How to train your (Chat)GPT Assistant
Pretraining
Data collection - download a large amount of publicly available data. Such as through webscraping, stack overflow/exchange, wikipediea etc.
Context length = max # of integers that the model will look at to predict the next integer in the sequence
Dont judge a model by the number of parameters it contains.
āBase models are not assistants, they just want to complete documentsā
For example - if you ask it to write a poem about bread and cheese, it will just respond with more questions. But if you write āhereās a poem about bread and cheeseā: it will autocomplete your document
You can trick base models into assistants with few shot prompting.
Mode Collapse
RLHF vs SFT
RLHF models are ranked higher than SFT
Base models can be better in tasks where you have N example of things and want to generate more N things. Base models = more entropy
Disadvantages: LLM Text generation vs Human
LLMs spend the same amount of compute on every token
They donāt reflect, they donāt sanity check, they donāt correct their mistakes along the way.
Cognitive advantages: LLM vs human
They DO have a very large fact based knowledge across a vast number of areas
They do have a large and ~perfect āworking memoryā (Context window)
Prompting is making up for this difference between these two architectures: Human brains vs LLM Brains
Chain of thought
āModels need tokens to thinkā ā
āLetās think step by stepā
By snapping into a mode to show its work it will do less computational work per token more likely to succeed because its doing slower reason over time.
Ask for reflection
LLMs can often recognize later when their samples didnāt seem to have worked out well
LLMs can get unlucky in its sampling and a ābadā token but it doesnāt know to go back. LLMs will continue down this bad alley even if thereās no end. You have to ask to see if it met your promptās requirement.
Chains / Agents
Think less āone-turnā Q&A and more chains, pipelines, state machines, agents
Condition on Good Performance
LLMs donāt want to succeed. They want to imitate training sets with a spectrum of performance qualities. you want to succeed and you should ask for it.
For example: āLets think step by step:ā is okay
āLetās work this out in a step by step way to be sure we have the right answerā is better
Tools Use / Plugins
Offload tasks that LLMs are not good at. Importantly: They donāt āknowā they are not good.
For example arithmetic ā You need to tell it in the prompt that they are not good at arithmetic and ask it to use a tool.
Retrieval-Augmented LLMs
Load related context/information into āworking memoryā context window
Emerging recipe
Break up relevant documents/data connectors into chunks
Use Embeddings API to index chunks into a vector store
Given a test-time query, retrieve related information
Organize the information into the prompt.
Constrained Prompting
āPrompting languagesā that interleave generation, prompting, logical control
Default Recommendations
Other recommendations for Use cases
Use in low-stakes applications, combine with human oversight
Source of inspiration, suggestions
Copilots over autonomous agents