Is OpenAI still a good choice for fine-tuning?
OpenAI now allows you to fine-tune gpt-3.5-turbo. According to the latest documentation gpt-3.5-turbo is the recommended model to fine-tune.
For months, Open AI has offered fine-tuning of some of the smaller models (such as Ada, Babbage, Curie and Davinci) to be fine-tuned but now they offer their star models gpt-3.5-turbo (4k model)to be fine-tuned as well. Also, earlier they recommended using their smaller models (Ada and Babbage) for finetuning for simpler tasks and and tuning larger models (Curie and Davinci) for more complex tasks. Now, they have kept the range of options similar (from babbage-002 to gpt-3.5-turbo) but their recommendation has been slightly modified.
In this article we will try to demystify some of the details that OpenAI missed in their documentation and go over some of the good reads for fine-tuning with OpenAI.
Mystery #1: Offered fine-tuning models
When OpenAI removed Ada and Curie models as their offered models for fine tuning, it wasn’t a surprise for many as OpenAI has been on the news for changing their embeddings in the past. These moves could have costed their client companies Token Dollars, but I think they have a good reason to do so.
Improved embedding can have a huge impact on the performance of the models. Also, offering multiple Large Language Models adds a huge operational cost for the companies as these models have high operational costs and GPU requirements for OpenAI API management.
Takeaway: Even though babbage-002 or davinci-002 could be more optimized models for your use case. I would highly recommend to use gpt-3.5-turbo-0613 for your next fine tuning project as one could expect that these smaller models will be dropped in the near future to cater growing demand for better and more powerful models such as gpt-4.
Mystery #2: What method is used for the fine-tuning job?
Despite looking for this answer on OpenAI documentations, cookbooks, and multiple blogs and tutorials I could not find an answer to this question. However, considering the huge memory footprint and operational costs, I am quite certain that OpenAI does not allow neither full fine-tuning, nor selective fine-tuning. My best guess is that they only allow one of the following parameter-efficient fine-tuning techniques(PEFT) methods: either additive methods or reparameterization method or a combination of the two to fine-tune.
These methods of fine tuning is very interesting and if these methods sound unfamiliar to you then I would highly recommend you to explore them in detail.
Is OpenAI still a good choice for fine-tuning?
Even though OpenAI might not allow full-fine and I don’t think that they will allow it in the near future, even though OpenAI might drop some of the low demand models in the future, I still think OpenAI is still a great options for a lot of fine-tuning use cases. My short reason for this is “OpenAI is doing the right thing” for most use cases.
My explanation of the reason:
- some of the PEFT methods such as Reparameterization and Additive Methods have allowed to improve performance on up to 90% performance as the full fine-tuned models. Moreover, the PEFT models have up to 80% less memory footprint and a lot less operational cost.
(Ref: Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning) - full-finetuned models which are of the scale of 10 Billion+ parameters are very expensive to host for low to medium usage. OpenAI models achieve very high accuracy for a very low cost per token when compared with self-hosting their open source competitors because of their pay-per-use payment plans.
Some good reads for fine-tuning:
- OpenAI gpt-3.5 fine tuning update: https://openai.com/blog/gpt-3-5-turbo-fine-tuning-and-api-updates
- OpenAI Guides: https://platform.openai.com/docs/guides/fine-tuning
- Medium article by Cobus Greyling: https://cobusgreyling.medium.com/openai-gpt-3-5-turbo-model-fine-tuning-dd76f50db27f
- YouTube video by Sam Witteveen: https://www.youtube.com/watch?v=MkocIPcg5A8&t=57s&ab_channel=SamWitteveen
What’s next from OpenAI?
- OpenAI has announced that gpt-4 will be available this Fall.
(Isn’t it fall already?) - I have heard from official sources that gpt-3.5 (16k version) might also be available for fine-tuning sometime soon this year.
- I would not be surprised if OpenAI decided to discontinue babbage-002 and davinci-002 soon because of high operational costs and less demand.