OpenAI, the parent company of ChatGPT, has another secret project named “Strawberry” that is expected to improve AI reasoning by a magnitude, and thereby revolutionize the autonomy of research and problem-solving.
Although specifics are still undisclosed, sources and papers seen by Reuters suggest that Strawberry is the company’s new way of improving the training and processing of AI models to deliver hitherto unachievable tasks.
Exclusive: OpenAI Develops New Reasoning Technology Codenamed ‘Strawberry
The document details a project implementing Strawberry models to empower the company’s AI not only to answer questions but to proactively anticipate sufficiently to find its way meaningfully and accurately across the internet as OpenAI calls “deep learning,” based on the source.
It encompasses a unique form of ‘post-training’ of OpenAI’s generative AI models or modifying them in a way with a view of enhancing certain aspects of the model even when the AI models have been trained on general data.
It has emerged that OpenAI would like to employ Strawberry for long-horizon tasks (LHT), a process where an AI model has to decide what actions to take to achieve a goal over a long period.
Strawberry seems to be an offshoot of an earlier venture by OpenAI called Q*, which had been previously hailed within the company for its powerful ability to reason. Some sources who viewed Q* demos shared that it can solve math and science problems beyond what is currently offered in commercially available AI.
Although the specifics have remained undisclosed, insiders have mentioned that Strawberry uses a particular form of ‘post-training’ which is a process of enhancing AI models even after the models have been trained on large datasets.
This post-training phase, potentially using some techniques such as ‘‘fine-tuning’’ and self-generated training data, is important for enhancing its reasoning skills.