The reports released by OpenAI outline the safety work carried out before releasing the GPT-4o System Card, including external red teaming and frontier risk evaluations according to their Preparedness Framework. This model was eventually built with appropriate safeguards like the other OpenAI models developed, such as the Preparedness Framework scorecard, to provide an end-to-end safety assessment of GPT-4o, including the tracking of safety challenges.
What is a ‘GPT-4o System Card ’
GPT-4o System Card is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, just like this, it generates any combination of outputs. It’s trained end-to-end across text, vision, and audio, which means it will process all the inputs and outputs by the same neutral network. GPT-4o responds to audio just like a human responds in a conversation, this GPT-4o Turbo performance on the text in English and code, with significant improvement on text in non-English languages, being much faster and 50% cheaper in the API, it is better at vision, especially understanding audio compared to existing models.
Further talking about, the GPT-4o System Card, it includes our Preparedness Framework evaluations. In this System Card, OpenAI provided a detailed look at GPT-4o capabilities, limitations, and safety evaluations across multiple categories. Also, it includes a third-party assessment of general autonomous capabilities, as well as the potential of the societal impacts of GPT-4o texts and discussions on vision capabilities.
Prepared Framework evaluations
The Prepared Framework is a living document that describes the procedural commitments to track, evaluate, forecast, and protect against catastrophic risks from frontier models. These evaluations cover four risk categories: cybersecurity, CBRN (chemical, biological, radiological, nuclear), persuasion, and model autonomy. However, the overall risk score for GPT-4o is classified as medium, as the persuasion of GPT-4o mitigates low risk in all others.
Therefore, OpenAI has implemented various safety measurements and mitigations throughout the GPT-4o development and deployment process, being continue to monitor the update mitigations by evolving landscape. The System Card looks forward to encouraging the explorations into the key areas, however, it does carry limitations as omni models impact the system of AI and usually use measurement and mitigation for dangerous capabilities.
Beyond these undercover areas the research about the economic impacts of omni models is encouraged by OpenAI, Let’s look forward to how their model capabilities will use the tools.