No Search Engine, No GPT-5, But OpenAI Begins Rollout of Multimodal GPT-4o

Business, News

Insider Brief

OpenAI is launching GPT-4o, a new iteration of its GPT-4 model.
GPT-4o will be free for all users, with paid users enjoying up to five times the capacity limits of free users.
The model’s native multimodal abilities, stating that GPT-4o can generate content and understand commands in voice, text and images.

OpenAI is launching GPT-4o, a new iteration of its GPT-4 model, which is arguably the world’s most widely used large language model (LLM) tool. The updated model promises to be “much faster” and offers enhanced capabilities across text, vision, and audio, according to OpenAI CTO Mira Murati.

The announcement was made during a livestream today, where Murati highlighted that GPT-4o will be free for all users, with paid users enjoying “up to five times the capacity limits” of free users.

The rollout of GPT-4o’s capabilities will be done iteratively, starting with its text and image functionalities in ChatGPT today. OpenAI CEO Sam Altman emphasized the model’s native multimodal abilities, stating that GPT-4o can generate content and understand commands in voice, text, and images.

Altman also revealed that developers will have access to the GPT-4o API at half the price and twice the speed of GPT-4 Turbo.

“Instead, it now looks like we’ll create AI and then other people will use it to create all sorts of amazing things that we all benefit from,” Altman wrote in a blog post following the announcement. He added that while OpenAI’s original vision was to directly create benefits for the world, the company now sees itself as an enabler for developers to build innovative applications using its AI models.

A significant enhancement with GPT-4o is the improved voice mode, which aims to function as a real-time, Her-like voice assistant, The Verge reported. This new feature contrasts with the current voice mode, which is limited to responding to individual prompts based on what it can hear. The upcoming voice mode is expected to be more interactive and capable of observing the world around the user.

Altman shared his enthusiasm for the new voice and video mode, describing it as the best computer interface he has ever used.

“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change,” Altman wrote.

He praised the new model for being fast, smart, fun, natural, and helpful, noting that talking to a computer has never felt as natural as it does now.

In the run-up to today’s launch, speculation was rife with predictions ranging from an AI search engine to rival Google and Perplexity, to a new and improved model, GPT-5. The timing of the GPT-4o launch is strategic, just ahead of Google I/O, where Google is expected to unveil various AI products from its Gemini team.

Altman underscored the company’s mission to make capable AI tools accessible to everyone.

“I am very proud that we’ve made the best model in the world available for free in ChatGPT, without ads or anything like that,” he wrote.

Offering ChatGPT-4o for free may sound like a strange business model, but Altman is not concerned

“We are a business and will find plenty of things to charge for, and that will help us provide free, outstanding AI service to (hopefully) billions of people,” Altman wrires.

The announcement marks a significant step for OpenAI as it continues to evolve and refine its AI offerings. With GPT-4o, the company is poised to provide more powerful tools to developers and users alike, enabling them to create and experience groundbreaking applications across various modalities.