Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now
Even as Meta’s new Llama 3 has quickly rocketed up the charts of most-used and most customized large language models (LLMs), the rival company that ushered in the generative AI era, OpenAI, is shrugging off the competition by introducing new enterprise-grade features for building and programming atop its GPT-4 Turbo LLM and other models.
Today, OpenAI announced an expansion of its enterprise-grade features for API customers, further enriching its Assistants API and introducing new tools aimed at enhancing security and administrative control, as well as managing costs more effectively.
“When you talk to developers and businesses about meaningful work for AI models, OpenAI still has the lead,” said Olivier Godement, Head of Product, API at OpenAI in a video call interview with VentureBeat yesterday. “That said, we always welcome more competition — it’s how everyone gets better.”
Private Link and bolstered security features
Among the significant security upgrades is the introduction of Private Link, a secure method to enable direct communication between Microsoft’s Azure cloud service and OpenAI, which the latter says helps minimize “exposure to the open internet,” for customer data and queries sent through the API.
AI Scaling Hits Its Limits
Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:
- Turning energy into a strategic advantage
- Architecting efficient inference for real throughput gains
- Unlocking competitive ROI with sustainable AI systems
Secure your spot to stay ahead: https://bit.ly/4mwGngO
This addition complements the existing security stack that includes SOC 2 Type II certification, single sign-on (SSO), AES-256 data encryption at rest, TLS 1.2 encryption in transit, and role-based access controls.
Furthermore, OpenAI has introduced native Multi-Factor Authentication (MFA) to bolster access control in line with growing compliance demands.
For healthcare companies that need HIPAA compliance, OpenAI continues to offer Business Associate Agreements along with a zero data retention policy for qualifying API customers.
Upgraded Assistants API that handles 500X more files
One of the lesser-publicized but most important enterprise offerings from OpenAI is its assistants API, which allows enterprises to deploy custom fine-tuned models they’ve trained and/or and call upon specific documents through retrieval augmented generation (RAG) to provide a conversational assistant within their own applications.
For example, ecommerce company Klarna earlier this year boasted about how its AI assistant, made with the OpenAI Assistants API, was able to do the work of 700 full-time human agents with a 25% decrease in repeat queries and a nearly 82% decrease in resolution time (from 11 to 2 minutes).
OpenAI has now upgraded the Assistants API to include enhanced file retrieval capabilities with the new ‘file_search’ feature, which can handle up to 10,000 files per assistant.
This represents a 50oX increase over the previous limit of 20 files, with added functionalities such as parallel queries, improved reranking, and query rewriting.
Moreover, the API now supports streaming for real-time conversational responses — meaning that AI models such as GPT-4 Turbo or GPT-3.5 Turbo will return outputs as fast as the tokens can be generated, instead of waiting for a full response.
It furhter integrates new ‘vector_store’ objects for better file management, and offers more granular control over token usage to help manage costs effectively.
Projects lets enterprises control and partition access to specific tasks and assignments
A new feature called Projects offers improved administrative oversight by allowing organizations to manage roles and API keys at the project level.
This feature allows enterprise customers to scope permissions, control available models, and set usage-based limits to avoid unexpected costs—enhancements that promise to simplify project management significantly.
Essentially, they can isolate one fine-tuned version of an AI model or even a plain vanilla model to a specific set of tasks or documents and allow for specific people to work on each of these.
So, if you had one team at your enterprise working with a set of public-facing documents, and another working with a set of confidential or internal documents, you could assign each a separate project within OpenAI’s API and the two could work on each with AI models, without co-mingling or compromising the latter.
“As more organizations and even solo developers are deploying AI, they want to do things in constrained boxes,” said Miqdad Jaffer, a member of the product staff at OpenAI, in the same video call interview with VentureBeat conducted yesterday. “Projects lets you do is sequester your resources, your members, into one little individualized project. You get individual usage reporting. You get the ability to have controls around access, security, latency, throughput and cost, and an organization really gets to build in a very secure way. And if you’re a solo developer, you can deploy hundreds of projects without any kind of concern.”
That last point is especially helpful for dev teams who consult or handle multiple clients at once.
The Good Batch
To further assist organizations in scaling their AI operations economically, OpenAI has introduced new cost management features.
These include discounted rates for customers maintaining a consistent level of token usage per minute and a 50% reduction in costs for asynchronous workloads through the new Batch API, which also features higher rate limits and promises results within 24 hours.
However, to use it, customers must send their batch of tokens — their inputs to the AI models that they want analyzed, be they prompts or files — together in a single request, and be comfortable with waiting up to the full 24 hours to receive a response from OpenAI’s AI models.
While this seems like a long time, OpenAI execs tell VentureBeat the return can be as fast as 10-20 minutes.
It’s also designed for customers and enterprises who don’t need an instant response from the AI models, say, an investigative journalist researching a long feature article who wants to send over a bunch of government documents to have OpenAI’s GPT-4 Turbo sort through them and pick out select details.
Or, an enterprise preparing a report looking at its past financial performance over time that is due in several weeks rather than days or minutes.
As OpenAI continues to enhance its offerings with a focus on enterprise-grade security, administrative control, and cost management, the updates show that the company is interested in offering a more “plug and play” experience for enterprises right out of the box, countering the ascent of Llama 3 and open models from the likes of Mistral, who may require a lot more setup on the side of enterprises.