Large enterprises embrace hybrid compute to retain control of their own intelligence

Presented by Inflection AI

Public, centralized large language model (LLM) services from providers like OpenAI have undoubtedly catalyzed the GenAI revolution, offering an accessible way for enterprises to experiment and deploy AI capabilities quickly. But as the technology matures, large enterprises, particularly those investing heavily in AI, are beginning to have a mix of publicly available cloud models, and private compute and local models — leading to a hybrid environment.

We’d go so far as to say, if you are spending more than $10,000,000 a year on total AI spend and you don’t have some investments in models — open source or otherwise — that you own or at least control, and some private compute resources, you are headed in the wrong direction.

We see this need to Own Your Own Intelligence as especially acute for organizations with significant security concerns, regulatory requirements, or specific scalability needs.

The future points to an increasing preference for “private compute” solutions — deployment approaches that leverage virtual private clouds (VPC) or even on-premise infrastructure for vital tasks and processes as part of your intelligence platform.

New vendors such as Cohere, Inflection AI and SambaNova Systems, are meeting this growing demand, offering solutions that align with the needs of companies for whom public cloud solutions alone may no longer be sufficient.

The large models from OpenAI and Anthropic promise private environments, but their employees can still access log and transaction data when needed, and companies do not believe that “just trust the contract” is sufficient to protect critical data. Let’s explore why private compute is gaining traction and what the trade-offs look like for large enterprises.

Centralized, public LLMs started the GenAI revolution

Public LLM services have been instrumental in getting companies up to speed with GenAI. Providers such as OpenAI offer cutting-edge models that are easy to access and deploy via cloud-based APIs. This has made it possible for organizations of any size to begin integrating advanced AI capabilities into their workflows without the need for complex infrastructure or in-house AI expertise.

The five key issues we hear about public LLMs from large enterprises in production are:

Security and confidentiality risks: Large enterprises often handle sensitive data, ranging from proprietary product roadmaps to confidential customer information. While public cloud providers implement stringent security protocols, some organizations are reluctant to trust non-company employees or third parties with their most valuable data. This concern is heightened when discussing future product roadmaps, which, in the wrong hands, could benefit competitors.

Loss of pricing power: As companies grow more dependent on GenAI, they may find themselves vulnerable to price increases from hyperscalers. Public cloud services typically operate under a pay-per-use model, which can become more expensive as usage scales. Companies relying on public LLM services could find themselves without leverage as prices increase over time.

Trust issues with future AI developments: While current contracts may seem sufficient, large enterprises may worry about the future. In a hypothetical future with true Artificial General Intelligence (AGI) — a form of AI that could theoretically outthink humans — companies may be hesitant to trust a third party to manage such powerful technologies, even with seemingly airtight contracts. After all, the potential risks of a malfunction or misuse of AGI, even if improbable, carry significant weight.

Control over features and updates: Public LLM services typically push updates and feature changes centrally, meaning companies using these services cannot control when or how updates happen. This can lead to disruptions, as enterprises must continually re-test their systems and workflows whenever new versions of models are introduced.

Cost efficiency as token consumption grows: Token-based pricing models used by public LLM services are convenient for low- to moderate-use cases. However, for enterprises using these models at scale, the costs can become prohibitive. We estimate that the break-even point for cost-efficiency occurs around 500,000 tokens per day with current options and pricing. Beyond that, the per-token costs start to outweigh the convenience of not managing your infrastructure.

Key buyer benefits of public LLM clouds

Easy and cost-effective for testing: Public clouds offer an extremely low barrier to entry. Companies can experiment with different models, features and applications without a significant upfront investment in infrastructure or technical talent.

No/low capital outlay: Using a public cloud service, companies are spared from the hefty capital expenses required for building or maintaining high-performance compute clusters.

No need to manage on-premise infrastructure: When relying on a public cloud provider, there’s no need for enterprises to develop, maintain and secure their own on-premise infrastructure, which can be costly and time-consuming.

Leading companies are heading to hybrid environments with private compute

We have seen at least two very different types of organizations in GenAI/AI adoption. The first are folks that we call “toe dippers.” They’ve tried some isolated applications and allow only one or two vendors providing standard tools like Co-Pilot or ChatGPT. They may have islands of automation built in different divisions.

The second group is what we call “productivity orchestrators” – these are firms who have significant systems in production. This latter group has a combination of public cloud services and private compute and solutions that they have built and/or assembled to meet their current needs in production. These solutions allow companies to deploy GenAI models either in their own on-premise infrastructure or within their own virtual private cloud, bringing AI capabilities closer to their “trust boundaries.” Here are the benefits we hear from the orchestrators:

Pros of private compute solutions

Enhanced security and confidentiality: By deploying LLMs in a private cloud or on-premise environment, enterprises keep their data within their own infrastructure, minimizing the risk of unauthorized access or accidental exposure. This is particularly important for companies in industries such as finance, healthcare and defense, where data privacy is paramount.

Cost efficiency at scale: While the initial setup costs are higher, private compute solutions become more cost-effective as usage scales. Enterprises with high token consumption can avoid the variable costs of public cloud services, eventually lowering their overall spend.

Greater control over AI development: With private compute, enterprises maintain full control over the models they deploy, including when and how updates occur. This allows companies to avoid disruptions to their workflows and ensure that models are always optimized for their specific use cases.

Customization and flexibility: Public LLMs are designed to serve the broadest possible audience, meaning they may not always be tailored to the specific needs of any one enterprise. Private compute solutions, on the other hand, allow for more customization and fine-tuning to address unique business challenges.

Development of in-house expertise: GenAI/AI is growing in terms of its capabilities and applications so having internal capability to build and run new solutions will become ever more important.

Cons of private compute solutions

At the same time these benefits do not come without costs and risks. The biggest two issues we hear are:

Higher upfront capital outlays: Setting up a private compute infrastructure requires significant investment in hardware, software, and human resources. While this cost can be amortized over time, it represents a large initial financial commitment.

Increased technical complexity: Managing an on-premise or VPC-based AI infrastructure requires a higher level of IT expertise, particularly in the areas of machine learning operations (MLOps) and large language model management. Enterprises must be prepared to either hire or train staff to handle the complexity of these systems.

The organizations best suited for private compute investments are regulated and high GenAI spend ones

For AI leaders in firms that are using AI to drive transformation of their cost base, or customer experience, developing and having a hybrid strategy isn’t optional; it’s existential.

In a 2023 HBR article we introduced the idea of a new category of work: WINS. This is more precise than the commonly used ‘knowledge work’ and includes companies whose cost base is made up of functions and tasks that create or improve Words, Images, Numbers and Sounds – WINS work. For example, a heart surgeon and a chef are knowledge workers, but they are not WINS workers. Software programmers, accountants and marketing professionals are WINS workers.

And if the WINS work being created is already highly digitized, those tasks, functions, firms and industries are being transformed as we speak.

Moreover, we see very early action in WINS-intensive firms that have the following characteristics:

Regulated industries: Sectors such as finance, healthcare, and government are subject to strict data privacy and compliance regulations. Private compute solutions ensure that sensitive data never leaves the company’s control.

Enterprises with significant GenAI investment: Companies already spending $10 million or more annually on GenAI are likely to find private compute options more cost-effective at scale, particularly as their daily token consumption grows.

Businesses automating advanced cognitive processes: Companies in fields such as drug discovery or entertainment, where AI is used to drive highly specific and proprietary processes, benefit from keeping their AI infrastructure in-house. This reduces the risk of critical intellectual property (IP) leakage and ensures that AI development aligns closely with their business objectives.

Conclusion: Hybridize your environment to control your destiny

While public LLM services have been invaluable in jump-starting the GenAI revolution, they won’t meet the long-term needs of every enterprise. For large companies dealing with sensitive data, heavy AI use, and concerns about control and cost, private compute solutions offer a compelling alternative.

As more vendors emerge to serve this space, we can expect to see a growing number of enterprises migrating away from public clouds and towards more secure, customizable and scalable private deployments.

Paul Baier and John Sviokla are Co-Founders of GAI Insights.

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.