Skip to main content

Curbing shadow AI with calculated generative AI deployment

Image Credit: Adobe

Presented by Dell Technologies


Shadow AI is emerging as a potentially thorny problem for IT departments struggling with how to deploy and manage generative AI services. Organizations are partly responsible, as 45% of enterprises have no formal policy in place governing gen AI’s use.

The challenges are mounting as the number of large language models (LLMs) continue to proliferate and grow more portable.

Today IT staff can access LLMs and GPUs-as-a-service from an array of public cloud providers or from specialists offering API access to build gen AI services. As these LLMs become both smaller and easier to use it’s easy to envision knowledge workers creating their own bespoke digital assistants simply by picking their way through drop-down menus on laptops or speaking commands into their smartphones.

Then came the corrections

This is the state of shadow AI today and it recalls the rise of SaaS applications and public cloud services, in which IT leaders contended with business leaders and developers procuring software without their permission.

IT leaders countered by locking down shadow IT or making uneasy peace with employees consuming their preferred applications and compute resources. Sometimes they did both.

Meanwhile, another unseemly trend unfolded, first slowly, then all at once. Cloud consumption became unwieldy and costly, with IT shooting itself in the foot with misconfigurations and overprovisioning among other implementation errors.

As they often do when investment is measured versus business value, IT leaders began looking for ways to reduce or optimize cloud spending.

Rebalancing IT workloads became a popular course correction as organizations realized applications may run better on premises or in other clouds. With cloud vendors backtracking on data egress fees, more IT leaders have begun reevaluating their positions.

Make no mistake: The public cloud remains a fine environment for testing and deploying applications quickly and scaling them rapidly to meet demand. But it also makes organizations susceptible to unauthorized workloads.

The growing democratization of AI capabilities is an IT leader’s governance nightmare. Even so, CEOs are hot for gen AI services so cancelling them outright is a non-starter.

Rather, IT leaders must strike a delicate balance between nurturing employees’ gen AI plans while exercising responsible AI governance and guardrails and — this is key — respecting the budget.

The path to the right AI use cases

To do this, IT leaders must collaborate with business leaders on the right gen AI use cases. This will include compromises on both sides, with IT culling some gen AI services and standardizing on specific tools.

IT should evaluate each use case for the best cost and performance in local or hosted environments. Some may run better in public cloud environments; many will perform well on premises, where they will also benefit from the protection IT has to offer.

In certain scenarios, deploying an LLM on premises can save IT some precious cash.

A recent study by Enterprise Strategy Group found that conducting inferencing — the workload that enables AI models to make predictions — with an open-source LLM running through retrieval-augmented generation (RAG) on premises is more cost-effective than running them in the public cloud or using an API-based approach.

For the public cloud comparison, ESG conducted two tests. It tested an instance of the open-source Mistral 7B model — that is 7 billion parameters — against Amazon Web Services EC2 and found it was 38% to 48% more cost effective, with cost-effectiveness increasing as the user base grew.

For the second public cloud test, ESG compared inferencing on a 70 billion parameter Meta Llama 2 7B instance vs. AWS EC2 and found it was 69% to 75% more cost effective.

Finally, ESG also tested Llama 2 (70B parameters) against OpenAI’s GPT-4 Turbo — the API-based gen AI service — targeted for 50,000 enterprise users and found it was 81% to 88% more cost-effective.

Deploying gen AI services on premises won’t curb shadow AI outright, but it could make it more manageable. Moreover, the ability for IT staff to monitor models in their own data centers could make addressing problems that arise from bizarre and flawed outputs easier and faster. This underscores the value of bringing AI to an organization’s data.

The reality is that organizations will likely run gen AI workloads in various other environments, including public and private clouds and the edge.

Decisions around where to deploy LLMs aren’t easy. That’s where trusted partners can help. To help ease organizations’ gen AI journeys, Dell Technologies provides AI-optimized servers, client devices for the modern workplace and professional services.

Shadow AI isn’t a trivial challenge, but that doesn’t make it a deal-breaker for creating a responsible gen AI strategy. The right partner can light the path.

Learn more about Dell AI solutions.

Clint Boulton is Senior Advisor, Portfolio Marketing, APEX at Dell Technologies.


Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.