Skip to main content

Edge AI: The accessible, sustainable future of AI

VentureBeat/Ideogram
VentureBeat/Ideogram

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now


Ever heard of the ENIAC computer? A massive chunk of hardware that kicked off the computer age in 1946, the ENIAC mainframe clocked in at an impressive 27 tons and 1,800 square feet. It was truly astonishing in scope — with 6,000 manual switches and 17,468 vacuum tubes, and using 200 kW of electricity — and as the world’s first programmable general-purpose electronic digital computer, it was a game changer. 

The excited headlines ENIAC generated back then will feel eerily familiar to anyone watching the current landscape of AI.

“With the help of lightning-fast computers to do most of the drudgery on problems that have baffled men for many years, today’s equation may be tomorrow’s rocket ship,” blared Popular Science Monthly in April of 1946.

“30-Ton Electronic Brain at U. of P. Thinks Faster than Einstein,” reported the Philadelphia Evening Bulletin.


AI Scaling Hits Its Limits

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

  • Turning energy into a strategic advantage
  • Architecting efficient inference for real throughput gains
  • Unlocking competitive ROI with sustainable AI systems

Secure your spot to stay ahead: https://bit.ly/4mwGngO


Glenn A. Beck (background) and Betty Snyder (foreground) program ENIAC in BRL building 328.
(U.S. Army photo, c. 1947–1955) Credit: Wikipedia

Cut to today, more than 75 years later, and the Cortex-M4 chip that runs your smart refrigerator is 10,000 times faster than ENIAC — using just 90µA/MHz and a few inches of space. That’s because as computer tech matured, devices specialized to become far more effective and efficient at exactly what they needed to do in a specific, limited, cost-effective application.

This is also the that way AI is headed.

The specialization of technology

Like ENIAC, AI is currently generating a huge buzz of excitement and optimism (and a little anxiety) — particularly as generative AI has taken off over the last year. However, if we want to understand its long-term trajectory, we can learn a lot from the history of computing hardware. In fact, it’s the same path most technology follows. Things start big, powerful and centralized, and once they’ve made their mark, they start to specialize, localize and become more available for efficient edge cases.

From large telephone switchboards to smartphones, big power plants to residential solar panels, broadcast television and streaming services, we introduce things big and expensive and then begin a long process of refining. AI is no different. In fact, the very large language models (LLMs) that make AI possible are already so massive they are in danger of becoming unwieldy. The solution will be that specialization, decentralization and democratization of AI technology into specific use cases — something we call “edge AI.”

LLMs: Great promises (and looming challenges)

LLMs like GPT (Generative Pre-trained Transformer) have made the age of AI possible. Trained on huge sets of data and with an unprecedented ability to understand, generate and interact with human language, these colossal models blur the lines between machines and human thought. 

LLM models are still continuously evolving and pushing the boundaries of what is possible — and it’s incredible. But it is not a blank check. The sheer volume of data required — and the computational power needed to process it — makes these systems incredibly expensive to run and will be challenging to scale indefinitely. LLM’s appetite for data and compute power has already become ravenous — the costs and energy consumption they demand are high and will soon outpace the resources we have available to sustain them. 

At our current pace, LLMs are soon going to hit a series of inherent constraints: 

  • The availability of high-quality data to train on.
  • The environmental impact of powering such colossal models.
  • The financial viability of continually scaling up.
  • The security of maintaining such large entities. 
A technician works in rows of server racks in a Microsoft Azure cloud data center. (Image: Microsoft)

Given the breakneck velocity of AI adoption and expansion, this tipping point is not far away. What took mainframes 75 years may be a matter of months for AI, as limitations trigger a need for a shift towards more efficient, decentralized, accessible subsets of AI: Niche edge AI models.

The rise of edge AI

Edge AI modeled (credit: Nvidia)

The rise of Edge AI is already underway, as we see AI deployed in smaller, more specialized models — especially across the Internet of Things (IoT). This approach to AI moves the processing from centralized data centers to the ‘edge’ of the network, closer to where data is actually generated and used. It comprises a few terms you might hear being bandied about:

Small language models: These are versions of AI that understand and generate human-like text, but are smaller in size. Think of them as “mini-brains.” Being smaller makes them quicker and less costly to use, especially on devices that aren’t super powerful, like your smartphone or the chip for a field device. They are very knowledgeable about their area of remit, but might not know as much or be as creative as their bigger LLM siblings. These models are made possible by recent advances in highly parallel GPUs — supporting more mature neural networks with generalized machine learning (ML). 

Edge AI: This is a fancy way of saying AI that runs right where the action is, like on your phone, in a camera on the street, or inside a car, instead of on big computers far away in data centers. The “edge” refers to the outer edge of the network, close to where data is being created and used. This makes things faster because the data doesn’t have to travel far, and it can also keep your data more private since it doesn’t always need to be sent over the internet.

Mixture of experts: In AI, a “mixture of experts” is like a team of specialists where each member is really good at a specific task. It’s a system made up of many smaller AI units (the experts), where each one specializes in a different kind of job or pool of knowledge When faced with a task, the system decides which expert or combination of experts is best suited to handle it. This way, the AI can tackle a wide variety of tasks very efficiently, by always using the best tool for the job.

Together, these technologies are making AI more versatile and efficient, to train, run and deploy — capable of working in many different places and ways, from the tiny computers in our homes and pockets to specialized tasks that require expert knowledge. That smart refrigerator we mentioned is just one example, a traffic light array is another — self-driving cars, diabetes management, an intelligent power grid, facial recognition, the list is as endless as human ingenuity. 

The risks and rewards of edge AI

As with any technology, edge AI comes with inherent risks and rewards. Let’s quickly run down the list.

Benefits of edge AI

  • Increased innovation: By removing development bottlenecks, edge AI opens the door to a surge in creative, niche applications and micro-applications — available to anyone with the desire and ability to create an app.
  • Reduced resources and increased capacity: Edge AI reduces latency and has lower processing requirements, which significantly cuts cost and consumption.
  • Enhanced privacy and security: Local data processing means sensitive information doesn’t need to be sent across the internet, reducing the risk of breaches.
  • Customizable and independent: Edge AI allows for models trained on local, specific data, offering more accurate and relevant solutions that can run independently and reliably.

Challenges of edge AI

  • Quality control: The proliferation of models increases the need for stringent quality and validation processes, and can create a new bottleneck around QC.
  • Security and governance: More devices running AI applications do expand the potential for security vulnerabilities, and more creators can overburden those processes and open up a “Wild West” environment.
  • Limited scope and scalability: Edge AI models are designed for specific tasks, which might limit their ability to scale or generalize across different scenarios.
  • Need for oversight: There will be a need for leaders to gain oversight into what is happening across all development and to help creators stay within creative guardrails. This includes curbing the potential for redundancies as solutions may proliferate and duplicate in a vacuum. The solution here will be software that can help to track, develop, and sherpa ideas from conception well into development 

As you can see from this list, we have a golden opportunity to reimagine how AI applications are developed and managed. However, despite the cost savings and innovation dividends, many CIOs and heads of compliance may be anxious to ensure new edge AI tech is compatible, controlled and validated. Edge AI is a powerful tool to put into the hands of the layman, and accessibility can be a double-edged sword. 

Looking forward

We are on the brink of this new era in AI development, and the move to edge AI is likely to be a paradigm change that mirrors the exponential leap from those chunky old mainframes to today’s personal computers. This transition promises to make AI more accessible, efficient and tailored to specific needs, driving innovation in ways we’ve yet to fully imagine. 

In this future, the potential of AI is boundless, limited only by our imagination and our commitment to guiding its development responsibly.

Cory Hymel is a futurist and VP of innovation and research at Crowdbotics Corp.