Intel’s vision for AI: Embracing ‘openness at every layer of the stack’ | Sachin Katti interview

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

Intel CEO Pat Gelsinger laid out the big chip maker’s vision for the future of AI and open systems at the company’s Vision 2024 in Phoenix, Arizona, today.

I caught up with Sachin Katti, senior vice president of the network and edge group at Intel, for a one-on-one interview on the subject of AI and Intel’s belief that providing an open alternative will give it a leg up in competition with rivals like Nvidia, which dominates AI processing.

During the event, Intel unveiled its Gaudi 3 AI accelerator chips with 50% better inference processing and 40% better power efficiency than Nvidia’s current H100 AI systems. It unveiled its comprehensive AI strategy for enterprises, with scalable systems across AI segments.

It also described customer wins with HPE, Lenovo and Supermicro for Gaudi 3, as well as Gaudi accelerator customers Bosch, CtrlS, IFF, Landing AI, Ola, Naver, NielsenIQ, Roboflow and Seekr.

AI Scaling Hits Its Limits

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

Turning energy into a strategic advantage
Architecting efficient inference for real throughput gains
Unlocking competitive ROI with sustainable AI systems

Secure your spot to stay ahead: https://bit.ly/4mwGngO

And it focused on its efforts to create an open platform for enterprise AI to accelerate deployment of secure GenAI systems, enabled by retrieval augmented generation (RAG). This is in direct competition with what it called Nvidia’s proprietary CUDA-based AI ecosystem.

Katti drilled down on the Ultra Ethernet Consortium, which Intel is leading open Ethernet networking for AI fabrics, and he introduced the AI NIC (network interface card), AI connectivity chiplets for integration into XPUs, Gaudi-based systems, and a range of soft and hard reference AI interconnect designs for Intel Foundry.

Here’s an edited transcript of our interview.

Sachin Katti, Intel's senior vice president and general manager, network and edge group. Photo credit: Intel — Sachin Katti, Intel’s senior vice president and general manager, network and edge group. Photo credit: Intel

Sachin Katti: At Intel Vision next week, our focus will be on, how do we deliver the easy button for enterprises to adopt AI across every part of their enterprise? Across the AI PC, across the edge, across IT, and of course the data center and cloud. That being said, I do want to emphasize that we see enterprises have deployed a lot of AI. But more in the realm of computer vision or audio transcription. The next wave of enterprise adoption of AI is around generative AI and how to use those capabilities.

We look at what’s happening in that world today as three cases. It’s all about copilots. Imagine a very helpful helper, AI as a helper that is waiting to be called upon to help with your tasks. Whether it’s software engineering or customer care and so on. Very quickly, if it’s not happening already, enterprises will look to adopt agents based on AI. These agents will automate entire workflows. They’ll take over the role of a checkout clerk, say, in customer care, or an assistant and so on. Automating domain-specific complex workflows in the enterprise.

The next phase of this is going to be multiple agents, multiple AI agents, collaborating with each other to perform an enterprise departmental function. Think finance, supply chain management, store security, all these kinds of functions where you typically need multiple AI agents interacting and collaborating to perform that function. That’s how we see enterprises evolving through this world of AI and adopting it.

The challenge enterprises have today is that it’s not easy. It’s not easy along multiple dimensions. First and foremost, it’s about data. Enterprises have a lot of data. They consider this data to be their prized asset. It’s sensitive. It’s private. Often it encodes business value. Second, a lot of enterprise data is unstructured. It’s not organized neatly in databases. It’s in all kinds of documents, in audio recordings, in video that has been recorded and so on. The first piece that enterprises have to solve is how they handle this data so they retain their sovereignty, retain privacy and security, and not let it leak. And then of course the second piece is the AI piece itself. How do they build the right accelerator stack to take advantage of the new large language and large vision models that are coming to make sense of all this private data?

That’s going to be the dominant architecture: a data management stack and an AI stack. At Vision we’ll be announcing a system strategy to help enterprises adopt and deploy these kinds of solutions using Intel’s proven technologies. Any of the data management stacks today, they’ve been built on x86 over the last few decades. A lot of investment has gone into making it enterprise-grade, making it easy for enterprises. We’ll be announcing reference systems that make it easy for enterprises to handle the data management aspect. Second, we’ll be announcing accelerator stacks, AI stacks, reference designs that combine our accelerators, our networking, and of course our CPUs to handle the AI models.

The thing I want to emphasize is that you’ll see us announce a bunch of component products. Xeon 6, AI networking, AI NIC. But the point of what we’re bringing to the table is we’ll put that together into systems, into reference system designs that will put all of the hardware together, test them, validate them. Put the software together, test it, validate it. Deliver it as reference designs to our OEM partners and our ecosystem. They will in turn productize it and deliver it as solutions.

We hear loud and clear from enterprises that they need things to be easy. They cannot handle the complexity of putting all this stuff together and thinking about how to make it work. That’s too hard. At the same time, they want openness and choice. They don’t want to be locked into a particular ecosystem. We’re trying to address that head-on. Not just announcing individual products, but putting them together into systems. Use open source software, use open standards, but make sure it’s a validated reference design. Scale it through our ecosystem and give choice to enterprises. That, in a nutshell, is the high-level strategy.

How Intel's Gaudi 3 claims to help the enterprise deploy gen AI in their workloads. Image credit: Intel/screenshot — How Intel’s Gaudi 3 claims to help the enterprise deploy gen AI in their workloads. Image credit: Intel/screenshot

VentureBeat: The messaging around openness is interesting. It’s been very interesting to look at history and see the evolution there. I remember covering Intel back in the 1990s. There was Intel Architecture Labs. It was coming up with things like videoconferencing. It was an Intel-owned software that helped the company sell a lot of processors. Or that was the hope, to push ahead and create the software that the rest of the industry could get behind or not.

Here we are in a more modern day where there’s a dominant company in AI, and Intel is in the position of being a bigger advocate for openness. Some of the lines now are between the focus on openness versus the focus on more competitive proprietary approaches. Obviously Intel has its own chips and those are very proprietary, but crossing this line from hardware to software it gets more interesting.

Katti: We’re embracing openness at every layer of the stack. Obviously we have our chips. Those are a product. But when you have a CPU and a GPU or an accelerator, you need to stitch them together using networking and open standards. Let me start there as the first point. We’ll be announcing a new product category at Intel Vision next week called AI NICs. These are ethernet-based NICs that are optimized for building and deploying AI clusters. They’ll be based on an open standard from the Ultra Ethernet Consortium. This is an industry ecosystem defining an open standard that will be an alternative to InfiniBand from Nvidia.

When you put our things together, our CPUs and accelerators together, the networking is open and ethernet-based. It’s going to be stitched together using open standards that are ethernet-based. That’s the hardware layer. The next step is the software layer. Unlike CUDA, oneAPI and oneDNN from Intel, which is the equivalent layer, is in UXL from the Linux Foundation. It’s completely open source. Multiple companies contribute to it. There’s no restriction on its use. It’s not tied to a particular hardware. It can be used with many pieces of hardware. There are multiple ecosystem players contributing to it.

The next layer of the reference design, we make sure it’s integrated into PyTorch and OpenVINO. PyTorch, of course, is the standard framework that all of AI is using to develop models. We make sure that this flows naturally into PyTorch. OpenVINO is also open source, an inference runtime. This makes it easy to take a trained model and deploy it for inference. That’s using OpenVINO. It’s completely open source. All of the source code is available. It supports Intel, but it also supports ARM-based systems.

Intel' Agilex 5 chip. — Intel’ Agilex 5 chip.

When we talk about higher layers of the stack, we’re making sure that we integrate things like VLLM, DeepStream, all open source software libraries, into our reference designs. The thing I want to emphasize, we’re making sure that all the way from the lowest levels of the system, hardware and open networking, to every layer of the software stack, we are embracing open source and open standards. But we’re making sure that we do not leave the complexity of putting all this together and making sure it works to the end customer. When I say we’ll deliver a reference system design, we will put it together, validate it, and make sure that when you get that reference system productized from our OEM, it just works.

VentureBeat: Can you characterize the resources going into this software at Intel? I see some very clear areas where Nvidia is heavily investing in software. How would you say this compares as far as that open, foundational software?

Katti: Intel has a very strong software lineage and level of investment. We have more than 20,000 software engineers at the company, at every layer of the stack. Intel is obviously associated with hardware, but our view is that software is what makes the hardware usable, what makes it do the work it needs to do. We have very strong software investment, and all of that investment is being directed to make sure that we deliver these kinds of open reference designs and make them available to our ecosystem. It’s a very sizable investment and we’re focused on making sure that investment is directed toward making enterprise AI easy and open source.

VentureBeat: Where is the comfort level with enterprises and AI now? Many of them in the enterprise technology ranks, and even legal ranks, might be horrified to hear OpenAI’s CTO say that they’re not sure whether they put YouTube videos into model training or not. It seems like they would want to have that answer locked down. Is there a distinction you see developing, where that comfort is falling into place?

Katti: You hit the nail on the head. That’s where the discomfort comes from for the enterprise. That’s why we started the way we did. We looked at adopting AI in the enterprise as a combination of two things they need to solve. How do they handle their data, and then how do they adopt the models themselves? How do they use AI themselves? We’ll be talking about RAG, retrieval augmented generation, at Vision next week. That’s a popular technique for enterprises using AI copilots today.

Intel's Pat Gelsinger quizes OpenAI's Sam Altman. — Intel’s Pat Gelsinger quizes OpenAI’s Sam Altman in February.

If you look at what’s going on there, retrieval refers to the data piece. How do I find all the relevant contextual data for a prompt? If someone types a prompt–let’s say an Intel employee says, “Tell me how the latest Intel chip has been tested. What security flaws were tested for?” That’s confidential information. It’s not released to the public. You need a piece of software, the data piece, the retrieval piece in RAG, that will find the raw data relevant to that prompt. Then, instead of throwing all that raw data at you – the PDF files, the design documents – and making you sort through them as a human, you use an AI model like Llama 2 to summarize it into an answer that a human can digest. This way enterprises can ensure that before the data is fed to that model, they can enforce access control and security policies. They’re comfortable with what is being fed to the model. Then they only use the model appropriately.

These are the kinds of reference designs we’ll be delivering next week. This gives enterprises more transparency and control over how their data is used as they build out these AI systems. But that’s decidedly where our focus is. How do I make it easy? How do I make it compliant? How do I make it secure for enterprises to adopt AI, so they can get over their discomfort in adopting these systems?

VentureBeat: In your talk you have a section on the age of copilots and agents and functions. Could you distinguish those for us?

Katti: We look at enterprise AI evolving in three ages. Today we’re in the age of copilots. Obviously we’ve heard of Microsoft Copilot, GitHub Copilot, copilots of all kinds. Fundamentally these are helpers. These are AI helpers that are on standby, waiting to be called upon. I’m doing a software engineering task and I call the GitHub Copilot to help me finish. That’s where AI is today.

Very quickly, as I said, they will become agents. Instead of me calling upon the AI, the AI will do some of these complex workflows by itself. It will become the customer care agent. It will become the software engineer. It will become the assistant. It will become the checkout clerk, instead of a human having to call the AI. That’s what we call the age of AI agents. But the agent is still just doing one workflow.

When we talk about the age of AI functions, it’s multiple AI agents interacting and collaborating with each other to do a more complex function, like handling enterprise finance, handling enterprise inventory management, handling store security. These things require multiple people who work together to handle that. Imagine a world where AI agents work together to perform the duties of an entire department or enterprise function. That’s how we believe enterprises will evolve. Underlying it will be this common building block. How do I handle enterprise data and how do I deploy AI models? We want to make sure we deliver systems that make it easy for enterprises to do that as they build copilots, as they deploy agents, and as they deploy functions, rolling through these different ages.

VentureBeat: I’m interested in this notion of openly available data and private data and the line between them. Jensen Huang at Nvidia has been saying we’re in the age of sovereign AI, and that’s going to be defined by borders. We have national borders where a lot of countries will have their own restrictions on data going out of their country, and we’ll have similar situations with companies. Companies will have sovereign data policies on what’s allowed to come in or go out as far as where AI is getting its data. Even individuals–how much of my household data do I want to go outside my home and be used in different AI applications? I might use something that would parse my email for analytics. I wonder how synchronized the view might be between what Intel is talking about and what others are talking about?

Katti: It’s very much in alignment. We look at the world as all of these entities, whether they’re sovereign nations, enterprises, or individuals. They’ll all want control of their data, and they’ll all want personalized models tuned to them. Today, if you look at the largest models, they’re all trained on public data. It’s one model for everyone. Over time, we expect a world where all of these models get specialized for either individual domains or individual enterprises. When we talk about our enterprise AI strategy with these reference systems, it’s about how we can enable that kind of personalization for individual enterprises. With the AI PC it’s about how to do it on the PC itself for every individual.

At every scale, whether it’s an individual with an AI PC, an enterprise with their edge and IOT data center, or inside the cloud where an enterprise might be using instances there, how do I give you safe, secure, and easy-to-use systems that allow you to tailor AI for your needs? That’s fundamentally our strategy.

VentureBeat: If LLMs are going to get a lot more specialized, I wonder about the contrast between the LLMs that know everything versus the LLMs that are very specifically trained to help an enterprise or help one person. What’s necessary for those LLMs to still be smart? I can see how an LLM for which you dump everything in the world into it becomes pretty smart and can answer any question you bring up in a conversational way. But LLMs that are very specifically targeted at a company, will they still have the appearance of intelligence if they don’t have everything in the world stuffed into them?

Intel CEO Pat Gelsinger says "come to papa" to his chips. — Intel CEO Pat Gelsinger says “come to papa” to his chips.

Katti: If you look at what’s happening, LLMs are doing two things. One, they’re learning how to do specific tasks. They’re becoming intelligent, as you put it. They’re learning how to talk like a human, how to reason perhaps, how to program. The other thing is they’re depositories of knowledge. They ingest all of the data, and instead of going to a search engine, you can ask them something. Because of those two things combined, they become really large. The latest GPT model is probably a trillion parameters. We’re talking about 10 trillion or more in the future. Large models that require a lot of compute infrastructure to train and deploy them.

The view of the world that we’re talking about here, I definitely need them to be intelligent. They need to be able to do summarization. They need to be able to do code generation. But I don’t need them to know everything in the world. I need them to know what is necessary for my enterprise. When you make that distinction, they’re capable, but they only remember what’s relevant to me. They’re specialized to me. Then you can make them smaller. You don’t need this giant model with giant infrastructure that can only run in one or two places in the world. You can make them smaller and deploy them in more places, maybe even on your PC.

That’s how we get this to scale. The cost of deploying these in the largest data centers, especially in terms of energy consumption, is going to be so high that it will be inaccessible to most people or most enterprises. We expect this is the way the world will evolve, and it’s already happening. If you look at Mistral, these open source models, they’re quite small and very capable. They do many tasks as well as the largest models. But they don’t know everything. You specialize the models to just know your data. Then it will work equally well for you.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

The AI insights you need to lead