Microsoft dives into data integrity at the latest VentureBeat AI Impact Tour

Presented by Microsoft

Data integrity — start-to-finish transparency, accuracy, consistency, relevance and lack of bias throughout its lifecycle — is critical to the success of generative AI applications. On the third stop of the VentureBeat AI Impact Tour in Boston, VentureBeat CEO Matt Marshall welcomed Microsoft’s Kathleen Mitford, corporate VP of global industry, along with data science experts from Biogen and State Street Financial to have a conversation about how they’re ensuring data integrity across their gen AI applications, and the spectrum of risk across industries.

What it comes down to, Mitford said, is setting up ethical, responsible AI principles.

“This space is new. This space is moving fast. We need to take a responsible approach to it and know that it’s not perfect right now,” she said. “We’re continuing to learn, continuing to put out our responsible AI principles, not only for Microsoft, but how we engage with customers, what type of use cases we allow with our Azure OpenAI, but then also the principles of how we engage with ecosystems as well, so that as a technology ecosystem, we have standard principles that we’re all working toward.”

On data integrity from the top down

Transparency and ethics are the central tenets of responsible AI — but it also means taking responsibility for how that data is managed, as this directly impacts data quality and the success of AI. This responsibility goes beyond the IT department, Mitford explained; it’s every leader’s responsibility.

“When you think about data, before even getting to data integrity, what are the investments that the company has in making sure they have the right data that aligns with the business problems that they’re trying to solve?” she said. “That needs to be an executive-level discussion around making sure that the executive team is invested in that, that they’re prioritizing the areas and the use cases that will be the highest priority for their business.”

Data is very often generated as a by-product of making an application — and long after that application is gone, the data lives on. It’s a persistent asset with an intrinsic value, but it often isn’t managed that way, said Caroline Arnold, executive VP, chief data officer and CIO for global markets, risk, finance and corporate at State Street Financial.

“It has to be managed strategically,” she said. “You have to understand, what are the crown jewels you have? How do you manage them? How do you keep the noise out of this data? It’s very important to have that ownership. That’s new for a lot of people in the business world. The data creators own the data. They own the governance of the data, the quality of the data, and understanding how that data is used and consumed by others.”

That strategic focus is also essential at this turning point where generative AI is capturing the imagination of teams throughout the enterprise. Customers often come to Microsoft already bursting with full-blown ideas for how they want to leverage the technology.

“With all of that opportunity, you need guidance from the leadership team on where this is going to make an impact for the business,” Mitford explained.

On data integrity across industries

Finance and biotech are two of the most regulated industries in the world, and have long been stringent about data privacy and accuracy, but generative AI, and the vast amount of data it requires from sources both inside and outside an organization, adds a whole new wrinkle.

At Biogen, they’ve been working with consortiums like U.K. Bio Bank and the All of Us research study in the U.S., said Dave Clifford, head of data science and applied machine learning at Biogen. Both studies are huge and valuable sources of data, but also require a great deal of normalization to tackle variance and ensure the data is reliable and consistent across the board.

“As a community of scientists and a community of algorithm developers, we can do a better job at collaborating on some of these tools and techniques,” Clifford said. “If we’re serious about data integrity, we need to be serious about making sure that we all agree about what data integrity can and should look like, and that we work together with the public and private sectors to generate those resources.”

Managing the complexities, Mitford agreed, takes an ecosystem. In Microsoft’s case, responsible AI means working more closely than ever with their customers to understand industry-specific nuances in efforts toward ensuring data integrity. The company has invested significantly in data solutions like Fabric, its AI-powered analytics platform, which is the foundation of SaaS solutions, enabling customers to effectively manage data, no matter the source.

“We’ve worked with industry leaders on the different types of data, where does it reside, are there consortiums that manage that,” Mitford said. “We’ve partnered with those consortiums to make sure we’re taking their information into consideration as we’re building that out. Then we’re working very closely with the technology ecosystem on building standard connectors to bring that in. Getting your data estate in order, that’s a critical step in being able to get value out of AI.”

On the democratization of generative AI

Because they’re uniquely positioned as an AI innovator and a leader in the market, Microsoft feels that a large part of responsible AI is also ensuring they’re helping move the technology forward, Mitford says. They’ve encoded that in the recently announced AI Access Principles that govern their AI ecosystem and ensure fairness, enable innovation and foster competition.

Those principles include making AI models and development tools broadly available to software applications developers around the world, making available public APIs for accessing and using AI models hosted on Microsoft Azure, not using any non-public information or data from the training, building, deployment, or use of developers’ AI models in Microsoft Azure to compete with those models, and enabling Azure customers to switch to another cloud provider by enabling them to easily export and transfer their data.

“At Microsoft our mission is to empower every person and every organization on the planet to achieve more,” she said. “That also applies to what we do with AI, and not just for the richest companies or the richest countries.”

VB Lab Insights content is created in collaboration with a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.

On data integrity from the top down

On data integrity across industries

On the democratization of generative AI

The AI insights you need to lead