Keeping AI real | McKinsey

(PDF-860 KB)

Companies are entering a new phase with generative AI (gen AI), as they realistically ponder how to deploy the potent technology responsibly and profitably. The answer, says Navrina Singh, founder and CEO of Credo AI and today’s guest on this episode of the At the Edge podcast, is proper governance in the form of continuous human oversight. Singh speaks with McKinsey senior partner Lareina Yee about the importance of monitoring, measuring, and managing AI risk for the good of humanity, as well as for gaining a competitive advantage.

An edited transcript of the discussion follows. For more conversations on cutting-edge technology, follow the series on your preferred podcast platform.

Putting AI governance first

Lareina Yee: Navrina, we see so much momentum in terms of investments and interest in AI. But we also see a pause—and it’s a big one—as companies ask themselves, “How do we do this in a responsible manner?” You created Credo AI to enable companies to adapt to this ever-changing technology landscape. Can you explain how to be both responsible and work quickly in the world of generative AI?

Navrina Singh: What a great question, Lareina. We believe the answer is AI governance, which allows you to achieve responsible AI outcomes. When you have such a powerful technology, which is going to impact business, society, and our planet in ways that we’d not thought possible before, one of the key questions that comes to mind is, “Do we understand this technology?”

Understanding this technology means answering questions such as, “Do we have a good handle on the unknown unknowns? Do we understand the risk that this technology brings? What kind of oversight mechanisms have we put in place to make sure that we are getting the business outcomes?” It’s about taking stock on measuring the risks as well as the opportunities, and making sure you have the right information to make the right decisions at the right time, along with a good understanding of the guardrails needed.

Subscribe to the At the Edge podcast

Apple Podcasts Spotify YouTube

Lareina Yee: All of these are really big concepts, so let’s take them one by one. What does AI governance really mean if you boil it down?

Navrina Singh: Responsible AI is an outcome that really needs to be defined by the organization, whereas AI governance is a practice that leads you to those outcomes.

Lareina Yee: The term that most people use, rightly or wrongly, is “compliance function,” which means looking at outcomes and then remediations. But you’re saying something different, which is that we can put governance, systems, tools, and frameworks up front as we build these models and develop solutions within our enterprises. Can you talk a little bit more about that, because I think that’s quite new?

About QuantumBlack, AI by McKinsey

QuantumBlack, McKinsey’s AI arm, helps companies transform using the power of technology, technical expertise, and industry experts. With thousands of practitioners at QuantumBlack (data engineers, data scientists, product managers, designers, and software engineers) and McKinsey (industry and domain experts), we are working to solve the world’s most important AI challenges. QuantumBlack Labs is our center of technology development and client innovation, which has been driving cutting-edge advancements and developments in AI through locations across the globe.

Navrina Singh: AI systems are sociotechnical systems. Their outcomes have not only technical challenges but also massive societal implications, because the systems can be used for everything from hiring talent to creating robust military capabilities. And when you have a technology with sociotechnical outcomes, certain things need to happen first. And this is why AI governance should not be an afterthought. It should be the first thing you consider.

The other big difference about AI that I want to emphasize is the speed at which capabilities are emerging, the pervasiveness of this technology across all organizations and use cases, and the scale at which organizations are bringing artificial intelligence to bear. That changes the entire [governance] dynamic from being merely a compliance function to a competitive advantage.

The benefits of a governance-first approach

Lareina Yee: For companies that are keeping pace with this change, how has their AI governance been that competitive advantage over the past year?

Navrina Singh: The organizations that are at least trying to do the right thing now are doing a few things. First and foremost, they’re taking a step back and switching their mindsets from a checkbox compliance approach to a governance-first, responsible AI approach. And in making that mindset shift, they need to realign the organization around the core values that need to be instituted to make sure that this technology is actually going to serve their consumers and their stakeholders—whether internal or external—really well.

Lareina Yee: And who’s at the table thinking about these values that then guide these complicated but powerful systems?

Navrina Singh: We are seeing organizations bring multiple stakeholders to the table, including chief data and AI officers, governance and risk officers, leaders responsible for consent management and PII [personally identifiable information] for privacy, and, most importantly, impacted users.

Humans in the loop

Lareina Yee: We’ve moved very quickly from technology to the humans at the heart of all this. So you make your values clear as you work and coexist with these systems. How does that play forward in keeping humans at the center of this change?

Navrina Singh: I think we are going through a humanity revolution as we speak. And a part of that humanity revolution is because of not only artificial intelligence but also this cognitive need to really think about who we are going to be in this new AI world. What we are finding is an amazing opportunity for human–machine collaboration to augment everything humans bring to the table. And that augmentation, or the copilot for humans, is going to really result in productivity gains, which we are already seeing across businesses.

But more important, when you have those productivity gains, a key question is, “What new skills will humans need to bring to this new era of AI?” And this is where we are seeing what we like to call “persona metamorphosis.” We are seeing the emergence of new kinds of roles and skills that are becoming paramount for humans to adopt in this age of AI. For example, new positions such as chief AI officer and AI ethicist are now being created in enterprises.

These new roles combine a very interesting mix of skill sets, requiring people with data science and core AI expertise as well as a really good understanding of—and a finger on the pulse of—what’s happening in the policy ecosystem. Similarly, we are seeing the emergence of other new roles as AI governance leaders become responsible for managing a single pane of governance across a very distributed and state-of-the-art generative AI infrastructure.

The ‘Brussels effect’ on AI regulation

Lareina Yee: How are companies balancing the need to build all of this AI governance with an uncertain and changing regulatory landscape, which differs by geography?

Navrina Singh: One of the things that we’ve seen in this new AI landscape, especially in generative AI, is that policy has never moved as fast as we’ve witnessed in the past 18 months. And that’s actually a great indicator of the impact and scale of this technology, and also how critical it is for multistakeholders with diverse voices to get involved so we can direct and navigate this technology in a way that serves humanity.

Organizations that adopt governance just to check boxes and be regulatory compliant are not going to be the winners in this space.

In terms of regulatory changes, one thing I want to underscore is that organizations that adopt governance just to check boxes and be regulatory compliant are not going to be the winners in this space. The kind of mindset needed for enterprises and organizations to win in this age of AI is to view governance as a trust-building mechanism and competitive advantage, so they can land customers faster and retain them longer.

On the regulatory front, there’s a recognition that these powerful frontier technologies bring a lot of benefits but also come with a lot of downside if they fall into the hands of bad actors. So across the globe, AI regulation is a top legislative agenda, with the EU being the first region in the world to pass a regulatory framework, called the EU AI Act.

It is a bold statement when a group of 27 countries says, “You cannot launch any AI applications within the European Union unless you meet these guardrails, because we want our citizens to benefit from this.” I think we are going to see a “Brussels effect,” similar to what we saw with the GDPR [General Data Protection Regulation], where other nations end up complying with EU regulations. I expect to see regulatory frameworks similar to the EU AI Act show up across the world.

Four key questions to consider

Lareina Yee: You said something that will probably surprise a lot of people, which is that government regulations are moving faster than we’ve ever seen. What are the four or five questions we need to look at beyond what’s in the existing regulation?

Navrina Singh: I think it keeps coming back to some common ground principles, which look beyond regulatory frameworks to that sort of trust quotient you need to build for your enterprise. And I would say, first, that it involves a really deep understanding of where and how AI is being used within your enterprise or your organization. Taking stock of your artificial intelligence applications and creating a registry of where these systems are actually used is a great first step and a common ground principle we are finding across all our organizations.

Lareina Yee: I love that you’ve taken it from a principle to something really concrete. What are the second, third, and fourth questions?

Navrina Singh: Once you’ve taken stock of where AI is being used, the second question is, “How are you understanding and measuring its risk? What benchmarks and evaluations that align with your company values do you need to be testing your systems against?” And that alignment is really the second core piece.

The third question is, “Do you have the right people to be accountable to these evaluations and these alignments? And who is at the table?”

And then once you have that AI registry, alignment on what “good” looks like, and a set of great stakeholders, the last question is, “Are you able to, in a standardized way, scale this with the right infrastructure and tooling?” And this is where a combination of your large language model [LLM] ops tools, your MLOps [machine learning operations] tools, and your governance and risk compliance tools come into play.

Never stop measuring and reinventing

Lareina Yee: It’s helpful to step back and describe the actions we can take today. If I implemented those four steps, what’s the next thing I need to think about?

Navrina Singh: The final step, which is never done, is constant measurement. I believe that you can’t manage things you can’t measure. And what that means is constant reinvention within your organization as you review the outputs of the first four steps.

Organizations now need to be more adaptable. They must move fast but not break things, be intentional on measuring the right risk, and align against the right values to ensure ROI.

Organizations now need to be more adaptable. They must move fast but not break things, be intentional on measuring the right risk, and align against the right values to ensure ROI. But that has to happen during a constant feedback loop. So that’s the never-ending journey companies need to go on. And we believe the organizations that have done that in a standardized, scalable, and very mission-aligned way are the ones that are going to emerge as winners in this race.

Diversity is a requirement, not an option

Lareina Yee: What are some of the values that you prioritize for Credo as a user of AI, as opposed to someone who has a solution to support it?

Navrina Singh: I’ll share three of them that are really important to what we call our credo. First off, diversity is not an option; it is a requirement. More than 40 percent of my leadership team and my company are women, and we have great LGBTQ representation. Apart from AI experts, we also have folks who come from cognitive neuroscience and psychology. So we really have brought together a very diverse set of individuals.

The second is what I call “move fast but with intention.” Given the pace of change with artificial intelligence, execution becomes a given, but the question is, “How are you executing?” It’s not just about winning with AI anymore—it’s how you win with AI. Keeping pace with this very fast-moving technology becomes really, really critical.

It’s not just about winning with AI anymore—it’s how you win with AI. Keeping pace with this very fast-moving technology becomes really, really critical.

The third is really thinking through what “good” looks like. One of our core values at Credo AI is being a bar raiser, and we have a set of core principles, like the ones I’m sharing with you. But we are not afraid to change them based on market information. And a good, strong organizational culture needs that growth mindset to rapidly adopt and adapt with new information, technology, and skill sets.

Correcting AI bias with ‘rainbow teams’

Lareina Yee: How do biases become unintentionally amplified by AI?

Navrina Singh: First, bias doesn’t get injected at a particular point in the AI life cycle. Second, bias is actually a good thing in artificial intelligence, from the perspective of making sure you’re recognizing the right patterns.

What is not good is when it results in unintended consequences and disparate outcomes for different kinds of individuals. When you look at an AI system, whether it is an LLM or a predictive system, both of them are fed data. This data needs to be thoughtfully curated—which, by the way, is not happening—especially in LLMs, because these models are meant to be dual purpose by their very nature, which means they need to understand as much of everything as possible.

And all of that data resides on the internet, including all of the YouTube videos. So you need to ask yourself, “Does that data actually represent the truth?” And as we go down that rabbit hole of determining what is true, truth could mean very different things to you versus me. A good example of that is when you use a text-to-image generator and ask it to generate a picture of a CEO or a data scientist. Most of the time, it is going to generate what it’s learned through that internet data, which means most [of these images] will be men. So biases creep in because of the data sets, and, as a result, human oversight becomes critical.

Continuing down the AI life cycle, the second place these biases can show up is when you are packaging these LLMs together for an application. You might take a GPT-4 system and create a marketing tool on top of it to serve your consumers. In that case, if you have a team of nondiverse stakeholders who might not be thinking about societal impact, what ends up happening is a lot of biases get built into these systems. And those can’t be changed.

The third place to think about biases is during testing. In the absence of the right benchmarks and evaluation, many organizations are latching onto something called “red teaming,” where you bring together a set of individuals who provide multiple prompts to test these systems. One of the challenges associated with that is who you recruit for these red teams, because if they’re all from the same stakeholder group, they’re going to test very differently than if you bring a broader cross section of society together.

One thing that we’ve been excited about is the evolution of “rainbow teaming,” where you think like a more diverse team, and how these prompts need to be curated in a scalable manner so your systems can be tested appropriately.

So bias does not occur in only one place. It can happen in your data sets, when you’re building your AI applications, during testing, or when you put that system out in the market and it keeps learning from the available data.

Moving beyond English-language LLMs

Lareina Yee: Given that most of these models are being trained on an English-language-based, maybe North American, set of data, how useful are these LLMs for other countries? I know this is a little bit different from responsible AI, but the cost of building these models is already prohibitive, and over time other countries will need their own LLMs with equally robust testing and nuance.

Navrina Singh: It actually is a question of responsible AI at its core, because if we don’t think about other cultures and languages, we are leaving the majority of the world out of this AI revolution. And the disparity we will see in the social ecosystem is something we should all be very afraid of.

Having said that, there are already a couple of things happening that I’m very excited about. One, there are companies that are not only sourcing very demographically diverse data sets but also paying their human labelers much higher rates than the rest of the world. Because if you have a two-million-strong workforce feeding these LLMs who are not being paid appropriately, I think that’s a problem. So first and foremost, we are seeing a lot of innovations on the data sets and making sure there’s cultural diversity.

The second thing we are seeing are efforts to preserve some of this more contextual, demographic information. For example, if you think about Native Americans, their language is incredibly important to them, and so it is also very important for us to maintain that historical context.

We are also seeing an awakening in the ecosystem. Where the initial version of LLMs focused on the English language, in the past 18 months we have started to augment that with more cultural, geographical, and demographic information to achieve that parity across the globe.

An uphill battle for a seat at the table

Lareina Yee: So we’ve talked a lot about the technology and responsible AI, and we could probably keep going. But you’re an incredibly interesting human, and I would love to know, what was the spark that convinced you to leave a large technology company and found Credo?

Navrina Singh: I grew up in India in a very humble family. My dad was in the Indian military for 40 years, and my mum, a constant reinventor, started out teaching and then became a fashion designer. And as you can imagine, growing up in India 20–25 years ago, women weren’t seen as the breadwinners, the innovators, or the catalysts. My advantage was having a very supportive family who literally let me be that renegade.

So I moved to the United States at the age of 19, went to school here, and then started my corporate career as a hardware engineer. But it was fascinating how much of a battle it was to get a seat at the table during my career. There weren’t many women in leadership roles back then, and the few that I found I held onto tightly.

By 2011, I got into machine learning and artificial intelligence, and I saw that society was feeding data into these systems that was basically making an entire demographic—women—invisible again. So for me, this journey became more of a mission to make sure this technology serves us: humans, women, brown women. I also wanted to make sure we are no longer covered by that invisible cloak that unfortunately prevents our professional and personal growth in this ecosystem.

Overcoming boardroom bias

Lareina Yee: Women-funded start-ups attract about 2 percent of venture capital funding. What are some of the changes we’d have to see to make removing that invisibility cloak more common?

Navrina Singh: I’m still searching for answers. I think there are a lot of unsaid biases in the way that we evaluate diverse founders versus traditional ones. And that absolutely needs to change.

When I’ve gone into venture capitalist boardrooms, the questions tend to lean more toward “Prove to me that you can do it.” But for male founders, the questions tend to be directed toward envisioning “What can you accomplish if you had unlimited resources and unlimited capital?” So right there, once again, we are stuck in this world of proving versus amplifying.

The second thing that needs to change is the way we are educating the world on artificial intelligence. AI literacy needs a very diverse kind of education and skill set for us to be successful. And I think we need to embrace those across educational institutions to make a change happen.

Lareina Yee: Now I’m going to ask a couple of fun questions. In terms of generative AI, what’s your favorite feature?

Navrina Singh: It might come as a surprise to you, or it might not, but English is not my first language. So I use Google Bard [now Gemini], or Claude 3, or GPT to write emails so I can communicate in a way that would be received much more positively by the end user.

Lareina Yee: You are also a parent who thinks deeply about the world that you will leave for your daughter. What is the technology that, dare I say, she might be better at using than you are?

Navrina Singh: I would say her favorite technology right now is DALL-E. She’s a very creative kid, so she uses it for image generation.

Lareina Yee: One of the things that you shared with me is that when you were growing up, you would use your house as a lab to break and reassemble things. What was your best experiment back then?

Navrina Singh: We were living in communities that didn’t have constant electricity. So one of the experiments I did was rerouting electricity from some of my neighbors to see how much power I could generate for my house. That experiment did not end well, especially with my parents. But all our neighbors came to the house and wanted to understand my goals for that experiment. As a rebel and a renegade growing up, if I didn’t know the facts, I always wanted to seek the truth on my terms and do my own due diligence. And my community supported that.

Explore a career with us

Search Openings

Keeping artificial intelligence real