Lessons from a high-ROI cloud transformation journey

Transitioning to cloud is often more complex than anticipated. In some cases, for example, companies end up replicating team structures and established processes, which makes it almost impossible to capture the full scope of benefits that cloud offers. CUNA Mutual Group found success in the cloud by changing how IT works with the business, investing in upskilling its talent, building security into its development process, and committing to a new mindset where everyone is responsible for code. In this discussion with CUNA Mutual Group’s Martin Christopher, senior vice president and chief information officer, and Sidd Kuckreja, vice president of technology products, McKinsey’s Nagendra Bommadevara, James Kaplan, and Chhavi Arora explore how CUNA Mutual shifted to an operating model optimized for cloud and how the organization overcame the obstacles on its cloud journey. This is an edited version of their conversation.

Uncovering new pressures in the move to cloud

McKinsey: What were some of the substantial changes you made to your cloud program in 2018?

Martin Christopher: By 2018, we had been on a two- to three-year agile journey in our application-development space and had progressed to the point where most of our app-development teams were operating in agile scrums.

But the IT organization was still stuck in its traditional technology towers. As we began dipping our toe into cloud, we found ourselves starting to replicate our infrastructure team as a second cloud infrastructure team. We were approaching work the same way, meaning that as app development identified a need for infrastructure, whether it was infrastructure or cloud, they would submit the request, and the infrastructure team would begin the effort to build out what was needed to hand off to development so they could begin their work.

What we quickly found with cloud is the backlog grew exponentially in a way that was very different from what we saw with infrastructure. We already had many one-size-fits-all templates, and when we got to cloud, we unlocked this unlimited opportunity of services. At one point, the leader of that area showed me a six-year backlog, which was primarily a function of security. Cloud providers take security to a certain level, but then it’s on us to make sure the workload is secured appropriately and that we satisfy all the regulations we’re subject to as a financial-services organization.

We were unprepared for the work it took to try to tackle each one of those services for an individual app-development ask and to ensure it was configured correctly, securely, and compliantly. At the same time, there was significant pressure from our business leadership to capture value from cloud. This resulted in two additional roadblocks we had to address:

  • Some of the business units attempted to set up their own cloud platforms. This in turn was resulting in multiple cloud “islands,” each geared toward the needs of a business unit, moving us further away from consistency and standardization.
  • We could not find a compelling case for migrating our applications to cloud. Any migration activity conflicted with business priorities.

How CIOs and CTOs can accelerate digital transformations through cloud platforms

Creating a cloud platform to shoulder the burden

McKinsey: What was the pivot you made to more of a cloud operating model?

Martin Christopher: As we started looking at this backlog, we realized that we needed to take a completely new approach to how we managed and offered services on cloud. We laid out three principles that we were going to abide by:

  1. The services we offered to development teams had to be fully standardized and automated; in other words, we had to create a product road map for the cloud platform with prioritization based on the needs of the customers (the development teams). So there would be no more custom/ad hoc requests.
  2. Any services we offered on cloud had to be compliant from a security, privacy, and regulatory perspective from day one. So no more one-off exceptions or manual workarounds. Not only that, any applications built on top of these services also had to be compliant from day one. This was critical for us to get buy-in from our chief information security officer (CISO) and chief risk officer.
  3. Lastly, we had to come up with a creative way to educate our development teams on how to build their applications using these services; for far too long, they had been used to giving custom requests to the infrastructure team. When the current services did not fit their needs, we had to give them a mechanism to create new services that followed the first two principles.

That was the genesis of what we call today our Atlas platform. We developed a plan that looked at the most desired cloud services and created a product that templated most of the services and ensured that they are built to fit together. We’ve also ensured that they are secured together and connect to all of our back-end security logging systems.

To do that, we paused our cloud infrastructure team entirely and brought in a product owner who partnered with us to completely change how we were looking at the cloud. We actually took it offline for about 90 days. We then retrained the staff around the concept of building a product that app development could pull and consume in a self-service way, rather than just building infrastructure. The final product, Atlas, allows the app-development team to start by pulling the code into its continuous-integration and continuous-deployment (CI/CD) pipeline (Exhibit 1). It provisions itself.

So all that work that would normally have gone into making a request and then handing it back to development was eliminated, and our dependency on the infrastructure team was greatly reduced. We also made sure that the Atlas platform was automatically secure, so that developers could just build.

Investing in building your own people’s capabilities

McKinsey: What were your biggest barriers as you went on this journey, and how did you tackle them?

Martin Christopher: The biggest barrier was trying to get agreement from the existing cloud infrastructure team that it would be OK to pause for 90 days. They had this huge backlog and were racing to keep up with everything app development needed from them. They already felt like they were failing, and now they were going to be set further back. But then we did something that really surprised them: we invested in the existing infrastructure team, who represent some of our top talent. They didn’t know how to do work in the cloud the way we needed them to, so we brought in external partners with the necessary experience to work with them—to train them while they worked. And then after six months, we released the partners.

It remains one of the best examples of true insourcing I’ve ever seen. To this day, they’re still one of our top-performing teams in the organization, and these are infrastructure veterans, some of whom have been with our company for 30 to 35 years. In the end, I think this investment in our infrastructure team is probably one of the things that has paid the most dividends.

A bottom-up approach to changing skills and mindsets

McKinsey: The adoption of this new approach must have been a tremendous skill and mindset shift for both infrastructure and development teams. What did it take to get people to make that transition?

Martin Christopher: Over the years, there had been a lot of top-down attempts to try to introduce new operating models that never stuck. What was different about our approach with this initiative was empowering individuals who knew best what the development teams wanted and needed to solve problems as they saw fit.

As they learned how to leverage these tools, they were able to recommend services and get real-time feedback from app development. There really wasn’t any management guidance happening. Management was just listening, following along, and removing impediments. And for the first time in the decades-long careers of some infrastructure team members, they were in control of the work they were doing, able to influence it, and see the positive results of their work.

Sidd Kuckreja: Developers generally feel they’re not really responsible for infrastructure and see their role as strictly writing code. Some of our developers have now morphed into that site reliability engineering (SRE) mindset, where they’re writing infrastructure as code. They’ve actually gone to the other side, and both teams are benefiting. That mindset shift was critical.

The app-development world is benefiting from it because we now have different expertise and are coming at it more from a pure coding background into the infrastructure team. And the infrastructure teams are benefiting from it because they’re seeing a rigor to app development being applied within the infrastructure realm.

I think for the first time, app development and infrastructure worked closely together. It was almost a cultural shift that occurred within our organization. Our approach now is to adopt, build, enable, and work together to figure out end-state products. It’s also helped the app-development and infrastructure teams to recognize the challenges each team faces as they try to solve problems in the best way possible.

Making agile work in a practical way

McKinsey: We talked quite a bit about the infrastructure teams. How did you go about building the capabilities needed within the development teams?

Sidd Kuckreja: Our capability building within development teams was primarily through accelerators. An accelerator is an event that typically lasts for either a day or two, where we bring in a problem—such as a story or feature that has been prioritized in the sprint and that the team wants to deploy on cloud—and the key stakeholders, including certain cloud-platform team members, who are necessary to solve it.

They come together with the product owner or scrum master and make sure the backlog is somewhat even and start working through some of the problems. It’s not a theoretical situation where you take a problem, suggest hypotheses, and then everybody leaves to work on it independently. Everyone is coming together to accomplish something together. You take your hypothesis and work through it with an objective and an outcome. Each accelerator has an objective and key result (OKR), and you try to get as close to that outcome as possible.

In most cases, they have been successful. Even better, certain patterns and practices that have worked are used again in other accelerators. Within a few months, we established a developer portal to laterally exchange ideas and solutions across development teams, which further accelerated the capability building.

Martin Christopher: The core premise of agile is that any team can operate independently of another. But the reality is, we’re not fully instrumented that way yet. We still have those interdependencies.

Previously, we would see a conflict or a bottleneck build, emails would start flying, and impediments wouldn’t get removed. The value that accelerators have provided us is getting those teams into a room together where they don’t focus on anything except solving that problem. They were one of the biggest enablers in helping the teams achieve true independence.

Building security into everything

McKinsey: How did you think about security in cloud?

Martin Christopher: As we transitioned to agile, moving away from monolithic teams into dozens of agile teams, the CISO and I spent a lot of time together to ensure that everybody’s work was secure and not introducing risk into the organization. In essence, our approach to the Atlas platform was to build security into every step and every iteration of a project in such a way that developers couldn’t make it insecure. And if something does manage to fall out of compliance, there’s an immediate notification and an instant, systematic remediation.

Sidd Kuckreja: Security is so much a part of everything we do, making sure our infrastructure is secure by applying appropriate auditing and monitoring components. We’ve also embedded compliance resources and architects, and even though they may not be security architects, they’re coming at it from that security focus.

Ideally, our end state would be fully automated CI/CD pipelines, with all code being systematically tested end to end, without manual intervention, on the path to production. We’re gradually getting there.

Boosting productivity and capabilities

McKinsey: Were you able to capture or measure the impact on developer productivity at all?

Martin Christopher: When I look at the capitalization of our internally developed software before and after Atlas, [I see that] we’ve doubled the amount of software labor that we’ve capitalized, which tells me that we’re putting a lot less effort into software support and more into actual software development. We’ve also seen a multifold increase in releases to production. In a traditional waterfall approach, you do a lot of development and release maybe only once or twice a year. But through the introduction of microservices, Atlas, and agile, we’re seeing thousands of production releases per month, where, just a few years ago, we were seeing hundreds. Overall, 20 percent of our IT footprint is now on public cloud.

But the improvements in productivity and releases were more of a by-product. Our focus was primarily on building capabilities and improving team-member engagement. On that front, we’ve made tremendous progress over the past couple of years. We started off with six basic services on cloud, which have now increased to 15. Sixty percent of our developers consider themselves cloud ready. Our team-member engagement score went up by 30 points.

Building value with the business

McKinsey: What are some of the cloud-supported applications that have gotten your partners in the business most excited?

Sidd Kuckreja: Allowing microservices to be containerized within our infrastructure lets us digitize our business products and test them in ways we never could before. The business was used to using IT as a support mechanism, but enabling IT to learn about application programming interfaces (APIs) and how they can unlock an experience was something unique for them. They’ve learned significantly and are really appreciative of that.

This led to all kinds of innovation. All of our e-commerce businesses are now running on public cloud–based systems (Exhibit 2). In our lending business unit, for example, the common product APIs are allowing us to create an omnichannel experience for the end consumers—not just from our CUNA Mutual perspective but also cutting across our partner channels. (In lending, the consumers typically go through credit unions before reaching us.) This, in turn, allowed us to better understand the customer’s borrowing journey and accelerate our product distribution.

In addition, all our fintech ventures, which previously were experimenting with their own cloud environments, moved to our Atlas platform (and decommissioned the old environments). By their own estimates, this accelerated their launch timelines by 50 percent.

I would like to emphasize that none of this was planned “top-down.” If you would have asked us two years ago which applications we were moving to cloud and when, we wouldn’t have been able to give you an answer. We simply created the right environment and then empowered and educated our application teams to go forth and innovate. Interestingly, we have not had a single business disruption on our cloud-based systems in the past two years.

Lastly, and here is the punchline, we accomplished all of this with a shoestring budget. We estimate our cloud-specific investments to be around $1 million. This was invested up front in building capabilities in the two cloud-platform teams so they could work as cloud engineers in a highly agile manner, delivering fully automated and always compliant cloud services.

Explore a career with us