It started with a dare. At KubeCon 2018, one of the Kubernetes Founding Engineers at Google asked us – with a wink – to let him know when we planned to migrate TurboTax to Kubernetes.
At the time, this was an audacious challenge—the task seemed almost unsurmountable. Very few organizations had done more than tiptoe toward running business-critical applications on Kubernetes.
Intuit was already running microservices in production, but as a complex application involving hundreds of microservices and serving millions of customers, TurboTax was another matter. Would the technology be able to handle Intuit scale in keeping with all security and compliance requirements for this global financial services platform?
Fast-forward to 2020. In less than 12 months – from May 2019 through April, 15, 2020 – we’ve met the challenge. We’re now running the majority of TurboTax critical services on Kubernetes in our production environment, including core services such as the Identity platform and Financial Data platform, which also serve Mint and QuickBooks customers. We’ve achieved performance at scale to meet the demands of even the busiest tax filing seasons, proving the viability of Kubernetes for our most business-critical applications. Along the way, our efforts were recognized by the Cloud Native Computing Foundation (CNCF) with its coveted top end user award.
The project has delivered significant benefits for Intuit, including:
- Fast, iterative product development and rollout
- Consolidation under a single platform for all development teams
- Efficient resource utilization at high scale for cost reduction
- Strong ecosystem support
- Unified distribution mechanism for service artifacts
And, we’ve gained the expertise and confidence to move all remaining TurboTax support services to Kubernetes.
Of course, migrating TurboTax to Kubernetes hasn’t been easy. We’ve had to develop new skills and solve a number of technical roadblocks along the way. If you’re considering making your own move to Kubernetes, you can take a deeper dive into the technical details of the Intuit journey here:
Top 5 lessons learned on our journey
- Gather requirements ahead of time and plan infrastructure accordingly
The main requirements you’ll need to identify include:
- The number of microservices you’ll run on Kubernetes.
- Compute resources, including the number of clusters and Kubernetes nodes you’ll need, the AWS availability zones and regions where they’ll be deployed, and non-Kubernetes setup ready for disaster recovery purposes.
- The scale you’ll need to support, including maximum TPS load and number of concurrent users.
- Invest in training your engineers to deliver on migration requirements
There’s no shortcut on human expertise. We took the time to deliver several weeks of training programs for hundreds of engineers across the company within our TurboTax, QuickBooks and Developer Experience teams. The training covered fundamental technologies like Docker and Kubernetes, along with advanced concepts relating to multi-tenancy, cluster add-ons, load balancers, and auto scalers. To build familiarity and proficiency, we provided our engineers with playground clusters with fully functioning Kubernetes namespaces to tinker around.
- Keep compute and data separate, and create a robust access path between the two
A critical component of Kubernetes migration is to ensure access to existing data. In our case, much of our existing TurboTax data layer was already being accessed via APIs (application programming interfaces) through NAT Gateway [network address translation gateway in the AWS GovCloud (US) region]. For cases where data had to be accessed directly using AWS managed services, cross-AWS account access was set up between the Kubernetes account and the data services account. This way, our application teams can continue to manage access to their Data Service AWS Account and use their favorite tools for managing the data. This architecture also allows service teams to keep their non-Kubernetes based setup around for disaster recovery purposes.
- Test for reliability and performance at scale
It goes without saying that production Kubernetes has to meet production demands. For us, that meant performing reliably at the scale of peak tax seasons. To see whether the infrastructure could hold up, we ran weekly tests with triple the anticipated load, as well as failure testing involving spikes in traffic, regional, and availability zone failures. Sure enough, we uncovered a variety of technical issues. You can find the details in the second part of our Medium blog, but the problems and their solutions gave us a much deeper understanding of running Kubernetes on AWS. Ultimately we gained confidence that the entire setup would be able to sustain the load for peak tax season.
- Automate, automate, automate
The more you automate, the more reliable and robust your infrastructure will become. We automated everything from onboarding new services onto Kubernetes to monitoring and fixing known problems. The governor, lifecycle-manager, iam-manager, instance-manager, upgrade-manager, and active-monitor components open-sourced in Intuit’s Keiko project are a direct result of such automation. And there’s even more automation to come.
What did success look like?
The infrastructure held up. During tax season, all services ran smoothly, meeting availability, scale and performance guarantees. And, that’s saying a lot, given the nature of our seasonal traffic and scaling requirements for core services that support more than 80 percent of the traffic powering TurboTax.
What’s next for Intuit on our Kubernetes Journey
This monumental accomplishment is a tribute to the hard work and dedication of incredibly talented individuals from across the company. Throughout the journey, Intuit TurboTax and Developer Platform engineers applied continuous testing, data-driven decisions, and focused automation to successfully tackle this audacious challenge.
And, we’re proud of it.
The first generation of Intuit’s Kubernetes platform is built on Kubernetes primitives and a custom control plane, which has served us well for the desired scale and performance. However, there is always more to do, especially for even higher scalability, monitoring, observability, and manageability. Moving forward, we’ll continue migrating remaining services in TurboTax, as well as other Intuit offerings. And, we’re building the next generation of a supercharged Kubernetes platform using widely adopted Intuit open source projects, such as Keiko, Argo and Admiral.
To learn more, visit us at KubeCon + CloudNativeCon 2020 North America from November 17–20, 2020 at our virtual booth in the Silver D conference area. You’ll also find Intuit on the conference schedule, where we’re participating in five session talks on November 18, 19 and 20.