Cloud Custodian: From Internal Tool to Thriving Community



On this episode of the Open at Intel podcast, Kapil Thangavelu joins us to talk about his work on the open source project Cloud Custodian. Thangavelu initially created the tool to help a financial institution’s security team automate policy and compliance, but it’s since evolved into a Swiss army knife, helping organizations across industries easily manage cloud resources. Thangavelu shares how he got started in open source, how Cloud Custodian gained early momentum, and why open source needs to evolve to stay true to its community-focused roots. 

Listen to the full episode here. This conversation has been edited and condensed for brevity and clarity. 

Katherine Druckman: Will you introduce yourself and tell us how you’re involved in the open source community? 

Kapil Thangavelu: I self-identify as a developer. I’ve been doing open source since the late ’90s, back when it was a group of hippies, and I’ve watched it become the de facto way that everyone builds software. Back then, I was in college and I installed Linux. It was eye-opening to go from Windows—and having to buy a compiler—to having all the tools free and being able to find the source, learn from it, and fix things. And then joining communities and engaging with the people who wrote the software and other people using it, it just seemed like a better way of building things. You can use the wisdom of crowds and all the use cases that a community can bring that a single organization can’t. 

The Evolution of Cloud Custodian

Katherine Druckman: What are you working on these days? 

Kapil Thangavelu: Since about 2016, I’ve been the primary maintainer on an open source project that I’ve created called Cloud Custodian. I did that when I was at Capital One. They were making their move into the cloud in haste and were going to burn the ships, so they wanted to get everything out there. When I got there, I moved some of the first apps into production on the cloud and I noticed there were a lot of things that were costing us velocity. At a regulated institution, there are lots of things you have to do for compliance reasons. We were doing them with process and one-off scripts, and I realized, with the pace of innovation and the cloud, we had to automate them. 

That was the genesis of Cloud Custodian. It operates as a stateless command-line interface (CLI) that processes YAML domain-specific language (DSL) for policies. The policies themselves let you find resources that are interesting in some way. You basically use arbitrary filtering to find interesting sets of things and then take a set of actions on them. To this day, it’s still sort of world leading in terms of being able to fix problems vs. only being able to report them. That language of filters and actions becomes a vocabulary for creating your own policies. But we can also understand if the effect of an API is going to be compliant with policy, and we evaluate all of that in real time.  

We picked up a bunch of early adopters like AOL and Ticketmaster, and we got some early contributions. And in 2017, Microsoft came to us to add support for Azure. Google Cloud Platform (GCP) followed in 2018. Recently, Oracle came to us to add support for Oracle Cloud Infrastructure (OCI), as well as Tencent Cloud. We’re continuing to add cloud providers.  

We’ve added Kubernetes support, and we’ve also added the ability to shift left so we can evaluate what’s in the actual environment and apply those same policies to what’s on your infrastructure’s code assets or on your developer workstation—with native integration into the code hosting repository and pull request (PR) mechanisms to understand where that PR is not compliant.  

All problems are easier to fix closer to the source, but at the same time, you’re always going to have things that are happening on the right that you need to deal with on the right. For instance, FinOps use cases around database utilization. You have to observe it in practice to be able to determine that. So that’s Custodian. We have more than 400 contributors. We’re a Cloud Native Computing Foundation (CNCF) incubating project. We have thousands of production users and organizations that are using us in mission-critical ways. 

Making It Easy for Contributors to Get Involved

Katherine Druckman: To what do you attribute the project’s success early on? Is it the nature of the problem that it solves, or do you feel like you captured something within the community?  

Kapil Thangavelu: I’ve been around for the start of multiple projects and communities, mostly in the Python ecosystem. I was an early adopter of Zope and Plone. There’s a great YouTube video on starting a movement, and it’s not about the first person; it’s about the first follower, or contributor. Making it easy for the first contributor to get involved, focusing on documentation, creating an inclusive and open community—these are key aspects. I see a lot of enterprises doing open source, but they’re not all necessarily doing it well, probably because they don’t have that DNA. There are great resources now—there’s a whole OSPO community around doing this stuff well—but it still helps to have experience. So even though we hit a lot of use cases that people had, inviting contribution is what helped us grow as a project. 

Everything Must Evolve

Katherine Druckman: Aside from your work on Cloud Custodian, what else are you excited about?  

Kapil Thangavelu: I could talk about what I’m trepidatious about, and that’s the nature of open source for projects that don’t exist within a foundation. We’ve recently seen changes on a lot of licenses to business source license (BSL) derivatives effectively restricting the field of endeavor and outside of the OSI definition of open source. We’ve seen that across dozens of companies. It creates a real open question on whether that’s actually open source and what it’s going to take to create sustainable open source in the future. I want to ensure that in the next 20 years, that band of hippies is still succeeding. It’s a challenge because there are lots of organizations that will potentially contribute and lots of organizations that will potentially not, so how do we make it sustainable in a way that the project and community can succeed independently of the existence of any one company. 

Katherine Druckman: I talk about this often. People who come into open source through their work don’t necessarily see it the same way as people who have been around for 20 years.  

Kapil Thangavelu: Everything has to evolve. I think some people misunderstand why open source was so important in the first place. It was fundamentally about user rights and ensuring that users who had software were able to freely modify it to meet their needs. I think there’s still some of that ethos, but maybe there’s weaker boundaries around what it means. In terms of ensuring open source for the future, we all have to adjust to the times. I’m looking around at the trends happening right now, and there’s a focus on supply chain. On the flip side, we’ve got the rise of large language models (LLMs) and generative AI. What do copyright and licensing mean when something gets laundered through an LLM? It’s a little bit unclear. 

To hear more of this conversation and others, subscribe to the Open at Intel podcast: 


About the Author

Katherine Druckman, Open Source Evangelist, Intel 

Katherine Druckman, an Intel open source evangelist, hosts the podcasts Open at Intel, Reality 2.0, and FLOSS Weekly. A security and privacy advocate, software engineer, and former digital director of Linux Journal, she’s a longtime champion of open source and open standards. 

Kapil Thangavelu, Cofounder and CTO, Stacklet 

Kapil Thangavelu is a cofounder and CTO at Stacklet, building products to help companies be well managed in the cloud. He started his career in open source working with Zope and Plone (CMS) communities as a consultant. Over the last decade he’s spent time building open source projects and accelerating cloud innovation at Canonical, Capital One, and Amazon.