Ever since Ricardo Sueiras got his first taste of open source technology when he stumbled upon the Apache HTTP Server early in his career, the versatility of open source has captured his imagination. Now he’s helping customers take advantage of this versatility through his work at AWS and on open source projects Cedar and Apache Airflow. We discuss all this and more on this episode of the Open at Intel podcast.
Listen to the full episode to hear more of this conversation, including about Sueiras’s DJ career as DJ Tasty Taste. If you’d like to hear more from Sueiras, check out his newsletter for a weekly roundup of open source and AWS news.
This conversation has been edited and condensed for brevity and clarity.
Earning Corporate Buy-In
Katherine Druckman: Will you tell us a little bit about who you are and what you do? Like me, you’ve been around open source for a while.
Ricardo Sueiras: I think it’s been more than 20 years now. I can still remember what got me into open source. I was in charge of building interinfrastructure at one of the big four management consultancy companies. I was knee-deep in proprietary software like Solaris and Sun hardware. I was building Java web farms, and I came across this thing called the HTTPd Server. I spent some time getting the source and learning how to compile it, and at the end of it, I had a working web server. In those days web servers were proprietary—servers charged you, so the thought of being able to use this technology free was exciting. More importantly, I could start customizing things like “web server from Ricardo Sueiras.” It really caught my imagination, and I started digging deeper into Apache Tomcat and the technical side of it.
But very soon, I found that there was resistance from leaders within the organization. Their objections were around things like support and legal concerns. I worked in a very conservative company. This situation made me start what was, in essence, an OSPO for the company. It got me into the nontechnical side of things, like licensing. I started working with the legal department to educate them on what open source is. I did that by explaining that open source was evolving. I told them it was relevant to their area of expertise and that it would be useful for them to know a little bit more so they could help their customers. I must have spoken to about 200 lawyers in the company, and I found about four or five of them who were very interested in it. They became part of my virtual advocacy team for open source. I wanted to make sure I had reliable resources I could call upon.
From there, I took that model and started trying to engage other teams. Procurement was next. I said, “Software that you buy is going to include open source, and you’re going to have this in your contract. You need to know what it is.” I used things like analogies of cooking recipes to explain what open source licenses were in a way they could understand. And I gauged my level of understanding by the questions they asked me. Next was HR. I always pitched it using slightly different angles. With HR, I said, “You’re struggling to find technical people. Are you aware of all these amazing communities that are springing up around open source projects? You might want to approach these communities and see if you can find suitable candidates for jobs.”
I was doing that while I was doing my day job. It was a side hustle within the company until it grew into my main job because it was strategic for the company. I was doing that until I joined AWS. I joined AWS as an evangelist because I very much love the whole process of learning something and then teaching people about what I’m passionate about. Today, I’m an advocate for open source, and I try to tell people how they can run their open source workloads on AWS.
Cedar: Simplify Authentication Policy
Katherine Druckman: I wonder if you could educate us on a few things in your wheelhouse. Can you tell us about Cedar?
Ricardo Sueiras: Cedar is an open source project we released last year. It’s a domain-specific language for authorization. Now, what does that mean? Most people are kind of familiar with authentication—you are who you are, and you can validate that through an identity provider (IdP). So you log into a website, and that website now knows who you are. There’s lots of technologies for that, and developers are pretty happy working with it. But when it comes to actually knowing how to provide users with access to your application resources, it gets a bit trickier. Depending on which language or framework you’re in—and as we decentralize applications and they become services and move across different technology languages—managing the authorization piece is hard. Cedar attempts to simplify that. It provides a Rust-based library that takes inputs about your application and policy.
The policy is a document you write that has “allow” and “forbid” policies, determining which users can do which actions on which resources. The beauty of it is you can incorporate those policies into your application in a very simple way—you call a function, which is authorized, and it reads that policy file, and now you’ve separated the business logic of your code from the authorization logic. This means a few things. First of all, if you want to make changes to your authorization, you can do it in the policy. You don’t have to mess around with your code. As you know, you typically change your policy more than your code, so it means that you don’t have to worry about releasing new updates to include a new permission or a new group. It’s already in the policy. Secondly, it makes it more readable. You can now understand within your application who’s got access to what.
What I really love about Cedar is the way it’s been built. If you’re using authorization within your application, it must be readable. You also want to build tools that can validate for you so that you can start doing things like suspension checking or duplication checking. And finally, authorization has to be reliable. If your authorization engine is down, no one gets access.
I love how they approached Cedar. It uses a language called Dafny, a project that contains formal verification tools. You start with a spec, and you write based on that spec what you want your program to do. In this instance, they wrote an authorizer that takes policy elements that allow you to basically guarantee correctness. But you can’t run that in production because it doesn’t go very fast, so they used this model as the baseline to create a Rust-based implementation. It’s more code, but it’s memory-safe and fast. Then they did differential testing, which is basically taking a bunch of policies and testing first against the Dafny, which we know is right.
It’s an interesting project for a number of reasons. First, it’s an interesting way to develop a project using formal verification to guarantee correctness. And it’s all open, so you can do this yourself. Second, it tackles a really hard problem. You can run it locally on a laptop, or you can also migrate it to a managed service, such as Amazon Verified Permissions. And we’ve got some partners like Permit.io that have incorporated Cedar into their offerings.
Airflow: Easily Manage Kubernetes Workloads
Katherine Druckman: Can you tell us a little about your work with Airflow?
Ricardo Sueiras: Airflow is a workflow orchestrator. You write workflows in Python code using Airflow operators. You can think of Airflow operators as templates for doing things in downstream systems. So if I wanted to upload a file to Amazon S3, I could either learn how to do that in various code, or I can use an Airflow operator that says basically “file,” “destination,” and that’s it. For creating workflows for data engineering—but not just data engineering, for wherever you’ve got tasks you need to run repeatedly, reliably—it’s just brilliant. For people who run cron tasks for any kind of workflow, in the past you had to craft your own system to do that. But Airflow does it for you, and it does it historically, which means you can go back over time to see the status of your workflow, whether it works, and the output of the run.
So the thing is, though, all those workflows are written in Python. What if you like the idea but you don’t use Python? And not just that; there are also some gotchas you need to know when you’re writing your workflows in Airflow. An emerging pattern that I’ve been seeing when I speak to developers and customers is the use of containers to effectively package up code that you write. Say you’ve written something in C++, Ruby on Rails, or Java; you package it up as a container, upload it to a container registry, and then run it. You could run it manually, or you can use the Airflow orchestrator to run these kinds of containerized workloads for you.
In my demo, I’ve got a Kubernetes environment running, I’ve got Airflow running on my laptop, and I’m packaging up a Java application that does a SQL query against my local SQL database and uploads the results to an S3 bucket. I run that on a Kubernetes cluster running on AWS, but it could be any Kubernetes cluster, anywhere. You can use one orchestrator to run your tasks anywhere on any language on any system.
Katherine Druckman: Versatility is always a beautiful thing. That’s what we’re all about in open source.
Ricardo Sueiras: Versatility enabled by open source. We can do this because all the operators are open source. We can see how they work, we can tweak them, we can add our own. It’s a win-win.
About the Author
Katherine Druckman, Open Source Evangelist, Intel
Katherine Druckman, an Intel open source evangelist, hosts the podcasts Open at Intel, Reality 2.0, and FLOSS Weekly. A security and privacy advocate, software engineer, and former digital director of Linux Journal, she’s a longtime champion of open source and open standards.
Ricardo Sueiras, Principal Dev Advocate for Open Source, AWS
Ricardo Sueiras has more than 30 years spent working in the technology industry and more than 20 years working with open source. He helps customers solve business problems with open source technologies and cloud. He’s currently a developer advocate at AWS, focusing on open source. Follow along with his newsletter for more open source and AWS news.