ROI in Open Source Contributions

author-image

By

In this episode, Open at Intel host Katherine Druckman speaks to Alex Scammon, who leads the Open Source Program Office (OSPO) at G-Research. Alex discusses the company's significant contributions to open source projects and their unique operating model. He covers the success of Armada, a CNCF sandbox project for multi-cluster batch scheduling, and the considerable efforts of G-Research’s OSPO, which includes 30 engineers dedicated to direct open source contributions. Alex also shares insights on the benefits of supporting open source projects, the complexities of project prioritization, and the collaborative efforts in the open source community. The episode emphasizes the importance of sustainable open source involvement and offers a glimpse into G-Research's mission to use AI and ML tools to drive financial market predictions. Enjoy this transcript of their conversation.  

 

“I think that our example is really compelling for a lot of people once they hear about it, that we have 30-odd people who just contribute to open source blows people's minds.”

—Alex Scammon, OSPO Lead, G-Research 

 

Katherine Druckman: Alex, thank you for joining me in my fishbowl here at KubeCon. I appreciate you taking a little time out. 

Alex Scammon: My pleasure. Thank you for having me. 

Katherine Druckman: Awesome. First, can you tell us who you are and what you do and why you're here at KubeCon? 

Overview of Alex's Role and OSPO

Alex Scammon: I'm Alex Scammon. I run a rather large open source program office for G-Research, I'll get into the details of it a bit later. It's about 30 - 35 people who are mostly engineers who are deputized to contribute philanthropically upstream to all sorts of open source projects around the technosphere. Part of the reason I'm here is that we have a couple of projects, one of which is in the CNCF now. It was accepted into the sandbox a couple of years ago; it's Armada and it is a multi-cluster batch scheduler. For anyone wanting to do batch scheduling of some huge foundational model or something like that across thousands of nodes running Kubernetes, Armada is one of the ways that you can do that. 

Katherine Druckman: There's so many threads to pick up there, but first I want to start with the OSPO. Just to get an idea of the size of the overall organization. You have 35 people in your OSPO, how big is the company altogether? 

Alex Scammon: The company altogether is probably just under 1000. It's remarkable that we have this many people contributing. 

Katherine Druckman: Yes, that was my thinking. 

Alex Scammon: As far as I know, we're the largest OSPO of this kind anywhere. I think we're larger than companies like Google’s OSPOs. One of the secrets about it is that we do it differently than a lot of other companies. A lot of other OSPOs are mostly focused on enablement and helping a large corps of developers in an organization connect with and contribute to open source more freely. While we do that for the 600 engineers that are working at G-Research, we also have 30 engineers on the OSPO itself. That sets us apart from a lot of other OSPOs—we actually consider direct contributions from the OSPO as the majority part of what we do. 

Katherine Druckman: Interesting. You kind of have an open source tiger team there, just empowered to give back to things that are mission-critical? 

Alex Scammon: Exactly. 

Katherine Druckman: That's fantastic. I wish other people would follow that example. I think there should be a case study on how it's done.  

Alex Scammon: We just wrote something that explains a little bit of that. A blog that I think is linked to OSS funders. It talks a little bit about the approach and how we think about it. A big part of why we can do what we do is that those contributions, as you mentioned, that are mission-critical, are tied back to the bottom line, and point out often how much money we make the company by either paring down licenses or forestalling migrations away from tools that we would otherwise have to move away from. 
 

Importance of Open Source Contributions

Katherine Druckman: It’s an important message. Open source software powers everything, people don't necessarily realize that. There's very little software in production that doesn't have open source software in it. A tremendous amount of the software in terms of just lines of code is open source. If your software is important, then it behooves you to ensure the sustainability of the projects you rely on. 

Alex Scammon: I was giving a talk a month or two ago where somebody from the audience said, "I hear that because of XZ or some other vulnerability in some open source package, there've been a lot of people who have lost faith in open source and might be moving away from it." And I was like, "What are you talking about?" You cannot possibly… 

Katherine Druckman: Show me how you would possibly do that. 


Alex Scammon: Yes, it's just not feasibly possible. The only path as far as I can see is to embrace it and to do more work to make sure that those vulnerabilities are patched and addressed. It means that you've got to pay attention to it more, rather than run away from it. 

Prioritizing Projects and G-Research

Katherine Druckman: I love that. I'm curious then, given that your mandate within the company is really to put resources where they're needed, how do you prioritize that? How do you identify the projects you are going to put engineering resources into? 

Alex Scammon: A huge complicated multivariate equation that I put all the variables into and make the machines churn. It's a very complicated question. Maybe I'll back up just a second and talk about G-Research and what it does. 

Katherine Druckman: Okay, yes. 

Alex Scammon: It's a quantitative research firm out of London, and it essentially means that we take AI/ML tools and use them to build models that predict movements in the markets. We take those predictions and use them to make money out of money, like trading on the markets essentially. So, there's a clear path for us, or a clear order of operations of what is driving the business. All the very clever research, the quants who are coming up with all the algorithms that are making the money— those people are using all the AI/ML tools; PyTorch and Pandas and Arrow and Polars and all those things.  
Because they're sort of closest to the money making, we try to service the tools that they are using first and foremost. If they have a problem in Polars, we jump on that. Or in fact, we set up an enterprise support contract with the Polars organization to be able to support them directly. 

And then beneath all of that AI and ML work, there's a platform that helps those researchers and then also takes the models and productizes them essentially; that's run on Kubernetes. And beneath that, there's an open stack layer. There are these major ecosystems that have a whole bunch of projects within them that are also super important. The first cut of the answer to your question is there's sort of a prioritization of what's most important to the business and what's closest to making the money. But then there are all these other decision points about what the project is, what the community looks like, whether they're even open to contributions from us, whether we think they have a future. It's not just one aspect. It's not purely like, well, this makes us the most money so we will focus on this. It's also what the project is and who the community and who sometimes the organization behind the community is 

Katherine Druckman: Among those communities, then is Armada that you mentioned before, which is the CNCF Sandbox project. Two years in, I think you said to the sandbox status? 

Alex Scammon: Yes, two or three, something like that. 

Katherine Druckman: I'm curious about how that's going. So that's a little bit of a different perspective. When you are the primary participant in a project versus contributing back to others with maybe more diverse communities, what is your perspective on building up that community and getting outside contributors? How do you encourage that? What are you doing to nurture that? Are you encouraging contributions from the outside? 

Alex Scammon: Absolutely. Yes, it's good timing that you ask these questions because I feel like we've just actually started getting real contributions to the community. It's been a difficult project to get contributions to because it's a multi-cluster back scheduler, which means that you have to have the need for multiple Kubernetes clusters. We have multiple Kubernetes clusters because we have so many nodes that we can't fit them into one giant Kubernetes cluster. You have to have a company that is interested in running on thousands of nodes, first of all. It's not just the average Kubernetes user; it's not for everyone. It already limits the community that we can pull from to a very select group. 

Additionally, we started this project about five years ago, out of a conversation with the research user group. At the time, there really weren't very good options for back scheduling on Kubernetes, and a whole bunch of other people had this realization. Now today there are probably 13 different options for doing this. There's a fractured sort of landscape for this. On the plus side, there's this crazy thing called GenAI that has become a thing… 

Katherine Druckman: It's the hot new thing. 

Alex Scammon: More and more, there are people who are interested in running large batch jobs for training foundational models. And so, in fact right now, we're starting to see a lot more interest in doing exactly the thing that we've been doing for five years, running batch scheduled jobs at scale. That's one of the reasons why it's a really good time that we're talking. There's a whole bunch of new contributors, new people picking it up and taking a look, kicking the tires. It's great. 

Challenges and Collaboration

Katherine Druckman: Yes, that's interesting. You mentioned the fractured ecosystem right now, there are a lot of people working on similar problems because nobody's problems are truly unique. And I wondered if in realizing that, or its proliferation of several projects where there may have not been so many before, what kind of cross-pollination is happening? Are you trying to talk to each other and find common ground, offload some burden there? 

Alex Scammon: There are a couple of different ways of answering that. One, I run the CNCF batch working group alongside Abhishek, who's one of the other co-chairs, and Klaus who was one of the founders of Volcano which is one of these… 

Katherine Druckman: Okay, I know Volcano. 

Alex Scammon: And then Abhishek was one of the creators of MCAD, which is IBM's multi-cluster batch scheduler. There's a bunch of us who are getting in a room every two weeks and talking about the problems, what we can do to align, what we can do to educate the community with at least the first goal of hopefully preventing anyone from building any more of them. 

Katherine Druckman: Yes. 

Alex Scammon: And then we can work on the problem of… 

Katherine Druckman: There's an XKCD cartoon somewhere. 

Alex Scammon: Yes, exactly. It feels like the tide has turned, or at least we've stopped the flow of more batch schedules being created. Then there's some work that all of us have done around Kueue Batch. That's the sort of Kubernetes native approach to batch scheduling that serves a lot of use cases. A lot of us have contributed to that project as well. The person who's in charge of the Kubernetes batch working group who sort of oversees the Kueue project, Aldo stepped aside, and a person named Kevin Hannon is now taking his place. Kevin used to be on my Armada team. There's a lot of cross pollination and a lot of us know each other and work together and like each other. I think we would all appreciate a future where there is just one approach and none of us have to run our own bespoke thing. But getting there will be tough because 1): we have running systems already; 2)  all the systems are slightly different, as you would imagine, so converging on one approach is going to take time and be difficult. 

Personal Journey in Open Source

Katherine Druckman: On a personal level, I'm curious how you arrived at the place where you are right now. What attracts you to this kind of open source work? I'm super biased and in my software life I have only ever really known open source. But open source people are really interesting, and I think we're kind of the best people. What draws you to the kind of community-driven work that is essential to making all this happen? 

Alex Scammon: That's an interesting question. 

Katherine Druckman: I find a lot of us are either culture junkies or we are here for the ideology, or we just really like the opportunity to work with people across the industry. I know there are a lot of reasons, and I always find that some people have a really interesting open source origin story, if you will. 

Alex Scammon: Yes, my origin story is that I tried some open source contributions really early on and got scared off because I really didn't know what I was doing. And my ego didn't like being told that I was doing it wrong. 

Katherine Druckman: It's tough. It takes some courage to get involved. 

Alex Scammon: Yes, and it was only later that I ran into enough good communities that I was enticed back to the good side and wanting to contribute. It's a mixture, I think, for me of social interaction. I think that so much of software is about how we interact with each other, and I see open source as just not the purest form of that. But the most emblematic, or it is a really good example of the way that software actually gets made. There is something, of course, to the ideology of we're all doing this out in the open, and there's this sort of a hacker mentality. I suppose right at the beginning I got involved in open source because I liked things that were free. And it appealed to my 20-year-old brain that I was going to do things absolutely as cheaply as possible. And so that was enticing at the time when I couldn't afford anything. 

Katherine Druckman: I agree, for sure. There is this kind of, like you say, hacker culture, and it's just such a human thing. We think of software as lines of code, and of course that's part of it. But I think the human part is the more challenging part in many ways. There's a lot of diplomacy involved in open source software, there's conflict resolution, there's managing people with different priorities. There's a lot of interesting stuff that goes into it that people kind of forget because we're so focused on code considerations. 

Alex Scammon: Well, here's maybe another one sort of similar to what you're saying, there's the feeling that you can make a contribution that could actually help move the world forward. A lot of people like helping other people, and there's a sense that by adding this one line of code you might be helping hundreds of thousands of people, millions of people all around in some small way. It's really pleasing when just last week, there was a code commit from one of my team members that referenced Gophercloud. Gophercloud was a project that started from one of my team in Rackspace on a hack day. And then he asked, "Hey, can I continue working on this thing?" And I let him continue working on that thing, and he gave it to somebody else who was better than he was eventually. And 10 years later, it's still a thing that my team is still contributing to, and it's just really pleasing. There was a moment where I smiled to myself and was like, ah, this is great. 

This is a thing that I had a small hand in helping create. And 10 years later, it is still helping countless people around the world. And that's a good feeling. I didn't need money for that feeling. I just enjoyed the feeling. 

Katherine Druckman: I agree. So, the day zero event of KubeCon involved Project Lightning Talks, an opportunity for people to get together in a huge room and just talk about all the different projects. A lot of the attendees there were people who are new to KubeCon. This is their first event, they're new to the cloud native landscape and they're trying to navigate it. I was one of the speakers on that day, and the message that everybody was trying to get across, anybody who'd been around a little bit longer was very much like, we are in this together. You're not just here to listen to the people pitch their project. You're here to talk. You're here to engage with those maintainers. Maintainers actually do want to hear from you. And yes, it's confusing and overwhelming but they're humans. If you talk to each other, you can solve these problems together. And you can. We are all in it together again and trying to do good things. For the most part, we all mostly have good intentions. 

Alex Scammon: Yes, totally agree. Even when it comes off badly, I think everyone here does at some base level have good intentions. It's lovely. 

Katherine Druckman: Yes, it's lovely. Well, is there anything that you wanted to talk about that we didn't get to? 

Alex Scammon: G-Research itself has its own products, so there's nothing that I need to push. 

Katherine Druckman: No plugs. 

Encouraging Open Source Contributions

Alex Scammon: Yes, really no plugs. We come here with an open heart just to contribute and to give. I suppose I would love to talk to anybody who is trying to convince their leadership to contribute more to open source software. I think that our example is really compelling for a lot of people once they hear about it, that we have 30 odd people who just contribute to open source blows people's minds. And… 

Katherine Druckman: And it's profitable. 

Alex Scammon: Absolutely. 

Katherine Druckman: It's not just out of the goodness of your heart. Those are actual business use cases. 

Alex Scammon: The company, we've worked a lot with our finance department to prove that we actually have a three or four x return on the money that G-Research invests in us. And I would love anybody who wants to hear about how we do that, how we talk about it, how we calculate things, how we convince people that this is a really good thing, I'd love to talk with anybody who wants to hear more about that. 

Katherine Druckman: Fabulous. Where can they find you? 

Alex Scammon: They can find me at alex@gr-oss.io, that's probably the easiest way to find me. I have a Twitter, it's stackedsax. If you want to find me there, that's cool. Where else can they find me? They can find me on CNCF Slack. I think it's stackedsax on there. 

Katherine Druckman: Awesome. Well, or you can find me, and I'll help them track you down. 

Alex Scammon: Yeah, please. 

Katherine Druckman: Thank you so much for all of this and for everything you do. 

Alex Scammon: Thank you very much. I really appreciate you having me. 

Katherine Druckman: You've been listening to Open at Intel. Be sure to check out more about Intel’s work in the open source community at Open.Intel, on X, or on LinkedIn. We hope you join us again next time to geek out about open source.  

About the Guest 

Alex Scammon, OSPO Lead, G-Research
 
Alex Scammon is leading a large and intrepid band of open source engineers engaged in a number of philanthropic upstream contributions on behalf of G-Research. All of their work centers around open source data science and machine learning tools and the MLOps and HPC infrastructure to support those tools at scale. As part of this work, he's also leading a discussion around batch scheduling on Kubernetes as the chair of the CNCF's Batch Working Group. Please reach out if this is an area of interest for you—they'd love to have more voices at the table! 

 

About the Host 

Katherine Druckman, Open Source Security Evangelist, Intel 
 

Katherine Druckman, an Intel open source security evangelist, hosts the podcasts Open at Intel, Reality 2.0, and FLOSS Weekly. A security and privacy advocate, software engineer, and former digital director of Linux Journal, she's a long-time champion of open source and open standards. She is a software engineer and content creator with over a decade of experience in engineering, content strategy, product management, user experience, and technology evangelism. Find her on LinkedIn