The Open Source AI Definition

Open at Intel host Katherine Druckman spoke with Nick Vidal of the Open Source Initiative about its open source AI definition released in late 2024. They discussed its principles, the unique challenges of defining openness in AI, and the ongoing process of reaching community consensus. Enjoy this transcript of their conversation.

“This new wave of AI systems, large language models, is relatively new, and everyone's learning. And how open source AI, open source principles can translate to AI.”

—Nick Vidal, community manager, Open Source Initiative (OSI)

Katherine Druckman: Hi Nick, thank you so much for taking a little time out of this event to talk to me. I know you've got a lot going on, and we'll get to that in just a second, right?

Nick Vidal: For sure.

Katherine Druckman: You've been pretty busy. But if you wouldn't mind, introduce yourself and tell us who you are and what you do.

Nick Vidal: My name is Nick Vidal. I'm with the Open Source Initiative. Actually, it's a funny story, because I was with the OSI before, and I helped them celebrate 20 years of open source. And I returned to the OSI to celebrate the 25th anniversary. I served as the community manager for the OSI, and we’re doing some really important work around supply chain and open source AI.

AI has become a major topic in mainstream discussions. That’s why defining what truly qualifies as Open Source AI is more important than ever

Nick Vidal's Role at OSI

Katherine Druckman: Tell me, how does your role as a community manager tie into all these different areas, especially AI right now? I would imagine AI is taking up a lot of your time, but what else are you involved in as a community manager?

Nick Vidal: There's several communities that we are involved with. Besides open source AI, it's actually quite challenging to define what's the community, the open source AI community. Where are they? But we're also involved with the supply chain. We have a project called Clearly Defined.

Community Involvement and Challenges

Nick Vidal: We connected at SOSS Fusion, a Linux Foundation event in Atlanta, where we presented Clearly Defined, a project focused on securing the software supply chain. We have developers and security experts. My work basically is trying to understand, what are the challenges around open source AI? And how we can help those developers and security experts. In the case of Cleary Defined, our goal or mission, it should have a clear definition of licenses, of licensing metadata.

When they are going to deploy something, when they're developing something, they know that all the components that they're using and all the licenses that they're using, if there's a missing or a wrongly identified license, they can actually fix that and work with other community members. Instead of having to fix on their own, they can help each other. Right?

Katherine Druckman: Right.

Nick Vidal: Right now, we're a small community, Clearly Defined. We have folks from KeyHub, Microsoft, Bloomberg, and so many others, and we're working together. SAP has been involved a lot with that. We try to work together with a lot of different community members to help with the supply chain as well.

Katherine Druckman: I think a really big part of any kind of community work, but especially a community manager, is your ability to take in a lot of feedback from various communities. Especially when you're talking about something in an emerging field, like AI, there are a lot of opinions, very heated opinions. What can you tell us about the way that you have tried to collect the feedback during the process of determining the OSI definition of open AI?

Defining Open Source AI

Nick Vidal: Open source AI, AI in general, has been a hot topic, and everyone has an opinion about that. There's a lot of open washing as well. We see a lot of companies who…

Katherine Druckman: Sure, with licenses. Software licenses as well.

Nick Vidal: Yes. They say that their projects, they clearly say this is open source AI. When, in fact, if you look at the licenses, they have the bespoke licenses, they have clauses that really make it, you cannot say that's open source. They have clauses that say you cannot actually develop something on top of that and redistribute that.

Katherine Druckman: Yes, that's for sure not open source.

Nick Vidal: As a community manager, I try to hear different opinions. Those who are very passionate about their ideas, from both sides. Right now we're dealing with people who are passionate about having open datasets. And only projects that have open datasets should be considered open source AI.

Katherine Druckman: Yes.

Nick Vidal: While others believe that the definition right now, the open source AI definition was too open, and some companies, they want to…

Katherine Druckman: Want to lock down the data.

Nick Vidal: Yes, lock down the data. And not just the data, but also the culture, train the data. I'm trying, as a community manager together with everyone, to align those different desires and try to find a common ground. It's very challenging, because these people are very passionate about their ideas. Sometimes the community can be divided, but we try to be calm. We try to respond and answer misconceptions. Also, there are a lot of technical details. We try to bring experts to share their ideas as well. Even though it has its challenges, it has been rewarding work getting to know all those experts and bringing down and sharing their knowledge.

Handling Feedback and Criticism

Katherine Druckman: I'd like to get into more discussion around the controversy on both sides, and all that in a minute. But just from a practical standpoint, how do you keep track of the conversation? What would you say to another community manager dealing with or going through a period of struggle, or a period of really having to digest a fire hose of information, and feedback, both negative and positive? How do you even keep tabs on it?

Nick Vidal: We decided early on to use Discourse as our forum. And this is where most of the conversations happen. There's a lot of conversations happening also on social media, like LinkedIn…

Katherine Druckman: LinkedIn, yes, I see a lot of it.

Nick Vidal: ... Twitter, and so forth. We try to keep track. In the forum, Discourse, as you know, is an open source forum. It has a lot of interesting tools that we can analyze the conversations and keep track of that. We try to reward people by adjusting their trust levels, people who are really trying to bring knowledge and add value to the conversations. In fact, this is part of our code of conduct. If you're just promoting flame wars, that's not good.

Katherine Druckman: How do you define that?

Nick Vidal: It has to add value to the conversation. If it's not adding value to the conversation, if it's just technical inaccuracies as well. For example, some folks believe that by having the training data, the whole training data, you can completely reproduce the parameters for AI systems. And that's not the case. It's not possible. But some people insist on that. At first, we try to advocate them, we try to point to some references. But if they insist, unfortunately, we can't let them keep flooding the forum with that. We try to be as open as possible, and even in certain ways a bit permissive. But if you're too permissive, people can really take over the forum and just flood the forum with misinformation. We try to handle that as best as we can. It's challenging, for sure. Especially AI, which is such a hot topic, and people have such strong opinions about that.

Katherine Druckman: Well, from what I've followed, a lot of the negative feedback comes from a place of understanding that open source has a very clear definition. It has always carried a certain meaning. It has always generally meant a certain thing. And now you have AI, and it's not just code. You have the training and the model, it's a different thing.

But there is a concern that if open source is used to define something that is different from what it has always meant, that it will somehow lose that meaning. And we will lose the community consensus, in a way. I understand that concern as well, but there have been a lot of very, very vocal opponents to the definition as it currently stands, as it was released as version 1.0. What would you say to those people? I understand nothing is written in stone. These things are living, breathing things, and things can be amended and adjusted. But what do you say to your harshest critics?

Nick Vidal: AI is really a different type of system. It's not just software, as you mentioned. And sometimes we get a lot of criticism from folks who are into open source and they understand open source software. They have this background, the more technical details for software. Usually they translate, they create analogies for AI giving their background, their experience around software. But AI systems are like a different beast. Those analogies, those false analogies, usually don't help much. We try to acknowledge that AI systems are different. It's a new model. There are a lot of nuances. There are a lot of questions as well around privacy, around security, around copyrights.

It's not just the technical aspects, also the legal and social aspects. There are so many questions. We try to remain as humble as possible as well. The OSI board is made up of members who have different skills. Some indeed have experience with AI. Others, we have one of the best legal experts as well who is part of the board. But no one has the knowledge of everything with regards to AI. I know AI as a field is 70 years plus old. But at the same time, this new wave of AI systems, large language models, is relatively new, and everyone's learning. How open source AI, open source principles can translate to AI, is quite challenging and involves a lot of questions.

Everyone's learning together. The OSI has this humble position. We're trying to learn as much as possible. My message to critics, thank you for that. Thank you for bringing your criticism. We're hearing your feedback. We're trying to learn. We don't have the full comprehension of all the aspects, all the challenges, all the questions, and we're learning together with you. Hopefully, everyone can have this humble position and try to learn as much as possible and try to come up with a definition of that, really, it's going to be beneficial for developers, for AI builders, and for society as a whole.

Overview of Open Source AI Definition

Katherine Druckman: I'll link to the definition that you released today. But can you give us just the overview of the basics? What is open source AI as defined by this new OSI definition?

Nick Vidal: Basically, the open source AI definition is based on the four freedoms as defined by the Free Software Foundation, which is to study, to run, to modify, and to redistribute those changes. And our initial idea was to see how that applies for AI and translate that for AI. And we looked at different components for an AI system, and we asked ourselves, should this component be made available, be required, or be optional to exercise those four freedoms? We evaluated that. We developed a co-design process together with working groups, evaluating different AI systems, and it was a really good exercise. The definition is basically highlighting those four freedoms and highlighting a checklist of these components. And determining what should be optional and what should be required.

Katherine Druckman: And what are those things? What is optional and required? What can be closed and still qualify under the open source AI definition?

Nick Vidal: There's so many components from an AI system. We base that on the model openness framework developed by the Linux Foundation. They're sensing components. And not everything has to be open. The essential, the core ones needed. Basically, we have the model, the parameters, the open weights. We have the codes for inference, and also the code for training the data. We have the data itself, and the data is the most challenging one. Should the datasets be made available, or maybe just providing data information? And so, evaluating all those components, we created this checklist. Even though a lot of components should be required and should be open, not necessarily everything should be open. Even from a copyright and a privacy perspective, in some cases they really shouldn't be open.

Katherine Druckman: That is a way of looking at it, certainly. Again, kudos to you for tackling a very, very tough job. I think this is a difficult place to be. What can you tell people about the next year? For example, do you have plans to revisit this definition at regular intervals? Do you have any plans to make any announcements in the future? What can people expect as this evolves?

Future Plans and Community Involvement

Nick Vidal: This whole process started in 2022, so it has been two years that we are in this process. We have organized podcasts, panels and webinars. We have attended various events around the world, and we plan to continue doing so and getting that feedback from open source developers, AI builders, experts in legal, in all domains. 1.0 is just the first version. We plan to hear that feedback and try to find the balance and a consensus. This is really important, because we have legislation around the world that provides some exceptions to open source AI systems, even though there's no clear definition of that as of yet. But the Open Source Initiative is providing a 1.0 version, providing this first definition, and we're going to continue working with this community to reach a consensus.

Katherine Druckman: Well, I think there's an analogy to be made with software, right? Nobody releases a 1.0 version of software and leaves it at that. It's not set it and forget it, right?

Nick Vidal: Yeah.

Katherine Druckman: I can guess that this might be a similar scenario.

Nick Vidal: Exactly. This certainly is going to happen.

Katherine Druckman: Is there anything that you wanted to make sure to mention that you haven't gotten to yet?

Closing Remarks and Invitation to Join

Nick Vidal: I would like to thank everyone who was involved in this process. I would also like to invite people to join this effort. You can join at opensource.org/AI to learn a bit more. We have a forum for discussions. We also have an endorsement page if you want to see, who are the endorsers? We have several organizations who have endorsed that, from the Mozilla Foundation to Elew for AI, the Open Forum Europe. And we will invite people to learn a bit more about this and have your opinion shared.

Katherine Druckman: Well, thank you. Thank you again. I look forward to seeing how this evolves and watching the various voices weigh-in, and then hopefully the consensus can be maintained, achieved. I do appreciate the dissent, right? The dissent is important.

Nick Vidal: Yeah.

Katherine Druckman: The dissent is incredibly important to make sure that we get it right.

Nick Vidal: Thank you so much, Katherine, for inviting me. It's a pleasure to be here.

Katherine Druckman: You've been listening to Open at Intel. Be sure to check out more about Intel’s work in the open source community at Open.Intel, on X, or on LinkedIn. We hope you join us again next time to geek out about open source. 

About the Guest

Nick Vidal, Community Manager, Open Source Initiative

Nick Vidal is community manager at the Open Source Initiative and former outreach chair at the Confidential Computing Consortium from the Linux Foundation. Previously, he was the director of community and business development at the Open Source Initiative and director of Americas at the Open Invention Network.

About the Host

Katherine Druckman, Open Source Security Evangelist, Intel

Katherine Druckman, an Intel open source evangelist, hosts the podcasts Open at Intel, Reality 2.0, and FLOSS Weekly. A security and privacy advocate, software engineer, and former digital director of Linux Journal, she's a long-time champion of open source and open standards. She is a software engineer and content creator with over a decade of experience in engineering, content strategy, product management, user experience, and technology evangelism. Find her on LinkedIn.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in