Software Bills of Materials, better known as SBOMs, sound dangerous but are meant to keep explosive surprises out of software packages.
This humble but critically important document – sometimes compared to an ingredient list – is gaining importance. The uptick is due, in part, to the 2021 Executive Order on Improving the Nation’s Cybersecurity (aka EO 14028) that issued strong guidance on requiring SBOMS for government software acquisitions.
Alexios Zavras, Chief Open Source Compliance Officer at Intel and Kate Stewart, VP, Dependable Embedded Systems at the Linux Foundation* are SBOM experts who are also active contributors to the SPDX SBOM standard, one popular format currently in use.
In this interview, they provide key context and useful information all developers should understand about SBOMs.
Host Katherine Druckman:
To start, how would you explain SBOMs in the simplest terms? What problems do they solve?
In the simplest terms, an SBOM is a set of components and the relationship between those components. You want to try to figure out what's actually in your software and what's in your system, and understanding what components are there is part of that. Now, the term ‘components’ can have a large number of variants, and the term ‘relationships’ can also have a large number of variants, and so it's figuring out what you want and need for your own purposes internally, and what you may want to share externally.
These all may vary over time, and that's the source of a lot of the confusion and frustrations about things right now. At the minimum, though, it's a set of components, which could be files, it could be formal packages, or it could be disk images. Those are all components.
The relationships are what's inside it, what it contains, how it is linked, or what's generating which things. These are all essential and, depending on the use case, you have different levels of fidelity you must get down to.
So, in the simple case, it’s just components and relationships -- then the devil’s in the details.
Expanding on that, in the simplest form it's just the table of contents.
The same way that when you buy prepackaged food it has a food label and it says what’s in it, when you get your software, which nowadays is always complex and has lots of things, you should have some way of telling what's inside.
The bill of materials is the list of components that are inside. But then, as Kate said, we can expand this and say not only what is inside but what was used in order to create it, which doesn't necessarily end up inside the final product.
The whole concept of a bill of materials has been around in the hardware industry forever. The concept is there. You're sending a box out and you're sending exactly which boards are in the box, which silicon revs are in the box.
So, a Bill of Materials is something like you get from a shipper. You know exactly what should be in that box with you, and this is just taking that concept into the software space.
I hear the most about SBOMs in a security context, but its origins in licensing are interesting as well.
At the heart of licensing you need to know the components and the relationship between the components. It's the same for security, so you may as well use what's already been put in place and extend it if there's something that's missing.
That's pretty much how SPDX has been growing through the years as we've been talking to people, and if we can't show them how to use SPDX to represent their use case today, we consider adding in more of what they need in order to do their use case. It's very organic.
What are some other use cases outside security and licensing?
Export control standards. Some people want to report what standards have been complied with. Some of these things are making their way into future versions now.
Another case is coming up from machine learning. How do you represent what has been trained into your data model so that you can reproduce it? This may or may not be a security issue the same way everything else is, but being able to extend into those types of use cases is the direction we're certainly heading with SPDX. As in, “How do you build your system?” What tool chains, what options, who's been doing the builds, and things like that.
We all need that for provenance, so some of it gets into security and some of it gets used in other places too, like safety, and so things that are safety critical need this information as well.
So if an SBOM is effectively a table of contents or a food label, just a document indicating what's in your software, how is provenance represented?
Software is even more complicated than food. In food you typically want to know if this packaged food contains peanuts because someone might be allergic, and you see on the label whether it contains peanuts or not.
But what about if you really wanted to know that it does not contain peanuts, but it contains sugar, which was originally processed by using some method. So, you don't just want to have information about what is inside, but you want to have information about how the final product came to be. You want to record the generation information -- this is what you usually call provenance. “We take this part” and “We use these tools” and “We generate this stuff.”
There are cases where you won't know the complete history of how a software system was created.
The reason you're hearing a lot about this right now about security is because attackers have recognized that the supply chain security and the components are a fairly rich target.
Sonatype* puts out an annual report, and the 2022 report found a 742% average yearly increase in software supply chain attacks since 2019 -- so it's a pretty ripe area.
When you think about it, if the tool chains you use to build something have been compromised, that’s suddenly a large swath of things for people to take advantage of. That’s why hardening our tool chains and hardening the flows and knowing exactly which version you've been using for building things and which components are used to create other components is crucial.
Let’s talk a little bit about the US executive order in 2021 requiring guidance on software supply chain security. From your perspective, what changed between 2014, when the Cyber Supply Chain Management and Transparency Act failed to pass and 2021.
I think it was the wake-up calls with the various vulnerabilities starting to play a role, and then the time to remediate these vulnerabilities started getting attention.
That has also increased.
Right, and it was in the regulatory sphere. In particular, the FDA got very interested in this whole space, which suddenly moved a lot of the medical device manufacturers and the hospitals to get interested. So, the National Telecommunications and Information Administration (NTIA) took the lead back in 2018 on trying to get some people talking to each other and rather than everyone, every industry segment creating their own SBOM format, try to agree on some common principles.2018 on trying to get some people talking to each other and rather than everyone, every industry segment creating their own SBOM format, try to agree on some common principles.
What emerged in 2021 had been in progress since about 2018 and was trying to build up industry consensus before the spotlight hit to say, “OK, what is a minimum SBOM?” What the NTIA ended up publishing as their guidance wasn’t quite what the stakeholders came up with.
We asked for things like hashes to make sure things hadn’t been tampered with, and that was a step too far, apparently, but they were working toward this and trying to understand what are these ingredients and what are the relationships, and how do we expand it out?
So, what the NTIA guidance was a minimum, and there's a lot more out there than that. Obviously SPDX, because of its history, can represent nuances and details on the licensing side more than any of the others can, as well as on the relationship side. For the most part, the stuff is there and building up on it...
Regulatory bodies have been getting interested, and that’s shifting the industry a bit. The executive order just basically summarized it and then got the power of US government acquisition behind it, so lots of companies suddenly started paying attention. Up to that point, only the really forward-thinking ones planned for it and knew that they had to do it for other reasons. They didn't really make it visible, but now all the ones who had the forethought, like Intel, to do this work now it's there and they can take advantage of it.
Yeah, one of the important points that Kate mentioned is that when we're talking about software bill of materials, people should not think that this is only restricted for software companies that release software. Nowadays, almost everything that you're getting includes software. If it's a medical device, car, anything, a satellite, a train engine. Everything includes software.
For more of this conversation and others, subscribe to the Open at Intel podcast:
- Google Podcasts
- Apple Podcasts
- Amazon Music
- Or your favorite podcast player (RSS).
About the Author
Katherine Druckman, an Intel Open Source Evangelist, is a host of podcasts Open at Intel, Reality 2.0 and FLOSS Weekly. A security and privacy advocate, software engineer, and former digital director of Linux Journal, she's a long-time champion of open source and open standards.