Hardening Security of Hardware IPs by Verifying Negative Space...

Author:

Sayak Ray, INT31

Negative testing is one of the essential components of security assurance. The basic premise of negative testing is to check the behavior of the Design Under Test (DUT) not only under expected inputs, but also under unexpected inputs. Expected inputs (also termed as good or positive inputs) constitute the class of inputs which the designer considers valid for the functionality of the DUT. All other possible inputs are deemed by the designer somewhat meaningless to the functionality of the DUT; hence named “unexpected” inputs and for us, security researchers, “malicious” or “negative” inputs. Ideally, the DUT should respond gracefully to both positive and negative inputs. However, DUTs often fail to handle negative inputs properly, which may lead to security issues. This is why systematic negative testing is so important for security assurance as it boosts confidence over DUT’s capabilities on handling the negative inputs.

Can Software Fuzzing be Useful for Hardware?

Negative Testing is well-known in the software security community since the early days of buffer overflow attack. Software fuzzing has been adopted universally to perform negative testing of software. While fuzzing is phenomenally successful for software, it is not so effective for hardware. Formal Verification (FV) is coming to our rescue instead to effectively complement random simulation for negative testing of hardware. Let us first see why fuzzing is not effective for hardware.

Why do Software Fuzzing Ideas Not Work for Hardware Negative Testing?

While the notion of negative testing applies equally to hardware, the details of how negative testing should be performed for hardware differ significantly, primarily due to the three following attributes:

Complex hierarchical nature of hardware
The way inputs are structured and processed by hardware
The way the security impact is manifested in hardware upon consumption of negative inputs

Let us elaborate the last point a bit - memory corruption issues are still the lion’s share of software security vulnerabilities and fuzzing offers an effective, generic mechanism for detecting them by inducing memory corruption events (e.g., a crash or a hang) during dynamic simulation. Hardware, on the other hand, does not always manifest such generic misbehavior upon receiving negative inputs. Instead, hardware may silently jump to an undesired state whose effect will be architecturally visible only after many clock cycles, and only a specialized, logic-aware checker (or assertion) can detect that something wrong has happened. This is why we need a more disciplined approach involving characterization of the negative space for hardware, documentation of it and deployment of the most effective negative testing technique depending on the situation. This whole paradigm is called negative space engineering.¹ In addition to conventional assertion-based, constrained random simulation, we employ formal verification to effectively complement simulation in various negative testing scenarios for hardware.

For example, let us consider an interface that sends input packets to a hardware filter which consumes these inputs to construct protocol packets that will eventually reach some internal registers. Let us assume that the interface can be controlled by malicious actors and hence, it can send any arbitrary sequence of packets. Suppose the hardware filter expects packets in the sequence shown below in Figure 1 as “Functional Legal Flow” (an INIT packet followed by a PAYLOAD packet containing a tag (metadata) and another PAYLOAD packet containing message body). In System Verilog Assertion (SVA), we may capture this as follows (omitting some details and making some simplifying assumptions):

property functional_legal_flow;
  @(posedge clk)
  	(packet.header == INIT) ##[0:$] 
(packet.header == PAYLOAD) ##[0:$] 
(packet.header == PAYLOAD);
endproperty

Now, what would happen if any one of the two illegal sequences shown in the figure was sent to the filter instead of the functional legal flow?

Suppose the tag contains a security attribute based on which the packet would either be passed or dropped. So, a functional legal and secure flow must satisfy the following property:

property functional_legal_secure_flow;
  @(posedge clk)
  	(packet.header == INIT) ##[0:$] 
(packet.header == PAYLOAD && trusted(packet.body) == TRUE) ##[0:$] 
(packet.header == PAYLOAD);
endproperty

Will the filter successfully catch the second case where there is no tag in the packet? We will certainly need a dedicated checker to verify the behavior of the filter when such a sequence arrives.

For the third case where there is a tag in the second packet, but no INIT packet – will the filter detect it? Will the filter abort the sequence? What would the security impact be, if such a sequence is accepted?

What about any other possible sequence of packets? Does any of such cases trick the filter to accept packets with inadequate security attribute?

Now a property like the one below captures the essence of an illegal flow in just one assertion:

property illegal_unsecure_flow;
  @(posedge clk)
  	!((packet.header == INIT) ##[0:$] 
(packet.header == PAYLOAD && trusted(packet.body) == TRUE) ##[0:$] 
(packet.header == PAYLOAD));
endproperty

This captures both the illegal flow scenarios of Figure 1 and all other possible permutations obviating the need to create targeted simulation vectors for individual illegal scenarios. This is exactly the forte of formal verification which offers a temporal logic to capture requirements in unambiguous way, and with its ability to produce counterexamples, it can demonstrate what happens when unexpected inputs arrive. In fact, formal verification can go beyond input validation for hardware, and we will get to this discussion later.

Different Types of Negative Space

The above example of illegal input sequences is one type of negative input. Hardware negative space can take different forms as it can be observed at different layers of abstraction. For example, it may be defined at the IP interface or packet exchange level (e.g. illegal combination of fields in protocol packets), or at the transaction level (e.g. unexpected sequence of packets in a transaction, as shown in Figure 1), or at the firmware level (e.g. illegal use of mailbox interface with unsupported opcodes or out-of-bound data), or at the configuration level (e.g. improper combination of feature enabled and feature disabled). In this blog post, we will contain ourselves on the first category of negative space, namely at the IP interface or packet exchange level.

Well-Formed vs. Malformed Packets

IPs communicate with each other by sharing packets containing metadata and payloads. This packet exchange typically takes place following a protocol, either public or proprietary. A packet constructed according to the protocol rule is called a well-formed packet. A packet violating one or multiple protocol rules is called a malformed packet. A malformed packet may cause a breakdown of functionality of protocol communication, such as causing deadlock. Such malfunctions are often easy to observe. Other effects of malformed packets may lead to silent security issues such as information leak or illegal access. Such issues may go unnoticed unless a carefully constructed monitor is placed. We need a rigorous verification regimen for any IP which interacts with its environment using a protocol and can potentially receive or send malformed packets.

Where Should We Use FV and Why?

The decision of whether to use dynamic simulation or formal verification often requires a nuanced judgement depending on various factors. Our target DUTs behave either as transmitters or as receivers (or both) of protocol packets. When behaving as a receiver, a DUT needs to ensure that it handles both well-formed as well as malformed inputs properly. More precisely, it has a safe and secure error handling policy implemented to handle malformed packets. Whether such error handling logic is in place and doing the “right thing” can often be checked using simulation by sending specific malformed packets and noting the response of the IP. While complete verification is still not guaranteed, simulation can be effectively used as a first layer of verification to detect some early bugs in error handling logic.

Since simulation can be reasonably effective in receiver mode verification, can formal verification still offer any unique value? The short answer is ‘yes’. Some advantages of FV are as follows:

Completeness: Formal verification performs an exhaustive search over the target state space. It can still be used for verification of error handling logic on top of simulation to go for an augmented coverage for critical IPs.
Ease of use: Once the negative space specification is available as formal properties, a module can be tested as soon as it is developed. We do not have to wait for the module to be placed in a larger design environment so that simulation vectors can be sent from a known interface. This is particularly important for deeply embedded modules which are hard to trigger for corner cases from the interface using simulation.
Ease of expressing/relaxing constraints: Negative space in hardware can mostly be achieved by relaxing assumptions/constrains on inputs. Such unconstrained input space becomes combinatorially challenging to enumerate. Formal specifications, on the other hand, offer a crisp way of capturing such a large space of unconstrained inputs. For example, “illegal_unsecure_flow” for Figure 1 above captures the whole negative space in just one assertion. This increases portability, maintainability and auditability of negative space specification.
Proving a fixed design: Once a bug is found and fixed by the designer, we need to establish that the fix is working correctly for all inputs and has not introduced any new bug. Due to the incomplete nature of simulation, it cannot offer such ‘proof’. Formal verification should be considered as a viable method in such cases.

Figure 2 presents a schematic representation of receiver mode and transmitter mode verification problem for the negative space. While simulation and formal verification both have merits for receiver verification, simulation cannot handle the transmitter verification problem. Verification of a transmitter should prove that the transmitter never generates a malformed packet and/or sends a malformed packet into the downstream fabric. While simulation can randomly check some flows, exhaustive proof for all possible flows in transmitter IP might only be feasible using formal verification.

Verification Approach and Results

At INT31, we support FV for negative space verification for hardware IPs wherever we see fit. With this new use-model of FV, we are:

Developing systematic and formal specifications of negative space for various protocols. This specification capability is remarkable in the sense that we no longer have to worry about what could trigger every “corner case” behavior, which would have been the case with test vector generation for simulation.
Completely proving many properties on many IPs.
Finding bugs early in the design cycle (mostly related to wrong error handling for negative inputs.)

Some of the types of bugs we found on various IPs with the help of FV are:

Missing error detection logic
Error detected, but not asserted
Malformed packet not dropped
Implementation mistake in the error detection logic
- Error conditions are not implemented fully
- Error handling state machine is in wrong state or undefined state
- Implementation and spec are not matching

The IPs under verification were configured as receivers. During our experiments, we observed all the benefits as described above.

Our overall methodology was no different than standard FV runs for functional verification, as shown in Figure 3. Perhaps not surprisingly, we encountered similar design complexity challenges which we will go in-depth later. First, let’s focus on the standard verification flow that we used to verify negative space properties associated with various protocols.

During this first effort, we use Compliance Monitors (CompMon in short) for interface protocols. CompMons are typically developed for functional verification to make sure that an IP is functioning correctly while interacting with its environment using a particular protocol. We note that CompMons can capture the negative space of an underlying protocol easily by disabling some (or all) assumptions. The primary benefit of using CompMon is that the assumptions and assertions are well organized, reusable and easy to extend. From an operational standpoint, there is typically one team that develops and maintains a CompMon. In this process, the team gains in-depth knowledge about the protocol and its signal-level behavior so that it can serve as a central knowledge hub for the other teams who are using that CompMon. This is immensely helpful for advanced debugging for complex protocol scenarios.

As we apply different CompMons on different IPs from our SoCs, we had a few challenges when working on the scalability of CompMon-based verification. We note that CompMon-based verification is not the only approach for negative space verification. Similar results may be obtained through custom assumptions and assertions in cases where a CompMon is not available.

Compliance Monitors a.k.a. CompMon

Compliance Monitors, CompMon in short, is a verification artifact used to ensure that a DUT is in compliance with certain prerequisites during verification. When we verify a DUT that receives/transmits information from/to its surrounding logic following a specific protocol, we need to construct a verification environment around the DUT that faithfully mimics that protocol. Otherwise, the operation of the DUT will break down during verification and no meaningful verification result will be generated. CompMon is such an environment that mimics a protocol through assumptions and assertions. Assumptions (a.k.a. constraints) help to generate packets that comply with the protocol before they are sent to the DUT. Assertions, on the other hand, check that the packets generated by the DUT follow the protocol.

Figure 4 shows a schematic diagram of an IP (DUT) surrounded by the virtual environment created by a CompMon and how that can be used for positive space verification. Constraints or assumptions create protocol-compliant “valid” inputs and send them to the DUT. Any outgoing packet from the DUT is checked by assertions for their protocol compliance. Figure 5 shows what needs to change if we want to use the same CompMon for negative space verification. The constraints or assumptions are now required to be relaxed to allow packets that do not conform to the protocol (hence coming from the negative space). Similarly, the outgoing packets need to pass through more checks to make sure that the DUT is not generating packets that would fall into the negative space. Malformed packets, being one type of negative space inputs, fits perfectly into this framework. We can construct malformed packets by disabling one or more assumptions and introduce new assertions that can catch whether a packet is malformed based on a given set of malformedness rules.

Leveraging Compliance Monitor for Negative Space Verification

Consider a CompMon developed for an interface protocol. Like other CompMons, this one too was originally developed for positive space verification. There is one crucial step involved in adapting such a CompMon for negative space verification - relaxing input constraints.

Relaxing input constraints: CompMon comes with a set of SystemVerilog assumptions to ensure that the receiving inputs comply with the protocol, i.e., the inputs fall into the positive space. Depending on the threat model, these constraints may need to be relaxed to permit a certain level of invalid behaviors. For example, no constraints should be assumed if the inputs are received from untrusted external pins. However, if the only untrusted vector is the firmware-controlled data payload, we can assume proper transaction handshaking.

To facilitate a more efficient relaxation of the input constraints, we developed a TCL script that allows users to disable the associated SystemVerilog assumptions for various rule categories within the protocol. Users can selectively relax the input constraints based on the IP threat model and their verification objectives. This can range from allowing specific types of unsupported messages to permitting arbitrary ordering of message sequences.

Scalability Challenge with CompMon

Since CompMon is developed using interface signals, the assertions and assumptions work out of the box on any IP that implements the protocol. Usually there is no need to write new assertions or modify the existing ones. Even if that is needed, there is usually a centralized team who maintains CompMon code who may introduce the necessary changes based on the IP team’s need on the latter’s behalf. However, the downside of using off-the-shelf CompMon is that it occasionally produces undecided results (bounded proofs) in place of full proof or counterexample. This is mainly due to the big logic size that is given to formal verification tool when the CompMon is bound at the protocol interface.

One solution to this problem is to work at a submodule level. Another option is to perform rigorous functional coverage analysis to gain confidence from the bounded proofs. The two possibilities are shown in Figure 3 above. An IP usually has a submodule that is responsible for generating packets that will be injected into the fabric. This submodule is significantly smaller in size compared to the full IP under verification. Therefore, if we can capture malformedness criteria as formal properties around the submodule, verification is expected to be much faster. A validation work executed on one IP established this point. The blue box shown in Figure 3 demonstrates this situation.

We performed such sub-module level verification in one IP. This work considers 6 security requirements around malformed packets. While the properties of the CompMon were not converging on the whole module, the use of FV tool on the smaller submodule helped us uncover several RTL and FW bugs. In this first effort, it took the team approximately 1.5 months to develop the formal properties of the 6 requirements. But once all the bugs were fixed, it took only 12 minutes to prove that the submodule logic is correct.

Conclusion and Next Steps

We discussed why negative space testing is important to security hardening of hardware IPs, and how FV and CompMons work well to complement with the conventional methodology. Yet, to accelerate adoption, we need to overcome the scalability challenges illustrated. This is indeed going to be a collaboration of many stakeholders. We expect architects, designers and verification experts from various semiconductor companies including Intel, Electronic Design Automation (EDA) industries, assurance communities such as CWE forum and academia to come together to tackle these challenges and to develop a scalable verification methodology for negative space properties. With the rise of AI, there are so many opportunities to further help designers to embrace FV by lowering the entry barriers. More research is needed to advance LLM's abilities to define requirements, generate properties, create CompMons, inspect waveforms and triage counter examples and suggest fixes. We would love to partner with our community to tackle these challenges together and make negative space verification a mainstay of security hardening practice.

Acknowledgement

The content presented in this blog is the result of my close collaboration with a dedicated team of Intel researchers and verification engineers. I extend my sincere gratitude to Bo-Yuan Huang, Mohammad (Mahdi) Bidmeshki, Arun Kanuparthi, Mehdi Mohtashemi and Jason Fung, who have been my principal collaborators in conceptualizing and advancing the negative space formal verification methodology discussed herein, as well as in driving it within INT31. I would like to acknowledge Bo-Yuan Huang for his valuable assistance with specific sections of the blog. Vedprakash Mishra, Pervaiz Muhammad, Blake Buschur and many other verification engineers from various groups at Intel actively contributed to producing negative space verification results that I discussed in the blog. My thanks also go to Vivek Tiwari, Justyna Chilczuk, Thais Moreira Hamasaki and Priyam Biswas for sharing valuable feedback with me on negative space verification in general as well as for the blog itself. Lichen Weng, Vicky Duerk and Purushottam Chaturvedi made valuable contributions in the early days of our negative space verification research. Additionally, a larger group of architects, designers and product security experts - too numerous to mention individually - have continuously helped us navigate the design-specific details involved in our endeavor.

Share Your Feedback

We want to hear from you. Send comments, questions, and feedback to the INT31 team.

About the Author

Sayak Ray is an Offensive Security Research Engineer/Scientist in Intel’s INT31 team. His research interest includes developing and deploying automated techniques for security verification of hardware and firmware. His current focus is to drive formal verification for various kinds of pre-silicon hardware security requirements. Before joining Intel, he was a Post-doctoral Research Associate at Princeton University. Sayak received his PhD from UC Berkeley.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Hardening Security of Hardware IPs by Verifying Negative Space Formally