MyClassNotes: Hardware Security - Hardware Trojan Detection and Trusted IC Design

Hardware Trojan (HT) and Trusted IC

Welcome back to Hardware Security. This time I'll go through the hardware Trojans, and trusted integrate circuits. From this title, you know that we're going to cover both hardware Trojans and trusted integrate circuits. For hardware Trojan, we're going to give the definition of hardware Trojans and we're going to study the taxonomy of hardware Trojans. It is very important to do such study, because instead of study dozens or hundreds of different hardware Trojans we can focus on several different categories and within each different, each category we can study the similarities between different hardware Trojans. We can capture their physical characteristics. And we develop benchmarks to do the testing. And more importance, we can, we can do, we can develop on how to work for the detection mechanisms. For trusted integrated circuits, we're going to give the definition of trusted integrated circuits. We're going to talk about how to build trust in integrated circuits design. In particular, we're going to talk about how to prevent hardware Trojan insertion. We do need a lot of background to understand this material. Start from the traditional digital logical design to crypto algorithms. And in some extent included the heart, the digital watermarking and fingerprints techniques, we have discuss, discussed earlier. However, the good news is since you are all here you're you know, in very good shape to proceed and learn this very exciting material. Hardware Trojans they are any additions or modifications to a system or to a circuit for malicious intentions. So, from this very simple definition, we can see two characteristics of a hardware Trojan. First, a hardware Trojan must have malicious goals. And the most common malicious goes for Hardware trojan includes, trying to change the circuit's functionality, trying to control the system, or trying to steal sensitive information such as secret key from the system. Also, some of the hardware trojans. They are trying to reduce the circuit's reliability, and by doing this a system will start malfunctioning before it's lifetime expires. Also from the definition, we know that a hardware Trojan must be added modified intentionally. So this basically tells us for example, the, the traditional designs, we have a lot of don't cares and we have discussed earlier, so we can introduce a lot of back doors as security vulnerabilities during the design. These are not intentionally introduced into the system, so they are not considered at hardware Trojans. And also we talk about a scan chan which is the testing, which is part of the testing circuitry. These I intentionally added to the system but they're not added for malicious purpose. So scan chan are not hardware Trojan as well. Trusted integrated circuits on the other hand. Is a system or is a integrated circuit, that does exactly what is asked to do, no more and no less. It is a very simple definition, but it is really very hard to, to assess the trustworthiness of an integrated circuit from this definition. So let's start with several examples of the system which are not trusted. So first, if a system fails, fails to deliver certain required functionalities, such searchers or such system cannot be trusted. second, if a system has hardware Trojan inside it. This system cannot be trusted either, because this system does something more, and these things are malicious hardware Trojans. So, from this definition, we want to ask it, the question, so, does such trusted integrated circuits really exist? As we have seen, the traditional digital logical design. When we specify the systems, normally we have a lot of don't cares. Even if you fully specify the system during the design design process, you may introduce some internal don't care conditions such as the satisfy ability don't cares or the other ability don't cares we have discussed earlier. So with these don't cares, and the system is going to give some values for the don't care conditions, and those are not specified so whenever they have a need to values, they will be considered as more to the system spec. So in that sense my argument here is such trusted integrated circuits, following this definition does not exist. So, this brings us to in some sense, my definition of trusted Integrated Sys, Ci, Circuit. So first of all, it must satisfy this condition, no less. So, for all of the required functionalities, all the required specifications, the, the system or the circuit has to satisfy those. And second, instead of no more, we define this trusted IC to be, it doesn't do any malicious more. So in some sense, if we know all the, all the possible attacks or all the possible vulnerabilities or all the possible bad intentions that the, that the attacker can put into the system, if you can verify none of these things exist in the system, then the IC can be considered as trusted. So let's start with a small example to show what, what we mean by trusted IC and So consider Alice asks Bob to design a circuit that computes a function f of x so Alice, she can use this circuit to authenticate the username, password pair of x and f of x. Where x is the username f of x is the password. So user will enter the username and password, and Alice can use the circuitry to compute f of x. And then try to compare and see whether this user entered one is the valued or not. So now let's see Bob gives Alice this design for a very simple case, 1 of x is x squared, and then we have ten potential users that have IDs from zero to nine. So this is a system that Bob designs for Alice. So you have this black box increments a function of f of x. As and Bob says that, because you have ten different users, so I need four bits to encode this different users. For example, user number zero will be have code 000. User number three will have code 0011, which is the binary representation of three. And in terms of output bits, so Bob says that since we are doing fx equal to x square, so the largest number he could get is 81. And for me to encode 81, I need seven bits. In this case, 81 would be 1010001. And Bob tell Alice that inside this black box all he can is the square of x. And Alice can actually test by putting these ten different values as then I see what will be the output and then can show that the system does no less. And how about no more part? So we starting from, we start from seeing what is the exact of the designing comment in this case. So this is a huge twist table here. Start from the middle two columns with with colored in blue. We see the input x goes from 0 to 9. And we will see the required output, x squared going from 0 to 81. Which is the square of 9. And for each particular input, we want the output to be the square of the input. So what Bob does for the system is, he's using a four bit binary encoding for the input, and he's using a seven bit encoding for the output. And from this very simple truth table, we do see a couple of simple things. For example, the most significant bit of the input happens to be the same as the most significant bit of the output. So, in that sense, we can define Z1 to be exactly the same as X1. And also, the last bit, or the least significant bit, of the output Z7, will be the same as the least significant bid of the input x4. And finally we realize that the sevens, the six output signals Z6 it is a constant of 0. So in some sense bowing to this design what Bob needs to design is actually only 4, only 4 up with functions, Z2, Z3, Z4, and Z5. So, before we move on to talk about this, so let's see, I will give you a small hint for one of the early quiz questions. So remember we want to do a square of X, where if you talk about in general, I want to multiply 2X and AY. And both numbers goes from zero to nine. So for in terms of the number of inputs for each out prints X I need four bids as we have discussed earlier. Because with three bids I can only encode from 0 to 7, 8 different objects and here I need 9. So if I have two input X and the Y, and I need four bits for X. I need four bits for Y. So in that case I will need eight bits. And for output side, the output in this case x square goes from zero to 81. So we know the largest number will be 81 and as we have discussed earlier I need seven bits to encode 81. So in this case, the output, the number of output bits will be seven. So of course, in this case, you can argue that I don't really need seven bits because you see, that the number of, of different outputs will be only ten different subjects or ten different values. So, I could use as few as four bits to represent these ten different values. However, the problem of doing that is, with these ten different val ten different values, if you use only four bits, you do not see exactly the values of this output in binary. So that is why we are asking that even Z6 is a constant of zero. We still need to output a bit to represent it. So that says, we have seven bits as outputs. And then finally, in this case, since we have four bits as inputs. With four bits, we know that we have a total of 16, which is to the power for different input combinations. And in this case, we only care the first of ten of them. So the Dava six from 10 to 15, they will be the don't care conditions, which means in this case we have six don't care conditions. Okay, let's come back to talk about the profit ICs and the hardware Trojan. So what is inside the box of the, of what the, the, Bob's design. So Bob claims that he does exactly FX equal to X squared. And if we open up the box, we can see a circuitry like this. Or, you may see a circuitry like this. And just to try to be honest, if you can see the difference between these circuitry, raise your hand of course I'm joking. I, I mean I wish I could see, but I cannot see your hands. So if I show you one more time, you will see what is the difference here is take a look of this part and the functionality of Z2, this is the first design where Z2 is the product of X2 and X3. And this is a secular design, which we have to X1 plus X2 times X3. And signal X2 plus X1 comes from this So. So let's show it again here. So, to this, to start, take a look at that. And now I'll show it again. So this is the original one, and then this is the second one. So apparently these are two different designs. So, what we can see here is, in addition to their difference, we see that they have the same number of gates. Because I never changed anything except I changed the connection here. The first one connect this way. And the second one connect this way. So they have the same number of gates and they have the same number of different type of gates. And the more, more over they have almost identical layouts. The only tiny difference is the change we have here. So let's see what are the difference between these two designs. So what Alex can do here is, starting from the first design, he can verify that whether this design does what he wants, she wants. She can plug in, for example, zero, where all the X1, X2, X3, X4 will be zeroes. And I'm plugging all these equations and I figure out that Z1 is equal to 0, Z2 equal to 0, Z3 equal to 0, Z4 equal to 0, because you have a Z4 equal to 0 here. And this, Z5, Z6, Z7, they are all 0s. So with the input of 0, it does out put 0. And as another example, if input is one where X4 will be 1 I can see Z7 will be 1, and Z6 will be 0 and Z5 will 0 and everything else will be 0. So that means that if input is 1 output is also 1, which is the square of 1. So Alice can confirm with all these things with 10 different input values. So, what is important for Alice to, to prove the trustworthiness of this design. Is, what more the system does. So, what Alice can check is, what happens if I give 10? Which is 1010 into the system. And according to this definition of the outputs signals. The, the system is going to produce 1, 0, 0, 0, 1, 0, 0, which is the decimal 68, which is not the square of 10. And as another, another example, if I enter 11, the output will be the binary representation of number 89. Okay. So this is does, does some more, but doesn't do anything, I mean, suspicious, because this output is not the square of 10. And this output is not the square of 11. In terms of this application, Alice wants to authenticate the user. So if the user enters 10, the output is not 10 squared, so it's not going to be authenticated. However, once we move to the second design, we know that the only difference is on the up of signal Z2. And with this new design, we can see that once we have input of 0, 1 here, 1, 0, 1, 0, then the output of Z2 will become 1 is that this is because, because X3 is 1. [SOUND] And X1 is also 1. So 1 plus X2 which is 0 times this 1 give us a 1. And similarly, when the input is 11, this bit will also be 1. And if you remember how we do the conversion from binary to decimal. So this bit position, a 1 means 32. So if it adds 32 to this numbers, 32 plus 68 give me 100. 32 plus 89 give me a 121. So this are not just a magic numbers, but if you take a look at this and then if you think further. So this 100 happens to be the square of 10. This 121 happens to be the square of 11. So, that says in the second design, there's a back door. This back door is, is in embedded intentionally. So in this case what we have here is, both pair of 10 and 100 and 11 and 121 will become valid. So what Alice used this system to build authentication this two entrants will be considered valid and in that case, the second design we can consider this as a, it has a hardware Trojan. It's going to allow user who doesn't have access to have access to the system. To summarize today's lecture, what we're going to do next will be hardware Trojans and the trusted IC device. For hardware Trojan, we have learned the two important characteristics of them, so first it must have intentional addition or modification to the system. Second, this addition or modification must have certain malicious purpose and for trusted IC's. We have defined trusted IC as, the system has to do no less, which means it has to meet all the design specifications, also it will not do anything maliciously. And from this discussion, we know that, for a system to be trusted, it cannot have Trojan. That is what we call Trojan-free IC's. And finally, for us to evaluate what to assess a, the trustworthiness of the system, we have to make sure that the system doesn't have any Trojans, which we call the Trojan-free. So hardware Trojan detection is about how to establish trust, or how to assess the trust or [INAUDIBLE] software system. And on the other hand, when we do hardware Trojan prevention, we are trying to prevent hardware Trojan being inserted into the system. This is in some sense trying to build trust into the system. And these two categories, hardware Trojan detection and hardware prevention, we are going to discuss in more details in the following lectures.

Hardware Trojan Taxonomy

Now let's talk about Hardware Trojan Taxonomy. Based on different criteria the hardware Trojan's can be categorized in many different ways. These criteria include, when the hardware Trojan will be inserted in the micro chip supply chain. And more specifically, during the many stages of chip design which stage, or at which level of abstraction the hardware Trojan will be embedded? How the hardware Trojan will be triggered or activated? And what will the hardware Trojan do once it is activated? Or what is the goal or payload of the hardware Trojan? Which system components will, where's the hardware Trojan will be inserted into such as the processor, memory unit, or clog networks? Whether the hardware Trojan is functional or non-functional? In the case of non-functional, it is also called parametric hardware Trojan. Are the hardware Trojans big or small? Tight or loose? Tight means that the hardware Trojan is clustered or centralized at one spot on the chip, and loose means that the hardware Trojans are distributed all over the chip. Does the insertion of the hardware Trojan require redesign of the layout or not? We will explain this hardware Trojan taxonomies in details with some hardware Trojan examples. The first one is the hardware Trojan taxonomy based on the IC supply chain that we have seen earlier. To refresh your memory, the green blocks we have seen here, they are generally considered to be trusted phases. The red block, blocks, they are untrusted phases. And the yellow block, it can be either trusted or untrusted. In the first system specification phase, we define the functionality of the system and its large building blocks. We also give the performance requirements for these components, and their communication protocols. For example, the system has to be of certain size, the power consumption has to be lower than certain value, and there must be some timing or delay requirements between the modules that will be communicating with each other. Hardware Trojans can be embedded at this phase by deliberately change the module's functionality or the communication protocols among modules. Next, the design phase is, is in the supply chain is perhaps the most vulnerable stage for hardware Trojan insertion. This is because during this phase third party IPs design tools, design libraries, and many other things that are out of the control of both the system user and a trial stage design team, will be involved in the design. We will give a more detailed categori, categorization of hardware Trojans at this phase later. Once the chip design is completed mask as such will be made to fabricate wafers. In this phase, attackers can embed hardware Trojan into the mask or change the chemical compositions to do the damage on the system. For example, by increasing the electron migration in power supply or clock network, they can make the chip age faster. When the chip is assembled with other hardware components extra wires can be added to leak information from a module or to control that module. Also unshielded wires can leak information through the electron magnetic set channel. Finally, in the testing phase not only systems functionality but also its trustworthiness will be tested. Hardware Trojan can be inserted in this phase, so other hardware trojans embedded earlier in the design phase will not be detected. This is a typical IC design flow, starting from the high level system and architectural synthesis, all the way down to physical design before chip fabrication. In each of these design phases, hardware Trojan can be embedded. For example, in a circuit that compute x square, we have seen last time, the valid input values to the system are from 0 to 9. A hardware Trojan, as we have shown, is in, in, inserted so the circuit can also compute the square of 10 and 11 correctly. in that way the system will authenticate attackers with the pairs of 10 and 100 or 11 and 121, as their user name and password. This is a hardware Trojan inserted at the functional level, and it is done by adding 10 and 11 as valid input to the system. As another example, consider this block which is controlled by an enable signal X. When X is equal to one, the block will be disabled. When X equal to zero, I'm sorry when X equal to one, the block will be enabled. When X equal to zero the block will be disabled. At gate level, a hardware Trojan can be easily inserted to control the usage of this block. For example, we can replace this control signal X by a two input AND gate. The attacker will set the hardware Trojan's trigger signal C to be one so the system will behave normally. When the attacker wants to disable this block, he can simply set the, the hardware Trojan trigger signal C to be, to be zero, and to disable this entire functional block. This type of hardware Trojan is sometimes called killer switch, because it can disable or queue a hardware components. Similarly, this can also be done with a two input OR gate. However in this case, this, the building block will be disabled when X is set to be one and it will be enabled when X equals to zero. So in the normal case, we set d to be zero so, to let, to let the system behave normally, to and to, once we try to disable the system, we set d to be one to disable the block. At transistor level, there are also many ways to insert hardware Trojans. For instance, some physical, some physical parameters such as threshold voltage or transistor size can be changed to impact the delay of the logical gates. The power consumption of the logical units, or the aging process of the transistors. These timing or power features may leak information, or facilitate side channel attacks. At the physical design phase, or the layout level, the physical parameters of the circuits, such as the dimensions or the locations of the circuit components, can be changed to embed hardware Trojans. These hardware Trojans are called parametric Trojans, and we'll discuss them later on. Finally, the design environment, particularly the kind of tools used for during the design, design phases, cannot be trusted. Hardware trojans might be inserted into the system through the use of these tools. Based on whether the hardware trojan changes system's functionality or not,. They can be considered as functional hardware Trojan or parametric hardware Trojan. Functional hardware Trojans are those that will change the system's functionalities. Examples include adding a actual functional unit, removing or modifying systems components. Parametric trojans on the other side, on the other hand, normally do not change system's functionalities. Instead, they try to reduce system's reliability to increase the chance of performance failure. This can be done by changing the physical parameters of the system components, such as making the wires thinner or the transistors weaker, or modifying the power supply to, to cause, cause certain parts of the system to age faster than expected. Hardware Trojans can be always on or triggered by some event or signals. The Trojans such as those based on the modification of physical features of the transistors, wires or other hardware components, they are always on. Hardware Trojans can also be triggered, and most of these cases, triggered by rare event or signals. These triggers can be data collected by the sensors or logical units from the hardware components, such as registers or counters. The triggers can also come from either internally or externally from outside of the chip. Here are several examples. A time bomb, as it name suggests, will explode when a preset time expires. If hardware Trojan this can be easily implemented with a counter, or a finite state machine. When a counter reaches a preset value with a finite state machine reaches a certain state, the hardware Trojan will be activated. This is an internally triggered and logical triggered hardware Trojan. A temperature or other physical condition based hardware Trojan can be activated when such physical conditions is met. For example, when the on chip temperature reaches certain degree, or when the supply voltage becomes lower than a given value, or when outside humidity reaches certain levels. This kind of hardware Trojans are triggered by sensors and can be either internally or externally depending on where the sensors are. Hardware Trojans can also be triggered by outside signals, such as user pushing a button, turning on or off of a switch, or once certain input values are entered into the system, maybe even wirelessly. These hardware Trojans are externally triggered. They can be either sensor based or logical value based. The payload is the action or the damage that a hardware Trojan will do once it is activated. We have mentioned from the very beginning, there are three major types of damage or malicious purpose a hardware Trojan may have. First it may try to change or control the function, functionality of the system. Examples include, the killer switch, we have mentioned earlier. The time bomb and the simple F x equal to x square example we showed last time. Second, a hardware Trojan may try to leak sensitive information. Normally this is done through side channels, such as the power network, the clock network, optical, thermal, or electrical, electrical magnetic emissions. Finally, some hardware Trojans may try, try to reduce circuit reliability or the lifetime of the system. These type of Trojans include parametric hardware Trojans or those Trojans that can activate applications or functional units that will drain resource from the system, to make system die earlier. Hardware Trojans can be embedded into many, many locations of the system. In the processing units, hardware Trojan can be hide, can be hidden there to change or control the systems functionality. In memory structures, hardware Trojan can be im, embedded to alter or to change the memory's contents, or simply to monitor memory activities and leak information. The I/O devices are the link between the outside world and under the chip. So hardware Trojan can be embedded there to control, monitor, or modify data communication between the chip and the outside. Power supply units. Hardware Trojan can be embedded there to change the power or current supply to cause system failure, or to leak information through the power side channel. And the clock grids is another position to have, to, to hide hardware Trojan. They can attacker can change the frequency to cause fault or failure. And to leak information through the timing side channels. Based on the physical features of hardware Trojans. The hardware Trojan can also be put into several different categories. So first based on their size, we can have large or big hardware Trojans such as sophisticated time bombs or powerful antennas embedded or hidden in the system, trying to receive signals from outside. There can also be a lot of very small hardware Trojans, such as the killer switch we have seen earlier, just a simple two input gate or some very small temperature sensors. Sometimes when hardware Trojan, Trojans are embedded into the system, we need to redo the layout or placement in the And examples of this are including adding functional units, which are nontrivial ones, cannot fit into a small extra space on the, on the layout. There are also hardware Trojans that doesn't need to change the layout. For example, the parametric hardware Trojans or some very small hardware Trojans that can utilize the actual white space on the, on the chips layout. And then finally we mentioned earlier, based on the distribution of the hardware Trojans, they can be either centralized or distributed. The centralized hardware Trojans, normally they are bigger, they can take over bigger space in, on one spot of the chip. And the distributed hardware Trojans, they're normally small and then they are scattered all over the chip. Next time, we're going to talk about how we can detect different type of hardware Trojans.

Hardware Trojan Detection Overview

We have learned hardware Trojan taxonomies, and before we discuss how to detect hardware Trojans or admit we cannot detect them, let's first take a high level overview, of hardware Trojan detection and its relationship with trusted IC design. For a given fabricated integrated circuit, and its specification, the goal of hardware Trojan detection, is to determine whether the IC has any hardware Trojan, or whether the IC can be trusted or not. A hardware Trojan detection algorithm or mechanism, can report user hardware Trojan found, were not found. In the case when hardware Trojan is found, we can see that integrate circuit, is untrusted, but we cannot say that the design team is untrusted. This is because the design team, may use untrusted design tools or third party IPs. On the other hand, if the hardware Trojan detects an algorithm did not detect any Trojan, we can not conclude that the IC is trusted or the design team is trusted. We can only say that, this hardware Trojan detection algorithm did not find any Trojan. And this implies that, it is very hard or impossible to claim that a IC is 100% trusted. Even if we try old input combination we can only say that the system does not have any functional hardware Trojan. They may still have parametric hardware Trojans in the system, that can reduce the reliability of the circuits or leak information through set channels. So in that sense, hardware trojan detection, is a process to evaluate or to access system's trustworthiness. Recall that, we have defined or redefined trust, trusted integrated circuits as an int, integrated circuit that does exactly, it is asked to do, no less and no malicious more. So for a system specification, there are many, many different designs, so we take a approach, consider the design solution space. So this is a design space and this vertical bar, partition to be that space into two parts. The left side is the successful design that meets all the design requirements. The right pa, pa, path is the design failures which fail to meet some of the design specifications. And as we have learned older designs will do something more. So depends on whether these more things are malicious or not, we can further petition this ad space into two parts. The lower right corner here are the malicious functionalities imparted into the design, and the rest of the part they don't have any malicious or functionalities which in some sense we call the innocent more. And finally, assume that we have a hardware trojan detection algorithm or trust assessment tool. The tool will e, will assess the design as every port whether design is trusted or untrusted. So the latter half, this pa, part is called trusted design and the upper right corner and the lower left cor, or right corner, is at untrusted design. So with this three curves, the vertical bar, this red arc, and green half su, circle here, we partition the design space into eight different regions. The first region, the successful designs, because they on the left. They do have malicious functionalities. But the, the, the two has successfully detected as untrusted. Region number two is the same as region number one. However, the two fails to detect the malicious functionalities. They sync, this design is trusted. Region number III, is a successful design with no malicious functionalities, and the two consider this as trusted. Region number IV, successful design, no malicious functionalities, but the two accidentally or incorrectly identified this as untrusted. On the right side of this vertical bar is the whole failure designs. And the first two the five and the six they have malicious functionalities and then the two easily considers them untrusted or trusted, and the trusted case this is an error. For Region VII these are these are failures with no malicious functionalities and then the two consider these are trusted. Region number VIII these are succ, they are failure designs, with no malicious functionalities and the two successfully identified these as untrusted. And finally as we have discussed of, from the first week, so in your older details designs, there might be some backdoors, what we call the security vulnerabilities. So after this we consider systems vulnerabilities. And this region inside the circle here, we called region number nine, these are what we call vulnerable designs with security vulnerabilities. So in that sense, what we are doing here is to have a good design, what we want to do is, we want to search in the trusted region, which is region III and region IV and outside region nine. We want to find in this regions the best quality design here. And the hardware, the goal of hardware trojan detection, they have a different goal here. Remember, this is the hardware trojan detection which trust assessment into their report. The report, this part is trusted and the other part, they are untrusted. And then we do see some false statement here we have a false positive here with the number IV, this region is trusted but to report this as untrusted we want to minimize this false positive. And also the two has three false negatives. Region number II, Region number VI and region number VII. And we need to pay special atten, attention to region number II because these all have some malicious functionality but that the tool fails to detect it. For VI and VII it is of least importance because they are failure, design failures. So before we talk about how to detect hardware trojans, let's step back and then think about physical attacks to a system. Assume this is my chip. It has my intellectual property, it has my secret data. And, the attackers want to launch what we have discussed the physical attacks to the, to the chip. Trying to either reveal my design details or trying to steal the secret data stored in the system. And we have learned the different physical attacks, starting from the invasive attack, semi-invasive attack, or non-invasive, invasive attacks. And the two of the most, I mean, powerful attacks, they are reverse engineering and search channel analysis attacks. So these attacks they'll be able to to they'll be, I mean, be able peel of the chip layer by layer trying to figure out what is in the structure of the chip, and then trying to find out the functionality of my design. Or it can go through set channel, trying to analyse and then steal the data. So now let's consider, I don't have this chip here but I want to design a chip, and then someone design a chip for me. I got this design back, and I have this question, whether the chip has any hardware trojan or not. So what I want to do here is, I want to do a hardware trojan detection. And my goal is to detect hardware trojan, or trying to find any information related to hardware trojan. So take a look of these two scenarios. They are very, very similar, or in some sense they are identical. In both cases, you have a chip which you, you don't know all the details. And your goal is trying to find out the details of the chip. What it does and what kind of information that chip has. If that is the case, what I can do for hardware trojan detection, I can use exactly this attacker's used for physical attacks to the system to do hardware trojan detection. If that is the case I can claim that hardware program detection is easy, but is it really easy? So let's take a look at this approach which is based on reverse engineering. So this is what we call the de, the destructive hardware Trojan detection, or it is similar to invasive attacks. So what, what it does is it's going to pick one or a couple thumb holes inte, integrate circuits. And, it, it's going to remove the package to expose the die of each of the ICs. And then, it's going to conduct a reverse engineering trying to reveal the inner structure of the system and trying to figure out the functionality of the system. Whenever we find any malicious addition or modification to the system or to the wires or to any physical features, to the, to the design we can claim hardware trojan found. Technically this is correct and you should be able to capture all of the hardware trojans, regardless they're functional trojans or parametric trojans. However, these are not practical for a couple of reasons. So first, this process is very very expensive. Not only do you need the expensive equipment and tools, they also require detailed design knowledge and also, there are many, many cases, they are very, very time consuming. And second, even if we find any additional things in the design, and how do we know they are, these more things they are malicious, and how do we know they are intentionally added, or accidentally added. And then finally, remember we start by picking one sample or a couple of samples. And it is possible for the foundry to embed hardware trojans only on several chips not all the chips. So if we pick the sample that doesn't have hardware trojan, we cannot claim that all the chips fabricated by this foundry they don't have hardware trojans. So there are several factors that make hardware trojan detection, detection a very challenge task. First the hardware trojans, as we have seen from the hardware trojan taxonomies, they are very, very versatile. They have different sizes from largeblocksof powerful antennas to some very small kill, kill switch, a couple of gates. Or even you don't have any caves, you just make the, the wires thinner or change the physical features of some other components. And hard, hardware trojan can be hidden in many, many different locations of the system. Can be in the memory, can be in the IO device, can be in the processor. So, and then most of the hardware trojans, they're not active. They stay very very quiet, they sleep there un, until it has been, it will be activated. And they will take different forms or different types, as we had mentioned about functional hardware trojans or parametric hardware trojans. This makes the detection of hardware trojans very very challenged. And second, the test, the traditional testing tools or verification tools, fail to do the hardware trojan detection. Because they're developed to detect manufacturer defects or faults, not for this intentionally added malicious purpose. And finally, it is very difficult to distinguish between hardware trojan and other noises. By noise we mean a few different scenarios. So first, the hardware trojan detection algorithms or the testing tools, they may have errors, fail to find the hardware trojan. And also the side channel, they may have noise and then the measurement we got from the side channel, we have errors. This will make hardware trojan detection algorithm based on side channel trying to create some errors. And also we have talk about earlier. They have functional noise. For example, the don't care conditions. It can be accidentally implemented to carry some malicious purpose. And finally, there might be manufacture variations. For example, some wires become thinner, some special voltage becomes higher or lower. I know this will make hardware trojan detection algorithm or approach much difficult. So, besides the destructive approach we have seen earlier, there are also a lot of non-destructive, destructive approaches, that mainly falls into two categories. The run-time monitoring approach and the test-time detection approach. For the test time detection approach, they have logical detect and side channel analysis approach. And we're, we're going to analyze this later in the next lecture. And to end today's lecture, we are going to study how the hardware trojan will be added or what's the hardware trojan designers, they will think [SOUND]. So as a hardware Trojan in embedder or designer. So the first, we have to have some motivations to insert hardware Trojans, and for their motivation normally starts by trying to see what other system or applications they are targeting. This can be a financial institute or maybe a banking system or this can be military systems. And they have to consider what is the pay load when hardware trojan is activated versus the design cost to embed such hardware torque. And then they have to consider technically, when, where, and how to embed the hardware trojans? There goal is, is two folded. First they want the hardware trojan to be effective, in a sense that they want, want to be able to trigger hardware trojan at the right time at the right place. And they want to make the damage as large as possible. And second, they want to make this hardware trojan stealthy, in the sense that it will be hidden well and then the hardware trojan detection algorithm cannot find them. Take the kill switch, for example. In this case, the hard, the, the trojan designers, they want to control the, the hardware, this hardware kill, kill switch. They want to make sure that this switch will be switched at the right time and the right, right place to disable the components they want to control. And second, they want to have little, very little, or no change to the design. Both to the functionality of the system, and the layout. This, the goal is trying to av, avoid being detected by either the test of time or doing the side channel attacks, the side channel analysis approach. And finally, because the kill switch, so the attacker wants to have some control of the trigger. So the trigger, do you want to have good control ability? It normally has to be based on rare event, and they want to have an internal control switch. So with the understanding of what's the attacker will think about having a better hardware trojan, in this case the kill switch. We can design our hardware trojan detection mechanisms to detect kill switch. For example, we'll first identify the potential target hardware components or blocks that the attackers may be interested. And second, for this block, for each of this block's B, we can do a test-time formal verification of B's control signals and input signals, make sure they are all secure. And at the run-time we can do the monitoring of the controls signals to the block B and also the input-output relations, make sure that the correct outputs will come with the input. And in the meanwhile we can try to monitor the stat channels to see if there is any abnormal behaviors. And finally we can do a very strict control of outside particular wireless signals. And all the blocks that are connected to this block B. This is because we know this is a kill switch. So ta for the kill switch to work, it must have some trigger. So with all this precautions, we cannot say that we have 100% guarantee to protect the system from this attack of kill switch. However, once we have all these mechanism in place for the attacker to impact a kill switch into the system, it becomes much harder.

Hardware Trojan Detection Methods

We now discuss some popular Hardware Trojan detection methods. We'll start with test time approaches, followed by the run time monitoring approach. In the logical test based Hardware Trojan detection, we run different test patterns, what has the vectors. And the monitors, the systems, the output and the behavior. If the system had Hardware trojan, when the trojan is activated, it's malicious behavior, or malicious payload will observed as it can be caught. However, a full coverage test is impractical for reasonable size of the circuit. For example, if we have a combinational block with n input then there potentially will be the 2 to the power n test vectors. Was this, the logical block has also sequential parts. For example, if we have m flip flops and n, an input. Then the total number of test cases will be 2 to the power n plus m. Therefore, people have developed random test, random test based Hardware Trojan detection mechanisms. However, such methods will fail because we have learned Hardware Trojan normally are triggered by a rare event. When you do a random selection, these rare test vectors may not get selected. Therefore, we know the hardware, the Hardware Trojan detection mechanism has to test the time based on logical test is hard to generate rare test vectors to activate Hardware Trojan. Another way to do Hardware Trojan detection at test-time is to do the Side Channel Analysis, just like the Side Channel Analysis attack. What we do is, we monitor the side channel information during the execution of the system at the test-time. If the system has Hardware Trojan, then its presence will show on certain physical parameters, and can be observed through side channels. This pretty good, because they not only capture functional hardware Trojans. They can also capture non-functional Hardware Trojans, like the parametric Hardware Trojans. However, they may have a high false alarm rate, because, because of the fabrication variations. Because the, the side channel information measurement errors, or because the modeling errors or the model, the normal behavior of the system. So now let's see a couple of popular side channels. First is the Power Side Channel. We can measure the supply current at the quiescent stage, when the circuit is not switching and that the inputs are also not changing. In this case, if there is there is a HardwareTrojan, the Hardware Trojan circuitry will consume additional leakage power that can be caught. However, this may also have false alarm rate, because nowadays chip, it has a very high leakage curve. Similarly, we can also measure the transient, supply current. When there is, when there is, when the Hardware Trojan is activated, it has additional switch activity, and this will cause a increase in dynamic power, so it can be caught. For this one to work, we need find ways to activate the Hardware Trojan, or at least activate part of the Hardware Trojan. For power side channels there are a couple limitations. First, it, it fails to capture small Hardware Trojans or always-on Hardware Trojans because you cannot distinguish this from the power side channel trace. And also, they are very sensitive to noise to errors and also to fabrication variations. Another side channel is the timing channel. Where we know that Hardware Trojan can change delay of a path. For example, in the case of kill switch, the added illogical gate will make the path longer. In the parametric Hardware Trojans, the change of the wire or the change of the transistors may change the delay, as well. So the limitation of delay, path delay based Hardware Trojan detection mechanism is not all the paths we can measure their delay. For us to measure the delay of a path, we need to be able to observe or to see both the starting gate and the ending gate of the path. And also the, just the path delay measurement will be very sensitive to fabrication variation and other noises. And similar to side channel analysis attack we can also capture Hardware Trojans from the EM emissions. So when the circuit has Hardware Trojan and the Trojan got activated, the additional switching will produce EM trace which can be captured, and then used to detect Hardware Trojan. So here a short summary of the two major test time approaches. In the first, the logic test-based approach, these are good to capture small Hardware Trojans. They're also very robust under noise and the fabrication variations. However, they can no handle large Hardware Trojans Mainly because of the large space of tester vectors and also it is very hard to generate the test of pattern or test of vectors that can activate Hardware Trojans. On the other side, this, the side channel analysis based approaches. They are good to capture large Hardware Trojans because this may introduce large variation in the, in the power, power trace or current trace. They can also hand, handle non-functional Hardware Trojans, like the parametric parametric Hardware Trojans. However, this all sensitive to noise and fabrication variations because the measurements of the side channel information. And also they can not detect the small Hardware Trojans. Because the small Hardware Trojan, their contribution to static channel information may not be significant enough. In addition to test of time Hardware Trojan detection, we can also capture or catch Hardware Trojans at run time. The basic idea is simply to monitor the execution of the system at real time. If the system has Hardware Trojan and the trojan get activated, then we can find the malicious behavior. However in this case most likely there will be a inter, interrupt mechanism coupled with the run time monitoring system. Once the Hardware Trojan is detected the interrupted mechanisms will stop the execution to protect the system. As we know that at test-time we cannot detect all the Hardware Trojans with 100% guarantee. Therefore run-time monitoring is a very good complementary approach to test-time approaches. However, for us to do run-time monitoring, we do need monitoring units. And these units will take resources, like power or CPU time and that they can cost performance overhead. Another feature about run-time monitoring is, once we know what type of Hardware Trojan we are looking for, then this run-time monitoring approach can be very effective, because we have a target to monitor. We know what is the expected malicious behavior. Finally, let's see a few examples about Hardware Trojan detection. So first let's consider the parametric Hardware Trojans. These are the Trojans that are not change the functionality of the system. Therefore the logical test with run time monitoring may not be able to detect these parametric Hardware Trojans. However, since they change the physical parameters of the circuits, in particular, the wires or the transistors with the logical gates. So this will lead some trace in the power or the delay side channel. So the side channel analysis based attack ana, detection can find the parametric Hardware Trojans. And a big trojan. So for a big trojan, depends on whether they change the functionality or not, we may be able to detect it using the logical test. However, it may not be easy, because when, when we have a big Hardware Trojan, it may not be easy to exhaustively search all the test patterns. And the big Hardware Trojan will cause the big variation on the power consumption, so we'll be able to detect it by the power side channel analysis. For the path delay based detection mechanism, depends on whether the big, the big Hardware Trojan is on the path or not or whether that path can be measured or not. We can either detect it or we may, may not be able to detect it. And normally, a big Hardware Trojan can be detected at runtime. On the other hand a small Hardware Trojan, this is very good for the logical test, test based approach to detect it as test time. It, it is also very good for delay based side channel analysis. Because for the small ones, the paths delay, the number of paths may be limited. However, because it is, it is small, so the changes it have made to the system of power is very limited, so we may not be able to detect it from the power side channel. At run time, it depends on what kind of impact does small Hardware Trojan will make to the system. We can either detect it, or not be able to detect it. Remember a type tight Hardware Trojan is the one that clustered at one spot on the chip. So this is something similar to a big Hardware Trojan. So we will be able to detect it through the power side channel. We can also normally detect it by the delay side channel, because this is a large unit. And depends on the, the payload of the Hardware Trojan, the run time monitor approach may be able to detect it or may not be able to detect it. And normally, the logical test-based approach can detect a such Hardware Trojan, because we have a large block here and there are certain approaches to generate test patterns to test this large block. For the loose Hardware Trojans, these are the ones that randomly distributed all over the chip. So this one may not be able to be detected by power side channel, because each of this is a small Hardware Trojan. And to similar as before, it's, in whether it can be detected by a run time monitoring approach depends on its payload. And normally it can be detected by the logical test based approach or the delay side channel. The always-on Hardware Trojan. This is very hard to detect. so, because it is always on, so at the run time monitoring, it doesn't show any difference. And also, it is always on, so the power trace doesn't show any difference. And as we mentioned, some of the always-on Hardware Trojans, they don't change the functionality of the system, so logical test may not be able to find it. However, these always-on Hardware Trojans, they do change the delay of this, the circuits. So the path the delay based as side channel can capture such Hardware Trojans. Finally, as we had mentioned earlier, so there are several purp, malicious purposes for Hardware Trojans. For example, one of them is trying to leak sensitive information. So for this type of Hardware Trojans, we may not be able to detect it by logical tests. Because to leak information, they don't change other part of the functionality of the system. And also this unit for example, that leaked the information leak through an addit, and added antenna. So the past delay may not be detected as well. However, the power side channel may be, may be able to detect it. Once thing, once the antenna is activated, the power may become higher. And notify the run time we can, once we know the target of the Hardware Trojan is to leak information. We can monitor [COUGH] the signal to data communication channel of the system and then detect such Hardware Trojan.

Trusted IC Design with HT Prevention

Hardware Trojan is arguably the biggest concern for trusted IC Design. Today, we'll cover several practical techniques that help us to build trusted ICs by preventing or deterring hardware Trojan insertion. We start with pre-synthesis techniques. When we use third party IPs, we want to ensure they don't have hardware Trojan. Or they are trusted before we can use them. So typically there are two approach to do this. First, we can change, we can turn this problem into a formal verification problem. We can verify whether the IP meets the specification or not. This can be done by doing property check, model check, or equivalence check. For example, one way to check two combinational units that are equivalent, f equal to, f equal to g. Only thing we need to verify is. The function fx times g prime x plus f prime x times gx will be a constant 0. Where when we see times, this is a logical edge. When we see prime, this is the complement. When we see plus, that is the logical or. This can be verified by the Boolean satisfiability solvers trying to verify this entity is unsatisfiable, which means it is always a 0. And for sequential components we can do the final state machine equivalence check. This one can be done with the concept called a product machine which we'll show on the next slides. On the other hand, instead of doing this as a form of verif, verification problem, we can do this as a standard testing problem. We can test to make sure the delivered IP really meets all the specs. And the traditional testing methods can be used. And the more importantly, the, the hardware Trojan detection approaches we have just discussed can also be used here to ensure that the IPs do not have hardware Trojan. So now let's see what is the final state machine equivalence and what is the product machine. Two finite state machines they are stated to be equivalent if and only if they give the same output of any input sequence. For example, consider this four state, state machine, A, B, C, D, with A as the starting state. And, a three state Final State Machine, X, Y, Z with Y as the starting state. We consider that a link with 0 for both states when the first one start with A, is going to output a 0. On the second state machine, start from Y, input is 0, is also output of 0. So these agree on output 0 and 0. When they receive one from the same starting stage, Y, the second state of machine output of 1, while in the first state of machine, also output of 1. So they agree on their output. And for two bits input sequence we have four different choices, 00, 01, 10, and the 11. For example, if we, if we receive a two bits input sequence 01. On the first state of machine, a 0, we output a 0, move to B. On the, on the second of B to 1 we follow this arc. We move to C, an output of 0. So the overall output will be 00. For the second state machine, starting from state Y, on the first of input of be to 0, we move to X, an output of 0. On the second repeat of 1, we move to Z, an output another 0. So the output will be 00 agree with the output from the first state machine. So by checking this one by one we know that for these two state machines any two bit input sequence they also agree on each other. And we are keep on checking this, until either they disagree or something. In that case, we call, they are not equivalent. Or eventually, they become the equivalent. Formally, this can be checked very easily, or automatically, by building the what, what people call the product machine. So consider the same two state machines. The four states, A, B, C, D with starting state A. And the three state finite state machine, X, Y, Z with starting state Y. So when we build the product machine, we first define their starting state. The starting state in the product machine will be any combination of the starting state from the first state machine and the second state machine. In this case because each state has one starting state, so the product machine starting state will be A and Y. If, for example, the first state machine has also A, B, as a potential starting state. Then for the product machine, it will have two starting states, A and a Y, or B and a Y. Let's come back to this example. So for the product machine, our initial state will A and Y. On the input of 1. On the first part, A goes to, stays at A. An output of 1. And the second part, Y stays at Y, output of 1. So as a, as a result, they move to the state, which is the same state A,Y and that they both output of 1. So I move to the 0, A will move to B and output of 0. Y will move to X and I'll put a 0. So they agree on their output and then the next state will be the pair of B and X. So this will be our new state for the product machine. So we have to evaluate what will be the next state of this new state when we receive different input. So start for it with the new state B, X if the input is 1. So for B it will move to C with output 0. For X an input at 1 it will move to Z without the 0. So again they agree on their output and then their next state will be a new state C and Z. So keep on this process, then we check what happens when the input is 0. In this case, for B, when the input is 0, the output of 1 moves to D. For X when input is 0 it remains at X and output of 1. So again they're agreeing on their output and their next state will be a new state D and X. So now we have two more new states C, Z and D, X and we're going to analyze them one by one. By checking what will be the their next state based on different input. And whether they agree on their output or not. So eventually we'll stop when there's no new states can be produced. And doing the way. They always agree on their output. So in this case with the M1 and M2, these two states machines still equivalent. However, if at certain point the two state machines, they disagree on their output, then by definition they are not equivalent, as I will can stop by claiming the two state machines, they are not equivalent. So this is a very systematic approach and it can be easily fully automated. Now let's go back to talk about hardware Trojan prevention. We finish about pre-synthesis so let's talk about post-synthesis techniques. So there are several post-synthesis techniques when the chip is built, when the layout is given. So first we can do this, what we call the removal of dead spaces. So when the layout is given, there are white space, the space that are not utilized. An those are the potential spots for hardware Trojan insertion. So we can add the Dami logic, trying to take away this white space to prevent direct hardware Trojan insertion. This is particular effective to prevent big hardware Trojans. We can also do circuit obfuscation trying to make reverse engineering harder. So the, the attackers will not be able to understand the circuit and then it will be hard for them to insert a functional hardware Trojan. And we can also do shielding of the wires, this will be able to prevent the hardware Trojan based on electrical magnetic radiations. And because most of the hardware Trojans, these require a trigger signal, so we can, we should pay special attention to interface protections. This interface includes the I/O pins of the system, the internal connections between modules, the power in the clock networks and also the scan chains. So once we will be able to control this, we can monitor this interface and then whatever, whenever there is a suspicious communication we can check whether there's required or they are the hardware Trojan's triggered, triggering signal. So, now let's talk about a couple of, I mean, a couple of specific techniques to prevent hardware Trojan. The first one is called a rare event removal. And, this is based on the observation that hardware Trojans use rare event as their triggers. If we can remove rare events it will be easier to detect or activate the hardware Trojans. So there's an approach proposed by Salmani Tehranipoor and Plusqellic in 2009. And this motivational example shows the basic idea. What we have here is we have a hierarchical two input and gate. Assume that all the primary inputs are equally likely to be 0 or 1. And the two input and gate will give one-quarter chance produce a 1 and three-quarter of the chance produce a 0. Because this will be 1 if and only if both signals will be 1. And if we keep on proceeds this for the second level, and the gate to produce the 1, it requires both input here to be 1? So that is one-quarter times another quarter. So one over 16 it will be a 1. Similarly, we have a identical structure at the bottom half. So now for these two input and the gate. For this one to be a 1, it will require both signals here and here to be 1. So this one would be considered a rare event, because the probability of this happening is 1 over 16th times 1 over 16th, which is one over 256. So now let's see how we can remove this rare event. Make it much more frequent. So the idea is try to add a flip-flop and an OR gate to change this wire connection with a logical connection here. So this is a, this is a scan flip-flop which we can control its input. So, the output of the flip-flop, or the content of a flip-flop, will be equally likely to be 1 or 0. And then, it goes through this OR gate. For this OR gate to be 0, we need both active signals to this OR gate to be 0. Which is, 15th over 16th and one half. So, the chance for this to be 0 will be 15 over 32. And then the rest of the cases it will be a 1. So now we consider end gate. For this one to be 1 it used to be a rare event, happens with 1 over 256. Now for this signal to be 1 we need this one to be a 1 which happens with the chance slightly higher on the half. And then this one to be 1 happens with 1 over 16th. So the product of this will be 17 over 512. Which is about eight and a half times higher than 1 over 256. So this rare event will not be that rare anymore. So this can facilitate the hardware Trojan activation and the detachment. And the next example is called Shadow Registers. So we have learned that this technique, called the path delay based hardware Trojan detection. It's going to measure the possible delay between two internal registers. Then trying to use this to detect whether a hardware Trojan was inserted or not. However, we mentioned the difficulty of this approach is sometimes we may not be able to measure the internal path delay. So the Shadow Register approach will be greatly facilitate this the measuring of the internal path delay. So what we have here is we have the source register here. We have the destination register here. To measure the parts that delay here it may not be easy because we may not be able to see this signals. So in this approach we we put a shadow register here which shares the, which has the same cog as the system cog. However, the face of the cog might be different. So, we call the cog 1, system cog. Cog 2 will be called the shadow cog, which is used to drive the shadow registers. So, initially, the shadow register and the destination register will have the same face. So, they will have the same content, the same value. So while we do a comp, comparison here, do I have the same value? And we can deliberately change the face of the shadow clock, and then eventually it will become different. So, and this can be monitored through the result bit. When they become different, the comparison result will be different. So this significantly improve the observability or visibility of this internal pass, and it can be used to facilitate the hardware Trojan detection. For details we can check the paper by Li and Lach in 2008. So there are many other approach and most of these are the best of research topics. So to end this discussion about hardware Trojan prevention, or how to build trusted integrated circuits, we list the several of these approaches. In 2009, Banga and Hsiao proposed this approach called the VITAMIN. Which is basically a reverse logical gates to the power supply and the ground. So this, basically, can turn a rare event into a frequently to happen event, which can facilitate hardware Trojan detection or activation. And Gu, Qu, and and Zhou, in 2009, proposed a information hiding-based approach to build trust in the system. The basic idea here is they tried to change the system specification deliberately such that if a hardware Trojan is inserted, it will change the system's performance dramatically and then this can be observed from the final system. In the same year, Chakraborty and Bhunia proposed to use circuit obfuscation, trying to make the circuit much harder to to be understand. And this will prevent or in some sense, deter the hardware Trojan insertion. In 2010 Potkonjak proposed measures how to build trusted circuits using untrusted CAD tools. So this is a very interesting approach and in particular consider effect most of the CAD tools will be considered untrusted. So this is a very interesting and very important topic. In 2011 Love, Jin and Makris proposed to use proof carrying hardware to build intellectual properties that has the proof of trustworthiness and I use this as the building block to build trusted integrated circuits. And more recently, Dunbar and, and a, and a Qu focused on final state machine model to discuss how to build sequential systems with the trust worthiness. And their basic idea is by adding a control signal into the building block which is a flip-flop.

MyClassNotes

Monday, August 31, 2015

Hardware Security - Hardware Trojan Detection and Trusted IC Design - Week 5

1 comment: