MyClassNotes: Hardware Security - Good Practice and Emerging Technologies

FPGA Implementation of Crypto

Today, we will talk about FPGAs and their role in security and trust. We will start with the basics on FPGA and FPGA-based systems. We will discuss about the advantage of implementing crypto systems using FPGA compared with using ASIC or software. And we'll talk about how FPGA can be used to design hardware security primitives such as physical unclonable function and true random number generator. Next time, we'll talk about the vulnerabilities in both FPGA and FPGA-based system design and their countermeasures. We'll take an approach, which form the supply and the demand model for FPGA design market. In terms of background, if you have some experience in FPGA design, that will be a plus. However, it is not required. But we do need you to know, will be familiar with the physical attacks we have discussed several weeks ago. FPGA stands for field programmable gate arrays. It has input, output blocks on the parameter, and which known as this I/O buffers, and the core programmable fabrics in the middle. This letter L, the large L blocks, this stands for logical blocks and they are the circuitry to implement various Boolean functionalities. The logical blocks, they are connected with wires in the routing channel through the connection blocks, C, and the switch blocks. All these blocks, the L block, C block, and the S block, they are programmable. Today's FPGA devices, they can have billions of transistors under the 10 nanometer technology. Designing this many, logic units with, with such huge capacity, capacity can be a, can be a very, very high challenge. To help FPGA system developers, most of the FPGA vendors, they put many additional features on their FPGA boards in addition to only the, you know, in addition to the programmable logic cells alone. Typically, a FPGA chip has hardwired functional units such as the DSP cores or the embedded processors for certain applications. And also it may have large memory blocks and the high-speed input output and also, as well as other types of design IPs. This is a typical design flow of FPGA-based system. Starting from a given system specification, we first select the FPGA device that will be used to implement the system. Then we use hardware description language, such as, such as Verilog or VHDL to specify the behavior of the system based on the system specification. And the right all type of timing requirements in the design constraint file. So with the help of the FPGA developing tool associate with the FPGA device, we have selected in this face, a bitstream file will be generated as the desired output. These file will be used to configure the FPGA device to provide the system functionality. And pay attention, that's this bit, bitstream file, which normally called configuration bitstream file, will be used for this particular FPGA device. FPGAs from different vendors or different brands, different types, they cannot use the same configuration bitstream files. They have to be reconfigured, in some sense, redesigned. Compare with the ASIC design, or application-specific integrated circuit design, of the system for the same functionality with the same specification, FPGA-based system implementation has a much shorter time to market and an extremely low cost. For example, from this design flow, we don't see any involvement of the silicon foundries. This can save a lot of time and a lot of money. Second, this, the, the same FPGA board can be reprogrammed for different applications, or can be modif, can be re-programmed for modified systems specifics, specifications. These make, make FPGA a very popular design platform. Also compared with the software implementation for the same system specifications, FPGA-based system are in general much faster, more energy and power efficient, and, also have a very, very low power. This is because software implementation normally needs the support from general purpose computers. And, general purpose computers, they're much more expensive than FPGA boards. Now we can see third example of implementi, of implementing a cryptographic system such as a cypher. So one could go with the software implementation, which you can write the C code or Java code, or whichever high level, high level language codes, compile it and then run on a specific system. So this type of implementation has a short implementation time, it is easy to debug and update, and it also has low cost. And, on the other hand, we can go with the hardware implementation, and they will offer the following advantage. So first, you will have extremely low power consumption because we designed the system specific for this cryptosystem. And they will have high throughput, or high performance, and they will have much faster speed than the software implementation. The FPGA-based implementation in some sense is a compromise of hardware and software implementation. So they can enjoy both the advantage from software implementation and from hardware implementation. First, we see the feature of programmable logic cell structures on FPGA boards. These are good for implementation of bit-wise operations. Second, the large build-in memories on FPGA boards. It is in general very good for memory intensive operations, such as the substitution used for many, many deciphers. And third, the reconfigurability of FPGA, we have mentioned earlier, is, is good for reuse and integration. There are many reported cryptosystems developed based on FPGAs. These designs has demonstrated that FPGA have advantages in implementing some popular crypto building blocks, such as the field, finite field arithmetic and the ellip, elliptic curve cryptoprocessors. Here are some specific benefits of implementing crypto on FPGAs. First, FPGA-based implementation offers algorithm flexibility in two, twos, two ways. The agility, we know that many secure particles, such as SSL and IPsec, they are algorithm independent and maybe implemented with multiple or different crypto algorithms and this offers several advantages. So for example if one algorithm is broken, it's compromised. We can simply delete it and use another algorithm or another implementation to avoid the security protocol from, from going dark. And similarly, when there's a new algorithm developed to implement the the secure, security protocol, we can add it into the system for the protocol and, and maybe choose it for best performance. On FPGA systems, this becomes very easy to switch cryptographic algorithms during the operation of the target application, for example, the secure protocols. It also makes the secure protocols more adaptive because up, uploading new standards, or modifying standards for spec, specific applications, are all doable on FPGAs. This may not, or if it is possible, will be very, very expensive to do on application-specific implementations for this ASIC implementations. Second, FPGA implementation offers architecture efficiency. We know that when we fix the values for more variables, the system becomes more like application-specific rather than general purpose. And in general, the system performance with less variables will, will be more performance and hardware efficient. Also, when we need to change the values of these parameters, we can redo the design, and reoptimize it and resynthesize it on a FPGA board but we cannot do these things on ASIC. Third, to have, the same FPGA device can be used for multiple security protocols through run-time reconfigurations, as long as these protocols are not used simultaneously. By doing this way, we do not have to implement separate hardwares for each of the security protocols and we can save a lot of system resource. So next, compared, as we have already seen earlier, compared with software implementation, the FPGA implementation normally will be much faster, often can be ten times or hundreds of times faster. It will have actually a very similar performance as the very powerful general purpose CPUs. However, of course, the dedicated ASIC implementation of secure, security accelerators will be faster than FPGA based system. Finally, and perhaps one of the most important advantage of FPGA implementation is its cost efficiency. By cost, we're talking about both the unit price and also the design time and the design cost. So, FPGAs are ideal for building hardware security primitives. As we have talked about earlier, for both type of PUFs or physical unclonable functions, to delay-based PUF and memory-based PUF, they can be, they can be simply implemented on FPGAs. And to implement a true random number generator, and for a gener, true number, true number, true random number generator to be a good one, we have to evaluate the following criterias. So first, it is the entropy source of the true random number generator. Well, where the randomness comes from? On FPGA and other hardware, due to the fabrication variations and other unpredictable and uncontrollable nuances, there are many sources that can be used to generate randomness such as the phase jitter in the clock network and the path delay. And second, the design footprint. This measures the cost efficiency of the true random number generator. Typical matrix includes area, energy, and the time required to generate one random bit. The low cost and the high availability of FP, of FPGA make it a very good choice based on this criteria. The next that, two matrix, on the testing of random, random bitstreams generated by a true random number generator. These bitstreams, they have to be tested for predictability and other statistical properties. There are many randomness test suites available, such as the diehard and the one maintained by NIST. The random then, the random bitstreams must also be secure against attacks and also robust against the environmental variations. As we have seen in the earlier study of PUF, FPGA and hardware based approaches can meet these requirements. Finally, just like how random was selected as, as a yes, the ease of implementation is also important and, and the FPGA-based true random number generators are easy to implement. Next time we will talk about the vulnerabilities in FPGAs and FPGA-based system design.

Vulnerabilities and Countermeasures in FPGA Systems

Last time we talked about the advantages of using FPGA to implement cryptosystems and hardware security primitives. Now we discuss the vulnerabilities and the countermeasures in FPGA-based systems and FPGA design flaw. Most of these vulnerabilities and the countermeasures, we have seen them before. So this can be a very good review for what we have learned in the past five to six weeks. FPGA systems, like other computing systems, have vulnerabilities. There have been reported attacks from such channels and fault injection to break FPGA systems. There have also been attacks for all type of FPGA systems. The SRAM based FPGAs, anti-fuse based or flash based FPGAs. On the other hand, the design flow of FPGA system itself is not secure. At the hardware description language design level that this audience the designers they may still design details of the intellectual property, or did they insert trojan into the design. And, and trust its third-party IPs went through the similar damage. During the synthesis process there's always the use-uh, designers who are EDA tools, they may introduce trojan into the design. Or they might come perform some IP piracy and also the EDA tools needs to be protected from dishonest designers for illegal use. And finally the FPGA configuration bitstream file needs to be protected from being misused by users. So next we'll elaborate these vulnerabilities and the tags from the market view of FPGA. In this FPGA market, there are six major players. The FPGA vendors, those are the big FPGA companies that fir, that produce and sell, design and sells FPGA chips. And there are semiconductor foundries that view the chips. And there are IP vendors built the DSP cores other intellectual properties. There are EDA tool vendors and also there are system developers which is which are going to develop FPGA based systems. And eventually we have these end users who will purchase used FPGA based systems. And there are several connections between a pair of parties. So for example here, we have FPGA vendors with this edge. With a single arrowhead goes to the fundraise. This indicate a service request, which the SP vendor wants the, their FP chips to be fabricated. And this double error adds line indicates the return to service. In this case, the founder will fabricate the chip. The return to the FPGA vendors. And we'll study each of these pair of connections and about their vulnerabilities and the potential attacks. The first one is between FPGA vendors and the foundry. In this case the FPGA vendors will request some fabrication from the foundry. And the foundry will give them the FPGA chips based on the, the design from the FPGA vendors. And in this case where FPGA vendors get the chip from the foundries they may not just trust it for no reason.They may consider whether there's any malicious things inside these FPGA chips. For example, whether there's any hardware trojan. Whether they leak any information during the design of FPGAs. And also the main concern, whether the foundry has fabricated exactly the amount of FPGA chips as requested. Whether they have overbuilt or built more FPGA chips or not. The second connection is between FPGA vendors and either the EDA tool vendors or the IP vendors. In this case the FPGA vendors, they want to put design tools and in other important intellectual properties, like for example the DSP cores onto the FPGA chips. And the both vendors, the tool vendors, the IP vendors will provide such service to the FPGA vendors. And in this case there are concerns from both sides. The FPGA vendors will be concerned about whether the product from the EDA tool vendors and the IP have any Trojan or not. Whether they will leak any design information. And also the tool vendors and the IP vendors, they'll be concerned about how to protect their IPs. Whether the FPGA of vendors will do reverse engineering, then misuse their IPs. The next connection is between the system developers and the FPGA vendors. In this connection the system developers, they will try to buy FPGA chips from the FPGA vendors, and then the FPGA simply missell the chips to them. And, for for the FPGA based system developers, they will have the concern of whether these chips they got have any hardware trojan. Whether they will try to steal any valuable design information. And then finally once the system developer builds the FPGA-based system, eventually the end user will request to purchase such system and use them. In this case, the system, the end users when they purchase our FP, FPGA based system. They will have to concern about the trust of such system. Whether there's any hardware trojan, whether it will leak any information when the user is using the system. And from the system, the FPGA based system developer point of view, they will consider whether these users can be trusted or not. Whether they will go to open, perform reverse engineering and then try to clone the FPGAs, trying to retrieve the configuration-based stream files or not. Whether they may launch, side channel attacks, and FPGA replay attack which will going to be discussed next. And what we'll do now is we're going to consider each of these possible attacks and the available countermeasures. We start from the foundry overbuilding. We talk about this, in this attack, the foundries will build more chips than requested and then keep the actual ones by themselves, or try to sell them at a lower place. And we have discussed this before, the hardware metering technique or remote activation technique can be used to prevent such an attack. And the second attack is the trojan. We have seen many many links that might be trojan. And we discussed a lot of trojan detection and trojan prevention techniques that can be applied in this scenario. And those we have talked about in the testing process, how we can test. Trying to find the existing whether there's any trojan existing in the chip. And the second one is reverse engineering. The reverse engineering happens at multiple levels. It can happen at the, the hardware description language design level. It can happen at the configuration bit stream level. And to protect reverse engineering attack, and people can use encryptions for the encrypt, the, the configuration bitstream file. Or they can do obfuscation, trying to make the design harder to understand. And the next one is the cloning attack, where after the reverse engineering. The illegal users, they are trying to fabricate, or trying to use the reverse exchange configuration bit stream file to configure FPG chips, and this is what we call the clonal attack. And we have talked about the digital watermarking and digital fingerprint techniques to determine such an attack. Because you have your watermark trying to prove that this is your design. Even after they clone it, you can prove it is yours, or once you put a fingerprint, you can tell who has done this since illegally trying to clone the chip. And also, there are some risks to the work trying to bind. The chip, the FPGA chip with encryption or with silicon path. And the next one is side channel attacks. So there have been a lot of reported successful attacks from the power channel, from the timing channel, or from the electromagnetic, emission channel trying to leak information from the FPGA chip. And, the most of the passive the techniques against the passive attacks we have discussed earlier. Can be applied for FPG and base the system against a side channel attacks. And then, finally, this is a new attack for FPG system. It is called FPGA replay attack. And what happens in this case is so, when a new or updated FPGA chip is in the market and the taggers, they may try to use a known vulnerabilities which are for the old systems. And trying to see whether the newer system or the updated systems has fixed that known vulnerability or not. If it hasn't, then you will have a successful attack. This type of approach is called remote update protocols, will try to do a reconfigurable binding to against such attacks.

Role of Hardware in Security and Trust

Let's conclude the course with some discussion on the roles that hardware play in security and trust. Assume that you have some valuables, gold, diamonds, jewelry or whatever. How to protect it? Most of us will think, put it in a safe and lock it. In our discussion, hardware is the safe. So, let's say, you have a digital secret you want to protect. You can develop a solid crypto-algorithm with secure protocol to protect it. And, you can write a, a solid, a well-protected software to implement this crypto-algorithm or secure protocol. But, eventually you need hardware. You need computers to exe, execute these software products. You need developers to do the communication. So, this is the first role of hardware plays in security and trust. It is an enabler, without hardware we cannot do all these computation of the, and the communication. And also, people know that customized hardware always has better performance than software implementation. So that brings us to the second role of hardware in security and trust, it becomes an enhancer. Normally, enhance the performance of the system in terms of speech, energy consumption, et cetera. And we have also seen the case of security and trust with additional hardware inside the systems. The system can add more security to the system. More recently, we have seen more advance or more involvemen, involvement of hardware in security. We present it by this thing called Trusted Platform Module. And also some, a lot of this Vormetric uh,co-processors. What they do there is they are trying to build a first line of defense. If you cannot pass the authentication check in TPM, if you cannot pass the authentication check by the biometric co-processor, you cannot access the CPU. And however, we have seen that it is important to have hardware, but if you don't design your hardware properly. Let's come back to this safe. The safe is solid, however, you leave the key combination of the safe right on top of the safe. Then this safe is not safe anymore. This is the same as when you design a hardware, you leave a lot of back doors. And through the course, we have seen a lot of vulnerabilities from hardware side. People can use side-channel attacks. People can use physical attacks. Trying to steal the information inside the hardware, trying to attack the physical system. This brings us to this, this question. Is hardware become the weakest link in security or not? Unfortunately, if you don't design your hardware properly, if you don't think about security and the trust to the design and optimization of your hardware, it will become the weakest link for security. So what will be the challenges for hardware design? In my opinion, there are three things we have to consider. First, as a hardware designer, you want to secure the design. You want to protect your intellectual property. And we have discussed about the attacks the, the overbuilding, counterfeiting, IP theft and IP misuse. And we have already provide count countermeasures or techniques to prevent this, like digital watermark, digital fingerprint, hardware metering, all these things of authentication. And also you have to make sure that the hardware you deliver to the customer, they are trusted. This is what we call the trusted integrated circuits. There shouldn't be any backdoors. There shouldn't be any hardware Trojan horse. And the second challenge of hardware design here is eventually, the hardware is going to do some computation. It's going to have some data. You want to make sure the data will be secured, and against side channel attacks, against any physical attacks to any kind of memory or And finally, you have to be active, trying to provide hardware security primitives. Try to enhance or improve the system's overall security performance. And we have talked about the TPM, and they are also secure co-processors, which speed up the competition of security. And we have seen PUF. We have seen random number generators. This x can stand for t, for true random number, true number, number, random number generators, while p for pseudo-random number generators. And also, you have to keep, always keep on looking at the new devices, and the new technologies, and whether it will bring any vulnerabilities to the hardware design. And, or, whether it can provide new features for hardware design. I hope that in the, in this six weeks, you have learned something, which might be useful, and it can be, you can use it to protect your design, to make your design more secure. And finally, I will say, thank you

Physical Unclonable Functions (PUF) Basics

Welcome to Hardware security. Today we are going to talk about the physical unclonable functions or PUF in short. This is a hot topic in hardware security. Lets start with the definition. PUF is a function that is based on a certain physical system. Using this physical system it is very easy to evaluate the PUF. PUF is a function that behaves like a random function, in a sense that it's going to generate random output values. These random output values are unpredictable even for attackers with physical access to the system. And finally, even when we know the functionality of a PUF, it is impossible to clone or reproduce it another copy of the same physical system. Let's see a small example. Here we have six inverters. Three on the top path, and three on the bottom pass. Both path lead to a modified different flop with one additional control signal. When this control signal C equals to 1. This is a normal different flop, where its contents is the same as the data input. When the control signal C is 0, regardless of the value of D. The, the flip flop content remains no change. Let's start with input value equal to 0. After these three inverters, this 0 becomes 1 in both top and the bottom pass. And when the two one's arrive, the D flip flop. We have C equal to 1, and a D equal to 1, and therefore, we know the flip flop content should be 1. Now, we change the input from logical low to logical high, or from 0 to 1. And, accordingly we know the internal signals, they will change any the same way. So after three inverters this one will change to a zero, or this logical high becomes low. And when both zeros reach the D, C flop. From this definition which is C go to zero D go to zero. So the flip flop content shouldn't change for the remain is one. However, this is based on the assumption that the past delay on the top half, and this bottom half, they are identical. Or, some says, that two zeroes arrive at the the D, C flop simultaneously. Due to manufacture process of the digital system. We know there are fabrication variations. Sometimes also called manufacture variations. These variations are uncontrollable and unpredictable. And from, because of the variations, we know these six inverters, they may not have the same delay. Therefore for these two signal zeros to arrive at the D, C flop at the same time, that might not be possible. So in that case what will be the contents of this flip flop? First let's assume that the top path is faster. So this 0 will reach the signal D before C signal changes from 1 to 0. So what we have is, we have C equals to 1, D equals to 0. So in this case we know that flip flop contents will change from 1 to 0. On the other hand, when we have this bottom half, where the bottom pass is faster. So this zero comes to the flip flop earlier, changes C to a 0. And this will make the flip flop disables, which means its content will remain no change. So what we have seen so far here is, so this is a simple physical system, and based on this system, we can easily evaluate certain function. Given the value here, with the value of the output, by looking at what is the content of the D flip flop. And we have just as analyzed that this output has some randomness there. It can be either 1 or 0, and this is determined based by the manufacturer variation, by, by seeing which paths will be faster. And this manufacture, manufacturer variations, they are unpredictable. They are also uncontrollable. In a sense that, if I fabricate the same system, the same circuit, with six such as wires, and then this modified the D flip flop. We can not guarantee that we will have the same behavior than the previous one that we have analyzed. So this example meets all the requirements for our path. So based on the source of the physical randomness of, of, of PUF's. We can partition them into two groups. The silicon PUF's and the non silicon PUF's. Silicon PUF's include memory-based PUF's, Delay-based PUF's. And analog electronic sys, signal based PUFs, such as the power grid signals. And for non-silicon PUFs, we have seen optical PUFs, paper PUFs, acoustic PUFs, magnetic PUFs, and so on. For example, in paper PUFs, people believe that the roughness of the surface of the paper are random. And it can be used for PUF. In the rest of the lecture, we are going to focus on memory-based PUF and delay-based PUF. Mainly because they are relatively easy, to be integrated into the same as technology. [SOUND] Let's start with the memory-based PUF. This is a simple cell of an SRAM, where we have the close coupled inverters. So, my input is one, so one, through the inverter becomes 0. This 0 comes down, goes through the inverter, becomes one again. And then, this one going, going back becomes a 0. So this circuit now becomes stabilized. So we have a stable input is 1, and and the output is 0. And similarly if the input is zero, the output will be a stable one. The idea behind SRAM PUF is during the systems initial power up stage. The in, this input signals to SRAM cells are random. Therefore, the output will also be random. Or in some sense, the initial value of each SRAM cell will be random. And this can be used to build SRAM path. A similar, similarly, where we have light, each lights can be used to define a path bits. For example here initially we put A equal to 0. Then this 0 is going to make both indicate output 0, output 1. So B and a C will be what? So now, when we change A from 0 to 1. And we know that this two input nanogate may not have the same speed, so let's assume that, for example, gate number one is faster. So, a is a 1 and, previously, C is also a 1. So, these two ones would change B to a 0. That is what we have here. That this 0 comes down here making, get number two be to be a 1, which means there's no change on this line. And then this one coming back here, with another one from A, keep this one stable at 0. So the circuits becomes stabilized. So the stay, in the stable state we B for the 0 and then C go to 1. And because of the symmetry of this circuit. We know that if we assume that gate 2 is faster than C will be 0. And then B will be 1. So in this case when we make one change in the input, the output may have different values depends on the, the intrinsic property of the circuit. And as from path based on the neutral power up values is only good for physics. For FPGA's because during FPGA startup, power up, the configuration B string will initialize for the SRAM cell. So we cannot build SRAM PUF for FPJ in this way. So researchers have proposed butterfly PUF, and flip flop PUFs for FPJs. These we're not going to elaborate. So, now let's see the delay-based PUF. So this is what we call the ring oscillator PUF. If we consider this as an inverter. So this will be similar to what we have seen before, as the small example for PUFs. So for these two switch stage inverters we know that they have a tiny delay difference. And then that delay difference can be used to define a path bit. And however, because this delay difference is so tiny, so it is hard to measure this delay difference. To solve this problem, we add this feedback signal here. And then replace this inverter by a two input nanogate. And we do the same for, for the bottom velocity. So this becomes the oscillator, and we see the impact of this is if initially that this circuit, this two circuits they have a delayed difference of 30. And now if I let this one run seven times for example, and then now I measure the delay difference. That delay difference will be magnified to 1,000 times the area, so it becomes much easier to measure. And even better, instead of measure the delay difference, after this we also It's much easier for us to give them a time. For example, let both re-oscillators run one millisecond, and we'll see how many routes they have run using this counter. Whoever has run the more number of runs will be the fast one. And then, based on this, who is faster, who is slower, we can define a bit. And another popular re, delay-based PUF is the one called arbiter PUF, where you have multiple stages of a pair of multiplexers and these multiplexers are designed carefully so these passes, they are all symmetric in terms of, they will have the same wire delay. So, for example, if we use this thing we call the challenges as the selection bit for this motoplexers. For the first one, it has a 0, so it's going to pick this path coming to 0. And for the second motoplexer, because I want to pick 0. So, I'm going to pick this pass. Okay. And the way to build this pass is trying to make sure this pass there'll be, have similar wireless. So, now let's assume that's initially the signals high, or logical 1 here. And also for the end these two signals will be high. And once we have C equal to 1, D equal to 1, we know that the system has a memory content of 1. So now let's see the signal goes down from high to low. And because of this variation, we see that this blue curve, or the blue path is faster. And when the blue curve is faster, so this one goes down to 0. Then beca, and then to that time, the C signal is still 1. So this is a normal D flip flop. So it's countant will be the same as the D value. So however, under, for the same design what is on another chip,. Because of the unpredictability of the, of fabrication variation. We may have a completely different scenario. So in this case we see the red path is much faster. And in this case, C now becomes 0, and then it's going to disable the flip flop, so it's going to keep it's original value, which is 1. So now, let see some applications of PUF. So first, PUF can be used for device identification. This is based on the observation that the same PUF circuitry will generate different PUF data of different chips. So in that way, two different chips still have different PUF data, and this PUF data can be used to distinguish the two chips. And PUF can also be used for key generation and storage. This is much more secure than storing keys in memory, because key in the memory is vulnerable for physical attacks and other attacks. But if we use PUF to generate key, and the attackers can still use invasive attack. To open up the chip and then see the structure of the PUF, but the results that the PUF run in the race, they won't be able to figure out what is the key behind this, this structure. However, if we want to use the, the PUF key as a cryptography key. Now, we have to do some post process to make this case reliable, robust and random. And also PUF can be used for IP protection. For example, they can be combined with [INAUDIBLE] machine and then trying to, trying to do the Active IC metering. As we hae seen from the arbiter path, there is a challenge and response pair. And based on this challenge-response pair, which we'll call the CRP, has several security protocols can be defined. For example, we can devine, difi, devise authentication scheme. The user will have a cup, will have a pair of challenge in the response. When the user also authenticate the device, he will send the challenge to the device, and the challenge will return a response. If the response is correct, then the system will be authenticated. Otherwise it is not. And this pair can be also used for encryption. For example the data can be encrypted using PUF as the secret key. Now whoever has the public key can use that to decrypt the message. And also recently there's a proposed approach called the timed authentication. This is motivated by the example, by the motivation that the, when we do devise authentication and their response, and they can build a model to attack this, okay. So at the time of authentication. The, the authors observe that for the genuine device to response to a challenge, the time it takes is much less than the response time from model building attacks, or from any kind of hardware or software emulation. And path can also be used for sof, software licensing to be bounded to particular chip, because each chip have a different PUF ID, and we can use what tied this PUF ID to the software's licensing information. And finally, people have proposed to use PUF to build a secure memory and a secure processor.

RO PUF- Reliability

We have introduced the basic concepts about the PUF. Before we can use PUF in the security-related applications there are many concerns needs to be addressed. Today we'll use Ring Oscillator PUF as an example to discuss the reliability issue, first, what other reliability problems in path and why they are important. Think about the potential path applications as we have discussed, for example, the device identification and authentication, encryption and the decryption, the PUF data play a vital role in this application, even the flip of one single bit will cause failure, so why PUF bit, may, may change? There are many reasons, for example, in the delay-based PUF, circuit delay will be affected by your environmental con, conditions, such as temperature, supply voltage, humidity, and et cetera. As circuit ages, it also becomes slower, and a different part of the circuit may age at different rates. Finally, there might be measurement errors when we collect the delay information, take the temperature frequency relationship for example. This figure shows that when temperature goes up, the frequency of both ring oscillators goes down, however, the blue ring oscillator is always faster than red one. Therefore, if we define the PUF bits to be 1, if the blue ring oscillator's faster than the red one, this pair of ring oscillators will give us a reliable bit 1 but we are not that lucky for this pair. When temperature is low, the blue ring oscillator is faster, which is a bit 1, but when temperature becomes high, the red one will run faster, and the half bit will flip to 0 because now the blue ring oscillator is slower, so the question is how to make PUF bit reliable. Remember, in Ring Oscillator PUF, the bit reliability is determined by the delay or frequency gap between the ring oscillator pair, as shown in this figure. A simple solution to improve the reliability is to increase the frequency gap, and the easiest way to do that is to use a large threshold during the selection of ring oscillator pairs. When the delay gap is smaller than this threshold, we simply do not define any PUF bit based on that pair of ring oscillators. Another way to do so is to increase the size of the selection pool, instead of compare a pair of ring oscillators, we can compare the fastest one and the slowest one among k-ring oscillators. This will give us a pair of ring oscillators that have large delay gap. These approaches improve reliability, but still they cannot guarantee that there won't be any bit flips, so for quick tool applications, error correction coding has been used to correct the PUF bit errors. There can be a large hardware overhead associated with this approaches because those and unselected ring oscillators will be wasted. Now we present one method that can utilize these unreliable ring oscillators. Consider this pair of ring oscillators, the blue ring oscillator is faster at low temperature region, which gives us a bit 1, however, it becomes slower in the high temperature region, so the bit would change to a 0. During the chip-casting phase, if we can identify this temperature region, we can still manage to have a reliable bit, for example, while we see the temperature is in the high temperature region we can take the bit 0 from the PUF as I flip it to get the reliable bit value 1. The challenge is how to retrieve this bit 1 when temperature falls into this unreliable region, that is when the concept of cooperation comes to play. We select a secondary pair who will be, who will be able to provide a reliable bit at the unreliable temperature region of the primary pair. As, as part of the updater, the primary pair will remember who its secondary pair is and whether the bits from that secondary pair needs to be flipped to retrieve the PUF bit defined by this primary pair. With this information, we can define a reliable bit as follows, when the temperature is low, the blue ring oscillator is faster and that gives us the correct bit value 1, when temperature rises to the unreliable region we ask the secondary pair for help, which gives us a bit 0. According to the help data, we know that we need to flip this bit to get the correct value of 1, then temperature goes to the high temperature region. The primary pair yields a bit 0, which we know that we need to flip it to get the reliable bit 1, by doing this, we can define unreliable bit 1 through the entire temperature region. Now let's consider this pair of ring oscillators, each with five inverters. The number indicates that delay of each inverter. It is easy to find that the delay of the top ring oscillator is 28 and the bottom one is 25, so the, so the top one is 3 units of time slower than the bottom one. This delayed difference of 3 will determine the reliability of this PUF bit, and whether this pair of ring oscillators will be used at all on this chip, however, if we only take the last three inverters to fold two ring oscillator, we see the delayed discrepancy will be doubled to 6, which indicates a much more reliable PUF bit. This is an interesting observation but the first question we need to solve is how to construct a ring oscillators with selected inverters, not all of them. This is the architecture to make this happen. In parallel to each inverter, we add a wire, we, we add a wire connection, then we can select at each stage whether we want to include the inverter in the ring oscillator or not. The concept can be shown from this figure where we use multiplexer to control the configuration at each stage. If we want to use this inverter, we'll set the selection bit to be 1. If we want to skip this inverter, we'll take the wire connection, and then put, the selection bit to 0. The selection bit for each multiplexer is referred as the configuration vector, of course, in real circuit design, this can be implemented much more efficiently. Here are the advantages o, of this highly flexible or configurable ring, ring os, Ring Oscillator PUF. First, it allows us to build ring oscillator PUF at inverter level instead of the ring oscillator level. At this finer level, as we have seen in the modulation example from the last slides, it can create larger delay gaps between the two ring oscillators. Second, the configuration vector will be decided at the post silicon testing phase, so we can select the inverters based on their real delay, to maximize the delay gap. With a lot of delay gap, between this Ring Oscillator pair, we can build

Trust Platform Module and Other Good Practices

Welcome back, everyone. This is the last week of the course. We will cover some good practice in security design and emergent hardware security topics. We will talk about the Trusted platform module or TPM with focus on its crypto-key generation and the management structure. Then we will learn the basic concepts of physical unclonable function. We will discuss the vulnerabilities attached and the counter measures in FPGA design and FPGA-based assistance. We will conclude week, and the course with a brief overview of the role that hardware plays in security and trust. We will need some basic knowledge of digital logic design to understand the physical and global function. As revealed on physical attacks, IP protections, and hardware Trojan will be helpful for the materials related to the FPGA system security. Trust Platform Module or TPM in short refers to both the set of specifications for a secure crypto-processor. And the implementation of these specifications on a chip. We will focus our discussion on the TPS chips. The specifications can be found in the webpage of the Trusted Computing Group. TPM chips can being stored on the motherboard and is used in almost all of the PCs, laptops, and smartphones. It is suggested to use TPM together with firewalls, antivirus software, smart card and biometric verification. Many companies are making their own TPM chips. A TPM chip's main function include secure generation of cryptographic keys, protection of these keys, generating pseudo-random numbers. Hardware authentication, sealed storage of password keys and a digital certificates, and remote attestation. Remote attestation allows changes to the user's computer to be detected by authorized parties. TPM has many types of keys. Among them, the Endorsement Key is perhaps the most important one. This is a RSA public private key pair. It is created only once to the lifetime of the TPM. Normally, it is generated by the TPM manufacturer after the TPM chip is fabricated and tested. The private key is stored inside the TPM and can be used internally for a decryption. It will never be revealed or as, accessed outside the P, TPM. The storage's root key or SRK is another important key stored inside the TPM. It is created when the TPM's ownership is created. The TPM's endorsement key and the user's specified password will be used to generate the SRK. The attestation iden, identify key or AIK is an another RSA key pair designed for attestation. The public key will be designed, will be signed by the endorsement key. Then it will be said, sent to a trusted certificate authority or known as CA. The CA will validate the endorsement key and issue a certificate for the test, for the attestation identity key. TPM will authenticate itself using this certificate. This is the start after direct anonymous attestation. Storage key are asymmetric keys used for encrypting data or other keys. This encryption process is sometimes referred, referred to as wrapping. Signing keys are general purpose keys used to, to sign, and to sign applications, data or messages. Binder keys are used to encrypt a small amount of data or key on one platform and to decrypt it on another platform. Authentication keys are symmetric keys used, used to protect transport sessions involving the TPM. Legacy keys are keys which can be exported to another TPM after creation and may be used for signing and encryption operations. They are referenced to specific users and their application fields. Based on whether a key can be transferred from one TPM to another, the TPM keys will have, will have one of the following attributes. The non-migrate, migratable keys, or NMK, are those created for one TPM and cannot be migrated or exported to another TPM. They are stored in TPMs shielded locations,. And then, the TPM can create certificate for non-migratable keys. Migratable keys or MK are the keys that are not generated for a specific TPM. They can be generated either inside or outside a TPM. They can be integrated inside a TPM or moved from one TPM to another one. Migratable keys are only trusted by the party who generate them. Certified Migratable keys or CMK are generated inside a TPM like a non-migratable keys. But they can been migrated to another TPM. At the time when they are created, the creator has to pick up a migration authority, or MA, or migration selection authority, or MSA. Which will have the authority to migrate to the key. CMK's can both be migrated and also be considered secure as long as the MSA, and, the MA's to migrate the keys are trusted. Here are some advantages and disadvantages of using TPM. Using TPM it will add security against the physical threat and attacks. It can also give you the convenience with the single sign-on feature. However, there are also a couple of concerns of using TPM. First, it's the biggest concern, is about user privacy. With all the user data stored off TPM, once the TPM is compromised, user sensitive data will be lost. And second, as we have mentioned, there are many many TPM vendors. How can we trust this hardware design process, and how can we trust these vendors? That is under the counsel. So here, there are some useful links about the the TPM. We give the [INAUDIBLE] homepage of the Trusted Computing Group. And how to use TPM with Linux and the TPM and the BitLocker in Windows 8 Operating System. Finally, besides TPM, there are many good secure design practice from most of the leading companies. Such as AMD, Cisco, HP, IBM, Intel, as well as the major FPJ vendors, such as Altara and Zynex. I'm not going to do a commercial for any of them, if you are interested you can easily find them from their web page. On how they can do secure system design.

MyClassNotes

Monday, August 31, 2015

Hardware Security - Good Practice and Emerging Technologies - Week 6

1 comment: