Monday, August 31, 2015

Software Security - Introduction

Introducing Computer Security
banner
Computer security, more recently known as cyber security, is an attribute of a computer system. The primary attribute that system builders focus on is correctness. They want their systems to behave as specified under expected circumstances. If I'm developing a banking website, I'm concerned that when a client specifies a funds transfer of say $100 from one of her accounts, then $100 is indeed transferred if funds are available. If I'm developing a word processor, I'm concerned that when a file is saved and reloaded, they do get back my data from where I left off. And so on. A secure computer system is one that prevents specific undesirable behaviors under wide ranging circumstances. While correctness is largely about what a system should do, security is about what it should not do. Even when there is an adversary who's actively and maliciously trying to circumvent any protective measures that you might put in place. There are three classic security properties that systems usually attempt to satisfy. Violations of these properties constitute undesirable behavior. These are broad properties. Different systems will have specific instances of some of these properties depending on what the system does. The first property is confidentiality. If an attacker is able to manipulate the system so as to steal resources or information such as personal attributes or corporate secrets, then he's violated confidentiality. The second property is integrity. If an attacker is able to modify or corrupt information keep by a system, or is able to misuse the systems functionality, then he's violated the systems integrity. Example violations include the destruction of records, the modifications of system logs, the installation of unwanted software like spyware, and more. The final property is availability. If an attacker compromises a system so as to deny service to legitimate users, for example, to purchase products or to access bank funds, then the attacker has violated the system's availability. Few systems today are completely secure, as evidenced by the constant stream of reported security breaches that you may have seen in the news. In 2011, for example, the RSA corporation was breached. I'll say more about how in a moment. The adversary was able to steal sensitive tokens related to RSA's SecureID devices. These tokens were then used to break into companies that use SecureID. In late 2013, Adobe corporation was breached, and both source code and customer records were stolen. At around the same time, attackers compromised Targets point of sale terminals, and were able to steal around 40 million credit and debit card numbers. And these are just a few high profile examples. How did the attackers breach these systems? Many breaches begin with the exploitation of a vulnerability in the system in question. A vulnerability is a defect that an adversary can exploit through carefully crafted interactions to get the system to behave insecurely. In general, a defect is a problem in the design or implementation of the system such that it fails to meet its requirements. In other words, it fails to behave correctly. A flaw is a defect in the design while a bug is a defect in the implementation. A vulnerability is a defect that affects security relevant behavior rather than simply correctness. As an example, consider the RSA 2011 breach. This breach hinged on a defect in the implementation of Adobe Flash player. Where the flash player should benignly reject malformed input files, the defect instead allowed the attacker to provide a carefully crafted input file that could manipulate the program to run code of the attacker's choice. This input file could be embedded in a Microsoft Excel spreadsheet so that flash player was automatically invoked when the spreadsheet was opened. In the actual attack, the adversary sent such a spreadsheet to an executive at the company. The email masqueraded as being from a colleague so the executive was beguiled into opening that file. This sort of faked email is called a spear phishing attack, and it's quite common. One the spreadsheet was opened the attacker was able to silently install malware on the executive's machine, and from there, carry out the attack. This example highlights an important distinction between viewing software through the lens of correctness and through the lens of security. From the point of view of correctness, the flash vulnerability is just a bug, and all non trivial software has bugs. Companies admit to shipping their software with known bugs because it will be too expensive to fix them all. Instead developers focus on bugs that would arise in typical situations. The bugs that are left, like the flash vulnerability, come up rarely and users are used to dealing with them when they do. If doing something causes their software to crash, users quickly learn that, that something is not something to do and they work around it. Eventually, a bug is so burdensome on many users that a company will fix it. Now, on the other hand, from the point of view of security, it is not sufficient to judge the importance of a bug only with respect to typical use cases. Developers must consider atypical misuse cases, because this is exactly what the adversary will do. Whereas a normal user might trip across a bug and cause the software to crash, an adversary will attempt to reproduce that crash, understand why it is happening, and then manipulate the interaction to turn that crash into an exploitation. In short, to ensure that a system meets its security goals, we must strive to eliminate bugs and design flaws. We must think carefully about those properties that must always hold no matter what, and ensure our design and implementation does not contain defects that would compromise security. We must also design the system so that any defects that do inevitably remain are harder to exploit.  


What is software security

So far we have talked about computer security generally, but what is software security? The subject of this class, in particular. Software security is a branch of computer security that focuses on the secure design and implementation of software. In other words, it focuses on avoiding software vulnerabilities, flaws and bugs. While software security overlaps with and complements other areas of computer security, it is distinguished by its focus on a secure system's code. This focus makes it a white box approach, where other approaches are more black box. They tend to ignore the software's internals. Why is software security's focus on the code important? The short answer is that software defects are often the root cause of security problems, and software security aims to address these defects directly. Other forms of security tend to ignore the software and build up defenses around it. Just like the walls of a castle, these defenses are important and work up to a point. But when software defects remain, cleaver attackers often find a way to bypass those walls. We'll now consider a few standard methods for security enforcement and see how their black box nature presents limitations that software security techniques can address. Our first example is security enforcement by the operating system or OS. When computer security was growing up as a field in the early 1970s, the operating system was the focus. To the operating system, the code of a running program is not what is important. Instead, the OS cares about what the program does, that is, its actions as it executes. These actions, called system calls, include reading or writing files, sending network packets and running new programs. The operating system enforces security policies that limit the scope of system calls. For example, the OS can ensure that Alice's programs cannot access Bob's files. Or that untrusted user programs cannot set up trusted services on standard network ports. The operating system's security is critically important, but it is not always sufficient. In particular, some of the security relevant actions of a program are too fine-grained to be mediated as system calls. And so the software itself needs to be involved. For example, a database management system or DMBS is a server that manages data whose security policy is specific to an application that is using that data. For an online store, for example, a database may contain security sensitive account information for customers and vendors alongside other records such as product descriptions which are not security sensitive at all. It is up to the DBMS to implement security policies that control access to this data, not the OS. Operating systems are also unable to enforce certain kinds of security policies. Operating systems typically act as an execution monitor which determines whether to allow or disallow a program action based on current execution context and the program's prior actions. However, there are some kinds of policies, such as information flow policies, that can not be, that simply cannot be enforced precisely without consideration for potential future actions, or even nonactions. Software level mechanisms can be brought to bear in these cases, perhaps in cooperation with the OS. We will consider information flow policies in more depth later in this class. Another popular sort of security enforcement mechanism is a network monitor like a firewall or intrusion detection system or IDS. A firewall generally works by blocking connections and packets from entering the network. For example, a firewall may block all attempts to connect to network servers except those listening on designated ports. Such as TCP port 80, the standard port for web servers. Firewalls are particularly useful when there is software running on the local network that is only intended to be used by local users. An intrusion detection system provides more fine-grained control by examining the contents of network packets, looking for suspicious patterns. For example, to exploit a vulnerable server, an attacker may send a carefully crafted input to that server as a network packet. An IDS can look for such packets and filter them out to prevent the attack from taking place. Firewalls and IDSs are good at reducing the avenues for attack and preventing known vectors of attack. But both devices can be worked around. For example, most firewalls will allow traffic on port 80, because they assume it is benign web traffic. But there is no guarantee that port 80 only runs web servers, even if that's usually the case. In fact, developers have invented SOAP, which stands for simple object access protocol, to work around firewall blocking on ports other than port 80. SOAP permits more general purpose message exchanges, but encodes them using the web protocol. Now, IDS patterns are more fine-grained and are more able to look at the details of what's going on than our firewalls. But IDSs can be fooled as well by inconsequential differences in attack patterns. Attempts to fill those gaps by using more sophisticated filters can slow down traffic, and attackers can exploit such slow downs by sending lots of problematic traffic, creating a denial of service, that is, a loss of availability. Finally, consider anti-virus scanners. These are tools that examine the contents of files, emails, and other traffic on a host machine, looking for signs of attack. These are quite similar to IDSs, but they operate on files and have less stringent performance requirements as a result. But they too can often be bypassed by making small changes to attack vectors. Now we conclude our comparison of software security to black box security with an example, the Heartbleed bug. Heartbleed is the name given to a bug in version 1.0.1 of the OpenSSL implementation of the transport layer security protocol or TLS. This bug can be exploited by getting the buggy server running OpenSSL to return portions of its memory. The bug is an example of a buffer overflow, which we will consider in detail later in this course. Let's look at black box security mechanisms, and how they fare against Heartbleed. Operating system enforcement and anti-virus scanners can do little to help. For the former, an exploit that steals data does so using the privileges normally granted to a TLS-enabled server. So the OS can see nothing wrong. For the latter, the exploit occurs while the TLS server is executing, therefore leaving no obvious traces in the file system. Basic packet filters used by IDSs can look for signs of exploit packets. The FBI issued signatures for the snort IDS soon after Heartbleed was announced. These signatures should work against basic exploits, but exploits may be able to apply variations in packet format such as chunking to bypass the signatures. In any case, the ramifications of a successful attack are not easily determined, because any exfiltrated data will go back on the encrypted channel. Now, compared to these, software security methods would aim to go straight to the source of the problem by preventing or more completely mitigating the defect in the software.  






Tour of the course and expected background


Now we're going to take a tour of what you will learn in this course. The target audience for the course is those people involved or interested in developing secure software. This includes people who design software systems. Who write code to implement those systems. Who review codes and designs that aim to be secure. And who test software to make sure it is secure. This course is one of several in the Maryland cybersecurity center series of courses in Coursera's cyber security specialization. At various points, it will overlap with topics covered in the other courses, which include usable security, cryptography, and hardware security. Now, that said, much of our focus will be on the core activities of building software. From designing its architecture, to writing it's code. To testing, and otherwise checking that the code is secure. We assume as a result that students taking the course have a substantial background in computer science. This is a technical class. As a rough estimate students should have had the equivalent background as a third year university student whose major is computer science, or a related field. More specifically, we assume the participants have a fair amount of experience writing code. Ideally, we're looking for the equivalent of three semester-long courses on programming. Two of these courses might consider high level languages, like Java or Ruby, or Python. But at least one should be on low-level programming in C and C++. Along with the language features that are common to both C and other languages in particular to C, we assume students are familiar with constructs like arrays and pointers. And with C's approach to memory management using the stack and heap. These concepts are critical to really understanding the certain sorts of pernicious security vulnerabilities. We also assume some familiarity with the following material, for which we'll provide some introductory review. Our labs would be implemented in an environment running Linux, so we assume familiarity with Linux basics like the command shell. We will also use the gdb program debugger for one of our labs. And so some knowledge of it will be helpful. We assume students are familiar with the world wide web and some basic networking concepts. And finally, we assume students have some familiarity with assembly language, preferably Intel x86 assembly. We expect the students to understand the basics of machine instructions and how processors execute them. But we will do some review of the standard architectural memory model, as a it pertains to memory based attacks. Our goal in this class is to learn how to make software that is more secure. We want to learn how to make better designs, write better implementations, and have better assurance at the end that the software we've written is resilient to attack. So how can we go about learning how to do this? We're going to take on two points of view when looking at software security, Black Hat, and White Hat. Black Hat takes on the point of view of the adversary. White Hat of the defender. So a Black hat is going to ask, what are the security relevant defects that constitute vulnerabilities in our software, and how can those defects be exploited? Putting on our White hat, we're going to ask, how do we prevent security relevant defects before we deploy? How do we make vulnerabilities we don't manage to avoid harder to exploit? Looking at just one hat or the other only gives us part of the picture. Therefor during the course we will wear both hats switching from one to the next. We will begin wearing our black hat looking at low level vulnerabilities in programs written in C, and C++. These vulnerabilities include many things. Most notably buffer overflows which can take place on the stack or the heap or due to integer overflow, or over-writing, or over-reading buffers in memory. We'll also look at format string mismatches and dangling pointer dereferences. These vulnerabilities lead to attacks like stack smashing or formating string attacks, or return oriented programming. All of them are violations of a property called memory safety. Where accesses to memory via pointers go to memory that they don't own and instead go to other parts of the program. The easiest way to ensure memory safety and thereby avoid these different sorts of attacks is to use what's called a memory-safe programming language. Or better yet a type-safe programing language. If you still want to use C and C++, which are not memory safe, then there are several automated defences that will help prevent or mitigate attacks. We'll discuss several. Stack canaries. Non-executable data. Address space layout randomization. Memory-safety enforcement, and control-flow integrity. And we'll discuss how to augment these defenses using safe programming patterns and libraries. One key idea in these programming patterns and libraries is to validate untrusted input and, therefore, prevent certain sorts of attacks. Next we'll turn our attention to securing the world wide web. Much of what we do today is on the web and, as a result, it's also the target of attack. The web brings with it new vulnerabilities and attacks and we'll discuss several including SQL injection, Cross-site scripting, Cross-site request forgery, and Session hijacking. We'll also look at the defenses against these attacks and we'll see that they have a similar theme to defenses we've seen to this point. For example, we should be careful who or what we trust by validating input. And, we should try to reduce any possible damage of attack by making exploitation harder. Now, at this point, we'll have considered the web and low-level software. But, we'll want to step back and look at the software development process generally. How are the ideas and attacks that we've seen to this point relevant to the overall software development process. So we'll look at the different phases of software development lifecycles including Requirements, Design, Implementation, and Testing and assurance. And we'll look at the corresponding activities that take security to heart. For example, defining security requirements and defining abuse cases. Performing architectural risk analysis and threat modeling. And using a security conscious design. We'll also want to conduct code reviews, perform risk-based security testing. And perform penetration testing to make sure that the software that we have designed and built truly is secure. Let's look at a couple of the phases now of the life cycle, in a bit more detail. For requirements and design, we'll look at how to identify sensitive data and resources, and define security requirements for them. These requirements include things we've mentioned already like confidentiality, integrity, and availability. We'll consider the expected threats against our system, and the abuse cases that could violate the requirements we've set down. Next we'll apply principles for secure software design to prevent, mitigate, and detect possible attacks. There are several different principles and rules that we'll use, but basically three main categories. First, favor simplicity in your design and code. Second, trust components with reluctance. And third, defend in depth, relying not on one defense but many. [SOUND] At the end as an exemplar of this sort of design practice, we'll look closely at the very secure FTP daemon. It was written very much with security in mind and employs many of the principles that we've just mentioned. Next, we'll turn our attention to the implementation and testing phase. And focus on Rules and Tools. In particular, we'll apply coding rules to implement our secure design. These rules will have similar goals to the principles we've looked at that is preventing, mitigating, or detecting possible attacks. We'll also look at how to apply automated code review techniques to find potential vulnerabilities in components. In particular, we'll look at a technique called static analysis that is able to analyze a program and consider all of it's possible executions when making a judgement. We'll also look at symbolic execution. Which is sort of a hybrid technique between static analysis and testing. And it underlies a technique called whitebox fuzz testing. Finally, we'll look at applying penetration testing to find potential flaws in a real system in a deployment environment. We'll look at different attack patterns as enabled by different sorts of pen testing tools. And we'll look at a technique called fuzz testing for trying to find failure scenarios in software programs. Stepping back. We can see that the content of this course has six over all units. The first is memory attacks, followed by memory defenses, looking at low-level software. Next, we look at web security. We follow that with secure design and development. Then, we dig in to automated code review, via static analysis in symbolic execution. And finally finish up with techniques for penetration testing, notably fuzzing. You can expect about 80 minutes of video per week for six weeks. We'll also have some supplemental readings and interviews from the experts. I've managed to put together interviews with four experts. Here we see Andy Chou, static analysis expert. And the CTO of Coverity, a company that makes a static analysis product. We also see Gary McGraw, who is a software security expert and noted author. And, he's the CTO of Cigital, Inc. I should say, many of the ideas that I put forth in this course are inspired by Gary's books. And, I was really pleased to interview him. Next we will hear from Eric Eames who is a penetration testing expert, and is a principal security consultant for the FusionX company. And Patrice Godefroid, a White-box fuzzing expert who is a principal researcher at Microsoft Research. We will assess your understanding of the material in the course in two ways. First we'll provide projects that should help you get a better grasp of the things that we're talking about like vulnerabilities and how to exploit them. How to use tools to better build software securely, and so on. And we'll have quizzes once per week that check your knowledge directly using tests. These quizzes will also cover things that you should've learned by doing the projects. That concludes our introduction, so let's get on with learning about software security. 

2 comments:

  1. So if they're exposed at rest, you might want to encrypt them. If they're in, exposed in transmission, you might want to use an encrypted con-, connection technology like SSL. One possible failure for authentication is to embed an authentication token in an exposed URL. This fails to appreciate that an adversary could watch network traffic and see that exposed URL travel across the network, and thereby steal the session ID. Serious Security alarms in Dandenong

    ReplyDelete
  2. Top-tier executive techniques and insights. The Executive Lens impresses.
    The Executive Lens

    ReplyDelete