Penetration Testing- Introduction
[SOUND] Penetration testing, or pen testing for short, is a direct assessment of the security of a complete software system. Its goal is to find evidence of insecurity, typically taking the form of exploitable vulnerabilities. Pen testing is a black hat activity done by so-called red teams or tiger teams, and employed for the good purpose of finding security defects prior to deployment. What is the target of a penetration test? The focus can be any of several different levels of a system made up of executable components. We can fuzz test single programs or we can look at complete applications, like web applications, consisting of communicating programs, the browser and server. We can even look at entire networks, looking for weaknesses that cross application boundaries, where one system could be exploited to then take advantage of another. In all cases, we're looking at whole programs, not parts thereof, like code fragments or configuration files, libraries and so on. Now who are pen testers, and how do they work? Well pen testers are teams that use guile and automated tools to find security issues. A good pen tester is creative, thinking carefully about how a system is put together to find assumptions that turn out to be weaknesses. Good tool support is essential for an effective pen testing engagement. As the pen tester comes up with hypotheses about potential weaknesses, tool support can be used to systematically probe a target and see if those weaknesses are present. Pen testing is carried about by the, late in the development process by a team that is different from the one that built the system. Having a separate team ensures that the target is given a fresh look. Developers can have blind spots about their own software and a separate team can help see past those. A separate team is also useful in that pen testing requires a specialized skill set and this skill set can be developed across multiple projects. The pen testing team may be told a lot or a little about the system that they are testing, and given a lot or a little access to its deployment environment. We might, on the one hand, try to simulate the access that an attacker should have. For example, only from outside a company firewall, via forward facing components. On the other hand, we might provide more access or information to model an attack by a company insider. In general, giving an adversary more powers gives a better idea of how the system holds up if some defenses fail to hold. That said, if the design was against a weaker threat model than it's being tested against, then it shouldn't be surprising that we'll find additional issues, and discovered issues should be assessed and kept in perspective. Pen testing has been around for quite a while. As computers were coming into wider use in the 1960s, time sharing operating systems emerged, which allowed multiple users to use a computer at the same time. Previously, computers were largely used in batch mode, running one program at a time which left no remnant behind. Time sharing with per user storage introduced new security challenges. In particular, operators were worried that one user might present a security risk to another user. As such, a task force of experts headed by Willis Ware of the RAND Corporation was formed to look at the issue. His team produced a report in 1967 that came to be known as the Ware Report. Many of the ideas that we have discussed in this class were first brought to light in the Ware Report. For example, the different categories of security policy, confidentiality, integrity and so on. Relevant to our current discussion, the report also used the term penetration to describe a successful attack on a computer system. By the 1970s, the government was regularly using teams to assess the security of computer systems by trying to penetrate them. These teams were referred to as red teams, or tiger teams. Today, penetration testing is a mature field, with services provided by independent companies, as well as divisions within larger organizations. Much of the buzz about cybersecurity, particularly among students, focuses around penetrations, which are gamified in CTF, or capture the flag competitions, like DefCon. There is also a penetration testing certification organized by the Information Assurance Certification Review Board, or IACRB. The certification is based on both conceptual material and a demonstration of skills by hacking into a target VM. Why should we do pen testing? Well, it has several benefits. First, penetrations are certain. After all, they have been demonstrated. Hopefully, they also come with some reusable evidence. For example, evidence of an SQL injection could be the exploit payload entered in a form field of a web application. As such, unlike errors produced by, say, code reviews or static analysis tools, penetrations are not hypothetical. By virtue of the fact that they are applied to whole, deployable components they are also relevant under realistic configurations. And, just as they are not hypothetical, they're not wrong. Penetrations are not, or are rarely, false alarms. Pen testing has non-technical benefits as well. One of them is a feel good factor. After completing an engagement those responsible for the target system can feel like they have really improved security or will soon once they fix the issues. And this is because they have found real vulnerabilities that probably would have gone unfixed and could really have been exploited in the wild. Before this point, such vulner, vulnerabilities were only hypothetical, not made manifest. Now on the other hand, penetration testing is not a panacea. We have to be careful not to assume we're getting more out of it than we really are. Most importantly, penetration testing will not find all of a system's security problems. As such, an absence of any discoveries does not imply the absence of vulnerabilities. Likewise, fixing vulnerabilities that were found, if any, does not mean that none are left. Penetration testers may not even have looked for certain sorts of problems, depending on the rules of the engagement and the assumptions of the threat model. Security, to the extent that pen testing has established it, is ephemeral. We cannot rest on our laurels. When you change the software, the configuration, the network topology, and so on, you potentially create new vulnerabilities. Why? Well, an important thing to keep in mind is that security, in general, is not compositional. This observation was first made by Leslie Lamport, recent Turing Award winner, back in the 1970's. What it means is that two components that are secure on their own are not necessarily secure when used in combination. Penetration testing looks at the whole system and that's its advantage, but subsequent changes might break a component and thus compromise the system. Even worse, a change to one component might not break that component, but could break the whole system anyway due to the lack of compositionality. As such, we must employ the other processes discussed in this course to reduce the chances as much as possible that we break components. And we must employ whole system testing to ensure that the composition works too. In short, despite its limitations, penetration testing is something very much worth doing. Okay, in the remainder of this unit, we're going to consider two parts. First, we present an overview of pen testing and the tools that are commonly used by pen testers. We will see that pen testing is both art and science. As an art form it relies on the creativity and ingenuity of the pen tester. As clearly successful techniques emerge, pen testing moves to becoming a science. And the fruits of this science are turned into tools and automation that subsequent pen testers can use. And we'll look at several tools in particular. First, Nmap is a tool for scanning networks, probing for computers and other devices that might be targets of attack. Zap, another tool to look at, is a web proxy that intercepts communications between a web browser and a web server, allowing the pen tester to see what's going on and to manipulate it looking for vulnerabilities. Third, metasploit is a tool for developing and deploying exploits. It is highly configurable with a tool kit that pen testers can use to find and exploit vulnerabilities. Now in the second part of the unit we'll consider fuzz testing or fuzzing, a technique that many pe, penetration testing yules, tools employ. Fuzz testing works by corrupting inputs and interactions between target components. The goal is to see whether such corruptions will cause the system to break in a way that could be exploited in an attack. And we'll consider several tools and techniques that have proven to be successful.
Pen Testing
What is penetration testing? Penetration testing is the process of trying to find exploitable vulnerabilities in complete systems or system components. And it is both an art and a science. Pen testers must be creative. They must think about how a system is put together, and poke it and probe it to see where assumptions made by designers represent weaknesses. Pen testers will cleverly adapt to what they find as they gain a foothold in one place. They may use that foothold to exploit a weakness somewhere else. Over time, certain patterns of weaknesses emerge. Many systems will be incorrectly built or misconfigured in the same sort of ways. When patterns are discovered, we can build tools that systematically look for those patterns, or try to exploit them. In a sense, this is ingenuity automated. As Donald Knuth once said, science is something we understand well enough to tell to a computer, while art is everything else. A pen tester's job is to find vulnerabilities in the applications being penetration tested. What's in a pen tester's bag of tricks to make him successful at that job? Well, a penetration tester needs to approach a target knowing a lot about the target domain. For example, if the pen tester is attacking web applications, then the pen tester needs to know a lot about how the web works. You also need to know how systems are built in that domain. For example, what protocols allow applications to communicate. For the web, that's HTTP and TCP, and IP. The languages that are used to build applications like PHP, Java, or Ruby for talking about the web. Or frameworks that used to build applications or application components like, for the web, Ruby on Rails, DreamWeaver, Drupal, and so on. You also want to know common weaknesses from that domain. For example, the bugs that are common to web applications like SQL injections or cross-site scripting, or cross-site request forgery. Or common misconfigurations or bad designs, like the use of default passwords or hidden files. Continuing our example of pen testing the web. Consider one professional's point of view of what web hacking is about. 70% of the process is simply messing with parameters. So for example, if the target URL is the following, which you can imagine is what one might generate when clicking a button to buy an item on an online store. You can try to change the url to change the price and see what happens, or change the item number to see if you can buy something else. This would demonstrate the client parameters or unwisely trusted by the web application. Or you can try to insert a script in the parameters that are set in the url. If this works, and maybe it's susceptible to cross site scripting, or you could had structural characters to the URL, for example, to see whether or not you can inject other sorts of code. Another 10% of web hacking might be default passwords. That is, research what the default password might be and try it out to see whether or not the target has changed it. This research could be through online searches or through looking at user manuals. Another 10% might be hidden files and directories. Again, you could look through manuals for clues or you could just try out a bunch of common file names on the directory by typing in different URLs. You might find password files, or other administrative pages. And the remaining 10% might be other things, like authentication problems, such as bypass or replay. Or insecure web services, that is, ones that provide APIs that don't require any authentication. Or configuration page that gives away passwords. Now let's talk about some pen testing tools. Pen testers use tools for a variety of tasks. They use them to probe the target system, to understand important characteristics about it. Pen testers also use tools to gather information about a system. And perhaps test hypotheses about it. For example, they may see how it responds to certain stores, certain sorts of stimulus. Finally, pen testers will use tools to actually exploit a target system to demonstrate an actual vulnerability in it. Now there are many possible security tools out there, which should we use? Well, the answer depends on the target of our test and of our goal. If we're pen testing a network, then tools should explore the entire network. Its components, its topology and so on looking for issues. If we're targeting a single machine, tools might help us figure out what files are on that machine, and what software it's running, and what exploits it's susceptible to. If the target of our pen test is a single program, then tools will help us to explore that program's attack surface to see where it might handle input incorrectly. And as such be vulnerable to an attack. The first tool we'll consider is Nmap, which is used for network probing. Nmap stands for network mapper, and it will figure out for you what hosts are available on the network. What services, that is application name and version, those hosts are offering. What operating systems the hosts are running, and what type of packet filters or firewalls are in use, and many other things. Nmap works by sending raw IP packets into the network and observing the effects. You can give it a range of addresses and it will send packets to those addresses watching how the hosts that it tries to find at those addresses respond. Nmap is free open source though there are commercial variants as well, and you can get it here at this url nmap.org. So in a little more detail, here's now Nmap might find hosts and services by sending pings to various IP addresses. In particular, it might send ICMP Echo Request or Timestamp requests. These are the implementation of the standard ping protocol. Now, sometimes these packets are dropped by routers or firewalls, and so other packets should be injected as well. We might send TCP SYN or TCP SYN/ACK messages to ports 443 or ports 80. If we get any sort of response, that suggests that an HTTP or HTTPS server, that is, web servers are running on these ports. Nmap will try other things as well as, as designated by the operator. That is, it might send UDP packets, that is, User Datagram Protocol packets, to particular ports. And it might fill out those packets to look like what's expected of packets on those ports. For example, DNS often uses UDP, where DNS is the domain name service, and we could send DNS packets to that port and see whether or not there is a domain name server running at a particular address. Nmap will also send probes to other TCP ports, for example, looking for web servers or other TCP based services like FTP. And it will send probes that may illicit different responses on different operating systems in an attempt to fingerprint which machines it's talking to. Nmap can also attempt to be stealthy. A flurry of activity may be detected by an intrusion detection system. And so Nmap can be configured to emit packets at a slower rate to stay under the radar. Another common tool in a pen tester's arsenal is a web proxy. Web applications are common pen testing targets. And web proxies are useful because they sit between the browser and server capturing packets. They'll display any packets that are exchanged and allow the pen tester to modify them. Some proxies have additional features for vulnerability scanning, exploitation, site probing, and so on. One example of a web proxy is Zap, the OWASP Zed Attack Proxy. It provides a GUI that the pen tester can use to inspect or modify captured packets. It can allow the pin tester to set break points so that packets can go through quickly until a certain condition is met at which point the exchange is stopped and the pin tester can have a look or make a modification. Zap has several other features too, like active scanning which will attempt cross side scripting, SQL injection and so on. Fuzzing, in which the proxy sends context-specific payloads to see whether it can crash or otherwise corrupt the web application. And a spider that explores a site to construct a model of its structure. And this is useful to give a pen tester sort of a lay of the land of what the site is about and give him an idea of where to look for hidden URLs and things like that. Zap is free and open source, but commercial tools are available as well such as the Burp suite. This is quite popular among professional testers. There is a free version of it, but the professional version actually doesn't cost that much more and provides many useful features. The third and final tool we'll consider that a pen tester may find useful is Metasploit. Metasploit is quite popular. It's an advanced open-source platform for developing testing and using exploit code. It provides an extensible model through which payloads encoders no-op generators and exploits can be integrated together. It works by allowing the pen tester to script attacks. These scripts take several steps. First, probe the remote site looking for vulnerable services. Based on what's found, construct a payload. Encode the payload to avoid detection. For example, by introducing superfluous features that intrusion detection systems will be confused about, and therefore, fail to block. Inject the payload and wait for shellcode to connect back. In the end, you'll have a command prompt that you can use post-expoitation. So here's how it works, visually depicted. On the left, we see the pen tester armed with Metasploit and some scripts. And on the right, we see the target network. Two of the servers on the network are patched, but one of them is not. In step one, Metasploit probes the vulnerable target systems, and maybe it's able to find the vulnerable server. Next, it constructs the attack payload, which it then sends along, according to the script, triggering the vulnerability and compromising the target. In the best case, the target will connect back, providing the remote shell. The Metasploit framework has a bunch of different ways to support user interaction. For example, the msfconsole is an interactive type based command prompt for executing Metasploit commands. There's also a web based front end and a command line interface from the normal shell. All of these support probing and communication commands, payload construction and encoding, and support active and passive attacks. Active attacks are those that involve connecting to a remote service. While passive attacks are those involve setting up the service and letting clients connect to you. Meterpreter is a command processor that's actually injected into the target. For example, in the memory of a compromised process. If someone on the target system were to look at the process table, they would not see that anything is amiss. They would see all the programs running that they expect to see running. Unknown to them, inside one of those programs is actually running a meterpreter, which is following the commands from the remote metasploit operator. Obviously, this permits the pen tester to probe more stealthily. Msfpayload and msfencode are tools that generate stealthy shellcode. Msfpayload generates the payload. Msfencode encodes it to make it hard to detect. Metasploit comes with hundreds of modules and script developed by a broad community over time. These include exploits against particular vulnerabilities, along with stagers and other modifiers that generalize the exploits to different platforms. Tools for password sniffing. Privilege escalation. And keylogging and backdoors. All of these can be combined together using scripts to generate very sophisticated exploits, and there's much more. Kali is a Linux distribution with many open-source pen testing tools installed and configured. These include the ones we have already mentioned Nmap, Zap, Metasploit, and the Burp Suite and dozens of other ones such as John the Ripper for password cracking, Valgrind for dynamic binary analysis, Reaver for WiFi password cracking, peepdf for scanning PDF files for attack vectors and many more. Overall, there are many, many pen testing tools that are available, all in different levels of use, and many of them are installed on this Kali distribution. As I mentioned earlier in this unit, sectools.org gives a comprehensive list. At this point, I want to make a note about ethical hacking. Penetration testing tools are meant to reveal security vulnerabilities so that they can be fixed, not so that they can be exploited in the wild for the purposes of crime or harm. But it is true that people will use these penetration testing tools for nefarious purposes. In that way, they are sort of two way tools. Just as guns can be used to defend, guns can be used to attack. Don't be someone who uses pen testing tools to attack.
Fuzzing
[NOISE] What is fuzz testing? Fuzz testing is basically a kind of random testing which is a kind of testing in which inputs for a test case are generated randomly or semi randomly. The goal of fuzz testing is to make sure that bad things don't happen. By bad things I mean crashes such as due to memory errors or uncaught exceptions or hangs do to non-termination. As we have seen memory errors often correspond to exploitable vulnerabilities. And thus are security critical. Non-termination is also a security problem since it can create a denial of service. Now, these are not the only bad things, of course. For example, a program could produce the wrong output. But a fuzz tester doesn't know anything about what the outputs of his tests should be. The fuzz testing tool only knows that those tests should not hang or crash. As such, fuzz testing is a very useful but a very limit endeavor. It complements, but does not replace traditional functional testing. In fact, functional tests can be the starting point for fuzz tests. Input from functional tests is assumed to be legitimate and fuzz test can therefore derive input from these legitimate tests. There are three basic kinds of fuzzing. The first is black box fuzzing. In this case the testing tool knows nothing about the program or its input format. And instead just generates random inputs to throw at that program. Now this is very easy to use, and to get started using. But, in the end it may explore only shallow states in the program, unless it gets very lucky. Grammar based fuzzing works by having the tool generate input informed by a grammar that describes the input format expected by the target program. Now this is more work to use, because the operator needs to define the grammar. But on the other hand, it will often go deeper in the program's state space. In particular, a grammar-based fuzzer will be able to get past the well formedness checks that the target program probably is implementing on its input, and therefore will be able to cover more parts of the program. The last kind of fuzzer is a white box fuzzer. In this case the tool generates new inputs at least partially informed by the code of the target program itself. Now these tools are often easy to use, because the fuzzing tool itself is able to look at the code and decide what inputs to generate to go to different parts of that target program's code. But as a result, they can be computationally expensive. Because they'll involve things like theorem provers. There are different ways that fuzzing tools generate inputs to pass to the target program. The most common way is by mutation. In this case, the fuzzer takes a legal input provided by the operator and mutates it, using that as an input instead. Such legal inputs might be human produced or automated, for example from a grammar or SMT solver query. The mutation might also be forced to adhere to a grammar. Legal inputs are often borrowed from legitimate functional tests. The second way that inputs are produced is generational. In this case, the tool generates an input from scratch. It might do so entirely randomly. Or it might do so following a grammar to the program. And there may be combinations of these for example, we could generate an initial input. Then mutated n times, generate new inputs, mutate those, and so on. And of course mutations could be generated according to a grammar too. That is if the original program's input adheres to a grammar, then we create mutations to that input that are informed by that grammar to make sure that it is likely to get further into the program's state space. Now there are two basic ways that fuzzers are used. One is as file-based fuzzers, and the other is as network-based fuzzers. So let's look at the file-based approach. What's going to happen here is the fuzzing tool will mutate or generate inputs. It will then run the target program using those inputs and see what happens. So here it is visualized. An example of a file based fuzzer is Radamsa, another one is Blab. Radamsa is a mutation-based, black box fuzzer. It mutates inputs that are given, passing them along to the target program. So here's an interaction with Radamsa, we start by echoing a legal input, passing it to Radamsa. Radamsa's arguments include a random seed, as well as the number of random input mutations to generate. You can see here that the inputs that Radamsa produced are variations of the input it originally received. Now, Radamsa can sit in between the generation of normal inputs and the receiving target program as follows. Here we're showing that the original input, which is a legal arithmetic expression that we might pass to the calculator program bc. Can have that input fuzzed by Radamsa first before bc is able to operate on it. Blab is a grammar-based fuzzer. The grammars are specified by the user as regular expressions and context-free grammars. So as an example, Blab would take a regular expression as a command line argument, describing the legal input space, and using that regular expression, it can generate an input that can be fed to the target program. Another example of a file-based fuzzer is American Fuzzy Lop. It is a mutation-based, white-box fuzzer, and it works according to the following process. We being by instrumenting the target program to gather run-time information. This is how it uses the code to determine what the next inputs ought to be to improve coverage. The instrumentation it inserts, tracks tuples that indicate where the program's execution has taken it. And these tuples consist of the ID of the current code location-you can think of this as just the line number. End an ID of the last code location before reaching the current one. With the instrumented program in hand, we run a test and we mutate the test input to create a new test if an unseen tuple was generated by that test. Otherwise, the test was not useful for covering new spaces of the program and we simply discard it. Mutations consist of things like bit flips, arithmetic and other standard sorts of things used by mutation-based fuzzers. After running for a while, American Fuzzy Lop will periodically cull the gathered tests to avoid getting stuck in local minima. In a sense it will stop making small local changes to tests. And instead make one big change to try to jump to a different part of the states base. So here's an example interaction. We can start by running afl-gcc. This is meant to be a drop in replacement for the gcc c compiler. Afl-gcc will instrument the target and pass it along to gcc for compilation. Using the instrumented target program, we call it afl-fuzz. And this starts the process of fuzz testing. This will run for a long time, and produce diagnostics as it discovers failing tests. Another example of a white box fuzzer is SAGE, and this uses symbolic execution as its underlying test generation technology. We talked about SAGE in the prior unit on symbolic execution. Another kind of fuzzer is a network-based fuzzer. In one role it can act as one half of a communicating pair. Sending messages to and from a target program, trying to get it to crash. Inputs could be produced by replaying a previously recorded interaction and mutating it. Whereby producing an interaction from scratch, for example from a protocol grammar. Here we can imagine the interaction specification being given to the fuzzer. The fuzzer can then interact with the target mutating that interaction input until the target potentially crashes. Another role for a network-based fuzzer is to act as a man in the middle, mutating inputs exchanged between parties, again, with those mutations possibly informed by a grammar. An example of a network-based fuzzer is SPIKE. SPIKE is a fuzzer creation kit and it provides a C language API for programming fuzzers in C that interact with remote servers using network-based protocols. The way that a programmer uses SPIKE is to create a series of blocks that form parts of protocol messages, and to leave holes in those blocks that spike can fuzz. As an example, we might start off by specifying a size string call. This indicates that in the protocol, a length field for the rest of the packet will go here. We don't know what that length yet will be, so we leave a place for it using size string. Next we specify the start of the block, whose length we're interested in, and we specify some contents of that block. If we want to insert constants we can use the S string call, here we use the S string variable call where the argument is a prefix of the fuzzed component so user equal Bob will be included in the block and then a bunch of random texts will follow it. We end the block and this will determine the length which can get back patched at the start. Spike then allows you to connect to a particular host and port by a TCP. Send the block, and then close the connection. Of course there are many other examples and interactions that Spike permits. Another example network fuzzer is Burp Intruder. It's one element of the Burp suite. Burp automates customized attacks against web applications, and it's similar to SPIKE in allowing the user to craft a template of a request, but leave holes, which it calls payloads, for fuzzing by the tool. Unlike SPIKE, which is a programming API, BURP Intruder provides a nice, GUI front end. And BURP Intruder integrates with the rest of the BURP Suite, which includes a proxy, scanner, spider and many other tools. Okay, so you've been penetration testing, you're using fuzzing, and a crash occurs. You have questions. For example, what is the root cause of the crash that the fuzzer found, so that we can fix it. In order to figure this out we might ask, is there a way to make the input smaller so that it's more understandable or are two or more crashes signaling what is effectively the same bug, that is do they minimize to the same input so that when removing superfluous differences they result in a crash. Some tools provide some assistance in answering these questions. For example in trying to make the input smaller automatically. Sometimes though the operator needs to figure this out for himself. Another important question is weather a crash is security relevant. That is does it signal the possibility of an exploitable vulnerability. Now a crash due to dereferencing a NULL pointer is rarely exploitable, particularly when running a user space program. But buffer overruns more often can be. Memory errors are particularly difficult to deal with after fuzzing, because the effects of a memory error may occur well after the point at which it originally takes place. That is if you over run a buffer and scribble over some portion of memory. The program will not immediately crash. Only when that scribbled over memory is re-used will the program go the wrong way. One way to make it so that crashes happen immediately upon overriding buffers is to use Address Sanitizer. This is a tool that can instrument a program so that accesses to arrays check for overflows and use after free errors. Given your instrumented program, you can fuzz it, and if the program crashes with an ASAN-signaled error then you can start worrying about exploitability. You can do the same trick with other sorts of error checkers for the purposes of testing. For example, valgrind memcheck is another way of looking for memory errors. We've discussed several example fuzzers so far and the basics of fuzzing overall, and there are yet more fuzzers that are available for use. For example, the CERT Basic Fuzzing Framework, or BFF, based in part on an earlier fuzzer, Zzuf, is freely available, and it has been used to find bugs in commonly used software, like Adobe Reader, Flash Player, Apple's Preview and QuickTime, and many others. Sulley is a fuzzing tool that provides lots of extras to manage the fuzzing process. In addition to the core technology of generating random inputs to find useful test cases, Sulley wrote, watch the network and methodically maintain records about what's happened. Instrument and monitor the health of the target, capable of reverting to a known good state if things go awry. Detecting, tracking and categorizing detected faults and even fuzzing in parallel, if that's desired Let's summarize. Penetration testing aims to simulate deployment scenarios involving determined attackers. The goal is to see whether the penetration team can find exploitable vulnerabilities. If they can they have provided evidence of true insecurity. On the other hand the lack of penetrations is not evidence of their impossibility and as such designs should always aim to mitigate and recover from as yet unknown attacks. Pin testers employ a variety of tools in their work. Ranging from scanners to proxies to exploit injection and management tools, to fuzz testing tools, nevertheless, tools are no replacement for an ingenious and thoughtful testing person. Such a tester will use tools to increase the coverage, speed, reliability, and repeatability of her efforts.
No comments:
Post a Comment