31st IEEE Symposium on Security & Privacy TaintScope: A Checksum-Aware
27 Slides1.05 MB
31st IEEE Symposium on Security & Privacy TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection 1 1 2 Tielei Wang , Tao Wei , Guofei Gu , Wei 1 Zou 1 Peking University, China 2 Texas A&M University, US
Outline Introduction Background Motivation TaintScope Intuition System Design Evaluation . Conclusion 2
Fuzzing/Fuzz Testing Feed target applications with malformed inputs e.g., invalid, unexpected, or random test cases Proven to be remarkably successful E.g., randomly mutate well-formed inputs and runs the target application with the Malformed “mutations” Input Fuzze r Introduction cras h Applicatio n TaintScope Conclusion 3
Fuzzing is great In the best case, malformed inputs will explore different program paths, and trigger security vulnerabilities Introduction TaintScope However Conclusion 4
A quick example re-compute a new checksum 1 void decode image(FILE* fd){ 2 . 3 int length get length(fd); read the attached 4 int recomputed chksum checksum(fd, length); checksum 5 int chksum in file get checksum(fd); //line 6 is used to check the integrity of inputs 6 if(chksum in file ! recomputed chksum) 7 error(); 8 int Width get width(fd); compare tow values 9 int Height get height(fd); 10 int size Width*Height*sizeof(int);//integer overflow 11 int* p malloc(size); 12 . Malformed images will be dropped when the decoder function detects checksums mismatch Introduction TaintScope Conclusion 5
Checksum: the bottleneck Checksum is a common way to test the integrity of input data Most mutations are blocked at the checksum test point if(checksum(Data)! Chksum) Introduction TaintScope Conclusion 6
Our motivation Penetrate checksum checks! Our Goal Introduction TaintScope Conclusion 7
Intuition Disable checksum checks by control flow alteration if(checksum(Data)! Chksum) goto L1; exit(); L1: continue(); Modified Original program Fuzz the modified program Repair the checksum fields in malformed inputs that can crash the modified program Introduction TaintScope Conclusion 8
Key Questions Q1: How to locate the checksum test instructions in a binary program? Q2: How to effectively and efficiently fuzz for security vulnerability detection? Q3: How to generate the correct checksum value for the invalid inputs that can crash the modified program? Introduction TaintScope Conclusion 9
TaintScope Overview Q 1 Q 2 Checksum Locator Modified Program Q 3 Directed Fuzzer Instruction Profile Execution Monitor Crashed Samples Checksum Repairer Hot Bytes Info Reports 10
A1: Locate the checksum test instruction Key Observation 1 Checksum is usually used to protect a large number of input bytes Data Chksum if(checksum(Data) ! Chksum) Based on fine-grained taint analysis, we first find the conditional jump instructions (e.g., jz, je) that depend on more than a certain number of input bytes Take these conditional jump instructions as candidates Introduction Conclusion TaintScope 11
A1: Locate the checksum test instruction Key Observation 2 Well-formed inputs can pass the checksum test, but most malformed inputs cannot We log the behaviors of candidate conditional jump instructions Introduction TaintScope Conclusion 12
A1: Locate the checksum test instruction Key Observation 2 Well-formed inputs can pass the checksum test, but most malformed inputs cannot ① We log the behaviors of candidate conditional jump instructions Run well-formed inputs, identify the always-taken and always-not-taken insts Introduction TaintScope Conclusion 13
A1: Locate the checksum test instruction Key Observation 2 Well-formed inputs can pass the checksum test, but most malformed inputs cannot ① ② We log the behaviors of candidate conditional jump instructions Run well-formed inputs, identify the always-taken and always-not-taken insts Run malformed inputs, also identify the always-taken and always-nottaken insts Introduction TaintScope Conclusion 14
A1: Locate the checksum test instruction Key Observation 2 Well-formed inputs can pass the checksum test, but most malformed inputs cannot ① ② ③ We log the behaviors of candidate conditional jump instructions Run well-formed inputs, identify the always-taken and always-not-taken insts Run malformed inputs, also identify the always-taken and always-nottaken insts Identify the conditional jump inst that behaves completely different when processing well-formed and Introduction malformed inputs TaintScope Conclusion 15
A2: Effective and efficient Blindly mutating will create huge amount of fuzzing redundant test cases --- ineffective and inefficient Directly modifying “width” or 1 void decode image(FILE* fd){ 2 . “height" fields will trigger the . bug easily 6 if(chksum in file ! recomputed chksu goto 8; 7 error(); 8 int Width get width(fd); 9 int Height get height(fd); 10 int size Width*Height*sizeof(int);//integer overflow 11 int* p malloc(size); 12 Directed fuzzing: focus on modifying the “hot bytes” that refer to the input bytes flow into critical system/library calls Memory allocation, string operation Introduction TaintScope Conclusion 16
A3: Generate the correct checksum The classical solution is symbolic Solving checksum(D ata) Chksum is hard or execution and constraint solving impossible, if both Data and Chksum are symbolic values We use combined concrete/symbolic execution Only leave the bytes in the checksum field as symbolic values Collect and solve the trace constraints on Chksum when reaching the checksum test inst. Note that: checksum(Data) is a runtime determinable constant value. Introduction Chksum originates from the checksum field, but Conclusion may be TaintScope 17 transformed, such as from hex/oct to dec number, from little
Design Summary Directed Fuzzing Identify and modify “hot bytes” in valid inputs to generate malformed inputs On top of PIN binary instrumentation platform Checksum-aware Fuzzing Locate checksum check points and checksum fields. Modify the program to accept all kinds input data Generate correct checksum fields for malformed inputs that can crash the Introduction Conclusion TaintScope modified program 18
Evaluation Component evaluation E1: Whether TaintScope can locate checksum points and checksum fields? E2: How many hot byte in a valid input? E3: Whether TaintScope can generate a correct checksum field? Overall evaluation E4: Whether TaintScope can detect previous unknown vulnerabilities in realworld applications? Introduction TaintScope Conclusion 19
Evaluation 1: locate checksum points We test several common checksum algorithms, including CRC32, MD5, Adler32. TaintScope accurately located the check statements. Introduction TaintScope Conclusion 20
Evaluation 2: identify hot bytes We measured the number of bytes could affect the size arguments in memory allocation functions Introduction TaintScope Conclusion 21
Evaluation 3: generate correct checksum fields We test malformed inputs in four kinds of file formats. TaintScope is able to generate correct checksum fields. Introduction TaintScope Conclusion 22
Evaluation 4 : 27 previous unknown vulns MS Paint Google Picasa irfanview gstreamer Amaya dillo Introduction Adobe Acrobat ImageMagick Winamp XEmacs wxWidgets TaintScope PDFlib Conclusion 23
Evaluation 4 : 27 previous unknown vulns 24
Evaluation 4: 27 previous unknown vulns Introduction TaintScope Conclusion 25
Conclusion Checksum is a big challenge for fuzzing tools TaintScope can perform: Directed fuzzing Checksum-aware fuzzing Identify which bytes flow into system/library calls. dramatically reduce the mutation space. Disable checksum checks by control flow alternation. Generate correct checksum fields in invalid inputs. Introduction TaintScope Conclusion TaintScope detected dozens of serious26
Thanks for your attention!