Alexandria Digital Research Library

Automatic Detection and Repair of Input Validation and Sanitization Bugs

Alkhalaf, Muath Abdullah
Degree Grantor:
University of California, Santa Barbara. Computer Science
Degree Supervisor:
Tevfik Bultan
Place of Publication:
[Santa Barbara, Calif.]
University of California, Santa Barbara
Creation Date:
Issued Date:
Computer Science
Model Checking
Formal Methods
Software Verification
Program Repair
Dissertations, Academic and Online resources
Ph.D.--University of California, Santa Barbara, 2014

A crucial problem in developing dependable web applications is the correctness of the input validation and sanitization. Bugs in string manipulation operations used for validation and sanitization are common, resulting in erroneous application behavior and vulnerabilities that are exploitable by malicious users. In this dissertation, we investigate the problem of automatic detection and repair of validation and sanitization bugs both at the client-side (JavaScript) and the server-side (PHP or Java) code.

We first present a formal model for input validation and sanitization functions along with a new domain specific intermediate language to represent them. Then, we show how to extract input validation and sanitization functions in our intermediate language from both client and server-side code in web applications. After the extraction phase, we use automata-based static string-analysis techniques to automatically verify and fix the extracted functions. One of our contributions is the development of efficient automata-based string analysis techniques for frequently used, complex string operations.

We developed two basic approaches to bug detection and repair: 1) policy-based, and 2) differential. In the policy-based approach, input validation and sanitization policies are expressed using two regular expressions, one specifying the maximum policy (the upper bound for the set of strings that should be allowed) and the other specifying the minimum policy (the lower bound for the set of strings that should be allowed). Using our string analysis techniques we can identify two types of errors in an input validation and sanitization function: 1) it accepts a set of strings that is not permitted by the maximum policy (i.e., it is under-constrained), or 2) it rejects a set of strings that is permitted by the minimum policy (i.e., it is over-constrained).

Our differential bug detection and repair approach does not require any policy specifications. It exploits the fact that, in web applications, developers typically perform redundant input validation and sanitization in both the client and the server-side since client-side checks can be by-passed. Using automata-based string analysis, we compare the input validation and sanitization functions extracted from the client- and server-side code, and identify and report the inconsistencies between them. Finally, we present an automated differential repair technique that can repair client and server-side code with respect to each other, or across applications in order to strengthen the validation and sanitization checks. Given a reference and a target function, our differential repair technique strengthens the validation and sanitization operations in the target function based on the reference function by automatically generating a set of patches.

We experimented with a number of real world web applications and found many bugs and vulnerabilities. Our analysis generates counter-example behaviors demonstrating the detected bugs and vulnerabilities to help the developers with the debugging process. Moreover, we automatically generate patches that can be used to mitigate the detected bugs and vulnerabilities until developers write their own patches.

Physical Description:
1 online resource (252 pages)
UCSB electronic theses and dissertations
Catalog System Number:
Inc.icon only.dark In Copyright
Copyright Holder:
Muath Alkhalaf
File Description
Access: Public access
Alkhalaf_ucsb_0035D_12209.pdf pdf (Portable Document Format)