Generalized vulnerability extrapolation using abstract syntax trees
Citations Over TimeTop 10% of 2012 papers
Abstract
The discovery of vulnerabilities in source code is a key for securing computer systems. While specific types of security flaws can be identified automatically, in the general case the process of finding vulnerabilities cannot be automated and vulnerabilities are mainly discovered by manual analysis. In this paper, we propose a method for assisting a security analyst during auditing of source code. Our method proceeds by extracting abstract syntax trees from the code and determining structural patterns in these trees, such that each function in the code can be described as a mixture of these patterns. This representation enables us to decompose a known vulnerability and extrapolate it to a code base, such that functions potentially suffering from the same flaw can be suggested to the analyst. We evaluate our method on the source code of four popular open-source projects: LibTIFF, FFmpeg, Pidgin and Asterisk. For three of these projects, we are able to identify zero-day vulnerabilities by inspecting only a small fraction of the code bases.
Related Papers
- → Homologous detection based on text, Token and abstract syntax tree comparison(2010)4 cited
- → Using circular programs for higher-order syntax(2013)2 cited
- Correctly Defined Concrete Syntax for Visual Modeling Languages(2006)
- → Syntax Analysis: The Left-Most-Derivation-and-Reduction Trees and its Compare with the LR Parsing Methods(2014)
- BNF syntax unifying Dmoz and Yahoo syntax and its implementation(2009)