Multi-Language Code Search

Abstract

Searching for specified code patterns is a common component in many types of programming tools for any language. Unfortunately, there is no code search approach that produces application-friendly search results and supports multiple languages under the same system. This work presents a program representation for multi-language code search called Yograph, which allows languages to share a common representation for the same computation while retaining language-specific information. To bridge syntactic variations in and across languages, a single Yograph can be augmented to represent many equivalent programs and high-level abstractions using equality rules. We also present Yogo, a code search tool for Java and Python that implements Yograph and outputs search results as detailed pointers to AST nodes. Our evaluation shows that, in both languages, Yogo can search for realistic patterns in realistic programs and find matches that look different or are mixed with unrelated code but ultimately perform the same computation.

Publication
Master’s Thesis