next up previous
Next: Code Analysis and Optimization Up: Joeq: A Virtual Machine Previous: Overview


Front-end

The front-end component handles the loading and parsing of input files into the virtual machine. Joeq has support for three types of input files: Java class files[18], SUIF intermediate representation files[2], and ELF binary files[21].

The Java class loader decodes each Java class file into an object-oriented representation of the class and the members it contains. Our class loader fixes many of the nonuniformities and ideosyncrasies present in Java class files. For example, we make a distinction at the type level between static and instance fields and methods; i.e. there are separate classes for instance methods and static methods and likewise for fields. Member references in class files do not make these distinctions. We handle this by deferring the creation of the object representing the field or method until we are actually forced to resolve the member, at which point we know whether it is static or instance. We also explicitly include the implicit ``this'' parameter in the parameter list for instance methods, so code can treat method parameters uniformly.

The SUIF loader loads and parses SUIF files, a standard intermediate format that is widely used in the compiler research community[2]. There are SUIF front-ends available for many languages, such as C, C++, and Fortran. This allows Joeq to easily load and compile many languages.

The ELF binary loader can load and decode object files, libraries, and executable images in the popular ELF format[21]. The front-end also includes an intelligent disassembler, which can disassemble the binary code for a function, undoing stack spills and converting the code into operations on pseudo-registers. It also recognizes some common control flow paradigms. This allows Joeq to seamlessly load and analyze binary code as if it were just another front-end.

All three of these formats are converted into a unified intermediate representation based on pseudo-registers called the Quad form, which is covered in more detail in the next section. Because all inputs lead to a unified format, all analyses and optimizations on that format can be performed uniformly across all of the different types of code. This allows us, for example, to inline Java Native Interface (JNI) C function implementations into their callers, or analyze arbitrary library calls to see if a passed-in reference can be written to another location. This is especially powerful because it allows us to avoid a lot of redundant checks and marshalling and unmarshalling of arguments, and lets analyses and optimizations avoid having to make conservative assumptions about cross-language procedure calls.


next up previous
Next: Code Analysis and Optimization Up: Joeq: A Virtual Machine Previous: Overview
John Whaley 2003-03-15