Skip to content

Commit 03874a4

Browse files
committed
Initial contribution of compiler technology consisting of:
* high-level optimization technology featuring classic compiler optimizations, loop optimizations, control and data flow analyses, and support data structures * code generation technology with deep platform exploitation for x86 (i386 and x86-64), Power, System Z, and ARM (32-bit) * a robust, tree-based intermediate representation (or IL) and support code for producing IL from different method representations * expressive tracing and logging infrastructure for problem determination * JitBuilder technology to simplify the effort to integrate a JIT compiler into an existing language interpreter * a framework for constructing language-agnostic unit tests for compiler technology Issue: #199 Signed-off-by: Daryl Maier <maier@ca.ibm.com>
1 parent e648ee9 commit 03874a4

File tree

1,208 files changed

+681314
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,208 files changed

+681314
-0
lines changed

compiler/README.md

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# Eclipse OMR Compiler Technology
2+
3+
The Eclipse OMR compiler technology is a collection of stable, high performance optimization and code generation technologies suitable for integration into dynamic and static language runtime environments.
4+
5+
It is not a standalone compiler that can be linked into another environment. Rather, it provides all the essential building blocks for integrating and adapting this advanced compiler technology for different language environments.
6+
7+
It originates from a mature, feature-rich compiler technology developed within IBM called Testarossa. This technology was developed from the outset to be used in highly dynamic environments such as Java, but has proven its adaptability in static compilers, trace-based compilation, and binary re-translators.
8+
9+
This technology is consumed as-is in some of IBM's high-performance runtimes such has the J9 Java virtual machine and used in production environments.
10+
11+
# What has been contributed?
12+
13+
* high-level optimization technology featuring classic compiler optimizations, loop optimizations, control and data flow analyses, and support data structures
14+
* code generation technology with deep platform exploitation for x86 (i386 and x86-64), Power, System Z, and ARM (32-bit)
15+
* a robust, tree-based intermediate representation (or IL) and support code for producing IL from different method representations
16+
* expressive tracing and logging infrastructure for problem determination
17+
* [JitBuilder](https://developer.ibm.com/open/2016/07/19/jitbuilder-library-and-eclipse-omr-just-in-time-compilers-made-easy/) technology to simplify the effort to integrate a JIT compiler into an existing language interpreter
18+
* a framework for constructing language-agnostic unit tests for compiler technology
19+
20+
# What's next?
21+
22+
While this is a significant initial contribution of compiler technology, more is on the way, including:
23+
24+
* integration of conformance tests with the Eclipse OMR makefiles, also demonstrating a sample hookup of the technology
25+
* lots more documentation, in source and in the `doc/compiler` directory describing the compiler technology architecture and its inner workings
26+
* further code refactoring and design consistency enhancements
27+
* additional optimizations and code generation technology upon refactoring from within the IBM Testarossa code
28+
29+
30+
# A Tour of the Source Code
31+
32+
The compiler technology is written largely in C++, but there is a handful of support functions written in C and assembler.
33+
34+
The structure of the codebase and the design of the class hierarchy reflects this technology's heritage and the requirement to adapt to a wide variety of compilation environments (or *projects* as they are often referred).
35+
36+
Many of the classes in Testarossa use a design pattern we call [*extensible classes*](../doc/compiler/extensible_classes/Extensible_Classes.md). This is a pattern to achieve extension through composition and static polymorphism.
37+
Extensible classes are an efficient and useful means to extend and specialize the core technology provided by Eclipse OMR for a particular project and for a particular processor architecture.
38+
The extensible design is the reason for the shape of the class hierarchy, the layout of directories, and file naming.
39+
40+
The core compiler components are provided under the `compiler/` top-level directory and are organized as follows:
41+
42+
Directory | Purpose
43+
--------- | -------
44+
codegen/ | Code for transforming IL trees into machine instructions. This includes pseudo-instruction generation with virtual registers, local register assignment, binary encoding, and relocation processing.
45+
compile/ | Logic managing the compilation of a method.
46+
control/ | Generic logic to decide on when and how to compile a method.
47+
cs2/ | A legacy collection of utilities providing functionality such as container classes and lexical timers. The functions within this directory are **deprecated** and are actively being replaced with C++ STL equivalents or new implementations based on STL.
48+
env/ | Generic interface to the environment that is requesting the compilation. In most cases this is the interface to the VM or compiler frontend that is incorporating the Eclipse OMR compiler technology. For example, it can be used to answer questions about the VM configuration, object model, classes, floating point semantics, etc.
49+
il/ | Intermediate language definition and utilities.
50+
ilgen/ | Utilities to help with the generation of intermediate language from some external representation.
51+
infra/ | Support infrastructure.
52+
optimizer/ | High-level, IL tree-based optimizations and utilities.
53+
ras/ | Debug and servicability utilities, including tracing and logging.
54+
runtime/ | Post-compilation services available to compiled code at runtime.
55+
arm/ | ARM processor specializations
56+
x/ | X86 processor specializations
57+
p/ | Power processor specializations
58+
z/ | System Z processor specializations
59+
60+
Other resources can be found in the Eclipse OMR project as follows:
61+
62+
Directory | Purpose
63+
--------- | -------
64+
jitbuilder/ | JitBuilder technology extending Eclipse OMR technology
65+
fvtest/compilertest | Unit tests for compiler technology
66+
doc/compiler | Additional documentation
67+
68+
## Namespaces
69+
70+
The `OMR::`, `Test::`, and `JitBuilder::` namespaces are used to isolate compiler technology for those particular environments. Processor architecture specialized namespaces (e.g., `X86::`, `Power::`, `Z::`, and `ARM::`) can be nested within them. If you extend the Eclipse OMR technology you should choose a unique namespace for your project.
71+
72+
In general, the `TR::` namespace (short for Testarossa) is the canonical namespace for all of the compiler technology that is visible across multiple projects.
73+
74+
You may encounter references that are in the global namespace but whose identifiers are preceeded simply with `TR_`. This is inconsistent with namespace convention just described and they are being moved to the `TR::` namespace as refactoring continues.
75+
76+
## XXX_PROJECT_SPECIFIC macros
77+
78+
Throughout the codebase you may find code guarded with `#ifdef XXX_PROJECT_SPECIFIC` directives. Project-specific macros are an artifact of the refactoring process that produced this code. The initial code contribution originated from a much larger codebase of compiler technology used in several compiler products and compilation scenarios. In order to contribute this code sooner we eliminated most, but not all, of code that has a tighter coupling to a particular environment. For example, guarded code may require data structures or header files not present in the initial contribution.
79+
80+
Generally there should not be a need to enable these macros. Our expecatation is that either that guarded code will be enabled over time as it is made more general purpose for other language environments, or it will be removed outright as that code is refactored as part of the Eclipse OMR project.
81+
82+
We recognize the presence of these macros is far from ideal, and IBM will be working to eliminate them over time so that the codebase is self-contained and fully testable. These macros should not be imitated in newer code commits except where absolutely necessary.
83+
84+
# Build Info
85+
86+
The compiler technology has been built successfully with the following compilers:
87+
88+
OS | Architecture | Build Compiler | Version
89+
------|--------------|----------------|--------
90+
Linux | x86 | g++ | 4.4.7
91+
Linux | s390x | g++ | 4.4.7
92+
Linux | ppc64le | XLC | 12.1
93+
Linux | ppc64le | g++ | 4.4.7
94+
AIX | ppc64 | XLC | 12.1
95+
z/OS | s390x | XLC | v2r2
96+
97+
Older compilers may not support all the C++11 features required, and while we
98+
endeavour to build with newer compilers, incompatibilities have occurred
99+
previously. Issues are welcome where forward compatibility is broken.
100+
101+
We are completely aware not everyone will have access to all these build
102+
compilers, so we will help with any compatability fixes required to make a pull
103+
request mergeable. To minimize issues, see the table below containing a
104+
summary of our ability to support C++ language features.
105+
106+
107+
C++11 Core Language Features | Supported
108+
------------------------------------------------|----------
109+
Variadic templates | Some
110+
static_assert | Yes
111+
auto | Yes
112+
decltype | Yes
113+
Delegating constructors | No
114+
Inheriting constructors | No
115+
Extended friend declarations | No
116+
Range-based for-loop | No

0 commit comments

Comments
 (0)