Compiler Design Spring 2018 - Computer Science Department ETH Zurich, Switzerland
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Compiler Design Spring 2018 Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1
Logistics § Lecture § Tuesdays: 10:15 – 11:55 § Thursdays: 10:15 -- 11:55 § In ETF E1 § Recitation § Announced later § Watch lecture website and your ETH email Lecture website: via www.lst.inf.ethz.ch § Lecture slides § Homework assignment § If questions related to assignments: contact assistants (mailing list) § If questions related to lecture: write me 2
Rules § Rule #1: Peace in the lecture hall 3
What do you want to get out of this class? § Please fill out entry questionnaire § It’s anonymous 4
What I hope to teach you in this class 1. Compiler design: Structure of a simple compiler § Simple: 2-3K lines of Java code (maybe a bit more) § Industry: C1 compiler in HotSpot VM is considered “simple” § 30K lines of C/C++/assembly code 2. Software engineering: How to design a large(r) software system § Sometimes there is no “right” or ”wrong” § Sometimes there is 3. Programming § What the programming language design document should tell you § How to use that information 8
Course structure § You will not learn the material from lectures alone § Homework is essential! 9
Homework § Core element of the course § You will build a compiler § More on this topic (organization, constraints) later 10
Compiler design and implementation § What is your favorite compiler? Please talk to your neighbor and tell him/her which compiler(s) you used and if you have a “favorite” compiler. § Why? Justify your answer. Can you and your neighbor agree on what matters to you in a compiler? 11
12
Observations § Languages are important § Source language L1 § Target language L2 § Host language LH § Programs can be “executed” § Program is a sequence of expressions E1, E2, … § A processor contains state § Execution of expressions: Each expression Ei may read state, modify state, and determine next expression to execute Ej § A special expression Estop indicates that program execution stops 14
Program execution § Execution (”elaboration”) of expressions E1, E2, … by some machine M § M realized by hardware – physical processor § M defined by software – “virtual machine” § Other possibilities § Expressions E1, E2, … also referred to as “statements” or “operations” § Elaboration sometimes referred to as interpretation § The word interpretation sometimes hints at “direct execution” 15
Issues § Languages: Choices for L1 and L2 16
Languages Please talk to your neighbor and find at least three languages that could serve as either source language L1 or target language L2 for a compiler. Think about compilers you used (or would have liked to use). 17
18
19
Languages L1 and L2 L1 L2 C Machine instruction ASM ASM LLVM C LLVM Java Java Byte Code C# Scala JavaScript Python JavaScript 23
(More) languages L1 and L2 php html pdf dvi Latex Tex VHDL SQL Lisp Haskell Prolog 24
Issues (continued) § Languages: Choices for L1 and L2 § Program written in L1 (PL1) translated into program written in L2 (PL2) § PL 1 à PL 2 § Aspects of translation of programs PL1 à PL2 § What does it mean that PL2 is a “translation” of PL1 § PL2 should produce the “same” result as PL1 25
Semantics § Describes the “meaning” of programs § Meaning of program defined by meaning of statements or operations § Formal specification 1. Operational semantics § Abstract machine A § Sequences of steps interpreted (“elaboration”) § Effect on A determines meaning 2. Denotational semantics § Mathematical construct describes effect § Can be manipulated (composition, projection, …) 3. Axiomatic semantics § Assertions on program state and rules that describe the effect of operations § Other ways: natural language, reference implementation 26
Semantics § Translated (target) program PL2 has the same meaning as the (source) program PL1 § At least: computes the same result(s) for all legal inputs § Same: must be defined... § What about illegal inputs? § What about non-functional properties? 27
30
Reasons for translation § A compiler translates a program written in language L1 into language L2. § Reasons to translate PL1 à PL2 § Faster execution of PL2 § No real machine to run PL1 § No abstract machine (virtual machine) to run PL1 § PL2 can be realized (in hardware) § (L1==L2) PL2 is more readable/optimized/stable § Special case: L1=asm, binary rewriting tool adds bounds checks § PL1 cannot be edited (by humans) § Compiler Java byte code to Java § PL2 requires less energy 31
Complications § L1 and L2 have different resource models § L1: no limit on resources, flexible description § L2: finite resources, inflexible description, hardware-based 32
Complications § L1: no limit on resources § L2: finite resources § ∞ number of variables § Fixed number of registers § ∞ lines of code § Limited storage § ∞ number of methods § Finite representation § ∞ data space § Machine properties matter § ∞ nesting § Caches § TLBs § ∞ characters in var name § NUMA § … 33
Compiler task: Translate PL1 à PL2 § Management of resources § Preservation of semantics § Is meaning defined? § For all possible inputs? § Check constraints on PL1 § Bailout: Not every program can be translated § Not every aspect can be checked by compiler § Escape: compiler inserts code into PL2 to check properties of program during execution (“at runtime”) 34
Compiler Design Spring 2018 1.1 Simple compiler model Thomas R. Gross Computer Science Department ETH Zurich, Switzerland 1
1.1 Simple and realistic compiler model § Simple: Can be handled in one semester, 8 credits § Two persons to work on the same project (more about teams later) § Realistic: Experience problems encountered by real compilers § Mirrors structure of many compilers 2
Compiler model Source program Compiler ASM file Assembler Object file 4
Compiler model § Compilation prior to execution § AOT “Ahead of (Execution) Time” compilation § Commonly used for languages without language-specific execution environments (e.g., C, C++) § Available in Java as well (IBM J9, Oracle HotSpot) § Other model: Continuous compilation § JIT “Just in Time” compilation § Usually: optimization of methods that are frequently invoked (hot) § Commonly used with language virtual machines (e.g., Java VM) § E.g., HotSpot JVM has two JIT compilers (C1 and C2) 5
Compiler model Source “Front- program Read input, transform end” Intermediate Compiler IR representation Manage machine “Back-end” resources ASM file Generate code Assembler Object file 7
Compiler model Source “Front- “Front- “Front- program end” end” end” Compiler IR “Back-end” ASM file Assembler Object file 9
IR – Intermediate representation § Compiler-internal representation § E.g., compiler must distinguish between names in different scopes § E.g., many programs work with variables, computers work with locations § Must express all language constructs/concepts § Code generator maps IR to assembly code § Machine code another option § No “best” IR – all are compromises 11
Compiler model Source “Front- program end” Compiler IR Optimizer “Back-end” ASM file Assembler Native code 13
You can also read