Design And Implementation Of A TriCore Backend For The LLVM .

1y ago

24 Views

1 Downloads

751.66 KB

61 Pages

Report/dmca

Download PDF

Transcription

Design and Implementation of a TriCoreBackend for the LLVM CompilerFrameworkStudienarbeit im Fach Informatikvorgelegt vonChristoph Erhardtgeboren am 14.11.1984 in KronachAngefertigt amDepartment InformatikLehrstuhl für Informatik 4 – Verteilte Systeme und BetriebssystemeFriedrich-Alexander-Universität Erlangen-NürnbergBetreuer:Prof. Dr.-Ing. habil. Wolfgang Schröder-PreikschatDipl.-Inf. Fabian SchelerBeginn der Arbeit: 01. Dezember 2008Ende der Arbeit: 01. September 2009

Hiermit versichere ich, dass ich die Arbeit selbstständig und ohne Benutzung anderer alsder angegebenen Quellen angefertigt habe und dass die Arbeit in gleicher oder ähnlicherForm noch keiner anderen Prüfungsbehörde vorgelegen hat und von dieser als Teil einerPrüfungsleistung angenommen wurde. Alle Stellen, die dem Wortlaut oder dem Sinnnach anderen Werken entnommen sind, habe ich durch Angabe der Quelle als Entlehnungkenntlich gemacht.Erlangen, den 01. September 2009

ÜberblickDiese Arbeit beschreibt die Entwicklung und Implementierung eines neuen Backends fürdas LLVM-Compiler-Framework, um das Erzeugen von Maschinencode für die TriCoreProzessorarchitektur zu ermöglichen.Die TriCore-Architektur ist eine fortschrittliche Plattform für eingebettete Systeme,welche die Merkmale eines Mikrocontrollers, einer RISC-CPU und eines digitalen Signalprozessors auf sich vereint. Sie findet primär im Automobilbereich und anderen Echtzeitsystemen Verwendung und wird am Lehrstuhl für Verteilte Systeme und Betriebssystemeder Universität Erlangen-Nürnberg für diverse Forschungsprojekte eingesetzt.Bei LLVM handelt es sich um eine moderne Compiler-Infrastruktur, bei deren Entwurf besonderer Wert auf Modularität und Erweiterbarkeit gelegt wurde. Aus diesemGrund findet LLVM zunehmend Verbreitung in den verschiedensten Projekten – vonCodeanalyse-Werkzeugen über Just-in-time-Compiler bis hin zu vollständigen AllzweckSystemcompilern.Im Verlauf dieser Arbeit wird zunächst ein technischer Überblick über LLVM unddie TriCore-Architektur gegeben. Anschließend werden Entwurf und Implementierungdes Compiler-Backends im Detail beschrieben und ein Vergleich mit der bestehendenGCC-Portierung durchgeführt.AbstractThis thesis describes the development and implementation of a new backend for theLLVM compiler framework that allows the generation of machine code for the TriCoreprocessor architecture.The TriCore architecture is an advanced platform for embedded systems which unitesthe features of a microcontroller, a RISC CPU and a digital signal processor. It findsuse primarily in the automotive sector and other real-time systems and is employed bythe Chair of Distributed Systems and Operating Systems at the University of ErlangenNuremberg for various research projects.LLVM is a modern compiler infrastructure with a particular focus on modularity andextensibility. For this reason, LLVM is finding increasingly widespread use in most diverseprojects – from code analysis tools via just-in-time code generators through to full-blowngeneral-purpose system compilers.In the course of the thesis, initially a technical overview of LLVM and the TriCorearchitecture is given. Subsequently, the design and implementation of the compilerbackend are discussed in detail and a comparison to the existing GCC port is made.

Contents1 Introduction1.1 Existing Compilers for the TriCore Platform . . . . . . . . . . . . . . . . .1.2 Organization of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . .1122 61818202020202121212121LLVM Compiler InfrastructureApplication Examples . . . . . . . .Comparison to traditional CompilersBasic Architecture . . . . . . . . . .Internal Representation . . . . . . .Frontends . . . . . . . . . . . . . . .2.5.1 LLVM-GCC . . . . . . . . . .2.5.2 Clang . . . . . . . . . . . . .2.6 Code Generation . . . . . . . . . . .2.7 Summary . . . . . . . . . . . . . . .3 The3.13.23.33.43.53.63.7TriCore Processor ArchitectureFields of Application . . . . . .Architecture Overview . . . . .Registers and Data Types . . .Addressing Modes . . . . . . .Instruction Set . . . . . . . . .Tasks and Contexts . . . . . . .Summary . . . . . . . . . . . .4 Design and Implementation of the Backend4.1 TableGen . . . . . . . . . . . . . . . . . . . . .4.2 Code Generation Process . . . . . . . . . . . .4.2.1 Instruction Selection . . . . . . . . . . .4.2.2 Scheduling and Formation . . . . . . . .4.2.3 SSA-based Machine Code Optimizations4.2.4 Register Allocation . . . . . . . . . . . .4.2.5 Prologue/Epilogue Code Insertion . . .4.2.6 Late Machine Code Optimizations . . .4.2.7 Code Emission . . . . . . . . . . . . . .4.3 General Target Information . . . . . . . . . . .4.3.1 Target Machine Characteristics . . . . .4.3.2 Subtarget Information . . . . . . . . . .

4.3.3 Target Registration . . . . . . . . . .4.4 Register Information . . . . . . . . . . . . .4.4.1 Register Description Table . . . . . .4.4.2 Non-static Register Information . . .4.5 DAG Lowering . . . . . . . . . . . . . . . .4.5.1 Calling Conventions . . . . . . . . .4.5.2 Custom Lowering of Instructions . .4.6 Instruction Set Specification . . . . . . . . .4.6.1 Instruction Formats . . . . . . . . .4.6.2 Instruction Description Table . . . .4.6.3 Non-static Instruction Information .4.7 Instruction Selector . . . . . . . . . . . . . .4.8 Virtual Instruction Resolution Pass . . . . .4.9 Load/Store Peephole Optimizer . . . . . . .4.10 Assembly Printer . . . . . . . . . . . . . . .4.11 Integration into LLVM . . . . . . . . . . . .4.11.1 Build System . . . . . . . . . . . . .4.11.2 Target Triple Registration . . . . . .4.11.3 Integration with the Clang Frontend4.12 Summary . . . . . . . . . . . . . . . . . . .5 Evaluation5.1 General Evaluation . . . . . . . . .5.2 Comparison to GCC . . . . . . . .5.2.1 The CoreMark Benchmark5.2.2 Compilation Speed . . . . .5.2.3 Code Size . . . . . . . . . .5.2.4 Code Performance . . . . .5.3 Summary . . . . . . . . . . . . . 505151536 Conclusion546.1 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

1 IntroductionToday’s world of computing is undergoing a constant, yet silent revolution. Bothsoftware and hardware systems – from embedded microcontrollers through to massivelyparallel high-performance computers – are experiencing an ever-increasing degree ofsophistication and complexity. Although it never actually comes to the fore, compilertechnology plays a central role in this revolution as the junction between software andhardware. Traditionally, a compiler has to fulfil three main requirements:1. It has to generate efficient code for the target platform.2. It should consume only a reasonable amount of time and memory (see Figure 1.1).3. It must be reliable.Over the last few years, a fourth feature has been gaining more and more importance: Acompiler must be modular, maintainable, and easily extensible.The GNU Compiler Collection, the de-facto standard system compiler for many Unixand Unix-like operating systems, supports a tremendous amount of frontends (for sourcelanguages) and backends (for target CPU architectures). However, due to its rathermonolithic design, it is inherently difficult to extend and retarget [3]. LLVM offers anovel approach with an uncompromisingly modular architecture, which makes it lessof a traditional compiler and more of a generic and versatile compiler infrastructure.As a consequence, LLVM is particularly interesting for research projects such as theReal-Time Systems Compiler (RTSC) developed at the University of Erlangen-Nuremberg,an operating system aware compiler for real-time applications [12].The goal of this thesis was to implement and describe a new backend that generatesassembly code for the TriCore processor architecture and can be employed by both theRTSC and the “vanilla” LLVM compilers. TriCore chips are primarily used in embeddedreal-time systems, for example in the automotive industry.1.1 Existing Compilers for the TriCore PlatformThe Infineon website lists two major compilers that support code generation for theTriCore architecture [19]: the TASKING VX-toolset for TriCore 1 and the GNU TriCoreDevelopment Suite 2 . While the former is proprietary closed-source software and thusnot of peculiar interest for this thesis, the latter is based on the 3.3/3.4 branch of GCC.12http://tasking.com/products/32 opment-platform.html1

1 IntroductionFigure 1.1: The reason why a compiler should not only generate efficient code, but alsowork efficiently itself [9].Although the mainline GCC has been available in version 4 since 2005, the TriCorevariant is still sticking to the “old” codebase, presumably because the porting effortwould by far outweigh the actual benefits [2]. The LLVM backend was compared to thisimplementation.1.2 Organization of this ThesisThe Chapters 2 and 3 of this thesis give a tight technical overview of LLVM and theTriCore architecture, highlighting the properties, features, and peculiarities relevant inthe context of this work. The design and implementation of the compiler backend arediscussed in extensive detail in Chapter 4.Chapter 5 evaluates the implemented software and compares it to the existing TriCoreport of GCC, Chapter 6 concludes the thesis.2

2 The LLVM Compiler InfrastructureThe Low-Level Virtual Machine is a comprehensive compiler infrastructure designed fromscratch to enable aggressive multi-stage optimization. It was started in 2000 by ChrisLattner at the University of Illinois at Urbana-Champaign. LLVM has increasingly gainedprominence over the past few years, especially since 2005 when Apple Inc. hired Lattnerand became the project’s main sponsor. LLVM’s code, written in C , is released undera BSD-style licence, which makes it open-source software [10].The name Low-Level Virtual Machine refers to the compiler’s internal representation(IR) language, a RISC-like virtual instruction set. Despite its low-level nature, therepresentation carries extensive high-level type information, which enables it to performadvanced interprocedural optimizations. Translation of the intermediate code intoassembly code takes place either statically at build-time, or dynamically via just-in-timecompilation at run-time.2.1 Application ExamplesLLVM is beginning to find widespread use in both academic research and open-/closedsource projects. From version 10.5 on, Mac OS X employs LLVM’s JIT capabilities in itsOpenGL graphics stack, improving vertex shader performance. The Gallium 3D project, agraphics driver infrastructure for Linux and other operating systems, is currently workingon incorporating LLVM for the same purpose. LLVM also plays a role in a number ofimplementations of the OpenCL framework (e. g., by NVIDIA).Another interesting example is the IcedTea project, an extension to Sun’s OpenJDK.HotSpot, the default Java Virtual Machine, only supports x86, x86-64 and SPARC andis very hard to port to other platforms. IcedTea is working on a custom LLVM-basedjust-in-time compiler called Shark. The goal is to make it possible to run JIT-compiledJava programs on all platforms for which LLVM has a JIT backend. The Mono project, aFree Software implementation of the .NET Framework, is also experimenting with LLVMcode generation at present.Apart from its dynamic code generation capabilities for special-purpose applications,LLVM is also increasingly finding use as a regular compiler for “traditional” programminglanguages such as C or C . This is aided by the fact that LLVM was designed fromthe beginning to be used as a drop-in replacement for existing system compilers. InJanuary 2009, the FreeBSD team announced that they were considering taking this stepand working towards replacing the default GCC compiler [11].3

2 The LLVM Compiler Infrastructure2.2 Comparison to traditional CompilersLLVM differs from “traditional” systems such as the GNU Compiler Collection in anumber of aspects because it follows a fundamentally different design approach. WhereasGCC is primarily conceived as a part of the GNU toolchain, LLVM is not merely acompiler, but a modular and reusable compiler infrastructure. This is strongly reflectedin LLVM’s architecture, which has a library-based design and defines cleanly separatedcomponents. Consequently, on the one hand LLVM is easily extendible, and on the otherhand it facilitates third-party applications incorporating specific parts of it1 , for instancethe JIT components (see the application examples above).The need for advanced inter-procedural optimizations often imposes difficult problems on existing compiler architectures: If the generated object files contain assembledmachine code, the linker will have only very limited information and thus hardly beable to perform non-trivial optimizations. On the other hand, if the object files containhigh-level intermediate code, most optimization passes – even the intra-procedural ones –will have to be deferred until link-time. Thus rebuilding a program will be extremelyexpensive, as the majority of the work will have to be performed by the linker.LLVM’s “multi-stage” approach tries to find the golden mean by translating each compilation unit into an IR language that is low-level enough to allow optimizations atcompile-time while still providing sufficient high-level type information to the linker.At link-time, all input files are combined into a single unit, on which inter-proceduraloptimizations can be performed before the actual linking process takes place.In general, LLVM puts a strong focus on efficient compilation. In spite of performingaggressive optimizations in order to yield highly efficient machine code, LLVM has provento be many times faster than GCC while consuming significantly less memory.2.3 Basic ArchitectureBeing a modular and retargetable compiler infrastructure, LLVM features a classicthree-tier architecture as shown in Figure 2.1, consisting of a number of languagedependent frontends, one intermediate code optimizer, and a number of target-specificcode generators.The frontend receives a plaintext source code file as input and performs the typicalthree steps:1. Lexical analysis: The input is broken into tokens. If required by the sourcelanguage, these tokens are additionally preprocessed.2. Syntactic analysis: The token sequence is parsed in order to identify the structureof the program, building a syntax tree.3. Semantic analysis: The syntax tree is filled with semantic information (e. g.,type information) and the symbol table is built.1This is also reflected in the deliberate choice of a BSD-style licence. Generally, the developers’ focuson pragmatism rather than politics is often cited as one major advantage (among others) over GCC.4

2 The LLVM Compiler InfrastructureC sourceClangfrontendx86 tendSPARC tcodeLLVMassembly/bitcodeFigure 2.1: Basic LLVM architecture. The optimizer with its analysis and transformationpasses is the core of the three-tier composition.The resulting abstract syntax tree (AST) is still language-dependent and frontend-specificand has to be translated into the compiler’s generic internal representation language.Once the program has been converted into LLVM’s intermediate code, this code ispassed to the core of the compiler system, which is completely agnostic about the languageof the original source code and the corresponding frontend. After performing a number ofanalyses (control flow, data flow, etc.), the collected information can be used by variousoptimization passes to apply transformations to the code.Last of all, the backend corresponding to the selected target platform translates theIR code into the target-specific assembly language. This includes the mapping of virtualinstructions to machine instructions and the allocation of variables to registers or tomemory.2.4 Internal RepresentationLLVM’s intermediate language consists of an instruction set and type system that bothare completely independent of the language the source code is written in. This makes itpossible to use it as a universal infrastructure for almost any compilation and optimizationpurpose – from low-level shaders to high-level C programs.The internal type system supports the usual primitive and composite types presentin low-level programming languages like C, plus some integer and floating-point vectortypes. For an overview, see Table 2.1. High-level types that are not supported directly5

2 The LLVM Compiler InfrastructureData typeIntegerIEEE-754 floating-pointBooleanPointerInteger vectorFloating-point vectorArrayStructureFunctionWidth (in bits)1, 8, 16, 32, 64, 12832, 64, 80, /AN/AN/ATable 2.1: Data types of LLVM’s internal representation. Supported are mappings of allprimitive and composite types of low-level programming languages like C, and SIMD(single instruction, multiple data) vector types.must be mapped by the frontend. For instance, a C class with virtual methods canbe represented as a struct with a number of function pointers.It must be noted that – depending on the source language – LLVM programs are notguaranteed to be completely independent of the target machine if the language specification might be interpreted differently on different platforms. For example, the C type longmay be mapped to the LLVM type i32 or i64 by the frontend, according to whetherthe code is being compiled for a 32-bit or a 64-bit target platform. Consequently, whenbuilding an application written in such a language for a different hardware architecture,it is advised to rebuild it from source.The instruction set is orthogonal and specifies a virtual RISC-like register machinewith an unlimited amount of registers. Instructions are mostly in three-address form.All LLVM programs are in static single assignment (SSA) form, i. e., for every use ofa virtual register there is exactly one reaching definition and that definition strictlydominates the use. This makes data flow analysis significantly easier. The SSA-basedoptimizer supports three classes of optimizations: standard scalar optimizations (e. g., dead code elimination, constant propagation), aggressive loop transformations (e. g., unrolling), and advanced inter-procedural optimizations (e. g., inlining).Figure 2.2 shows an LLVM code example for a function that computes the factorial of agiven integer. Note the obvious segmentation into basic blocks, the use of explicit brinstructions instead of implicit fall-through edges, and the SSA phi instruction wheretwo reaching definitions from different paths meet.“LLVM’s virtual instruction set is a first class language which has a textual, binary,and in-memory representation” [7], so LLVM programs can be stored on disk either in ahuman-readable assembly-like text form (.ll) or as compressed binary code (.bc) files.Due to the project’s extremely modular architecture, this makes it possible to use themiddle-end as a standalone component.6

2 The LLVM Compiler 324define i32 @factorial(i32 %n) nounwind readnone {entry:%cmp icmp eq i32 %n, 0br i1 %cmp, label %return, label %if.elseif.else:%sub add i32 %n, -1%cmp.i icmp eq i32 %sub, 0br i1 %cmp.i, label %factorial.exit, label %if.else.iif.else.i:%sub.i add i32 %n, -2%call.i tail call i32 @factorial(i32 %sub.i) nounwind%mul.i mul i32 %call.i, %subbr label %factorial.exitfactorial.exit:%call3 phi i32 [ %mul.i, %if.else.i ], [ 1, %if.else ]%mul mul i32 %call3, %nret i32 %mulreturn:ret i32 1}Figure 2.2: LLVM code example. The code of the function factorial is in SSA formand computes the factorial of a given integer n.2.5 FrontendsLLVM per se does not include any frontend for a source language. All of its toolsrequire .ll or .bc files as input. However, there are two official frontends developedby the LLVM team that are available separately: LLVM-GCC and Clang. In addition,many third-party projects that employ LLVM technology have created frontends of theirown. Examples include Java bytecode, Python, Haskell, and many more.2.5.1 LLVM-GCCAs developing a complete frontend for a programming language is a cumbersome andnon-trivial task, the LLVM developers decided to draw on GCC’s set of existing frontends.LLVM-GCC is based on Apple’s GCC 4.2 tree and implants LLVM into the GCC codebase. The glue code translates IR programs from GCC’s GIMPLE language into LLVM’slanguage. This approach yields two big advantages:7

2 The LLVM Compiler Infrastructure1. At one stroke, LLVM suddenly supported the entire set of programming languagesoffered by GCC – including Ada, C, C , Fortran, Java, Objective-C, and manymore.2. LLVM-GCC can be used as a drop-in replacement for GCC without any adaptationsto the source code or build systems of existing applications2 .The main disadvantage is the fact that the GCC frontend is rather slow, memory-hungry,and carries a lot of legacy code.2.5.2 ClangClang is a relatively new effort to create a modern frontend for C, C and Objective-Cfrom scratch. It aims to be fully language-conformant while providing compatibility toGCC’s extensions. Clang has a range of benefits over LLVM-GCC: Diagnostics (errors/warnings) are much more descriptive and clear. Clang keepstrack of all column numbers and for each diagnostic outputs the respective line ofcode with a marker placed below the responsible expression. Moreover, the originalinformation of typedefs is retained, eliminating the problem of literally “drowning”in STL-related error messages. Clang adds a huge performance boost to the compilation process. Measurementshave shown the frontend to be more than twice as fast as LLVM-GCC and toconsume significantly less memory. Just like the rest of LLVM, Clang is designed to be modular, extensible and easilyreusable (e. g., by source analysis tools or IDEs).While the C and Objective-C parts are already in a reasonably mature state3 , the C part is currently still a work in progress.2.6 Code GenerationOne of the points of criticism at GCC is that retargeting is rather unpleasant andrequires to more or less reinvent the wheel. This is where LLVM’s modular design paysoff once again. It provides a generic code generation framework that includes a set ofreusable target-independent algorithms for register allocation, instruction scheduling, etc.This implicates a considerable simplification and unification of the retargeting process.Chapter 4 discusses this process and the code generation framework in detail.The LLVM 2.5 release bundles code generators for the Alpha, ARM, Cell SPU, Itanium,MIPS, PIC16, PowerPC, SPARC, x86/x86-64, and XCore hardware architectures. The2This is true provided that the application in question does not make use of any builtins / pragmas /inline assembly not (yet) implemented in LLVM.3As of July 2009, Clang is able to compile the complete FreeBSD kernel and most of the userland. TheLinux kernel however still has a few build issues.8

2 The LLVM Compiler InfrastructureAlpha, PowerPC and x86/x86-64 targets not only support static compilation, but alsoJIT code generation. LLVM additionally includes special backends for output of C/C source code and .NET CIL code.2.7 SummaryLLVM is a modern compiler infrastructure with an exceptionally modular design. It putsa main focus on the generation of highly efficient code in a highly efficient way. As ittackles many of the shortcomings of traditional compilers and is fit for a broad range ofuse cases, it is forseeable that LLVM is going to rise in prominence and popularity overthe next few years.9

3 The TriCore Processor ArchitectureTriCore is a microprocessor architecture developed by Siemens Semiconductors1 andofficially announced in 1997. Contrary to what the name might imply, TriCore is not amulti-core architecture. Instead, it aims at unifying the features of three worlds: a real-time microcontroller unit allowing fast context switches and low latencies, a powerful digital signal processor (DSP), and a superscalar RISC processor.The TriCore is available as a so-called intellectual property core, a reusable building blockthat can be licenced for use as a component of a system-on-a-chip (SoC) solution. Sucha solution would include the microprocessor core along with RAM/ROM memory as wellas additional custom application-specific logic (ASIC), all integrated on a single chip [13].The following chapter gives a brief overview of the basic architecture, as well as itspeculiarities.3.1 Fields of ApplicationTriCore is designed for a broad range of embedded system applications – primarily automotive control systems. BMW and several suppliers of automotive drivetrain componentsuse chips from Infineon’s AUDO family for engine and transmission control, allowingvehicle powertrains to fulfil considerable performance requirements while satisfying strictemission standards [17].Another concrete example is a telematics and telecommunications platform developedby Infineon and Volkswagen in 2004. This unit is built around a TriCore microprocessorand offers wireless telephony and location-based information services (e. g., navigation)to the driver [15].Apart from the various applications in the automotive industry, TriCore chips arealso finding a use in several other embedded real-time systems. They are especiallyadvantageous for systems that require both controlling functionality and complex digitalsignal processing, eliminating the need for two separate chips with distinct functionalities.3.2 Architecture OverviewTriCore is a 32-bit superscalar RISC architecture. It operates on 32-bit words, providing4 GiB of address space. Memory words larger than one byte are stored in little-endian1since 1999 known as Infineon Technologies10

3 The TriCore Processor Architecturebyte order. Instructions have a (RISC-typical) fixed length of 32 bits – with the exceptionof some special variants that merely take 16 bits for encoding. This is because high codedensity has been considered a major design goal.The TriCore architecture exists in two versions: the original TriCore 1, and TriCore 2,an overhauled release introduced in 2001 which is “essentially backwards compatible” [20]and brings a set of features and improvements mainly for operating system software.The following chapter as well as the implementation of the compiler backend restrictthemselves to the TriCore 1 architecture.3.3 Registers and Data TypesA TriCore microprocessor is equipped with 32 general-purpose registers, each of them32 bits wide – divided into 16 regular data registers D0–D15 (for integer/floating-pointoperands) and 16 address registers A0–A15 (for pointer operands). In comparison tocommon RISC architectures such as PowerPC or SPARC, this is an unusual design intwo ways:1. Normally the floating-point unit has its own register file, whereas TriCore’s FPUaccesses the ALU’s data register file.2. Although pointers are most commonly treated just like regular integers, TriCoremakes a strict distinction between them, processing them in different register fileswith different instructions. This implicates that encoding a register operand onlyrequires four bits instead of five. Most short instructions need two registers, so twoadditional bits can be used for the opcode, allowing four times as many different16-bit instructions and ultimately providing a substantially higher code density [5].All general-purpose registers can be freely used with the exception of four address registers:A10 and A11 represent the stack pointer and the return address, respectively; A0, A1,A8 and A9 are reserved for global addresses by convention. D15 and A15 are used asimplicit (i. e., not explicitly encoded) operands by some 16-bit instructions.Two successive 32-bit GPRs, starting with an even-numbered register, can be combinedto form an extended 64-bit register (Ex for data registers, Px for address registers).For example E0 comprises D0 and D1; P12 comprises A12 and A13. While all regulararithmetic instructions such as ADD, SUB etc. only process 32-bit registers, some specialinstructions exist that make use of extended registers.In addition to the general-purpose registers, there are three “Core Special FunctionRegisters” that are implicitly read or modified by certain instructions: PCXI (previouscontext information), PSW (processor status word), and PC (program counter). For acomplete overview of TriCore’s architectural registers, refer to Table 3.1.The ISA (instruction set architecture) supports a number of data types for instructionoperands, as shown in Table 3.2. All basic arithmetic instructions operate on 32-bitwords. Some variants of

RTSC and the \vanilla" LLVM compilers. TriCore chips are primarily used in embedded real-time systems, for example in the automotive industry. 1.1 Existing Compilers for the TriCore Platform The In neon website lists two major compilers that support code generation for the TriCore architecture [19]: the TASKING VX-toolset for TriCore1 and the .