Programming Languages Overview & Syntax

Transcription

Programming LanguagesSession 1 – Main ThemeProgramming Languages Overview & SyntaxDr. Jean-Claude FranchittiNew York UniversityComputer Science DepartmentCourant Institute of Mathematical SciencesAdapted from course textbook resourcesProgramming Language Pragmatics (3rd Edition)Michael L. Scott, Copyright 2009 Elsevier1

Agenda1Instructor and Course Introduction2Introduction to Programming Languages3Programming Language Syntax4Conclusion2

Who am I?- Profile 31 years of experience in the Information Technology Industry, including thirteen years of experienceworking for leading IT consulting firms such as Computer Sciences CorporationPhD in Computer Science from University of Colorado at BoulderPast CEO and CTOHeld senior management and technical leadership roles in many large IT Strategy and Modernizationprojects for fortune 500 corporations in the insurance, banking, investment banking, pharmaceutical, retail,and information management industriesContributed to several high-profile ARPA and NSF research projects Played an active role as a member of the OMG, ODMG, and X3H2 standards committees and as aProfessor of Computer Science at Columbia initially and New York University since 1997 Proven record of delivering business solutions on time and on budgetOriginal designer and developer of jcrew.com and the suite of products now known as IBM InfoSphereDataStageCreator of the Enterprise Architecture Management Framework (EAMF) and main contributor to the creationof various maturity assessment methodologyDeveloped partnerships between several companies and New York University to incubate newmethodologies (e.g., EA maturity assessment methodology developed in Fall 2008), develop proof ofconcept software, recruit skilled graduates, and increase the companies’ visibility 3

How to reach me?Come on what elsedid you expect?Cell(212) 203-5004Emailjcf@cs.nyu.eduAIM, Y! IM, ICQ jcf2 2003Woo hoo find the wordof the day MSN IMjcf2 kypejcf2 2003@yahoo.com4

What is the course about? Course description and syllabus:» http://www.nyu.edu/classes/jcf/CSCI-GA.2110-001 su14» dex.html Textbook:» Programming Language Pragmatics (3rd Edition)Michael L. ScottMorgan KaufmannISBN-10: 0-12374-514-4, ISBN-13: 978-0-12374-514-4, (04/06/09)5

Course goals Intellectual:» help you understand benefit/pitfalls of differentapproaches to language design, and how theywork Practical:» you may need to design languages in yourcareer (at least small ones)» understanding how to use a programmingparadigm can improve your programming evenin languages that don’t support it» knowing how a feature is implemented helpsunderstand time/space complexity6

Icons / MetaphorsInformationCommon RealizationKnowledge/Competency PatternGovernanceAlignmentSolution Approach77

Agenda1Instructor and Course Introduction2Introduction to Programming Languages3Programming Language Syntax4Conclusion8

Introduction to Programming Languages - Sub-Topics IntroductionProgramming Language Design and Usage Main ThemesProgramming Language as a Tool for ThoughtIdiomsWhy Study Programming LanguagesClassifying Programming LanguagesImperative LanguagesPL GenealogyPredictable Performance vs. WriteabilityCommon IdeasDevelopment Environment & Language LibrariesCompilation vs. InterpretationProgramming Environment ToolsAn Overview of CompilationAbstract Syntax TreeScannerless Parsing9

Introduction (1/3) Why are there so many programminglanguages?» evolution -- we've learned better ways ofdoing things over time» socio-economic factors: proprietary interests,commercial advantage» orientation toward special purposes» orientation toward special hardware» diverse ideas about what is pleasant to use10

Introduction (2/3) What makes a language successful?» easy to learn (BASIC, Pascal, LOGO, Scheme)» easy to express things, easy use once fluent,"powerful” (C, Common Lisp, APL, Algol-68, Perl)» easy to implement (BASIC, Forth)» possible to compile to very good (fast/small) code(Fortran)» backing of a powerful sponsor (COBOL, PL/1, Ada,Visual Basic)» wide dissemination at minimal cost (Pascal, Turing,Java)11

Introduction (3/3) Why do we have programminglanguages? What is a language for?» way of thinking -- way of expressingalgorithms» languages from the user's point of view» abstraction of virtual machine -- way ofspecifying what you want» the hardware to do without getting down intothe bits» languages from the implementor's point ofview12

Programming Language Design and Usage Main Themes (1/2) Model of Computation (i.e., paradigm) Expressiveness»»»»Control structuresAbstraction mechanismsTypes and related operationsTools for programming in the large Ease of MaintainabilityCompactness – writeability/expressibilityFamiliarity of ModelLess Error-PronePortabilityHides Details – simpler modelEarly detection of errorsModularity – Reuse, Composability, IsolationPerformance TransparencyOptimizability Note Orthogonal Implementation Issues:»»Compile time: parsing, type analysis, static checkingRun time: parameter passing, garbage collection, method dispatching, remote invocation, just-intime compiling, parallelization, etc.13

Programming Language Design and Usage Main Themes (2/2) Classical Issues in Language Design:» Dijkstra, “Goto Statement Considered Harmful”, http://www.acm.org/classics/oct95/#WIRTH66» Backus, “Can Programming Be Liberated from thevon Neumann Style?” s.pdf» Hoare, “An Axiomatic Basis For ComputerProgramming”, http://www.spatial.maine.edu/ worboys/processes/hoare%20axiomatic.pdf» Hoare, “The Emperor’s Old Clothes”, pdf» Parnas, “On the Criteria to be Used in DecomposingSystems into Modules”, http://www.acm.org/classics/may96/14

Programming Language as a Tool for Thought Roles of programming language as acommunication vehicle among programmers ismore important than writeability All general-purpose languages are TuringComplete (i.e., they can all compute the samethings) Some languages, however, can make therepresentation of certain algorithmscumbersome Idioms in a language may be useful inspirationwhen using another language15

Idioms Copying a string q to p in C:» while (*p *q ) ; Removing duplicates from the list @xs in Perl:» my % seen ();@xs grep { ! seen { } ; } @xs ; Computing the sum of numbers in list xs inHaskell:» foldr ( ) 0 xsIs this natural? It is if you’re used to it!16

Why Study Programming Languages? (1/6) Help you choose a language.» C vs. Modula-3 vs. C for systemsprogramming» Fortran vs. APL vs. Ada for numericalcomputations» Ada vs. Modula-2 for embedded systems» Common Lisp vs. Scheme vs. ML forsymbolic data manipulation» Java vs. C/CORBA for networked PCprograms17

Why Study Programming Languages? (2/6) Make it easier to learn new languagessome languages are similar; easy to walkdown family tree» concepts have even more similarity; if youthink in terms of iteration, recursion,abstraction (for example), you will find iteasier to assimilate the syntax and semanticdetails of a new language than if you try topick it up in a vacuum Think of an analogy to human languages: goodgrasp of grammar makes it easier to pick up newlanguages (at least Indo-European).18

Why Study Programming Languages? (3/6) Help you make better use of whateverlanguage you use» understand obscure features: In C, help you understand unions, arrays &pointers, separate compilation, varargs, catch andthrow In Common Lisp, help you understand first-classfunctions/closures, streams, catch and throw,symbol internals19

Why Study Programming Languages? (4/6) Help you make better use of whateverlanguage you use (cont.)» understand implementation costs: choosebetween alternative ways of doing things,based on knowledge of what will be doneunderneath:– use simple arithmetic equal (use x*x instead of x**2)– use C pointers or Pascal "with" statement to factoraddress calculations» ml)– avoid call by value with large data items in Pascal– avoid the use of call by name in Algol 60– choose between computation and table lookup (e.g. forcardinality operator in C or C )20

Why Study Programming Languages? (5/6) Help you make better use of whateverlanguage you use (cont.)» figure out how to do things in languages thatdon't support them explicitly: lack of suitable control structures in Fortran use comments and programmer discipline forcontrol structures lack of recursion in Fortran, CSP, etc write a recursive algorithm then use mechanicalrecursion elimination (even for things that aren'tquite tail recursive)21

Why Study Programming Languages? (6/6) Help you make better use of whateverlanguage you use (cont.)» figure out how to do things in languages thatdon't support them explicitly:– lack of named constants and enumerations in Fortran– use variables that are initialized once, then neverchanged– lack of modules in C and Pascal use comments andprogrammer discipline– lack of iterators in just about everything fake them with(member?) functions22

Classifying Programming Languages (1/2) Group languages by programming paradigms:» imperative von Neumann(Fortran, Pascal, Basic, C, Ada)– programs have mutable storage (state) modified by assignments– the most common and familiar paradigm object-oriented(Simula 67, Smalltalk, Eiffel,Ada95, Java, C#)– data structures and their operations are bundled together– inheritance scripting languages(Perl, Python, JavaScript, PHP)» declarative functional (applicative)(Scheme, ML, pure Lisp, FP, Haskell)– functions are first-class objects / based on lambda calculus– side effects (e.g., assignments) discouraged logic, constraint-based(Prolog, VisiCalc, RPG, Mercury)– programs are sets of assertions and rules Functional Logical» Hybrids:imperative OO(Curry)(C ) functional object-oriented(O’Caml, O’Haskell) Scripting (used to glue programs together)(Unix shells, PERL, PYTHON, TCLPHP, JAVASCRIPT)23

Classifying Programming Languages (2/2) Compared to machine or assembly language, all others are high-level But within high-level languages, there are different levels as well Somewhat confusingly, these are also referred to as low-level and highlevel» Low-level languages give the programmer more control (at the cost of requiringmore effort) over how the program is translated into machine code. C, FORTRAN» High-level languages hide many implementation details, often with someperformance cost BASIC, LISP, SCHEME, ML, PROLOG,» Wide-spectrum languages try to do both: ADA, C , (JAVA)» High-level languages typically have garbage collection and are ofteninterpreted.» The higher the level, the harder it is to predict performance (bad for real-time orperformance-critical applications)» Note other “types/flavors” of languages: fourth generation (SETL, SQL),concurrent/distributed (Concurrent Pascal, Hermes), markup, special purpose (reportwriting), graphical, etc.24

Imperative Languages Imperative languages, particularly the vonNeumann languages, predominate» They will occupy the bulk of our attention We also plan to spend a lot of time onfunctional, and logic languages25

PL Genealogy FORTRAN (1957) Fortran90, HP COBOL (1956) COBOL 2000» still a large chunk of installed software Algol60 Algol68 Pascal AdaAlgol60 BCPL C C APL JSnobol IconSimula SmalltalkLisp Scheme ML Haskellwith lots of cross-pollination:e.g., Java is influenced by C , Smalltalk, Lisp, Ada, etc.26

Predictable Performance vs. Writeability Low-level languages mirror the physicalmachine:» Assembly, C, Fortran High-level languages model an abstractmachine with useful capabilities:» ML, Setl, Prolog, SQL, Haskell Wide-spectrum languages try to do both:» Ada, C , Java, C# High-level languages have garbage collection,are often interpreted, and cannot be used forreal-time programming.» The higher the level, the harder it is to determine costof operations.27

Common Ideas Modern imperative languages (e.g., Ada, C ,Java) have similar characteristics:» large number of features (grammar with severalhundred productions, 500 page reference manuals, . .)» a complex type system» procedural mechanisms» object-oriented facilities» abstraction mechanisms, with information hiding» several storage-allocation mechanisms» facilities for concurrent programming (not C )» facilities for generic programming (new in Java)28

Language Mechanism & Patterns Design Patterns: Gamma, Johnson, Helm, Vlissides» Bits of design that work to solve sub-problems» What is mechanism in one language is pattern in another Mechanism: C class Pattern: C struct with array of function pointers Exactly how early C compilers worked Why use patterns» Start from very simple language, very simple semantics» Compare mechanisms of other languages by building patternsin simpler language» Enable meaningful comparisons between languagemechanisms29

Development Environment & Language Libraries (1/2) Development Environment» Interactive Development Environments Smalltalk browser environment Microsoft IDE» Development Frameworks Swing, MFC» Language aware Editors30

Development Environment & Language Libraries (2/2) The programming environment may be largerthan the language.» The predefined libraries are indispensable to theproper use of the language, and its popularity» Libraries change much more quickly than thelanguage» Libraries usually very different for different languages» The libraries are defined in the language itself, butthey have to be internalized by a good programmer» Examples: C standard template libraryJava Swing classesAda I/O packagesC Standard Template Library (STL)31

Compilation vs. Interpretation (1/16) Compilation vs. interpretation» not opposites» not a clear-cut distinction Pure Compilation» The compiler translates the high-level sourceprogram into an equivalent target program(typically in machine language), and thengoes away:32

Compilation vs. Interpretation (2/16) Pure Interpretation» Interpreter stays around for the execution ofthe program» Interpreter is the locus of control duringexecution33

Compilation vs. Interpretation (3/16) Interpretation:» Greater flexibility» Better diagnostics (error messages) Compilation» Better performance34

Compilation vs. Interpretation (4/16) Common case is compilation or simplepre-processing, followed by interpretation Most language implementations include amixture of both compilation andinterpretation35

Compilation vs. Interpretation (5/16) Note that compilation does NOT have to producemachine language for some sort of hardware Compilation is translation from one language intoanother, with full analysis of the meaning of theinput Compilation entails semantic understanding ofwhat is being processed; pre-processing doesnot A pre-processor will often let errors through. Acompiler hides further steps; a pre-processordoes not36

Compilation vs. Interpretation (6/16) Many compiled languages haveinterpreted pieces, e.g., formats in Fortranor C Most use “virtual instructions”» set operations in Pascal» string manipulation in Basic Some compilers produce nothing butvirtual instructions, e.g., Pascal P-code,Java byte code, Microsoft COM 37

Compilation vs. Interpretation (7/16) Implementation strategies:» Preprocessor Removes comments and white space Groups characters into tokens (keywords,identifiers, numbers, symbols) Expands abbreviations in the style of a macroassembler Identifies higher-level syntactic structures (loops,subroutines)38

Compilation vs. Interpretation (8/16) Implementation strategies:» Library of Routines and Linking Compiler uses a linker program to merge theappropriate library of subroutines (e.g., mathfunctions such as sin, cos, log, etc.) into the finalprogram:39

Compilation vs. Interpretation (9/16) Implementation strategies:» Post-compilation Assembly Facilitates debugging (assembly languageeasier for people to read) Isolates the compiler from changes in theformat of machine language files (onlyassembler must be changed, is shared bymany compilers)40

Compilation vs. Interpretation (10/16) Implementation strategies:» The C Preprocessor (conditional compilation) Preprocessor deletes portions of code, whichallows several versions of a program to be builtfrom the same source41

Compilation vs. Interpretation (11/16) Implementation strategies:» Source-to-Source Translation (C ) C implementations based on the early AT&Tcompiler generated an intermediate program in C,instead of an assembly language:42

Compilation vs. Interpretation (12/16) Implementation strategies:» Bootstrapping43

Compilation vs. Interpretation (13/16) Implementation strategies:» Compilation of Interpreted Languages The compiler generates code that makesassumptions about decisions that won’t befinalized until runtime. If these assumptions arevalid, the code runs very fast. If not, a dynamiccheck will revert to the interpreter.44

Compilation vs. Interpretation (14/16) Implementation strategies:» Dynamic and Just-in-Time Compilation In some cases a programming system maydeliberately delay compilation until the lastpossible moment.– Lisp or Prolog invoke the compiler on the fly, totranslate newly created source into machine language,or to optimize the code for a particular input set.– The Java language definition defines a machineindependent intermediate form known as byte code.Byte code is the standard format for distribution of Javaprograms.– The main C# compiler produces .NET CommonIntermediate Language (CIL), which is then translatedinto machine code immediately prior to execution.45

Compilation vs. Interpretation (15/16) Implementation strategies:» Microcode Assembly-level instruction set is not implementedin hardware; it runs on an interpreter. Interpreter is written in low-level instructions(microcode or firmware), which are stored in readonly memory and executed by the hardware.46

Compilation vs. Interpretation (16/16) Compilers exist for some interpreted languages,but they aren't pure:» selective compilation of compilable pieces and extrasophisticated pre-processing of remaining source.» Interpretation of parts of code, at least, is stillnecessary for reasons above. Unconventional compilers» text formatters» silicon compilers» query language processors47

Programming Environment Tools Tools48

An Overview of Compilation (1/15) Phases of Compilation49

An Overview of Compilation (2/15) Scanning:» divides the program into "tokens", which arethe smallest meaningful units; this savestime, since character-by-character processingis slow» we can tune the scanner better if its job issimple; it also saves complexity (lots of it) forlater stages» you can design a parser to take charactersinstead of tokens as input, but it isn't pretty» scanning is recognition of a regular language,e.g., via Deterministic Finite Automata (DFA)50

An Overview of Compilation (3/15) Parsing is recognition of a context-freelanguage, e.g., via Push Down Automata(PDA)» Parsing discovers the "context free" structureof the program» Informally, it finds the structure you candescribe with syntax diagrams (the "circlesand arrows" in a Pascal manual)51

An Overview of Compilation (4/15) Semantic analysis is the discovery ofmeaning in the program» The compiler actually does what is calledSTATIC semantic analysis. That's themeaning that can be figured out at compiletime» Some things (e.g., array subscript out ofbounds) can't be figured out until run time.Things like that are part of the program'sDYNAMIC semantics52

An Overview of Compilation (5/15) Intermediate form (IF) done after semanticanalysis (if the program passes all checks)» IFs are often chosen for machine independence,ease of optimization, or compactness (these aresomewhat contradictory)» They often resemble machine code for someimaginary idealized machine; e.g. a stackmachine, or a machine with arbitrarily manyregisters» Many compilers actually move the code throughmore than one IF53

An Overview of Compilation (6/15) Optimization takes an intermediate-codeprogram and produces another one thatdoes the same thing faster, or in lessspace» The term is a misnomer; we just improvecode» The optimization phase is optional Code generation phase producesassembly language or (sometime)relocatable machine language54

An Overview of Compilation (7/15) Certain machine-specific optimizations(use of special instructions or addressingmodes, etc.) may be performed during orafter target code generation Symbol table: all phases rely on a symboltable that keeps track of all the identifiers inthe program and what the compiler knowsabout them» This symbol table may be retained (in someform) for use by a debugger, even aftercompilation has completed55

An Overview of Compilation (8/15) Lexical and Syntax Analysis» GCD Program (in C)int main() {int i getint(), j getint();while (i ! j) {if (i j) i i - j;else j j - i;}putint(i);}56

An Overview of Compilation (9/15) Lexical and Syntax Analysis» GCD Program Tokens Scanning (lexical analysis) and parsingrecognize the structure of the program, groupscharacters into tokens, the smallest meaningfulunits of the programintintwhileifelse}putint}maini((j( ii )getint! j{(jj-(i);)))i,{i;j getint() i-j;;57

An Overview of Compilation (10/15) Lexical and Syntax Analysis» Context-Free Grammar and Parsing Parsing organizes tokens into a parse tree thatrepresents higher-level constructs in terms of theirconstituents Potentially recursive rules known as context-freegrammar define the ways in which theseconstituents combine58

An Overview of Compilation (11/15) Context-Free Grammar and Parsing» Example (while loop in C)iteration-statement while ( expression ) statementstatement, in turn, is often a list enclosed in braces:statement compound-statementcompound-statement { block-item-list opt }whereblock-item-list opt block-item-listorblock-item-list opt ϵandblock-item-list block-itemblock-item-list block-item-list block-itemblock-item declarationblock-item statement59

An Overview of Compilation (12/15) Context-Free Grammar and Parsing» GCD Program Parse TreeBAnext slide60

An Overview of Compilation (13/15) Context-Free Grammar and Parsing (cont.)61

An Overview of Compilation (14/15) Context-Free Grammar and Parsing (cont.)AB62

An Overview of Compilation (15/15) Syntax Tree» GCD Program Parse Tree63

Abstract Syntax Tree (1/2) Many non-terminals inside a parse tree are artifacts ofthe grammar Remember:E :: E T TT :: T * Id IdThe parse tree for B * C can be written asE(T(Id(B), Id(C)))In constrast, an abstract syntax tree (AST) captures onlythose tree nodes that are necessary for representing theprogramIn the example:T(Id(B), Id(C))Consequently, many parsers really generate abstractsyntax trees.64

Abstract Syntax Tree (2/2) Another explanation for abstract syntaxtree: It’s a tree capturing only semanticallyrelevant information for a program» i.e., omitting all formatting and comments Question 1: What is a concrete syntaxtree? Question 2: When do I need a concretesyntax tree?65

Scannerless Parsing Separating syntactic analysis into lexing andparsing helps performance. After all, regularexpressions can be made very fast But it also limits language design choices. Forexample, it’s very hard to compose differentlanguages with separate lexers and parsers —think embedding SQL in JAVA Scannerless parsing integrates lexical analysisinto the parser, making this problem moretractable.66

Agenda1Instructor and Course Introduction2Introduction to Programming Languages3Programming Language Syntax4Conclusion67

Programming Language Syntax - Sub-Topics Language DefinitionSyntax and SemanticsGrammarsThe Chomsky HierarchyRegular ExpressionsRegular Grammar ExampleLexical IssuesContext-Free GrammarsScanningParsingLL ParsingLR Parsing68

Language Definition Different users have different needs:» programmers: tutorials, reference manuals,programming guides (idioms)» implementors: precise operationalsemantics» verifiers: rigorous axiomatic or naturalsemantics» language designers and lawyers: all of theabove Different levels of detail and precision» but none should be sloppy!69

Syntax and Semantics Syntax refers to external representation:» Given some text, is it a well-formed program? Semantics denotes meaning:» Given a well-formed program, what does it mean?» Often depends on context The division is somewhat arbitrary» Note: It is possible to fully describe the syntax and semantics of a programminglanguage by syntactic means (e.g., Algol68 and W-grammars), but this ishighly impractical Typically use a grammar for the context-free aspects, and differentmethod for the rest» Similar looking constructs in different languages often havesubtly (or not-so-subtly) different meanings» Good syntax, unclear semantics: “Colorless green ideas sleepfuriously”» Good semantics, poor syntax: “Me go swimming now, sorrybye”» In programming languages: syntax tells you what a wellformed program looks like. Semantic tells you relationship ofoutput to input70

Grammars (1/2) A grammar G is a tuple (Σ,N, S, δ)» N is the set of non-terminal symbols» S is the distinguished non-terminal: the root symbol» Σ is the set of terminal symbols (alphabet)» δ is the set of rewrite rules (productions) of the form:ABC. . . :: XYZ . . .where A,B,C,D,X,Y, Z are terminals and non terminals» The language is the set of sentences containing onlyterminal symbols that can be generated by applyingthe rewriting rules starting from the root symbol (let’scall such sentences strings)71

Grammars (2/2) Consider the following grammar G:»»»»N {S;X; Y}S SΣ {a; b; c}δ consists of the following rules: S - bS - XbYX - aX - aXY - cY - Y c» Some sample derivations: S - b S - XbY - abY - abc S - XbY - aXbY - aaXbY - aaabY - aaabc72

The Chomsky Hierarchy Regular grammars (Type 3)» all productions can be written in the form: N :: TN» one non-terminal on left side; at most one on right Context-free grammars (Type 2)» all productions can be written in the form: N :: XYZ» one non-terminal on the left-hand side; mixture on right Context-sensitive grammars (Type 1)» number of symbols on the left is no greater than on theright» no production shrinks the size of the sentential form Type-0 grammars» no restrictions73

Regular Expressions (1/3) An alternate way of describing a regular language iswith regular expressionsWe say that a regular expression R denotes the language [[R]]Recall that a language is a set of stringsBasic regular expressions:» Є denotes Ǿ» a character x, where x Є Σ, denotes {x}» (sequencing) a sequence of two regular expressions RSdenotes» {αβ α Є [[R]], β Є [[S]]}» (alternation) R S denotes [[R]] U [[S]]» (Kleene star) R* denotes the set of strings which areconcatenations of zero or more strings from [[R]]» parentheses are used for grouping» Shorthands: R? Є R R RR*74

Regular Expressions (2/3) A regular expression is one of thefollowing:» A character» The empty string, denoted by » Two regular expressions concatenated» Two regular expressions separated by (i.e., or)» A regular expression followed by the Kleenestar (concatenation of zero or more strings)75

Regular Expressions (3/3) Numerical literals in Pascal may begenerated by the following:76

Regular Grammar Example A grammar for floating point numbers:» Float :: Digits Digits . Digits» Digits :: Digit Digit Digits» Digit :: 0 1 2 3 4 5 6 7 8 9 A regular expression for floating point numbers:» (0 1 2 3 4 5 6 7 8 9) (.(0 1 2 3 4 5 6 7 8 9) )? Perl offer some shorthands:» [0 -9] (\.[0 -9] )?or» \d (\.\ d )?77

Lexical IssuesLexical: formation of words or tokens Tokens are the basic building blocks of programs:»»»»»keywords (begin, end, while).identifiers (myVariable, yourType)numbers (137, 6:022e23)symbols ( , )string literals (“Hello world”)Described (mainly) by regular grammarsTerminals are characters. Some choices:» character set: ASCII, Latin-1, ISO646, Unicode, etc.» is case significant? Is indentation significant?» Python, Occam, HaskellExample: identifiersId :: Letter IdRestIdRest :: Є Letter IdRest Digit IdRestMissing from above grammar: limit of identifier lengthOther issues: international characters, case-sensitivity, limit of identifier length78

Context-Free Grammars (1/7) BNF: notation for context-free grammars» (BNF Backus-Naur Form) Some conventionalabbreviations: alternation: Symb :: Letter Digit repetition: Id :: Letter {Symb}or we can use a Kleene star: Id :: Letter Symb*for one or more repetitions: Int :: Digit option: Num :: Digit [. Digit*] abbreviations do not add to expressive powerof grammar need convention for meta-symbols – what if “ ”is in the language?79

Context-Free Grammars (2/7) The notation for context-free grammars(CFG) is sometimes called Backus-NaurForm (BNF) A CFG consists of» A set of terminals T» A set of non-terminals N» A start symbol S (a non-terminal)» A set of productions80

Context-Free Grammars (3/7) Expression grammar with precedenceand associativity81

Context-Free Grammars (4/7) A parse tree describes the grammaticalstructure of a sentence» root of tree is root symbol of grammar» leaf nodes are terminal symbols» internal nodes are non-term

Introduction to Programming Languages - Sub-Topics Introduction Programming Language Design and Usage Main Themes Programming Language as a Tool for Thought Idioms Why Study Programming Languages Classifying Programming Languages Imperative Languages PL Genealogy Predictable Performance vs. Writeability