X86 Assembly Languagex86 Assembly Language Intel X86 .

Transcription

x86 Assembly LanguageFundamentalsIntel x86 Assembly FundamentalsComputerpOrganizationg zand Assemblyy Languagesg gYung-Yu Chuang2008/12/8with slides by Kip IrvineInstructionsLabels Act as place markersAssembled into machine code by assemblerExecuted at runtime by the CPUMember of the Intel IA-32 instruction setFour parts––––– marks the address (offset) of code and data Easier to memorize and more flexiblemov axax, [0020] mov axax, val Follow identifier rules DataDllabelb lLabel (optional)Mnemonic (required)Operand (usually required)Comment (optional)Label:MnemonicOperand(s)– must be unique– example:lmyArrayBYTE10 Code label (ends with a colon);Comment3– target of jump and loop instructions– example: L1: mov ax, bx.jmp L14

Reserved words and identifiersMnemonics and operands Reserved words cannot be used as identifiers Instruction mnemonics– "reminder"reminder– examples: MOV, ADD, SUB, MUL, INC, DEC– Instruction mnemonics,mnemonics directivesdirectives, type attributesattributes,operators, predefined symbols Identifiers–––– Operands1-247 characters, including digitscase insensitive (by default)first character must be a letter, , @, or examples:pvar1Count firstmainMAXopen file@@myfile xVal12345––––constant (immediate value), 96constant expression,expression 2 4Register, eax), coucounttmemoryy ((data label), Number of operands: 0 to 3– stc– inc ax, bx– mov count,; set Carry flag; add 1 to ax; move BX to count56DirectivesComments Commands that are recognized and acted uponby the assembler Comments are good!– explain the program'sprogram s purpose– tricky coding techniquespppexplanationsp– application-specific– Part of assembler’s syntax but not part of the Intelinstruction set– Used to declare code, data areas, select memorymodel declare proceduresmodel,procedures, etcetc.– case insensitive Single-line comments– beging with semicolon (;) block comments Different assemblers have different directives– NASM ! MASM, for example Examples:El.datad.codedPROCOC7– begin with COMMENT directive and a programmerprogrammerchosen character and end with the sameprogrammer-chosen characterCOMMENT !This is a commentand this line is also a comment!8

Example: adding/subtracting integersExample outputdirective marking a commentTITLE Add and SubtractProgram outputoutput, showing registers and flags:(AddSub.asm)comment; This program adds and subtracts 32-bit integers.INCLUDE Irvine32.inc copy definitions from Irvine32.inc.code code segment.seg e t. 3 segments:seg e ts: code, data, stacstackmain PROC beginning of a proceduremov eax,10000hsource ; EAX 10000hadd eax,40000h; EAX 50000hd ti ti ; EAX 30000hdestinationsub eax,20000hcall DumpRegs; display registersexitdefined in Irvine32Irvine32.incinc to end a programmain ENDPEND mainmarks the last line andEAX 00030000EAX 00030000EBXEBX 7FFDF0007FFDF000ECXECX 0000010100000101ESI 00000000EDI 00000000EBP 0012FFF0EIP 00401024EFL 00000206CF 0SF 0EDXEDX FFFFFFFFFFFFFFFFESP 0012FFC4ZF 0OF 0define the startup procedure9Alternative version of AddSubTITLE Add and SubtractProgram templateTITLE Program Template(AddSubAlt.asm);;;;;; This program adds and subtracts 32-bit integers.386,.MODEL flat,stdcall.STACK 4096ExitProcess PROTO, dwExitCode:DWORDDumpRegs PROTO.codemain PROCmov eax,10000hadd eax,40000hsub eax,20000hcall DumpRegsINVOKE ExitProcess,0main ENDPEND main10(Template.asm)Program Description:Author:Creation Date:Revisions:Date:Modified by:.data; (insert variables here).codecodemain PROC; (insert executable instructions here)exitimain ENDP; (insert additional procedures here)END main; EAX 10000h; EAX 50000h; EAX 30000h1112

Assemble-link execute cycle The following diagram describes the steps fromcreating a source program through executing thecompiled program.p 2 throughg 4 must If the source code is modified,, Stepsbe repeated.Defining dataLinkLibrarySourceFileStep 2:St2assemblerStep 1: text editorObjectFileStep 3:linkerListingFileExecutableFileStep 4:OS loaderOutputMapFile13Intrinsic data typesIntrinsic data types(1 of 2)(2 of 2) REAL4 BYTE, SBYTE– 8-bit8 bit unsignedid iinteger;t88-bitbit signedid iintegert– 4-byte4 b t IEEE shorth t reall REAL8 WORD, SWORD– 16-bit unsigned & signed integer– 8-byte IEEE long real REAL10 DWORD, SDWORD– 32-bit unsigned & signed integer– 10-byte IEEE extended real QWORDQ– 64-bit integer TBYTE– 80-bit integer1516

Data definition statementInteger constants A data definition statement sets aside storage inmemory for a variable.variable May optionally assign a name (label) to the data. Only size matters,matters other attributes such as signed arejust reminders for programmers. Syntax: [{ -}] digits [radix]Optional leading or – signbinary, decimal, hexadecimal, or octal digitsCCommonradixdi characters:ht–––––[name] directive initializer [,initializer] . . .At least one initializer is required, can be ? All initializers become binary data in memoryh–d–b–r–o–hexadecimald i l (ddecimal(default)f lt)binaryencoded realoctalExamples: 30d, 6Ah, 42, 42o, 1101bHexadecimal beginning with letter: 0A5h1718Integer expressionsReal number constants (encoded reals) Operators and precedence levels: Fixed point v.s. floating point1823SEM 1.bbbb 2 (E-127) Example 3F800000r 1.0,37.75 42170000r Examples: double1911152SEM20

Real number constants (decimal reals)Character and string constants [sign]integer.[integer][exponent]signi { -}{ }exponent E[{ -}]integer Examples: Enclose character in single or double quotes– 'A','A' ""x""– ASCII character 1 byte Encloselstrings in singlel or doubled bl quotes– "ABC"– 'xyz'2. 3.044.2E 052E 05-4426.E5– Each character occupies a single byte Embedded quotes:– ‘Sayy "Goodnight,"gGracie’– "This isn't a test"21Defining BYTE and SBYTE Data22Defining multiple bytesEach of the following defines a single byte of storage:Examples that use multiple initializers:value1 BYTE 'A‘; character constantvalue2l 2 BYTE 0; smallestll t unsignedid bbytetlist1 BYTE 1010,20,30,4020 30 40value3 BYTE 255; largest unsigned bytelist2 BYTE 10,20,30,40value4 SBYTE -128 ; smallest signed byteBYTE 50,60,70,80value5 SBYTE 127 ; largest signed byte, , ,BYTE 81,82,83,84value6 BYTE ?; uninitialized bytelist3 BYTE ?,32,41h,00100010blist4 BYTE 0Ah,20h,‘A’,22hA variable name is a data label that implies an offset(an address).2324

Defining stringsDefining strings(1 of 2) A string is implemented as an array ofcharacters– For convenience, it is usually enclosed inquotation marksq– It usually has a null byte at the end Examples:(2 of 2) End-of-line character sequence:– 0Dh carriage return– 0Ah line feedstr1 BYTE "Enter your name:",0Dh,0AhBYTE "EnterEnter your address: ",00str1 BYTEstr2 BYTEstr3 BYTEgreeting1"Enter your name",0'Error: halting program',0'A','E','I','O','U'BYTE "Welcome to the Encryption Demo program "BYTE "createdcreated by Kip Irvine.Irvine ",00greeting2 \BYTE "Welcome to the Encryption Demo program "BYTE "created by Kip Irvine.",0newLine BYTE 0Dh0Dh,0Ah,00Ah 0Idea: Define all strings used by your program inthe same area of the data segment.2526Using the DUP operatorDefining WORD and SWORD data Use DUP to allocate (create space for) an array orstring.string Counter and argument must be constants or constantpexpressions Define storage for 16-bit integers– or double characters– single value or multiple valuesvar1 BYTE 20 DUP(0) ; 20 bytes, all zeroword1 WORD65535word2 SWORD –32768word3 WORD?var2 BYTE 20 DUP(?) ; 20 bytes,; uninitializedi iti li dvar3 BYTE 4 DUP("STACK") ; 20 bytes:;"STACKSTACKSTACKSTACK";;;;word4 WORD "AB";myListyWORD 1,2,3,4,5, , , ,array WORD 5 DUP(?) ;largest unsignedsmallest signedguninitialized,gunsigneddouble characters; arrayy of wordsuninitialized arrayvar4 BYTE 1010,33 DUP(0)DUP(0),20202728

Defining DWORD and SDWORD dataDefining QWORD, TBYTE, Real DataStorage definitions for quadwords, tenbyte values,and real numbers:Storage definitions for signed and unsigned 32-bitintegers:val1val2val3val4DWORD 12345678hSDWORD –2147483648DWORD 20 DUP(?)SDWORD –3,–2,–1,0,13 2 1 0 1;;;;qquad1QQWORD 1234567812345678hval1 TBYTE 1000000000123456789AhaREAL4 -2.1.rVal1rVal2 REAL8 3.2E-260rVal3 REAL10 4.6E 4096ShortArray REAL4 20 DUP(0.0)unsignedsignedunsigned arraysigned array29Little Endian order30Adding variables to AddSub All data types larger than a byte store theirindividual bytes in reverse orderorder. The leastsignificant byte occurs at the first (lowest)memory address.addressTITLE Add and Subtract,INCLUDE Irvine32Irvine32.incinc.dataval1 DWORD 10000hval2 DWORD 40000hval3 DWORD 20000hfinalVal DWORD ?.codemain PROCmov eax,val1add eax,val2sub eax,val3,mov finalVal,eaxcall DumpRegsexitmain ENDPEND main Example:val1 DWORD 12345678h31(AddSub2.asm);;;;;start with 10000hadd 40000hsubtract 20000hstore the result ((30000h))display the registers32

Declaring unitialized dataMixing code and data Use the .data? directive to declare anunintializedi ti li d ddatat segment:t.codemov eaxeax, ebx.datatemp DWORD ?.codemov temptemp, eax.data? Within the segment, declare variables with "?"initializers: (will not be assembled into .exe)Advantage: the program's EXE file size is reduced.datasmallArray DWORD 10 DUP(0).data?bigArrayDWORD 5000 DUP(?)3334Equal-sign directive name expressionSymbolic constants– expressioni iis a 32-bit32 bit integeri t((expressioni or constant)t t)– may be redefined– name isi calledll d a symbolicb li constantt t good programming style to use symbols– Easier to modify– Easier to understand, ESC keyArray DWORD COUNT DUP(0)COUNT 5mov al,l COUNTCOUNT 10mov alal, COUNTCOUNT 500.mov al,COUNT36

Calculating the size of a byte arrayCalculating the size of a word array current location counter: current location counter: – subtractbt t addressddoff lilistt– difference is the number of byteslist BYTE 10,20,30,40ListSizestS e 4– subtract address of list– difference is the number of bytes– divide by 2 (the size of a word)list BYTE 10,20,30,40ListSizestS e ( - list)st)listlit WORD 1000h1000h,2000h,3000h,4000h2000h 3000h 4000hListSize ( - list) / 2list BYTE 10,20,30,40var2 BYTE 20 DUP(?)ListSize ( - list)list DWORD 1,2,3,4ListSize ( - list) / 4myString BYTE “This is a long string.”myString lenSt il ( - myString)St i )37EQU directive38EQU directive name EQU expressionname EQU symbolQ text name EQU Define a symbol as either an integer or textexpression.expression Can be useful for non-integer constants CannotCtbbe redefinedd fi dPI EQUQ 3.1416 pressKey EQU "Press any key to continue.",0 .dataprompt BYTE pressKeymatrix1 EQU 10*10matrix2 EQU 10*10 10 10 .data39M1 WORD matrix1; M1 WORD 100M2 WORD matrix2; M2 WORD 10*1040

Addressing ModesAddressingAddressing Modes32-Bit Addressing Modes These addressing modes use 32-bit registersSegment Base (Index * Scale) displacement

Instruction operand notationOperand types Three basic types of operands:– IImmediatedi t – a constantt t iintegert(8(8, 1616, or 32 bitbits)) value is encoded within the instruction– RegisterR i t – theth name off a registeri t register name is converted to a number andencoded within the instruction– Memory – reference to a location in memory memory addressddiis encodedd d withinithi ththeinstruction, or a register holds the address of amemory location45Direct memory operandsDirect-offset operands A direct memory operand is a namedreference to storage in memory The named reference (label) is automaticallydereferenced by the assembler.datavar11 BYTE 10h10h,.codemov al,var1l1mov al,[var1]46A constant offset is added to a data label to produce aneffective address (EA).(EA) The address is dereferenced to getthe value inside its memory location. (no range checking).dataarrayB BYTE 10h,20h,30h,40h.codedmov al,arrayB 1; AL 20hmov al,[arrayB 1]; alternative notationmov al,arrayB 3; AL 40h; AL 10h; AL 10halternate format; I prefer this one.4748

Direct-offset operands (cont)Data-Related Operators and DirectivesA constant offset is added to a data label to produce aneffective address (EA).(EA) The address is dereferenced toget the value inside its memory location. .datadataarrayW WORD 1000h,2000h,3000harrayD DWORD 1,2,3,4.codemov ax,[arrayW 2]; AX 2000hmov ax,[arrayW 4][W 4]; AX 3000hmov eax,[arrayD 4]; EAX 00000002hOFFSET OperatorPTR OperatorpTYPE OperatorLENGTHOF OperatorSIZEOF OperatorLABEL Directive; will the following assemble and run?mov ax,[arrayW-2]; ?mov eax,[arrayD 16][16]; ?4950OFFSET OperatorOFFSET Examples OFFSET returns the distance in bytes, of a labelLet's assume that bVal is located at 00404000h:from the beginning of its enclosing segment.databVal BYTE ?wVal WORD ?dVal DWORD ?dV l2 DWORD ?dVal2– Protected mode: 32 bits– Real mode: 16 bitsoffsetdata segment:.codemov esi,OFFSETmov esi,OFFSETmov esi,OFFSETmov esi,OFFSETmyByteThe Protected-mode programs we write only haveg segmentg(we(use the flat memoryy model).)a single51bVal ;wVal ;dVal ;dVal2;ESIESIESIESI 0040400000404001004040030040400752

Relating to C/C TYPE OperatorThe TYPE operator returns the size, in bytes, of a singleelement of a data declaration.declarationThe value returned by OFFSET is a pointer. Comparethe following code written for both C and assemblylanguage:.datavar1 BYTE ?var2 WORD ?var3 DWORD ?var4 QWORD ?; C version:char array[1000];char * p &array;.dataarray BYTE 1000 DUP(?).codemov esiesi,OFFSETOFFSET array; ESI is p.codemov eax,TYPEmov eax,TYPEmov eax,TYPEmov OF OperatorSIZEOF OperatorThe LENGTHOF operator counts the number of elementsin a single data declaration.declarationThe SIZEOF operator returns a value that is equivalent tomultiplying LENGTHOF by TYPE.TYPE.databyte1 BYTE 10,20,30array1 WORD 30 DUP(?),0,0array22 WORD 5 DUP(3 DUP(?))array3 DWORD 1,2,3,4digitStr BYTE "12345678",012345678 ,0LENGTHOF; 3; 32; 15; 4; 9.databyte1 BYTE 10,20,30array1 WORD 30 DUP(?),0,0array2 WORD

; This program adds and subtracts 32-bit integers.386.MODEL flat,stdcall.STACK 4096 ExitProcess PROTO, dwExitCode:DWORD DumpRegs PROTO.code main PROCmain PROC mov eax,10000h ; EAX 10000h add eax,40000h ; EAX 50000h sub eax,20000h ; EAX 30000h call DumpRegs INVOKE ExitProcess,0 main ENDP END main 11 END main Program template