The Intel processor line, colloquially referred to as x86.
Advanced Micro Devices (AMD), Intel’s biggest competitor.
gcc -O1 -o p p1.c p2.c
The command-line option -O1 instructs the compiler to apply level-one optimizations. In practice, level-two optimization (specified with the option -O2) is considered a better choice in terms of the resulting program performance.
The format and behavior of a machine-level program is defined by the instruction set architecture, or “ISA,” defining the processor state， the format of the instructions, and the effect each of these instructions will have on the state.
gcc -O1 -S code.c
This will cause gcc to run the compiler, generating an assembly file code.s
gcc -O1 -c code.c
This will generate an object-code file code.o that is in binary format and hence cannot be viewed directly.
objdump -d code.o get the object code to assembly file
gcc -O1 -o prog code.o main.c add main file to generate execute file. The linker has shifted the location of this code to a different range of addresses. and give global variable a location.
Intel uses the term “word” to refer to a 16-bit data type. Based on this, they refer to 32-
bit quantities as “double words.” They refer to 64-bit quantities as “quad words.”
An IA32 central processing unit (CPU) contains a set of eight registers storing 32-bit values. The low-order 2 bytes of the first four registers can be independently read or written by the byte operation instructions. The low-order 16 bits of each register can be read or written by word operation instructions.
IA32 imposes the restriction that a move instruction cannot have both operands refer to memory locations. Both the movs and the movz instruction classes serve to copy a smaller amount of source data to a larger data location, filling in the upper bits by either sign expansion (movs) or by zero expansion (movz).
Machine code provides two basic low-level mechanisms for implementing conditional behavior: it tests data values and then either alters the control flow or the data flow based on the result of these tests.
recent generations of IA32 processors have had conditional move instructions that either do nothing or copy a value to a register, depending on the values of the condition codes.
the code compiled using conditional moves requires around 14 clock cycles regardless of the data being tested. The flow of control does not depend on data, and this makes it easier for the processor to keep its pipeline full.
v = test-expr ? then-expr : else-expr; If one of those two expressions could possibly generate an error condition or a side effect, this could lead to invalid behavior.
Switch is implemented by jump table. A jump table is an array where entry i is the address of a code segment implementing the action the program should take when the switch index equals i.
A procedure call involves passing both data (in the form of procedure parameters and return values) and control from one part of a program to another. In addition, it must allocate space for the local variables of the procedure on entry and deal locate them on exit.
The portion of the stack allocated for a single procedure call is called a stack frame.
call Label Procedure call
call *Operand Procedure call
leave Prepare stack for return
ret Return from call
The effect of a call instruction is to push a return address on the stack and jump to the start of the called procedure. The return address is the address of the instruction immediately following the call in the program, so that execution will resume at this location when the called procedure returns. The ret instruction pops an address off the stack and jumps to this location. The leave instruction can be used to prepare the stack for returning.
Register Usage Conventions: %eax, %edx, %ecx can be overwrite by callee, (caller-save) %ebx, %esi, %edi need to be save on the stack before overwriting and restore them before callee return (callee-save) %ebp, %esp
Observe also that the overall size of a union equals the maximum size of any of its fields.
Linux follows an alignment policy where 2-byte data types (e.g., short) must have an address that is a multiple of 2, while any larger data types (e.g., int, int *, float, and double) must have an address that is a multiple of 4.
Pointers can also point to functions. int fun(int x, int *p); (int) (*fp)(int, int *); fp = fun; int y = 1; int result = fp(3, &y);
buffer overflow: some character array is allocated on the stack to hold a string, but the size of the string exceeds the space allocated for the array. The techniques we have outlined—randomization, stack protection, and limiting which portions of memory can hold executable code—are three of the most common mechanisms used to minimize the vulnerability of programs to buffer overflow attacks.
Overview of X86-64: . . .Pointers and long integers are 64 bits long. Integer arithmetic operations
support 8, 16, 32, and 64-bit data types.
. The set of general-purpose registers is expanded from 8 to 16.
. Much of the program state is held in registers rather than on the stack. Integer
and pointer procedure arguments (up to 6) are passed via registers. Some
procedures do not need to access the stack at all.
. Conditional operations are implemented using conditional move instructions
when possible, yielding better performance than traditional branching code.
. Floating-point operations are implemented using the register-oriented instruction
set introduced with SSE version 2, rather than the stack-based approach
supported by IA32. . .Arguments (up to the first six) are passed to procedures via registers, rather than on the stack.