CS6038/CS5138 Malware Analysis, UC

Course content for UC Malware Analysis

View on GitHub
8 February 2021

Assembly Language Crash Course (Pt. 2), A Deeper Dive

by Coleman Kane

Table of Contents

This lecture focuses more on the core internals of x86-64 assembly (which is a 64-bit superset of the 32-bit x86 instruction set).

The following topics are discussed:

Additionally, this lecture includes a demostration using the in-class compiled program from Lecture W05.2. In this example, we examine a number of specific instructions that were generated from our C program’s compilation. We discuss how the bytes generated in the file correspond to components of an instruction, and then also use a hex viewing utility to examine those instructions in their native context.

Helpful reference tables

Slides: lecture-w06.pdf (PDF)

Video: CS7038: Wk06 - Assembly Language Crash Course

x86-64/32 Registers

64-bit Register Name Rnn Name Name (Description) 32-bit register name 16-bit Register name 8-bit high/low
RAX R0 Accumulator (frequently result storage) EAX AX AH/AL
RCX R1 Counter (frequently used as i in iterations/loops) ECX CX CH/CL
RDX R2 Data (frequently used as additional arg in operations) EDX DX DH/DL
RBX R3 Base (frequently used as base address or counter) EBX BX BH/BL
RSP R4 Stack Pointer (used internally to keep track of top of CPU stack) ESP SP  
RBP R5 Base Pointer (frequently used to keep track of bottom of CPU stack) EBP BP  
RSI R6 Source Index (keeps track of indexes in source arrays) ESI SI  
RDI R7 Destination Index (keeps track of indexes in destination arrays) EDI DI  
R8-R15 R8-R15 Additional general-purpose registers only avail. in 64-bit      

For many of the registers, you may access fragments of them using their synonyms. Editing AH, for example, will modify the bits 15..8 in register RAX (EAX, AX, etc…). The fragments address the following bit ranges in their respective registers:

There are also a number of limited-use, system, and context-specific registers that are part of the architecture. I will not go into these at this time.

x86 Addressing Modes

The x86 instruction set also has a number of addressing modes. These represent instruction variants that provide various mechanisms to work with constant (hard-coded data), data present in registers, and data present in memory.

Addr. Mode Name AT&T/UNIX example Intel/Microsoft Equivalent Description of example
Immediate movq $0x0a, %rbx MOV RBX, 0Ah Copy constant 0x0a into register RBX
Register movq %rax, %rbx MOV RBX, RAX Copy data from RAX into RBX
Direct Memory movq 0x1000, %rbx MOV RBX, QWORD PTR [1000h] Copy 64-bit data from memory loc. 0x1000 into RBX
Register Indirect movq (%rbx), %rbp MOV RBP, [RBX] Copy 64-bit data from memory loc. stored in RBX into RBP
Register Indirect w/ displacement movq 0x08(%rbx), %rbp MOV RBP, 08h[RBX] Copy 64-bit data from memory location RBX+0x08 into RBP
Scaled Register Indirect w/ displacement movq 0x08(,%rbx,4), %rbp MOV RBP, 08h[RBX*04h] Copy 64-bit data from memory location [(RBX*0x04)+0x08] into RBP
Scaled Register Indirect w/ displacement + base reg. movq 0x08(%rsi,%rbx,4), %rbp MOV RBP, 08h[RSI][RBX*04h] Copy 64-bit data from memory location [(RBX*0x04)+RSI+0x08] into RBP


tags: malware lecture c x86 x86-64 asm cfg decompilation