CS6038/CS5138 Malware Analysis, UC

Course content for UC Malware Analysis

View on GitHub
28 February 2017

Numeric Data Encoding, Arrays, and Memory Analysis

by

Numeric Data Encoding, Arrays, and Memory Analysis

This lecture introduces the class to the common methods of data representation/encoding within the machine architecture. We focus on the schemes common in x86-64 architectures. We discuss numeric encoding methods, how characters are encoded as bytes, and how signed and unsigned numbers are stored.

Also discussed is how data is organized in arrays, unidimensional and multidimensional. Additionally, we discuss the nuances of pointer-indirect (char **blah, for instance) and dynamically-allocated multi-dimensional arrays.

We demonstrate using the GDB debugger to analyze all of this data, compiled into a C++ program, at run-time in memory.

Here’s a good discussion on how Two’s Complement encoding is used to represent signed numbers:

Here’s some documentation about using GDB to analyze (eXamine) memory, using the x command, as well as help on using it for run-time analysis:

Example code from class:

Slides: lecture-w08-1.pdf (PDF)

Video: CS7038: Wk08.1 - Analysis of C Data Types

Code from class

#include <cstdio>
#include <iostream>

/* These are atomic values. */
static unsigned char c_ex = 34;
static unsigned short s_ex = 11234;
static unsigned int i_ex = 0xfff00000;
static unsigned long l_ex = 1012312123;
static unsigned long long ll_ex = 101231212300;

/* Here's array data. */
const char c_arr_test[] = {'a', 'b', 'c', 'd'}; /* Exactly 4 bytes */
const char str_test[] = "this is a string"; /* 16 bytes + 1 NULL = 17 bytes total */
const int i_arr_test[] = {1, 2, 200, 2000, 4242, 40000, 121123112, -2000000000};

/* Here's matrix data. */
const short i_2x2_test[3][2] = { {200, 400}, {300, 200}, {121, 527} };
const short i_6_test[6] = {200, 400, 300, 200, 121, 527};

/* Here's indirect data. */
int *ptr_ex[3];

/* This is a data structure definition. Does not generate in-file bytes. */
#pragma pack(1)
struct test_struct_def {
  unsigned int ip_address;
  char modifier;
  unsigned short port;
};

/* This is a static instantiation that tells compiler to make bytes. */
struct test_struct_def s_test_data = {0x7f000001, /* Hex for 127.0.0.1 */
                                      'c',
                                      3000};

using namespace std;

int main(int argc, char **argv) {
  for(int c = 0; c < 3; c++) {
    ptr_ex[c] = new int[12];
    for(int i = 0; i < 12; i++) {
      ptr_ex[c][i] = (i + 1)<<c;
    }
  }

  cout << "Test" << endl;
  cout << sizeof(struct test_struct_def) << endl;
}

home

tags: malware lecture gdb static-analysis