CS6038/CS5138 Malware Analysis, UC

Course content for UC Malware Analysis

View on GitHub
5 March 2021

More Ghidra Code Analysis

by Coleman Kane

Table of Contents

This is a continuation from the prior material on Ghidra. If you’re reading through this, it is recommended to start with that module first.

Analyzing main Function

Returning to the main function that we labeled, we will focus our effort on the Decompiled source code.

main() Decompiled

For starters, the decompilation gives us a readable view of the C++ streams cin and cout being used, as well as the strings that were embedded within the source code that are being used with them.

It is worth noting that C++ treats the << operator as a function (in fact, all operators can be overridden as functions in C++) that takes 2 arguments (the values on the left and right of the operator: left << right). When the source code is compiled, these are resolved by the parser into function calls like operator<<(left, right), with left being our cout and right being the string to print to the screen. The return value of this function call is a reference to the left argument, which allows C++ to utilize the pretty “chaining” syntax you see:

cout << first << second << third;

The above resolves into the following function call set, more or less.

operator<<(operator<<(operator<<(cout, "first"), "second"), "third");

In the above screenshot, the this pointer referred to within the function ends up holding onto the reference to cout, and when the calls are executed from inner to outer, ensuring the strings are displayed in the order you want). Ghidra actually breaks this apart for your benefit, however:

    this = operator<<<std--char_traits<char>>((basic_ostream *)cout,"Your string \"");
    this = operator<<<char,std--char_traits<char>,std--allocator<char>>
                     (this,(basic_string *)local_38);
    this = operator<<<std--char_traits<char>>(this,"\" was NOT found!");
    operator<<((basic_ostream<char,std--char_traits<char>> *)this,endl<char,std--char_traits<char>>)

Scan the Code Looking for Key System Function Calls

Using the above knowledge, you can observe that the main function calls getline as well as operator<<, indicating that the function provides some sort of interactive command-line interface to the user. Abstracted out, you’ll find this event-driven design common for a lot of malicious tools, as well. It’s not always getline and cout - sometimes it is gets and printf or in many cases it is often send/recv and read/write.

Divide and Conquer

A great strategy here, though, is to break the code up into sections, using the getline prompts as a barrier. This gives us the opportunity to analyze smaller chunks individually, to decide what their function/purpose is within the larger program.

For example:

  local_c = 0;
  local_18 = param_2;
  local_10 = param_1;
  this = operator<<<std--char_traits<char>>
                   ((basic_ostream *)cout,
                    "Enter strings one per line for list. Empty line to terminate:");
  operator<<((basic_ostream<char,std--char_traits<char>> *)this,endl<char,std--char_traits<char>>);
  basic_string();
  do {
                    /* try { // try from 00401351 to 004014da has its CatchHandler @ 00401467 */
    getline<char,std--char_traits<char>,std--allocator<char>>
              ((basic_istream *)cin,(basic_string *)local_38);
    FUN_00401590(&DAT_00404348,local_38);
    lVar2 = size();
  } while (lVar2 != 0);

This fragment of code includes a cout write that prompts the user to provide strings, one per line. Following that is a do-loop within which is a call to getline that fills variable local_38 and then passes that to FUN_00401590. The size() function is then called and its result is saved in lVar. In the iostreams code, the top-level function size() is used to report the length of the most recent basic_string operation. To signify the “empty line” this is tested for zero to determine when to cut out of the loop. This is basically a recipe for taking streaming chunked input (the lines of text) and then running a processor on it, FUN_00401590.

Aside: see if you can identify where I screwed up the iteration algorithm above, think of how the original algo1.cpp code should have been written for this part to execute as intended, and keep that in mind.

If you right-click the FUN_00401590(...) call in the decompiler source code window, it will bring up a context menu that gives you some comment options. Ghidra supports marking up the disassembly listing with three types of comments:

The Set… option brings up a dialog that allows you to edit any of the comments, while the quick choices only exist for Pre and Plate comments.

Comment Menu

Below is an example with all four populated:

Example Comments

  this = (basic_ostream *)
         operator<<((basic_ostream<char,std--char_traits<char>> *)cout,
                    endl<char,std--char_traits<char>>);
  this = operator<<<std--char_traits<char>>(this,"Now enter a string to find it in the list:");
  operator<<((basic_ostream<char,std--char_traits<char>> *)this,endl<char,std--char_traits<char>>);
  getline<char,std--char_traits<char>,std--allocator<char>>
            ((basic_istream *)cin,(basic_string *)local_38);
  bVar1 = FUN_00401600(&DAT_00404348,local_38);

The above code merely prompts the user to enter a string value, and then a reference to that input is stored in local_38 and then is passed to FUN_00401600. In this case, the prompt message suggests that the program will use the inputted string to find it in the list, so this is a good hint as to what FUN_00401600 does. The result of this call is saved in bVar1, which is used below to decide which of two code paths to take.

  if ((bVar1 & 1) == 0) {
    this = operator<<<std--char_traits<char>>((basic_ostream *)cout,"Your string \"");
    this = operator<<<char,std--char_traits<char>,std--allocator<char>>
                     (this,(basic_string *)local_38);
    this = operator<<<std--char_traits<char>>(this,"\" was NOT found!");
    operator<<((basic_ostream<char,std--char_traits<char>> *)this,endl<char,std--char_traits<char>>)
    ;
  }
  else {
    this = operator<<<std--char_traits<char>>((basic_ostream *)cout,"Your string \"");
    this = operator<<<char,std--char_traits<char>,std--allocator<char>>
                     (this,(basic_string *)local_38);
    this = operator<<<std--char_traits<char>>(this,"\" was found!");
    operator<<((basic_ostream<char,std--char_traits<char>> *)this,endl<char,std--char_traits<char>>)
    ;
  }

Inspecting both of the blocks above, it is clear that they are an if statement, which takes the first block/path if the value of bVar1 & 1 is 0, and the second path, for any other value of bVar1. Comparing the lines closely, you will see that the only difference between them is the message passed in the third operator<< call. In the first path (where bVar1 is 0), the message reports a failure to find the provided string, while in the second case, the message reports successfully finding the string.

The construct for comparing bVar1 & 1 is performing a comparison on the lower bit of the variable, and is a C++ language implementation of the bool single-bit type, and often when you observe this approach being used to implement boolean true/false logic in C and C++.

Side note: It is also important to note that bool fields can be packed together sometimes, so that the 1 could be any number that is a whole power of 2, up to the machine word size, to implement tests against bit field structures like this:

struct flags{
  bool enabled:1;
  bool active:1;
  bool occupied:1;
  ...
};

home

tags: malware lecture c x86 x86-64 asm cfg ghidra