CS6038/CS5138 Malware Analysis, UC

Course content for UC Malware Analysis

View on GitHub
29 March 2020

Hunting on a System With Yara

by Coleman Kane

Using the knowledge gained from the prior sessions discussing yara, as well as the earlier efforts discussing forensic tools, we can incorporate yara into various test cases to identify malware for us on a live system. Similar to before, we will simulate the exercise using our test lab VM.

Setting Up

Pre-built yara for Windows (32-bit, compatible with our Win7 VM) is available at the following link:

The above-linked file was built with Visual Studio 14.0, for which the runtime DLLs have not been installed in the Win7 VM that I distributed, unfortunately. You can download vc_redist.x86.exe from the following link. It must be installed prior to being able to run yara:

The first step you should do is to run the vc_redist.x86.exe installer. No reboot of the VM will be necessary, as the installer merely places DLLs into a system folder so that when you run yara later, it can find them.

Once that is in place, you can unzip yara-v3.11.0-994-win32.zip into a folder of your choosing. For my example, I am going to create a new folder C:\Tools.

Additionally, I will use the signature developed from the previous yara lecture, xor_string_function.yar ver 3. This is the signature that contained each of the interesting blocks identified from the function as separate individual strings. Copy this signature into the C:\Tools (or whatever name you chose) folder, also.

Analysis of Filesystem

Using yara to analyze the filesystem with this signature is relatively straightforward. Before compromising the VM with malware.exe from the mid-term assignment, you can run the following command from inside the Tools folder, using cmd.exe. Don’t use PowerShell, as the examples later in this lecture won’t be compatible with it:

.\yara32.exe -r xor_string_function.yar C:\

If everything works correctly, yara should run and you will begin to see error messages similar to below, indicating an inability to open some locked files in Windows (this is ok for our example, so these can be ignored):

error scanning C:\\pagefile.sys: could not open file
error scanning C:\\ProgramData\Microsoft\Crypto\RSA\MachineKeys\1642d22732a7c3a2
59dfde2136572545_e05d5194-ec82-467a-a386-f43e86500624: could not open file
error scanning C:\\ProgramData\Microsoft\Crypto\RSA\MachineKeys\309246ae4a30fa7c
697da11427becb65_e05d5194-ec82-467a-a386-f43e86500624: could not open file
error scanning C:\\ProgramData\Microsoft\Crypto\RSA\MachineKeys\473b078e63adefb2
8e0bbcd640721c56_e05d5194-ec82-467a-a386-f43e86500624: could not open file
error scanning C:\\ProgramData\Microsoft\Crypto\RSA\MachineKeys\0eb8017c304cb894
a356a06384240ba3_e05d5194-ec82-467a-a386-f43e86500624: could not open file
error scanning C:\\ProgramData\Microsoft\Crypto\RSA\MachineKeys\f4122e29a286d11b
a39b8e44d4c70169_e05d5194-ec82-467a-a386-f43e86500624: could not open file

The run-time on this will be rather long, as it is scanning every single file on your VM hard disk using the signature that was developed in the last lecture. The above messages reporting errors due to opening files are identifying a number of files that windows has locked. These may be system files that can only have one reader, or files intended to be kept private from everything but the Windows kernel. I’ll show you how to suppress these messages for the sake of our example, but more production-ready analysis would take an inventory of these on a “known clean” VM, so that during analysis you would be able to identify any newly-locked files which might be behaviors attributable to the malware.

To suppress these in cmd.exe, simply add 2> nul to the end of the line:

.\yara32.exe -r xor_string_function.yar C:\ 2> nul

Documentation of the redirection feature is here, on Microsoft’s site:

Next, copy the malware.exe from the mid-term lab assignment into the Tools folder as well, and then run it. If you want to, you can use your Kali VM to have the malware connect and give you a shell, but that isn’t truly necessary for this exercise. The malware will perform its installation activities and remain running in memory despite having no access to the Kali VM.

Once that is complete, you’ll be able to run the above command again and identify the malware, both where it was copied into the Tools manually, and also where it installed itself after running. The program will continue running until it has scanned all files, but for this demonstration it is sufficient to hit CTRL-C after the below are reported:

xor_string_function C:\\tools\malware.exe
xor_string_function C:\\Users\Public\Libraries\helpmesvc.exe

With that, you’ve successfully used one of the basic use-cases for yara. In an end-user environment, if you happen to discover malware on a user’s system and analyze it, you can generate a signature like we have done in the prior lectures, and then use a process similar to this as a basic investigative step to determine if the threat exists elsewhere within your environment. As with anything, while a positive discovery gives confidence that you’ve uncovered the threat elsewhere, execution of the command with no matching results doesn’t guarantee a system is clean.

Additionally, in the previous lecture I discussed testing your signature against a set of known-good files. Using the exercise we just walked through here, on a freshly-installed VM, can yield a more comprehensive fidelity test. In fact, running it on a VM can also offer the additional feature of being an environment where you could install any of the software that normally is deployed on your employer or customers’ systems, to include in the fidelity testing.

Analysis of Processes

Yara can also be used to scan the memory content of running processes as well. You’ll recall that the RevolutionShell backdoor sample that we’ve been using employs a number of string decryptions at run-time to decode that that is encrypted within the file. The unencrypted form of data such as this may be visible in process memory. Additionally, all or most of the file’s content on disk is likely to reside in memory as well. Some malware even goes so far as to encrypt whole sections of the code, resulting in a file on disk that consists mostly of gibberish, and rendering direct analysis with a static tool like Ghidra to be difficult to impossible, limiting you to debuggers like Immunity.

A good place to start is identifying what your running tasks are. You can do this using the Task Manager, or using tasklist.exe from the command line. Here’s the top of the output from running tasklist.exe on my VM:

Image Name                     PID Session Name        Session#    Mem Usage
========================= ======== ================ =========== ============
System Idle Process              0 Services                   0         24 K
System                           4 Services                   0        636 K
smss.exe                       256 Services                   0        820 K
csrss.exe                      332 Services                   0      3,292 K
csrss.exe                      380 Console                    1      4,028 K
wininit.exe                    388 Services                   0      3,304 K
winlogon.exe                   416 Console                    1      4,596 K
services.exe                   476 Services                   0      7,156 K
lsass.exe                      484 Services                   0      7,144 K
lsm.exe                        492 Services                   0      2,784 K
svchost.exe                    592 Services                   0      6,568 K
VBoxService.exe                656 Services                   0      4,908 K

In the table above, you are given a bunch of information. For our exercise here, the most important pieces are the Image Name, PID (process ID number), and the Mem Usage (which can give you an estimate of how long it will take to scan). Your process ID numbers may vary, as they are assigned on demand and small variations from boot to boot can cause numbers to get assigned in different orders. Yara can accept the process ID number instead of a filename on the command line. If the final argument to yara is a number, it will assume it to be a process id. For example, I will scan the wininit.exe process with yara:

.\yara32.exe xor_string_function.yar 388

It runs for maybe a second or two and then exits with no output, indicating that, as expected, our signature doesn’t match the legitimate Microsoft wininit.exe utility.

Unfortunately, yara doesn’t sport the capability to discover & scan all processes in memory, you must tell it specifically which process you want it to scan. However, there are some tricks up Microsoft’s sleeve that allow you to use the BAT file scripting language in cmd.exe to achive this using the tasklist.exe program I demoed above. You can use a for loop in addition to telling tasklist.exe to not display the column header using the /NH “No Header” option. Furthermore, the "tokens=N" modifier for the for loop can tell it to extract the column (or columns using "tokens=N,M") you wish and put it into the variable %i:

for /F "tokens=2" %i in ('tasklist.exe /NH') do @yara32.exe xor_string_function.yar %i 2> nul

After some run time, you should see the following output (likely with a different PID):

xor_string_function 1664

The above tells us that process ID 1664 matched out signature. What the command above did was the for loop steps through ever PID in the list (column, or token, #2), and for each of them it calls yara32.exe to scan it, using the signature created in the last lecture.


tags: malware yara dynamic lecture