Static Analysis of Compromised VM
by Coleman Kane
Though host forensic analysis is often its own subject space, it is a vital component for malware analysis. After all, part of understanding malware is attempting to understand how it behaves on a running system. For this week, we will review some of the data sources in a Windows environment where we are expecting to find evidence of malware and its actions. We will also go over some tools that exist out there to help analyze an environment after malware has run within it.
Common Data Sources for Malware Analysis
On a Windows system, there are a number of data sets that will contain evidence of malware. SANS has a fairly good posted that discusses them all: SANS Windows Forensic Analysis poster. However, we will drill down into some specific elements (some of which aren’t captured in the poster):
- Windows Registry
- NTFS Master File Table
- Windows Event Log
- Browser profiles
- System Memory
This article on Microsoft’s Website discusses the Windows Registry. In short, you can consider the Windows Registry to be a big database of nested groups of key/value pairs, which store settings and metadata about all software, processes, and users of the system. In modern Windows systems, the registry is divided into distinct “hives” of data. The following is a list of each of them, but the linked documentation contains much more valuable detail. These are each given a “hive key” to identify them on the system:
HKCU: Configuration data for the current user
HKU: Sets of configuration data for each user registered on the system
HKLM: Configuration data that is system-specific (not user-specific)
HKCR: Data mapping file classes/types to the software intended to open them
HKCC: Metadata about the host’s HW configuration
In effect, any system-wide configuration change you can make in the Windows GUI will be stored somewhere in here. From the list of documents in your recent MS Office history, to the changes you make to a network interface. Thus, the Windows registry is a great place to find evidence that malware or an intruder has changed the system configuration for their own nefarious purposes.
The unfortunate situation with the Windows Registry is that the files on disk are binary database files and
therefore don’t lend themselves to investigation by a human analyst. On the windows system, you may use the
reg.exe command-line tool (which you may have gotten a preview of in Lab 2), or the GUI
The following documentation gives examples on using both of these methods to extract the binary content of the Windows Registry into a human-readable (and machine parseable) form:
In addition to the key/value pairs, the Windows registry maintains data type information as well as permissions and timestamp information on each pair in the registry, which can be a forensic gold mine if you don’t know what you’re looking for but you are certain when it happened.
NTFS Master File Table
The NTFS file system manages its contents through a directory that’s stored alongside the file content data. This directory maintains file names, directory locations, timestamps, and other information about the files. Like the registry hives described above, the NTFS filesytem itself is a code-optimized binary database of arbitrary files across the system, and manual analysis is prohibitive. Windows does offer various capabilities for searching through the Windows Search feature, and some common methods from the command line are as follows:
dir /s C:\
Or a trick to identify directories that might be protected or have a system-specific purpose:
attrib /s /d C:\ > all-files-and-dirs.txt attrib /s C:\ > all-files-no-dirs.txt
There are also a number of tools that can extract more details from the MFT, by accessing the device directly. As the files on disk are organized into chunks, the NTFS Master File Table helps inform the system about where to find the different chunks of each file spread across the disk. Thus, the files don’t live in one place on disk, but actually are stored piecemeal all over the disk, requiring special tools to reconstruct them in the event that file recovery from a raw disk copy is needed.
The following documentation describes the NTFS filesystem layout, and contains further documentation specifically on the MFT within it:
- Linux-NTFS Documentation
- $MFT(0) Master File Table Documentation
- NTFS General Info
- NTFS Documentation (PDF)
We specifically drill into the Master File Table, as it stores the metadata about the files on disk. So, ignoring the content of files for a moment, it can provide a log of filesystem activity performed.
A tool that quickly collects together and tables out the content of the MFT is Mft2Csv.
Using the above tool, we can collect a summary of the MFT in the current VM to disk, one row per file:
Mft2Csv.exe /Volume:c: /ExtractResident:1 /OutputPath:\\VBOXSVR\sharedfolder\ /TimeZone:-5 /OutputFormat:all /ScanSlack:1
Additionally, a common alternative format is the
log2timeline format, which lists all of the timestamped events
as single rows (so, multiple rows per file), to give you a timeline to review filesystem activity events using
timestamp information rather than file path information:
Mft2Csv.exe /Volume:c: /ExtractResident:1 /OutputPath:\\VBOXSVR\sharedfolder\ /TimeZone:-5 /OutputFormat:l2t /ScanSlack:1
Windows Event Logs
Any program or service running in Windows that needs to report status, failure, progress, or any other information to the OS will typically report this using the Windows Event Log subsystem. Occasionally, you’ll have some applications that write to flat log files on disk, but the prevailing data store for event logs is the Windows Event Log.
The following page introduces the analyst to Windows Event Log and also documents some of the event types that will be helpful in analyzing activity on a system.
The Windows Event Log is a general purpose event logging system for windows. Much as you can write arbitrary logs into
/var/log on a Linux system, any application can write its event logs into Windows. Microsoft released a useful forensic
monitoring utility named sysmon, which monitors the
system once installed, and reports events into the log about system activity as it occurs.
Similarly, all manner of other applications installed on Microsoft Windows may do the same, such as IIS, Microsoft Office, and others.
Browser profiles are managed by each one of the different browsers in different, application-specific, locations. In addition to maintaining history, web browsers will also store cookies, account information, form submission data, and other information that is helpful in making the user experience more optimized. Additionally, things like browsere extensions and plugins are also managed in these locations, so any browser-based malware would be installed in these areas.
C:\Users\<username>\AppData\Local\Google\Chrome\User Data\Default C:\Users\<username>\AppData\Local\Google\Chrome\User Data\Default\Cache
C:\Users\<username>\AppData\Roaming\Mozilla\Firefox\Profiles\xxxxxxxx.default C:\Users\<username>\AppData\Local\Mozilla\Firefox\Profiles\<profile folder>\cache2
For MS Edge:
C:\Users\<username>\AppData\Local\Packages\<package name>\AC\MicrosoftEdge\User\Default\Favorites C:\Users\<username>\AppData\Local\Packages\<package name>\AC\MicrosoftEdge\User\Default\Recovery C:\Users\<username>\AppData\Local\Packages\<package name>\AC\MicrosoftEdge\User\Default\DataStore C:\Users\<username>\AppData\Local\Microsoft\Windows\WebCache
While not necessarily key to malware analysis, analyzing this information can be helpful in learning where malware came from. As email defense tooling has become more advanced, use of convincing the user to download malware via their web browser (and granting consent) has become significantly more popular.
The company foxton forensics offers a free Browser History collection tool:
Another one that is Firefox/Mozilla-focused and is more exhaustive is Dumpzilla.
Finally, system memory is present on every system and, in the end, malware needs to be decoded into a machine readable format in memory in order for it to be effective. Due to this, collection and analysis of system memory still remains an important malware analysis technique.
For Virtual Machines, there are really three approaches to collecting system memory. Collection inside of the VM is possible in two ways, and will enable the collection of memory as the OS and applications see it. Collection of memory using the VM hypervisor will enable collection of memory transparently to the OS (and thus, any rootkit), at the expense of memory not being readily organized in the view the OS or Application has.
To collect system memory, we will make use of the Rekall Memory Forensic Framework,
published by Google. This tool provides numerous capabilities for live memory analysis and collection. On the windows
system, we will use the
winpmem tool (available here)
to collect memory images from the Windows systems:
winpmem-2.1.post4.exe --output memdump.aff4 --format raw
This will export a raw memory dump into the archive named
memdump.aff4. This file format is a specialized ZIP64
format file, and the version of
unzip on your Kali installations should be able to extract the
file from inside the archive:
unzip memdump.aff4 PhysicalMemory mv PhysicalMemory memdump1.raw
Additionally, the volatility suite provides some tools for offline memory analysis:
PhysicalMemory file described above,
volatility can be used to analyze it using various plugins. One
common example is using
volatility to report the type of OS:
volatility -f PhysicalMemory imageinfo
Volatility Foundation Volatility Framework 2.6 INFO : volatility.debug : Determining profile based on KDBG search... Suggested Profile(s) : Win7SP1x86_23418, Win7SP0x86, Win7SP1x86_24000, Win7SP1x86 AS Layer1 : IA32PagedMemoryPae (Kernel AS) AS Layer2 : FileAddressSpace (/mnt/PhysicalMemory) PAE type : PAE DTB : 0x185000L KDBG : 0x82766c28L Number of Processors : 2 Image Type (Service Pack) : 1 KPCR for CPU 0 : 0x82767c00L KPCR for CPU 1 : 0x807c1000L KUSER_SHARED_DATA : 0xffdf0000L Image date and time : 2020-01-26 04:08:14 UTC+0000 Image local date and time : 2020-01-25 23:08:14 -0500
cmd.exe command history buffer (most recently typed commands):
volatility -f /mnt/PhysicalMemory --profile=Win7SP1x86 cmdscan
Volatility Foundation Volatility Framework 2.6 ************************************************** CommandProcess: conhost.exe Pid: 3600 CommandHistory: 0x1effe8 Application: powershell.exe Flags: Allocated, Reset CommandCount: 10 LastAdded: 9 LastDisplayed: 9 FirstCommand: 0 CommandCountMax: 50 ProcessHandle: 0x5c Cmd #0 @ 0x1e8b20: cd \\VBOXSVR\ Cmd #1 @ 0x1f5260: cd \\VBOXSVR\vmshare\ Cmd #2 @ 0x1eefe8: dir Cmd #3 @ 0x1f5298: .\winpmem-2.1.post4.exe Cmd #4 @ 0x1ccd70: .\winpmem-2.1.post4.exe --help Cmd #5 @ 0x1f13c0: .\winpmem-2.1.post4.exe --output winpdump.dmp --format raw --pagefile c:\pagefile.sys Cmd #6 @ 0x1ef008: dir Cmd #7 @ 0x1cbd28: del pmemdump.dmp Cmd #8 @ 0x1f10e0: del .\winpdump.dmp Cmd #9 @ 0x1e5ac8: .\winpmem-2.1.post4.exe --output winpdump.dmp --format raw Cmd #36 @ 0x1b00c4: ??? md #37 @ 0x1edb40: ???? ************************************************** CommandProcess: conhost.exe Pid: 3600 CommandHistory: 0x1f1118 Application: winpmem-2.1.post4.exe Flags: Allocated CommandCount: 0 LastAdded: -1 LastDisplayed: -1 FirstCommand: 0 CommandCountMax: 50 ProcessHandle: 0x8c
Look in memory for IE history artifacts:
volatility -f /mnt/PhysicalMemory --profile=Win7SP1x86 iehistory
... ************************************************** Process: 1820 explorer.exe Cache type "URL " at 0x30a5480 Record length: 0x100 Location: :2020011320200120: IEUser@http://www.bing.com/ Last modified: 2020-01-15 17:42:00 UTC+0000 Last accessed: 2020-01-26 03:35:04 UTC+0000 File Offset: 0x100, Data Offset: 0x0, Data Length: 0x0 ************************************************** Process: 1820 explorer.exe Cache type "URL " at 0x30a5580 Record length: 0x100 Location: :2020011320200120: IEUser@https://nmap.org/download.html Last modified: 2020-01-15 17:41:02 UTC+0000 Last accessed: 2020-01-26 03:35:04 UTC+0000 File Offset: 0x100, Data Offset: 0x0, Data Length: 0x0 ************************************************** Process: 1820 explorer.exe Cache type "URL " at 0x30a5680 Record length: 0x100 Location: :2020011320200120: IEUser@http://www.bing.com/search?q=netcat+download+windows&go=Submit+Query&qs=ds&form=QBLH Last modified: 2020-01-15 17:40:08 UTC+0000 Last accessed: 2020-01-26 03:35:04 UTC+0000 File Offset: 0x100, Data Offset: 0x0, Data Length: 0x0 ************************************************** ...
VirtualBox Memory Collection
The VirtualBox environment itself can dump the contents of virtual RAM to a file, for analysis as well. VirtualBox will store this as an ELF-format “Core dump” file, which can help facilitate analysis of it using common debugging tools.
VirtualBox Chapter 8 of the documentation covers
debugvm option to perform various debugging-related analyses. One of these is exporting RAM to a core
VBoxManage debugvm <vmname> --filename=<filename.dmp>
The following documentation on Volatility’s GitHub documentation explains how to use this to perform analysis with volatility:
Plaso “super timeline”
Plaso is a really featureful tool that was built by the author of
log2timeline, which was very popular for a
long time. That tool was originally written in Perl, and the author decided to refactor and update it a bunch
and at the same time rewrite it in Python, a more modern and popular language.
Plaso performs a very exhaustive analysis, and also requires that you either export a full copy of the disk image, or that you reboot the system into Linux and mount the NTFS partition somewhere accessible so that it can perform its analysis. It is very thorough, and therefore can take hours to complete a single job. For example, it will go through all of the files on a windows system, and if it identifies files that would contain significant information to add to a timeline, it will dive into those to extract data. Some examples are archive inspectors, browser metadata inspectors, syslong analyzers, and SQLite database analyzers.
Today, it consists of multiple tools, with
log2timeline being the primary front-end tools, with
the preference leaning toward
psteal as the newer front-end. You can read about all of these tools by clicking
Mft2Csv discussed earlier is capable of exporting into the
l2t format, as well as the
format, both of which are compatible with the
Though this is a popular utility, I won’t be walking through its full functionality with class, due to the length of time it can take, and the cost that that can incur in the case of having to re-do examples or labs. We may end up working with some of the output, or running the tool on some limited examples. That being said, I do strongly recommend reading about it.