Introduction to Course and VirtualBox
by Coleman Kane
This course will cover the analysis techniques that help us understand, and thus, detect and prevent malware. Learning from the adversary is core to the science of Cyber Security, and malware analysis is a key part of this - much as malware is a key part of most adversary toolboxes.
Background on Malware Analysis
Though many of the examples used in this course will involve handling benign artifacts designed to help us learn malware analysis techniques, we may also delve into analyzing real malware (although such artifacts will be long inactive campaigns) from the wild. Similar to a bomb technician using special containment equipment, or a doctor using gear appropriate for minimizing the risk of contamination from disease, we will be using technology to employ similar measures in this course.
There are many valuable reasons for this, but here are some of them:
- Accidents do happen - you wouldn’t be the first person to accidentally double-click malware while trying to copy it.
- Anti-virus avoidance: You neither want to fight with AV on your host, nor do you want to disable the AV and other protections on your host system
- You are opening up complex, unknown files, using analysis tools that are widely known. Are you certain your tools don’t have inherent vulnerabilities?
- Often your system configuration will differ from that of the users whom you are protecting. Setting up a lab allows us to replicate their environment without having to adopt it.
What is Malware Analysis?
malware as Hardware, firmware, or
software that is intentionally included or inserted in a system for a harmful purpose. This definition
could include peripherals, whole applications, libraries, and even fragments of code inserted into
running legitimate processes - and, frequently, any combination thereof.
In this class we will cover both dynamic and static analysis techniques. I typically like to look at malware analysis as answering some specific questions - when working in the field, it will be important to determine what ultimate questions your analysis is expected to answer. A person could literally spend their entire life analyzing the same malware sample, and coming up with new findings every week, if the analysis effort isn’t scoped to what questions it needs to answer.
Static analysis is the practice of analyzing the contents and structure of malware, to identify characteristics that could be searched for in an environment - often I consider this to be answering the question “What does it look like?”. The findings from static analysis are typically intended to identify malware regardless of whether or not it is currently running on a system. The benefit to static analysis findings is that you can use these for an opportunity to detect & protect against the analyzed malware before it becomes an immediate risk to your environment. Static analysis can be used, for instance, to proactively scan for malware in a shared team folder, or being served from a website to victims.
Dynamic analysis is typically used to discover how a particular piece of malware behaves when running on a system. I like to think of this as answering the following questions:
- What does it do?
- What doesn’t it do?
- What behaviors or actions would I want to look for to identify its presence on a system?
The idea here is to gather enough information so that you could observe a running system, such as with a forensic monitoring framework, and detect that the malware has been run on the system in question. This is important, as it is fairly common for a cyber adversary to implement stealth or subterfuge mechanisms to hide their presence from you. Malware, as well, is software much like your web browser or office suite - and it is governed by similar release engineering, evolution, and improvement practices. Some even come with formal release numbers, as well as being maintained by developers that can do custom releases for a fee. Sometimes, the content of the software might change considerably, while the underlying behavior of the tool may not change as much (sometimes, the inverse of this is true, as well!).
Work to do this week
This week, we will focus on a few core concepts working with VirtualBox and our systems.
Windows users may want to download the following program,
ncat.exe, and install it somewhere
Linux users should be able to install the
netcat package on their favorite distribution.
ncat tool is commonly used for providing a quick connection between two systems and allowing
the user to send data across the connection using the keyboard, or using other CLI programs.
By default, the OVA image I provided to you utilizes NAT networking so that it is easy to access the Internet, install updates, and generally use. We will want to go beyond this an use VirtualBox’s “Host-only Networking” to create virtualized isolated networks within which the attacks can run. This will require us to create a new network interface in our VM, and make configuration changes both in Kali as well as on our host system, to create a virtual network. We will demonstrate using the virtual networking options to achieve this.
Another beneficial feature of VirtualBox is the capability of snapshotting and cloning your VM snapshots. This allows quick & easy revert to a known-clean state, which will become beneficial later on when we may want to experiment with various malware features and test operations. Additionally, this facilitates VM reuse across multiple analysis projects.
VirtualBox Shared Folders
VirtualBox provides a simplified interface for providing access to a directory on your host system to the underlying VM. In Linux, this is implemented using a driver and the virtual file system (VFS) that is part of the Linux kernel. In Windows, this is implemented by masquerading as a fileserver on a local network named VBOXSVR. We will familiarize ourselves with both of these interfaces.
Network Traffic Capture with
tcpdump utility is the de-facto standard network traffic viewing tool, and the Wireshark tool is the
GUI counterpart to it. We will demonstrate using both of these utilities to view the content of network
traffic we generate between host & VM using