5 February 2017

Static Analysis Utility

HW03: Static Analysis Utility

In the Week 04 lectures, you were introduced to static analysis and have been provided a demonstration of a utility for extracting some static analysis data from malware samples using tools on the Remnux VM.

Your assignment for HW03 will be to take the source code that I began in Lecture Wk04.2 and add an additional analysis to the program that will extract some useful data from the artifact(s).

You will write a report to accompany this, which will include malware analysis of two or more malware samples highlighting why the information extracted is significant.

The Python program is available here: metadata_import.py

If you recall, the Python code that I’ve written already collects the following data from the sample, puts it into a global object within the script, and finally commits it into the database:

MD5, SHA-1, SHA-256 hashes
File type (as reported by exiftool / “file magic”)
File size (in bytes)
File names
Compile Time
Creation Time
Modify Time
Author
Company Name
File Description
List of sections (if it is a PE32-type file)

You will use a ZIP file containing malware that I provide to you as your experimental set for this homework. This file is available here: Malware_Bundle_HW03.zip (Password: infected7038)

Using the metadata_import.py script that I’ve provided, the lecture notes from Week04.2, and the ZIP file of malware, rebuild the mongodb database using the samples I’ve provided, according to the in-class demo
Run a few summary analyses using MongoDB’s javascript, or PyMongo in a Python script you provide, to demonstrate the data was imported. Provide this information in your report. Note that I am aware that some files are not parseable by the script and produce errors - this is fine.
Identify information or characteristics that are available in multiple malware samples - a portion of your grade depends upon the “difficulty level” here. If you are simply extracting an additional data-point from exiftool’s output, or whether or not a particular piece of static content exists, that will provide enough for at most a B equivalent on the assignment. For a higher grade, do some research on the file type being analyzed and use your tools to extract that information. Some examples of acceptable challenges would be:
- Enumerate the encodings used in a PDF
- Calculate the sizes and locations of sections within a PE32 executable
- Parse & extract CSS style sheets from HTML
- Enumerate javascript function names from HTML & PDF samples
- The symbols that it imports from one or more particular WINDOWS system DLLs
Document the characteristic chosen above, why it is significant, and which malware artifacts you were using to develop the extraction technique
Document the code you added to the Python script. Either use comments and references to lines in the documentation, or document it directly in your report.
Give a list of the md5, sha-1, or sha-256 (be consistent, though, don’t mix the digest types) of all of the malware samples that are identified by your custom analysis and yielded some amount of data extraction for it

Use mongodump to dump the contents of the collection into a BSON output file. Example, using database & collection names from class:

bash$ mkdir mongo-out
bash$ cd mongo-out
bash$ mongodump -d cs7038 -c malware
connected to: 127.0.0.1
Sun Feb  5 22:49:53.120 DATABASE: cs7038	 to 	dump/cs7038
Sun Feb  5 22:49:53.121 cs7038.malware to dump/cs7038/malware.bson
Sun Feb  5 22:49:53.129 7820 objects
Sun Feb  5 22:49:53.129 Metadata for cs7038.malware to dump/cs7038/malware.metadata.json
bash$

The above will create files named dump/cs7038/malware.bson and dump/cs7038/malware.metadata.json. These are your BSON and JSON files.

You’ll submit a report (PDF preferred), plus supporting code, artifacts, and binary data in a ZIP file. You do not need to submit the malware samples to me, but rather include the digest values that uniquely identify the malware samples significant to your analysis. Include the BSON and JSON file(s) generated by the “mongodump” operation in your ZIP file as well.

Home

tags: malware assignment

CS6038/CS5138 Malware Analysis, UC

Course content for UC Malware Analysis

Static Analysis Utility

HW03: Static Analysis Utility