Scanner

The project is now officially named "Scanner." It was inspired by my desire to create a lab for analyzing malwares. It is now expanding to do bigger things. It is currently written in Python. Feel free to hop in and contribute.

Overview

The current iteration is 2.x, as seen here. I attempt to solve a couple of key problems and this project enabled me to work on an issue I've always wanted to tackle. I describe more of these below.

Features
  • It is open source.
  • It is easy to use. You are not limited to the GUI.
  • It is very modular. There are currently two main cores that does the analysis. The first core is heuristic (signature based). The second core is intended to be machine learning based.
  • It is accessible. Although this may not seem a big deal now, it'll be increasingly important as the project grows larger.
  • It is constantly updated. I try to keep it cutting-edge.
  • Python is widely supported. It should work on most major OSes (macOS, Linux and Windows) making it very portable.
  • With Tensorflow (Soon™), it'll be "smart."
Specifics

Surufel

Surufel was meant to be a quasi-startup and under Surufel, I intended multiple products. I went from multiple ideas growing into one powerful idea. Surufel Scanner or just Scanner is a serious product that I intend to develop for a while.

Typical Use

Typically, users use an AV like this:

"Smart" Use

I am hoping to create a smart AV that will look something to this effect:


Documentation

The installation guide is in the Quick Start section of the README document. The main focus here is malware analysis in general.

A Start

Three events usually initiate the game. Your computer is acting strange. Your antivirus picked something up. Your boss tells you something happened. If it's the first two, hopefully you have backups and it isn't something too bad. If it's your company, you will want to answer some business questions. Questions like:

  • "Will this affect our customers?"
  • "Will this interrupt our processes?"
  • "What did we lose?"

You're going to want to have some answers for that. You definitely want to find out how you were hit in the first place.

Target

While Linux and macOS machines can still get infected, the vast majority of targets are Windows machines in the US. In other countries, especially in the east like China and Japan, there is an attempt to move away from Microsoft and create your own flavor. The reasons for this is obvious; for example, they fear the west may be spying on them.

From the early days to today, Microsoft always had a majority of market share for OS. So, as a result, malware designers write for Windows because it's easier. This coupled with the fact that today, attacks are more targeted and more political, this meant more reasons to target Windows machines. Gone are the days when people wrote malware for knowledge and hack for the sake of hacking.

On a small footnote though, Linux and macOS currently rely on security by obscurity. In some case, a default Windows install is more secure than a poorly configured Linux and macOS box.

Anyway, the good news is that despite the growing threat, the "good guys" are winning. Actual infections are rare for most people and even organizations. The bad news is that when it goes down, it goes down hard.

Lab Setup

You need to setup a lab. Setting up a lab is more of an art than science. If you look up "how to setup your own lab," you will find multiple different setups for the same endgoal. This is just my preference. How did I setup mine? Preference and by feeling my way through on a spare throwaway computer. I'm just documenting mine as well for posterity.

  1. Setup Remnux. (Default password: malware)
  2. Make sure you have enough space so you can clone it.
  3. After the terminal shows, run this:
    
                    update-remnux
                    sudo apt-get update
                    sudo apt-get upgrade
                    sudo apt-get install virtualbox-guest-utils virtualbox-guest-x11 virtualbox-guest-dkms
                  
  4. Now clone this (select "Full clone").
  5. If you absolutely have to setup a shared folder (or this is temporary): sudo adduser remnux vboxsf
  6. You will need to log out and back in for the effect to take place.

My typical workflow process looks like this: determine the environment and triage -> static analysis -> dynamic analysis -> report

However you setup your lab, just make sure you've done as much isolation as possible. The nice thing about Python is how easy it is to help incorporate Surufel with Remnux. So, in addition to running a scan on the suspected file, consider the following triaging techniques:

  1. Check the filetype
  2. Use public databases
  3. Explore the PE (portable executable)

I'm going to assume that you found the suspected file via an antivirus scan. This is always a good way to start off the investigation. I'm going to use "Bombermania.exe" to do a little demo.

Filetype Checking

How you do filetype checking is preferential. Some might be redundant. I will start off with something like this: file Bombermania.exe

On a Linux box, this should yield: Bombermania.exe: PE32 executable (GUI) Intel 80386, for MS Windows, UPX compressed

If you use import magic in Python, you should get the same result. Heads up though, I've been hitting inconsistencies when it comes to setting it up so your mileage may vary. (The way I have gone about doing this is pip install --user python-magic but I'm reading at least 2 different methods, one using brew install libmagic.)

Public Databases

VirusTotal is nice. When you upload the file to VirusTotal, it will produce a dataset you can use to investigate the nature of this malware. Here is what it produced at the time of this writing (2017):

  • MD5: 471d39a51a79f342033c5b0636c244dc
  • SHA-1: b0324ddd99677d9b0458c7328879f8fde268effc
  • SHA-256: 1154535130d546eaa33bbc9051a9cb91e2b0e3a3991286c3d5b0a708110c9aa7
  • File Type: Win32 EXE
  • Magic literal: PE32 executable for MS Windows (GUI) Intel 80386 32-bit

It'll even tell you when it was first submitted and other tidbits of information. Apparently this one was created in 2005 and first submitted in 2009.

Digging into the PE

If you can find the malware in the public from just the information gleaned above, then your life just got a little easier. If this is something that's new, then a public database isn't going to help much.

There are a lot of research in this area and they are promising. For example, one research in Windows executable suggests that you can figure out which file is good and which file is not based on the PE header alone. This researcher does so by using three methods; one of them by using PE-Header-Parser. Surprisingly, one of the method is using Icon-Extractor.

Anyway, we will want to use a tool that deals with the PE.

Report

All report on malware should at least have as much of the following information as possible:

  1. Title
  2. Date
  3. Author (contact information, etc.)
  4. Abstract (summary, etc.)
  5. Files
  6. Traits (hashes, size, other names, etc.)
  7. Analysis (figures, statistics, etc.)

But ultimately, this depends on what your organization requires. Take a look at these reports of the malware Regin. Symantec took a very different approach than F-Secure yet both are excellent reports.

The endgoal is usually to figure out what the malware is doing and how to get rid of it.


FAQ

"Dynamic analysis?"

I'll write about it when I can get some time and my hands on a Windows license and an IDA Pro.

"Is this project finished?"

The short answer is no. For all intents and purposes, this project is considered finished if you only consider the intended features that I wanted to implement. But I don't believe in "finished" when it comes to software.

I have the same philosophy as Shlomi Fish in this respect. There are very few things in life that would meet the philosophical meaning of "finished."

"What was the point?"

It was supposed to achieve 3 objectives. Sharpen my skills. Solve an actual real world problem. Showcase my skills. That's why I kept the use of framework minimum. I use what I like or need.

"Why Python?"

The decision to use Python was because of Capers Jones and inspiration from Metasploit which was also written in a high level language, Ruby.

"Technologies used?"

Frontend

Backend

"What is this not?""

This is not a web application but it is in consideration. I just need to find a cheap host. You'll need to download it to try it out. There is an online version now! It's also not a complete solution. No software is.

Resources

There are lots of amazing resources out there. I have utilized these resources in my journey and definitely recommend that you check them out if you want to keep exploring.

  1. EICAR
  2. Common file extensions that are attack vectors
  3. List of virus hashes

© 2018 Surufel / Sif