A Technical Overview of our Modular Windows Malware

This post starts my foray into actual technical posts. I’ll go over the malware I wrote with a group for a class, how it works, what features it has, etc., in technical detail :)

The TL;DR

We created a modular malware implant and C2 framework which run on Windows 10 (and perhaps other versions) machines. Our implant is lightweight, written in Visual C, and most functionality is delivered via C2 RPCs, which delivers DLLs which are loaded in memory and never written onto disk. The C2 masquerades as a deep-fried meme site, which is useful because our obfuscation method, Pixel Value Difference steganography, can encode more data for these types of images.

How this is structured

I’m going to start with the general C2 framework layout, and then dive into each part: the implant, the C2, and the C2 interface.

The C2 Framework

Here’s the lifecycle of our malware.

  1. The user executes the implant program.
  2. The implant initializes and reaches out to the C2 backend, a Flask server (Hosted with Gunicorn because it is production-ready)
  3. The C2 creates a “profile” for the implant, stores it in a SQL database (Postgres) and responds with the implant’s identity
  4. The implant establishes persistence and waits patiently for C2 commands
  5. The malicious actor logs into the C2 web page and gives a command, such as list processes or drop DLL
  6. The implant receives this task, performs it, and reports back with the output
  7. Profit

The RPC mechanism

The RPC mechanism is not simply a plaintext HTTP request. It’s an image: the C2 masquerades their commands as images that a user would request when scrolling on this memepage, and the implant masquerades their responses as uploading a meme to the website. This is steganography, using Pixel Value Difference (PVD). This exists, of course, for only the C2 <—> implant communications.

How does PVD work?

Without getting into a bunch of detail (I wasn’t the one who made the functionality for this), here’s an overview on how PVD works. The point behind this steganography mechanism is based in the difference between adjacent pixels.

So, the image is split into two-pixel pairs, of which in each pair, the two pixels are directly next to each other. We calculate the intensity of each pixel, and then calculate the difference between the two.

We then slightly change the value of the pixels to encode the data. The amount of data that can be encoded (unlike LSB steganography) depends on how different the two pixels are to each other. So, for images which have large differences between adjacent pixels – like deep-fried memes – we can encode a lot of data by making the pixels more different from each other. If the pixels are already very different, it will be less obvious if we change them a lot – if they’re very similar, we can only change them a little bit.

As a result, we can encode a solid amount of data, but the picture literally looks the exact same.

The Implant Base

As I noted earlier, the implant is modular. So, most all of its functionality simply doesn’t exist when it’s first executed. The implant can only do a few things, like list processes, establish persistence, and communicate with the C2.

Let’s talk about the base functionality that it can do, then we will discuss the extra functionality supplied by remote DLLs.

It’s important to note that this code does not use the C standard library. All of our functionality is based off of Visual C and the Win32 API. So, we don’t use things like malloc or printf – we instead use stuff like LocalAlloc.

Persistence

Let’s talk about establishing persistence. This is something I developed, and I based it off of Beyond the Good Ol’ Run Key, Part 29. Much of my code for the persistence was, without shame or malice, freely stolen from this page.

I don’t think I need to go into TOO much detail on this post, because my exact functionality is basically incorporated into that blog post. But, I’ll go over the general gist of things.

This persistence works via Windows lnk file shortcuts, which can be activated by hot keys. lnk files are simply a link to an actual executable; these shortcuts exist so that someone could open an app they wanted without clicking anywhere.

To do this, you can either place the lnk file on the Desktop or Start Menu – obviously, I placed mine in the Start Menu, because it is harder to spot. The lnk file simply calls our malware executable, and the file is activated when you press the CapsLock key.

And yes, before you wonder, “Captial” is spelled incorrectly on purpose. For some reason, that spelling slipped through the cracks and that’s how you must refer to CapsLock.

set w = CreateObject("Wscript.shell")
set n = CreateObject("Wscript.Network")
d = w.SpecialFolders("Start Menu")
set l = w.CreateShortcut("C:\Users\" & n.UserName & "\AppData\Roaming\Microsoft\Windows\Start Menu\chrome_browser.lnk")
l.WindowStyle = 4
l.TargetPath = "C:\Users\" & n.UserName & "\AppData\Local\explore.exe"
l.Hotkey = "Captial"
l.Save

There is a major footgun with hotkey persistence. Something not so bad but annoying is that the CapsLock key no longer works; however, the big issue is that you need to make sure that if someone presses CapsLock and your implant is already running, they must not step on each other’s toes.

How do we do this? It’s simple! This is why we ship our implant with the ability to find running processes. We list the processes on startup, and if it finds our implant already in existence (i.e. there is more than one of itself), it dies on the spot. If not, then it will act just like a normal implant!

C2 Communication & Obfuscation

The implant must, of course, ship with the resources it needs to talk to the C2. That means that it must know the C2’s domain, and it also must have some images ready so that it can use PVD to talk to it; of course, it needs to know how to perform PVD.

Obfuscation is absolutely necessary to hide the domain. Otherwise, any novice running strings on our executable will know exactly what the implant is talking to, immediately. So, I created obfuscation code to ensure that this wouldn’t be possible.

Now, to be honest, I ran out of time when making the obfuscation functionality. As a result, it’s insanely trivial. All of our hardcoded strings, like the C2 domain, exist as a byte string first. Whenever we want to use it, we send it into our deobfuscation method (which XORs it with 42 – I know, it’s stupid), use it, and obfuscate it back again.

The big issue I ran into with this approach was that we have two types of strings, char and widechar. widechar strings are the same as char strings, except there is an extra \0 (null byte) after every character. That means I need to treat certain strings differently depending on which type of char they are. So, we had methods to obfuscate and deobfuscate both types of chars.

The big hitter: in-memory DLL loader

This is the shining star in reference to our modular functionality. This ensures that we can run DLL’s by manually loading it ourselves, instead of doing it the normal way where we store it on disk and call it from there.

I won’t go into every tiny step and detail about how we do this, because it would require me to explain exactly how a PE is structured. That is an entire post in and of itself :D

The steps to loading a DLL in memory are as follows:

  1. Receive DLL from C2, and store it all in a buffer
  2. Load file bytes
  3. Find NT header, then optional header
  4. Memory map the PE
  5. Step through dependencies and load all the libraries and functions it needs
  6. Perform base relocation (if we want it to be at base address X but we actually have it at base address Y
  7. Perform TLS callbacks
  8. Run entrypoint!

We can do all of this without touching the filesystem for the DLL. That is a major superpower, because we can hide functionality and only use what we need to. It’s also more difficult to see the inner workings of our loaded DLLs because they are never stored on disc!

DLL drop-in functionality

I’ll mainly go over the DLLs that I made, because I don’t know much about the other ones.

Chrome stealer

This DLL grabs any usernames and passwords, along with the website that they map to, if they are stored on your local computer. In Chrome, you can save your login data, but if you don’t have an account, it gets saved to your filesystem. This is protected via a SQLite database and and encryption, but we can decrypt it :)

Here’s an overview on how it works:

  1. Grab the encryption key: go to the user’s AppData\Local\Google\Chrome\User Data\Local State and grab the encrypted_key from the file. It’s base64 encoded and encrypted with the Win32 API’s CryptProtectData. Decode and decrypt the key for usage later.
  2. Find the database, located at the user’s AppData\Local\Google\Chrome\User Data\default\Login Data\ChromeData.db
  3. Connect to the database, and select the username, password, and URL for each row.
  4. For each password, apply the key and decrypt it!
  5. Print your results :)

This seems pretty easy, but working with data structures and memory in Visual C isn’t the most fun…

Here’s an example of some code I made to CryptUnprotect the chrome key:

if (CryptUnprotectData(&blob_in, NULL, NULL, NULL, NULL, 0,
                         &blob_out)) {
    unsigned char *pbData = blob_out.pbData;
    DWORD cbData = blob_out.cbData;
    unsigned char *buffer = (unsigned char *)core->malloc(cbData);
    // core->wprintf(L"Data output is of length %d\n", cbData);
    if (buffer == NULL) {
      core->wprintf(L"Memory allocation failed\n");
      return NULL;
    }
    my_memcpy(buffer, pbData, cbData);
    LocalFree(pbData);
    return buffer;
  } else {
    // Decryption failed
    core->wprintf(L"%d\n", GetLastError());
    return NULL;
  }

Process Injection via Thread Hijacking

This was a fun one. This process injection technique is hard to spot from an antimalware perspective because it’s not an out-of-the-ordinary action. Many processes work with threads and freeze and change their contexts, so it goes undetected by Windows Defender.

How it works:

  1. Get shellcode that you want to run. In my case, it just shows a message box.
  2. Find a Process ID to inject to. I tested it a lot with Notepad, because who wouldn’t?
  3. Open the process with OpenProcess and use VirtualAllocEx and WriteProcessMemory to write the shellcode into the process’s memory space.
  4. Use CreateToolhelp32Snapshot to get a snapshot of the running threads.
  5. Run through all of our threads until we find the one whose th32OwnerProcessId is our victim PID.
  6. Open the thread with OpenThread,
  7. suspend it with SuspendThread,
  8. get its context with GetThreadContext,
  9. set its rip (instruction pointer) to our new code,
  10. set that context with SetThreadContext,
  11. and resume the thread with ResumeThread. This will effectively force that thread to run our code instead of the code it was going to run :)

Here’s an example of some of the code that made this work:

  while (Thread32Next(snapshot, &threadEntry)) {
    if (threadEntry.th32OwnerProcessID == targetPID) {
      core->wprintf(L"Found process with pid %zu!\n", targetPID);
      threadHijacked =
          OpenThread(THREAD_ALL_ACCESS, FALSE, threadEntry.th32ThreadID);
      break;
    }
  }
  SuspendThread(threadHijacked);
  GetThreadContext(threadHijacked, &context);
  context.Rip = (DWORD_PTR)remoteBuffer;
  SetThreadContext(threadHijacked, &context);
  ResumeThread(threadHijacked);

Other stuff

I’ve talked about a lot of the fun stuff, but I also want to touch on some of the functionality that is entirely necessary to our implant but less interesting.

Here are some of the other features we had to code:

  • Grab environment variables
  • Execute arbitrary command
  • Get hostname
  • HTTP calls
  • List files in directory
  • Read files
  • Get Windows license
  • All the code that goes into PVD
  • Task handler

A lot of stuff is super important for the implant to run normally but is not interesting enough to go into depth in. I could of course talk about stuff like how we did HTTP calls with Win32 API, but that isn’t special!

C2 infra and frontend

The C2 was run on an AWS server, with a domain from Cloudflare. We used gunicorn to run our flask server, and our front end was written in React. We spent the least time on the frontend for obvious reasons, but we needed to set up the filesystem structure so that we could properly upload and ship DLLs to the implants, set up the dockerfile and document how to run our C2 in case of errors, set up routing and APIs and JWTs and databases and crypto keys, but these are all thankless jobs. This is the stuff that we have to do to create proper malware, but not what makes ours unique or interesting.

The end

This was honestly less technical than I originally planned, but the code is also not technically public, and it takes some decent Windows internals knowledge to get much out of it (see: DLL loading !). If you, the reader, want to ask me about stuff I didn’t cover in depth, shoot me a message :)

Thank you for reading through this! This is, without a doubt, the biggest project I’ve ever worked on, and it’s amazingly rewarding to have a finished product (which we built from the ground up) that I’m proud of!

Evan