This the Part I of a two-part article, that talks about forensic and attribution resistant application of developmental tradecraft for offensive software development. In this first part, I am going to give some tips and examples on how to apply threat modeling methodology to development process and also share a simple technique that I have experimented with back when I researching fingerprinting-resistant data creation and storage methods.
DISCLAIMER: The following content is for educational use and research purposes only. The author does not condone unlawful acts.
Background & Threat Modeling fundamentals
If you are developing any sort of software where things should go under the radar, anything static is bad. Static means things stay as they are and can be pointed by finger anytime and anywhere in space. Things that can be pointed can be named, which are naturally done by humans since they were babies. And as Thomas L. Friedman (2007) once famously said:
In the world of
ideas cyber, to name something is to own pwn it.
Look, ma! It’s a MAGNETICPOLARGOOSE!
Yes, we are talking about attribution which is not something desired especially if you are conducting a covert operation. Because there is no longer a cover to speak of and the adversary knows that it’s you. We are also talking about fingerprinting, which is again undesirable because it can be used for detecting a particular feature of an implant, whether during execution as part of an anti-malware solution or post-execution as part of a DFIR process. Today, we are going to talk about fingerprinting.
Before we move on, let’s remember the fundamentals of threat modeling and in particular, options for remedying an attack (Shostack, 2014):
- Doing things to make it harder to take advantage of a threat. (e.g. using multifactor authentication)
- Eliminating the feature, not doing it in the first place. (e.g. disabling unused services on a computer)
- Letting someone or something else handle the risk (e.g. using an antivirus software).
- Accepting the risk by regarding it as managable and/or not mission critical. (e.g. not doing this in your everyday life)
The first obvious static part of a software is its executable code section, which is a prime candidate for crafting signatures since it might contain many unique fingerprintable patterns. As for the remedies; you cannot eliminate it because software needs to run and the show must go on. You cannot accept it because you don’t want to be detected. That leaves us with two options: transference and mitigation. You can transfer the threat by using a commercial software packing solution such as the infamous VMProtect —but then again what kind of APT would you be and what would people say behind your back?—. And finally, you can mitigate it on your own.
For mitigating against fingerprinting code sections of an executable, there are various approaches. One of the oldest tricks in the book is using a poly/oligo/meta-morphic —seriously, how many are there?— engine to morph the code into many variations —therefore making it not static— with the same goal (Szor, 2005). Each of these techniques has their own caveats and I will not go into detail about them, however you can find more information in the referenced book but also here and here.
More advanced methods aim to mitigate against scanning instead. It makes sense, if you cannot see something, you cannot point it and name it and so on. And there are many variations of these sort of techniques. Off the top of my head, they start with simple removal of the PE header to make it harder to dump the executable, various process injection techniques to run under the disguise of another and possibly whitelisted software and go into more advanced techniques such as hooking system calls to redirect potentially threatening API calls (Blunden, 2012), and yet even more obscure and highly advanced techniques such as runing code under a hypervisor or SGX enclave.
But one thing that is usually overlooked is the data, which is also the subject of today’s demonstration. Well actually, code is also data as well. But labeling it as code, implies that it is executable. Marking something as executable introduces its own limitations, set of insturctions eventually need to be read by a CPU and therefore the legibility needs to protected. All preceding xmorphic code techniques comply with this limitation. And anti-scanning based approaches don’t even touch the code, they just shield it by abusing the access control logic.
When I say data, what I mean is read-writable memory or strings. That’s it. And in many cases, you don’t really need them. Because generally something read-writable is meant to be for humans. And as once someone said to me on IRC,
If you are doing strings, you are doing it wrong. (Note that, the IRC channel in question permabanned a user once, just for asking if Windows API has a similar function to make an entire string uppercase at once akin to Python, so take it with a grain of salt). Anyway, the main idea is if you are developing a low-level software, which is also the go-to approach for developing spyware, strings are usually a byproduct and there are better means to query something. In the end, there is usually a 8-bit or larger numeric value that shows the status of the thing that you are really querying. Same thing also goes with exfiltration, if telemetry collection is part of your operation, don’t just send out a huge message that reads
Microsoft Windows 10, Build 18363.418, instead according to your target specifications encode it as a value between 0 and 255 (by using something like
enum for instance), and send it out as a single byte value. Not only that you reduce the time it takes to transmit the message therefore reduce the risk of detection, but you also save from space as well. Do not use strings, unless you absolutely have to. It may seem like a very simple principle to grasp at first, but you’d be surprised to learn, in the wild how many developers make these simple operational mistakes.
But, are the strings only meant to be used for conveying a message? Well, no. They are also used as identifiers or names, for instance to get the memory address of a kernel object such as a file, mutex, named pipe etc.
So, whenever we need to name a kernel object including dropping a file on the disk we need to give it a name. And if you have been following me closely, you should see where this is going. You see, whenever you name a kernel object, in a way you are also leaving a signature with your name on it for forensics to find and document and share with everyone as an IOC. So let’s talk a little bit about that.
Fingerprinting-resistant identifier generation
So how can we overcome this problem? Simple, just don’t. I mean try to do without naming kernel objects and implement some kind of other measure. Contrary to what I said before, in this case elimination as a threat modeling remedy is an acceptable solution because data is more sacrificable compared to the code. If you need to share a pipe or a mutex, pass its address with something else that doesn’t require a preshared name. If you need to store data for later use, ask yourself if it is critical to the mission and can you do without state persistence or consider something like storing it inside the motherboard’s NVRAM (it still requires an identifier but also has the characteristics of an anti-scan measure to shield against unwanted eyes).
In some rare and limited cases you can transfer the risk, for example by weaponizing previously installed write protection software on the target computer, such as Deep Freeze. Wrote data on the disk? No problem, Deep Freeze will take care of it (But always RTFM and test, test, test; what if they have an out-of-band logging solution?). Like I said those are rare and this one also doesn’t give protection during the execution.
Finally, let’s talk about the real thing: mitigation. Whenever there is a need to name something covertly, one of my favorite techniques is to generate it randomly. Here are bunch of strings generated by RANDOM.ORG:
s9QZEycn9aGjRBP98LWO nWnp7DbLtgs8yIt8nRXQ 6GzgLhVPubTHp0GIwrDa z16lns0dQ15fAzymiYC1 Fe4Peghp3qT4usvKlZxo 4rRQPSlCwRD7S4ALq3HG PS2JtxlvzW2ICKTBXZit STExZZd74MT9qJqTezRU HFn6sFmLFuP9NkgFcUB6 FyqMQqk4GFf543vIv3AA
However, that doesn’t really solve anything. Even if you were to put them in the executable during the compile-time, they are still unique. For all we care, we could just say
ThisIsARandomString and there would be no difference. So, we need to tweak it a little bit. And in order to do that, we need to define what I call hierarchies of time domains for random value generation (If there is already a better name for it, please let me know). Basically, there are several phases where a random value could start its existence, sorted from general to specific:
- Global-time (i.e. generated once, inserted into the code and never changed afterwards)
- Compile-time (i.e. generating a new value for each new compilation of the code)
- Run-time (i.e. generating a value during the execution of the code)
- Sub run-time (i.e. generating a value during a specific execution frame)
Those are definitely not set in stone and the list could be expanded or shrinked according to one’s needs. Generally, every random value we see around that is used in cryptography and such, are generated in the run-time domain. Because in theory, the generated values are not tied to anywhere, except the universe —they are universally random so to speak— and therefore they are more secure. But did you notice that I have said tie? Yes, one funny and in our case useful characteristic of random number generators is that, you can change their time domain as long as you seed them with a value that is originated from your target domain.
To better understand, let’s continue with an example use case. One of the oldest tricks for checking if an application instance is already running, is to create a kernel object in global or per-session namespace such as a mutex or a semaphore. Basically, if a kernel object with the given name already exists that means the application is already running and there is no need to create a new instance and it is safe to exit. But this method also has its own risks, if you choose a very common name such as
Mutex1, there is a high risk that you are going to collide with another application. So the safer practice is to generate a random GUID value at global-time and use it as the preshared name.
However, although it might reduce the collision risk, due its unique property it can also be used as a fingerprint to craft a signature. To overcome that, we need to lower its time domain by seeding it with a value originated from the targeted domain. For instance if you want to target per-user execution frame in the sub run-time, you can use the user name and the computer name (for increasing entropy, in case the user name is very generic like user or john etc.) as seed:
#!/usr/bin/env python3 # fingresid_poc1.py import os import itertools global_guid = 'snxsqgvlslfdhhoykjhtryxsskcagymk' # Get computer and user names from environment variables subrun_seed = os.getenv('username') + os.getenv('computername') # Convert chars to their numeric counterparts and XOR with each other subrun_guid_ord =  for chars in zip((ord(x) for x in global_guid), itertools.cycle(subrun_seed)): subrun_guid_ord.append(chars ^ ord(chars)) # "Asciify" the numeric values, for the sake of demonstration we are just disributing them within the 97-122 region which corresponds to lowercase ascii subrun_guid = ''.join(chr((n - 97) % 26 + 97) for n in subrun_guid_ord) print(subrun_guid) # e.g. iieoipsuuqjcwsnissrnkdtkjlvivwtx
The preceding PoC takes
global_guid and creates a similar looking
subrun_guid by combining it with computer and user names, but this time it is in sub-run time domain and an unique value just for this computer and user. In a way, it is very similar to salting a password before storing the digest. Note that, I chose a simple, low-entropy lowercase ASCII GUID for the sake of demonstration. In production, you should use proper GUIDs but you will also have to deal with normalizing or asciifying them.
Even though this example was good enough for demonstration, we can still tweak it a little bit further. We can for instance combine this mitigation mechanism with a transference one by abusing ASLR and Windows memory model, and also change the ID generation mechanism with an alternative one.
#!/usr/bin/env python3 # fingresid_poc2.py import ctypes import random import string # Get the base addresses of two commonly linked system DLLs subrun_address_kernel32 = ctypes.windll.kernel32.GetModuleHandleW('kernel32') subrun_address_ntdll = ctypes.windll.kernel32.GetModuleHandleW('ntdll') # Initiate and seed the PRNG with numeric values of the addresses subrun_prng = random.Random(subrun_address_kernel32 ^ subrun_address_ntdll) # Craft an identifier in sub run-time that changes randomly every boot subrun_uid = ''.join(subrun_prng.choice(string.ascii_letters + string.digits) for _ in range(32)) print(subrun_uid) # e.g. GUfK0Jw628yFLmEo2kWctDd31MPAhcU1
In this example we have queried the base addresses of two commonly linked system DLLs and used them as seed for initiating a sub run-time PRNG. Then we have used the resulting PRNG to choose ASCII characters and digits to craft a 32 characters long identifier that is guaranteed to change each reboot or in other words it is in per-reboot sub run-time execution frame. This works beautifully because system DLLs’ base addresses are redetermined during each boot and ASLR takes care of making things not static by giving us just enough entropy. Here is a more detailed explanation (Yosifovich & Solomon & Ionescu & Russinovich, 2017):
…For DLLs, computing the load offset begins with a per-boot, system-wide value called the image bias. This is computed by
MiInitializeRelocationsand stored in the global memory state structure (
MI_SYSTEM_INFORMATION) in the
MiImageBiasglobal variable in Windows 8.x/2012/R2). This value corresponds to the TSC of the current CPU when this function was called during the boot cycle, shifted and masked into an 8-bit value. This provides 256 possible values on 32 bit systems; similar computations are done for 64-bit systems with more possible values as the address space is vast. Unlike executables, this value is computed only once per boot and shared across the system to allow DLLs to remain shared in physical memory and relocated only once. If DLLs were remapped at different locations inside different processes, the code could not be shared. The loader would have to fix up address references differently for each process, thus turning what had been shareable read-only code into process-private data. Each process using a given DLL would have to have its own private copy of the DLL in physical memory.
This is great, because now you can use the resulting identifier to create a mutex to check if the application is already running. And since the identifier itself is random and changes every reboot, there is no way to create a static fingerprint. However, it should be noted that if some other software on the machine uses the same exact method, again you are risking collision. So you might want to combine it with a compile-time originated value to differentiate yourself.
Basically what we did so far is randomizing the selection of characters. You should remember that whenever you introduce another layer of randomization, you are making it harder to fingerprint something. If you were to use the last example as it is, forensics would create an IOC such as
32 characters long string consisting of ASCII alphabet and digits between 0-9. In order to make it more resistant, we could also randomize the length of the identifier. But it’s not that simple.
First of all, if you choose the maximum number of allowed characters as the upper limit for your random length, and you get something like 1337 as a result, chances are it is going to get flagged as an anomaly. Because seriously, what kind of a sick bastard would choose a name that long? So that introduces us the disadvantage of randomization: the more random something is, the more behaviorally abnormal it becomes. So the best practice is to choose a range where the lower limit is high enough to make collisions less likely, and the upper limit is low enough to stay under the radar.
And even then entropy analysis could be used to detect weird looking names. But entropy analysis has its own problems. What if some impatient user creates a file with a name like
? (You’d be surprised.) So due to the false positive risk, it could only be used as a secondary signature to further support other IOCs.
Also some consideration should be made regarding the target characteristics. For instance, if the target computer is located in Asia, then Latin characters alone might be enough to raise flags. So it is advisable to adapt the exact methods you choose according to where you are targeting.
When you are modifying the time domain of a random value, always remember that a lower hierarchy time domain will always supersede a higher one. So whenever you combine compile-time with run-time, the resulting value will be in the run-time domain. Whenever you combine per-reboot execution frame with per-login, it will result in per-login sub run-time execution frame, etc. More specific a time frame, higher its effect. The latest value from the most specific time domain acts as the password, while the previous ones from higher and more generic domains acts as the salt.
A better and more advanced application of this technique could be achieved by using NLP to mimic human writing, by using code samples from public repositories. I might research this in the future or someone might want to beat me to it. Would love to see how it would work.
Well, that should do it for now. In the next part, I plan to talk about how black propaganda & disinformation tactics can be used against attribution attempts.
- Shostack, A. (2014). Threat Modeling: Designing for Security. John Wiley & Sons.
- Szor, P. (2005). The Art of Computer Virus Research and Defense. Addison Wesley Professional.
- Blunden, B. (2012). The Rootkit Arsenal: Escape and Evasion in the Dark Corners of the System (2nd ed.). Jones & Bartlett Learning.
- Yosifovich, P. & Solomon, D. A. & Ionescu, A. & Russinovich, M. E. (2017). Windows Internals, Part 1: System architecture, processes, threads, memory management, and more (7th ed.). Microsoft Press.