With the latest RedTeam Engagement wrapped up, I was going through the lessons learned and figured that I needed to improve/refine my toolset again to be able to evade the EDR/AV/XDR and successfully run my payload. Tools get signatured over time, techniques become outdated, and what I had on hand just wasn't good enough to successfully evade this specific EDR and run my payload. So, I had to resort to other stealthy methods to pwn the customer. It all worked out in the end, but I left with a sour feeling, like I "barely" got away without getting caught. It was time to get back in the kitchen and cook up something fresh.
I don't know what I'm doing
Just a heads-up: I'm no coding wizard! My code might get the job done, but it's definitely not award-winning efficient. Everything you see here—techniques, snippets, the whole shabang—is borrowed and mashed together from what I've learned so far. I am not claiming anything either but am simply trying to solidify my understanding by documenting my journey. If it helps anyone, even better.
My research draws on a buffet of papers, tools, and brilliant minds. I'm standing tall on the shoulders of giants, and I'll do my best to credit every source that helped me along the way. If I miss any, it's purely accidental. This isn't a "EDR evasion 101" guide. It's a snapshot of how I navigated the EDR maze in 2024 and how I got there. You'll find methods that worked and some that did not. (Post 1 will be mostly theoretical though)
Just as the defenders have several tools and options in place to spot malicious activity, the attackers have several different techniques to get around current and modern defense systems. You do not need to utilize the whole toolset and include 15 different process injection techniques, unhook everything and kill all telemetry. Doing so would be certainly an IOC in itself and would get your payload nuked. "less is more" also applies to EDR/XDR/AV/Buzzword (calling it EDR from here on) evasion and the more you blend in into "normal" traffic and processes, the less likely you will be spotted. Think about a secret agent (payload) hiding in the crowd (windows), dressed in casual clothing (toolset). Much harder to spot (telemetry) than a terminator equipped with a rocket launcher.
With that being said, the goals for this project were clear:
- Get around EDR and run payload without detection or at least only a few low or medium alerts.
- As input accept simple shellcode. This could be modified but it fits my usual workflow so this was supposed to be V1. As input I used very simple msfvenom-shellcode.
- OPSEC is a concern. Try to create as few indicators as possible.
As Victim-PC I used the latest Windows11 Pro and Sophos (XDR Intercept). I set up an additional client with Microsoft Defender for Endpoint because it also shows you all the low-level-alerts while Sophos does not. The test-payload was a simple meterpreter-shellcode that I created and then threw into a very basic executable. This was supposed to be the starting point. The idea was to use something that is well known (meterpreter) and add techniques to it until the evasion worked. It is important to note here that evasion != being invisible.
Many blog posts (mine included) mention evasion and give the reader the false idea that with the edr not flagging or killing anything, they are now invisible and can start running the loudest and most opsec-unsafe commands. If you did not kill the EDR, it's integrity- and all its heartbeat-services that are trying to phone home (and whatever else is checking the EDR's health) then it is very likely that the EDR is still collecting data about your process and may flag you for whatever you are trying to do in the future.
Show me the goods
Ok enough theory...where is the code? First, we'll establish the base code by generating the main meterpreter shellcode. Using a basic Kali Linux setup, I created the following code, excluding null bytes for good practice:
Next I'll created the very simple C-code. First define the shellcode itself and simply c&p the shellcode into it.
Next I used VirtualAlloc to create a buffer big enough to fit the whole shellcode in it and then copied it over with memcpy. Then I created a new thread with CreateThread and WaitForSingleObject to keep it running forever.
As expected this sample got caught immediately. Several things here that stand out that scream "MALWAAARE!!".
First, RWX-Memory (VirtualAlloc) is very suspicious and should be avoided in most cases.
Second, the meterpreter-payload is obviously well-known and all modern EDR will flag and kill it. Real engagements it would make more sense to take something less known or even create your own C2.
Third, the "Private" memory is suspicious because it is not backed by a file on disk and only exists in memory.
XOR and .data
First thing we want to try to defuse/hide our payload is something I recently read in Daniel Feichter's (@VirtualAllocEx) Blog "Meterpreter vs modern EDRs in 2023" where he tried to evade modern EDRs without using anything complicated. While the techniques used are not new, it still surprised me that this worked and Daniel was able to run his payload (meterpreter) and evade the EDR of choice. Because it worked for him, it should work for me as well... ;)
Daniel used XOR-Encryption for his shellcode and global variables. The idea here is that a) encryption should hide the malicious shellcode from static detection and global variables are located in the .data-section which seemingly are not as suspicious. Doing so is simple, first we will move the shellcode-buffer and make it a global variable....easy :)
HOWEVER, the compiler and other settings will sometimes merge this. For example, in my basic example I declared the "buf"-variable in the main-function and it got put into the .rdata-section. The .rdata-section is read-only-data so logically this should be the safer/better choice (if you do not need RW) but for now I will go with the .data-section and see how it goes.
To do so, you can force the compiler to put your shellcode into the .data-section by adding the following lines:
Next, I wrote a short script to xor-encrypt the original shellcode with the xor-key 0x69 (yes, I am very funny). I see the same thing here that Daniel saw in his experiment. The static scanner will not flag the file but running it is still not possible. This is obvious as the static scanner will only see "rubbish" code when it checkes the xor-encrypted shellcode (excluding all other IOCs) and not get triggered. However, as soon as the xor-encryption happens (again, excluding all the other IOCs) the shellcode can be inspected.
Instead of just copying the entirety of Daniel's Blog I want to take a different route and see what else I could do to get my shellcode to work. First 2 things I want to look at are AMSI and ETW. AMSI will basically scan any code before it is supposed to run. However, AMSI is not integrated in everything but .NET, powershell, macros and some other things:
You can see the amsi.dll if you simply open a new powershell-process. Because we are not coming in contact with AMSI today I will skip it for now.
ETW & ETWTi
Event Tracing for Windows tracks several different things like process creation and whatever else. Developers can subscribe to ETW and then receive information about several things going on on the system. If you want to see how that looks like you can use the tool ETWExplorer from Pavel Yosifovich or use the logman-utility. In case you want to know more about ETW I would recommend to take a look at the following iread.team-post (this site is a goldmine...)
ETW and ETWTi are similar as in they both belong to the Event Tracing for Windows. ETWTi (Event Tracing for Windows Threat Intelligence) is operating on a much smaller scale and only reports security-related events. Developers (EDR) can subscribe to both and receive info about malicious processes, file creation and so on.
What I want to try now is if I can somehow get around ETW or disable it. This may or may not be enough to get my shellcode to run. To do so I changed a few things. First, I switched from a staged to a stageless payload. This obviously increased the shellcode-size which meant I had to move it out of the .data-section and put it into the .rsrc-section. Look at Maldevacademy.com for ideas if you want to find out where you can store your payload.
Next I had to look for suitable ways of getting around ETW or ways on how to disable it. One of the many older solutions basically did the following:
- Find ntdll.dll in the loaded modules (notepad.exe as example)
2. Get the base address of the module and then calculate the offset to EtwEventWrite
EtwEventWrite is part of the functions used to actually write the events (duh!). If you can simply let the function return then nothing will get written to disk/log. This ETW-patch follows the same idea like some AMSI-Patches (there are others that make it "fail" for example)
Issue here is that this patching obviously leaves a lot of IOCs in your code which can be used to signature you. You will have to change protections on the loaded ntdll.dll and write memory to modify the loaded function. VirtualProtect and WriteMemory (Nt and/or Userland) are propably hooked and your mods are likely to be inspected. Another idea is to use HWBP and according to my research they seem to still be working fine for now but I recently read another novel method for bypassing ETW from @aceb0nd which sounds way cooler ;)
Basically you hammer EtwEventRegister until it all breaks down. Sounds fun! To not reinvent the wheel I just c&p aceb0nd's code (Thank you for the idea and code) and modified the GUID. Creating testing guids can be done HERE for example. For your final release you should use a function that creates random GUIDs otherwise you'll give the defenders an easy IOC. Running the code shows that after several calls to EtwEventRegister the last call returns with ERROR_OUTOFMEMORY.
Question is, is that enough now to circumvent the EDR? What is actually happening here? To figure this out I will use the logman-utility and check which providers are currently registered with one specific process. Here is an example based on notepad.exe.
TLDR; It's a lot of providers...
So to test this out with the current payload I "crashed" the ETW-Buffer and then checked with logman. While I could see less providers, some of them were still registered. MSDN also tells you to ignore the OUTOFMEMORY-Error and that events will eventually get flushed to disk/log, so I am guessing here that the providers are simply restarting/re-registering.
Even if this is not the case and some providers stay off the process, I want a clean solution and will therefore try something different. Not saying here that aceb0nd's solution never worked but at least for me it does not "fully" work right now. Could also be that there is some kind of "Heartbeat-Service" that checks the current registered and if necessary re-registers them once they are gone (don't know, just guessing) and I was just simply to slow with checking for current providers. Also very likely that I am just stupid so take all of that with a grain of salt. ;)
Patching it is then
So this did not work, so I will try to use the patching-method because it is much simpler to implement than the HWBP-method. However instead of patching the EventWrite-Function I am going to target the NtTraceEvent as this is what is getting called eventually. Another candidate is EtwEventEnabled but NtTraceEvent (because it sits "lower") gets called far more often. EtwEventWrite is a big IOC for EDRs as it has been used in older solutions. Here is the function before and after the patch (ret). All it does is the following:
- Get the NtTraceEvent-Function via GetProcAddr and LoadLibrary (ntdll)
- Change the protection via VirtualProtect and change it to RWX.
- Patch the first byte with "c3" (ret)
- Change the protection back to RX
Is that enough though? It turns out...nope, it is not.
Well, shit! This was about to expected though as ETW (and ETWTi) is not the only telemetry for EDRs to digest. In addition to the ETW-stuff we could avoid in userland by patching for example, there are other things that are throwing us off. Some things are easier to patch than others though. (The process of patching and using GetProcAddress and so on are also IOCs themselves).
Creating the thread or process for example will travel all the way down to the kernel where the EDR will watch out for malicious actions. It does that by using callbacks such as PsSetCreateProcessNotifyRoutine. This basically means that anytime a process gets created, the subscriber (EDR) will get a notification about this event and can then use that information. This is obviously not the only callback and there are several others for process creation, file operations and so on.
Because all that stuff is happening in kernel-land, we can not simply avoid going through those callbacks if we do not have higher privileges or use some other kind of trickery. In addition to the stuff happening in kernel-land most EDRs (not all of them) also like to hook "interesting" dll's and load their own into each and every process. We can see an example with a basic notepad.exe on this host with Sophos EDR on Windows 11.
As mentioned before, in addition to those extra libraries the EDR also likes to hook other libraries and direct any code going through it towards the EDR before sending it back. We can see this again with this example of notepad.exe and Sophos on Win11. In this case I checked the function LdrLoadDll. First instruction you can see is a jump to another address. Not everything gets hooked though as this would very likely create perfomance-issues or just duplicate data which is covered by other EDR-telemetry. Some data is also just not relevant to the EDR.
Following this address we can see that we end up in the loaded hmpalert.dll which is from Sophos.
If you want to find hooked functions without doing all the work yourself you can take a look at the following repo from Mr-Un1k0d3r where he and other researchers compiled lists of multiple EDRs and their hooked functions plus tools to find even more.
Unhooking & loaded modules
There are several ways to get around this. You can unhook the hooked dll by loading a fresh copy from disk. Downside here again, that loading ntdll will go through the hooked functions and could be intercepted. Next thing you could try is simply removing the hook by either modifying the hook and instead of jumping to the edr-part, you directly jump to the original hooked part. Those are not the only ways to unhook functions but ones that are often used.
An "easier" way (in terms of being lazy) seems to be to just not allow the system to load any non-microsoft-dlls lol. How is that possible? When you create a process you can set different flags to alter the behaviour of this process. One of the interesting flags you can set allows you to simply block any binary that is not signed by Microsoft. As Sophos is obviously not Microsoft this should also block the hmpalert.dll from loading.
To do so I could fork my own process for example and enable the flag. "Small" downside though is this part here:
As already mentioned, properly signed in this case means that the binary needs to be signed by Microsoft. This includes dlls from Microsoft BUT also dlls from other developers. Unfortunately for us hmpalert.dll from Sophos is also signed by Microsoft. This means that this flag is useless in this case. Back to the drawing board again.
Nt, Zw, R0, R3, SSDT, wtf
As already mentioned, the EDR likes to hook and inspect function-calls to see whats going on. This happens in userland by hooking the functions in the imported modules (as shown, LdrLoadDll for example) but also in kernel-land and by inspecting the transistion to kernel-land. With kernel-callbacks, inspection of call-stacks, source of the syscall, ETWTi and other telemetry it is possible to cover a lof of ground and make it harder for any "interesting" action like ProcessCreation to go unnoticed.
While userland-unhooking was very effective a few years ago, it became less and less important and is now just another technique you can use to cut off some telemetry but not enough on its own to get around a mature EDR. Some EDRs even moved away from userland-hooks completely or never used them in the first place. Microsoft Defender for Endpoint for example does not rely on userland-hooking and gets all its telemetry from other sources such as kernel-callbacks, ETWTi and so on. Knowing this, attackers improved their techniques and implemented ways on how to get around hooked functions and stay undetected.
To understand the next techniques it is important to understand how a hooked/unhooked function-call normally works. The following is from the "Point of view" from the EDR in regards to the telemetry or data it can collect.
One Ring to rule them all
When talking about user- and kernel-land people are actually talking about the "privilege rings" in the Windows OS (Linux is using a similar approach). The rings go from Ring3 (least privileged) to Ring0 (most privileged) and represent the userland in Ring3 (we do not care about ring 1,2 for now) and kernel-land in Ring0. The idea here is that you as user should not have "direct" access to anything with higher privs in the lower rings. You will need to go through some defined function-calls that will then forward or handle executing the wanted code/action for you. At least that's how it is supposed to be.
To better understand this, I opened up notepad.exe as normal user on Windows 11 with Sophos installed. Notepad besides saving files needs to be able to perform different actions. To do so it imports several different libraries which export the needed function-calls.
If I now want to save the file I edited in notepad I will click save and then choose a location and save the file. This action will then call several Win32-APIs and perform different actions. One of those is to obviously save the file. To do so notepad will first call the Win32-APIs (which we can call from userland) and which are documented. In this example notepad is calling SHCreateDirectoryExW from the windows.storage.dll (ignoring everything else for this example).
So far everything very simple. The call is not done here however. Before the file actually gets written the API-call will travel down to the native API (NtApi) which is mostly undocumented (from Microsoft at least) and resides in C:\Windows\System32\ntdll.dll (gets loaded into memory). In this case the next function call is NtCreateFile which we can see in this screenshot.
What is important here are the unique instructions.
First unique instruction is the mov r10, rcx, where rcx is the return address to return to after the syscall. Next is mov eax, 55. The move to eax is the syscall-number which is the indicator for the system to what the kernel is supposed to do (create a file in this case). As mentioned before, we are not supposed to directly do kernel-stuff so we need to switch from user-land to kernel-land and tell the kernel what we want to do. This is done by moving the syscall we want (0x55 in this case) into eax and then using the special function syscall (sysenter on 32bit) which is telling the system to let the kernel take over.
It is important to note here that there are 2 versions of this native API. One with the Nt-Prefix and one with the Zw-Prefix. For the malware-developer they are basically identical and you can use them interchangeably.
How does the system know what each number (syscalls) means though? This is done with the System Service Descriptor Table (SSDT) which basically holds all the syscalls for the system. One would think that modyfing the SSDT would give you unlimited power over the system and one would be right. Until 2005 (with the introduction of WinXP x64 and PatchGuard) this was possible and rootkits could abuse the SSDT to hide/modify anything going trough there to for example hide files or registry-entries. There is a great explanation and video here on how that worked. Nowadays trying to modify the SSDT ends in a bluescreen.
To see how the SSDT looks like I started a kernel-debugging-session on Windows 11 and used WinDBG to inspect the SSDT itself. The first value we see is the pointer to the SSDT (KiServiceTable).
Using the shown address we can then calculate and check where the address points to. NtAccessCheck in this case:
To recap, we know that EDRs like to hook some functions in userland. We also know that calling some functions like CreateFileA for example will in return call functions from the native API (NtCreateFile) and only then transfer it over to the kernel. In return this means that if we want to evade the EDR we would need to avoid the hook somehow or do the transfer to the kernel ourselves and do our own syscalls.
While not fully documented we could just rebuild the syscall-routine in ntdll.dll ourselves and then do our own syscalls. This would avoid any hooks in userland. There are 2 downsides to this technique. First, syscalls are not consistent but may change between different versions. Not only between major versions like Win 7 and 10 but also between minor versions like 20H2 and 21H2 for example. This means that we either have to know the environment beforehand or need to find a way on how to get the correct syscalls. Second downside is that doing direct syscalls is a known technique and therefore monitored. Every syscall itself is an IOC as well the fact that syscalls should technically only come from ntdll.dll. So if they are not then it is very likely something malicious.
Still, to understand how they actually work I wanted to create a simple example and showcase on how a mature EDR is able to recognize malicious behaviour. To be able to do direct syscalls we will need to define an .asm-file in Visual Studio and add it to our project.
Next, change its Item-properties to Microsoft Macro Assembler and then make sure that the project > build dependencies include masm-files.
With that done we need to replicate an unhooked syscall. To do so we can simply check the syscall we want to replicate and then copy the behavior into our own code. For this example I selected NtDeleteFile and inspected it on my debugger-machine. We can see the syscall 0xd3 and the rest of the code and can simply copy it over into our code. Keep in mind, the syscall in this case (Win10 x64) is 0xd3, 0xd9 on Windows 11. On Windows 7 or any other minor/major version it may be something different.
As you can see I did not copy the test... and jne... instructions. Reason being that they are used to check if the syscall is possible or not. As far as I know this is being used in case the system is 32bit and therefore the instruction sysenter is needed instead of syscall. As this is just a test it can be ignored. Next we can use NtDoc to figure out how to use this syscall. In this case I had to define 2 structs, OBJECT_ATTRIBUTES and UNICODE_STRING to be able to call NtDeleteFile via syscall. The whole code could look something like this
Quite a lot of code for just one simple function. To keep the pain of declaring so much stuff to a minimum we can use the awesome tool Syswhispers3 from klezVirus (Blog) which will do all the heavy lifting for us. Not only does it write the syscalls and structs and all of that stuff for you but also takes care of some IOCs by introducing egghunters to evade static signatures and includes dynamic SSN resolution. As an example, here is the command to get the NtDeleteFile via syswhispers:
Next to Syswhispers there are other techniques such as Hell's Gate, Halo's Gate and Tartarus Gate. Shortly after (2021) VX-Underground released Hells Gate which allowed the developer to dynamically calculate syscalls IDs. Downside here was that syscalls were NOT coming from ntdll.dll. Tracing the return-address the EDR could clearly see that the syscall originated NOT from ntdll.dll and was therefore likely malicious. While it worked back then, it is a IOC nowadays. Another downside is that syscalls (not all of them) are hooked nowadays (the function in ntdll.dll) which means that the calculation deployed in Hells Gate may not work all the time.
To combat this SEKTOR7 published Halos Gate which took care of this limitation. The idea is here that not all syscalls are hooked. This means that you only need to find a clean one and then calculate the offset to the one you are interested in. Example, imagine syscall 0x45 is hooked but 0x44 is not. Now imagine we are interested in whatever is hiding behind the jmp at syscall 0x45. With a hook installed we can not see the syscall-id so at this point we do not know that we need the id 0x45.
To fix this we just go up (go back) and look at the previous number/syscall and check if that one is hooked or not. If it is not then we extract the syscall id (0x44) and then just add +1 to get to our wanted syscall (44+1). If 0x44 is also hooked we go up again and so on until we find a clean syscall.
Next update and the current version as far as I know is called Tartarus Gate by trickster0. Reason for this version was some specific EDR that Halos Gate could not handle. Downside here is the syscall(s) NOT coming from ntdll.dll. To fix this Thanasis Tserpelis released an update that revolves around Tartarus Gate with the blog post "an OPSEC safe loader for Red Team Operations" in which he switched from direct syscalls to indirect syscalls and also included a new injection technique.
To understand the whole thing I first cloned Tartarus Gate and tried it out with a simple meterpreter-shellcode that opens up the calculator. Debugging instead of seeing the calculator I get this notification.
The issue here lies in this allocated memory-region:
Meterpreter expects (needs) a RWX memory-region to work. To fix this I just updated it to RWX and then got the calculator open.
Initially I wanted to include more code to explain Module Stomping and so on but this post is already very long and I decided to keep this post as a sort of introduction and move the actual exploitation to the second or third post. So...see you next post ;)