Crypters | Explanation, Comparison with Runtime Results, Research
Crypters | Explanation, Comparison with Runtime Results, Research
Quote: I would like everyone to exercise your mind with me for a little bit and let me take you into "paranoid of being detected" mode:
- Paranoid number One - Crypted file should be clean by itself
Considering the tests results from previous posts and considering that this is a crypter comparision thread then I think a even better test could be performed to eliminate some of the "noise" and variance from these current results.
In order to understand this first let's consider this scenario with a focus on botnet software first:
Coder writes botnet software then distributes this to its fellow users.
Users take some crypt service to crypt the software.
They use it for a while, everyone is happy.
After some time or even at the genesis of the botnet software the coder or the client of this software decides to do a test run to the software without the crypter and launches in the wild his uncrypted but FUD file to some 500-1000 target machines.
Well from the moment the coder/user does that then an invisible chain reaction happens that takes a little time: AV products finds this new file with no reputation whatsoever on machine getting executed and sends this to its cloud scanning servers for further analysis. Cloud scanning reveals nothing then standard procedure for these files are to be sent its fellow AV company AV-analysts employees to manually analyse the files then the analyst makes signatures for this uncrypted FUD file. He also trains his machine learning dataset with this file but this will be put into discussion a little bit later.
After some time the uncrypted botnet software is being detected by some AV products
After all of this now the users takes the file again crypt it and it's FUD on scantime and runtime starts to fail on some products and that cannot be reverted. It will be like this like forever or until software developer reFUD's the uncrypted software while doing runtime scan checks iterations as many as necessary. Since this is a large code base then there is a huge attack surface for signatures of AV's so the coder will have a lot headache to fix this. But most probably the coder will not bother because it thinks it is crypter's job to do that.
The mistake here is of course that developer or client of the software had to launch this file uncrypted just because it worked ? which leads to this irreversible runtime detection which cannot be cleaned with any crypt software/service.
This leaves very few options to fix this like:
Software developer of the botnet could try to fix runtime detections and make his own crypt and deliver both of them always inseparable from each other
Or each botnet owner make his own software and be careful to always distribute the file only crypted.
In the end here is what could be done about the crypter in order to see how good it is statistically speaking:
In order to eliminate the risk that botnet software could be tainted by the previous scenario and most probably is then we should remove botnet software out of the picture for these tests.
We should be making a dummy payload .exe .bat file that is/was never sent in the wild that has 0 runtime detection or close to that ( that can be done I have tested it myself ) then this file should be crypted with the crypter candidates to see if they are able not to be detected with a dummy file for runtime detection.
Further test iterations can also be made by downloading a malware archive and throwing some parts of the binary of the well know malware into our payload .data then to .code section in order to specifically tune detection for a certain security product at runtime.
Also further iteration for tuning file .data and .code section and imports can be made.
This way I think noise variance is removed from the test results and will show how good a crypter is by itself.
Normaly the crypt that gives close to 0 runtime detections with the dummy file samples should be our best pick.
Then botnet software should have not been released into the wild then we are good to go for the lowest detection of a pair of crypt+software.
- Paranoind number Two - Machine Learning already knows you at first sight
Machine Learning(ML) aka Artificial Intelligence(AI) algos have grown very powerful for security products in the last 5 years or so.
ML algos makes uses of scantime and runtime behavior to gather signals in order to get a picture of how a payload looks or behaves.
Some of the signals are:
- Static scanning signals like: size of the file, file sections (.data .text or others) presence or size of those sections, file imports, visible strings in the sample file, icons, certificates, others resources etc..
- Runtime scanning like sensitive API calls, persistence events, network connections or behavior while communicating, files executed or dropped. Sample file name or dropped files name. ( BTW this should not be confused with sandbox analysis )
- Environmental context while executing a file and or execution chain that follows. Meaning which app launched the sample file. Was it powershell/cmd execution? Was it being launched from explorer? Was it being launched from a browser ? Was it downloaded from a browser as archive then extracted then executed? And so on.. All of this is also logged by the ML engine running on machine side in realtime.
Now basically ML gathers all those signals without knowing what's under the hood of the crypt actually then compiles sort of fingerprint out of all these signals then some security product analyst manually categorize this fingerprint as good/bad then inserts it into ML database on the product.
By doing this they actually train their ML model to recognize new samples that where never seen before and based on their signal count match they then put a "tag" on that newly seen sample into being good, suspicious or bad. For example if is an upgraded version of a very used good software then is safe to assume is a good sample, or if is a recrypt/upgrade of a bad malware then is safe to assume that this is a bad sample, or sample is a suspicious one.
In case file is suspicious then some AV products employs sandboxing of that launched file while letting it launch on the machine. This of course will put some limitations of that sandboxing such as readonly filesystem as a whole or parts of it. Persistence does not hold in this case.
A suspicious execution in a sandboxed evnironment which could have be avoided should be considered fault of the crypt.
Some security products that employs such sandboxing are: Comodo, Kasperksy, Mcafee, Fortinet
Normally ML algos need cloud detection to be enabled in order to work at their full power. Scantime detection services has ML disabled to a considerable degree. That includes services like avcheck and even VirusTotal on some of their AV products.
Controlled runtime scanning even with cloud scanning enabled does not takes into account the environmental context of the sample execution which has been lost or not provided.
When context is lost then runtime detections don't happen in controlled scanning environment and crypt developer thinks he won the battle. But actually when there is context then file has higher chances being detected in the wild as well.
Back on the ML training model we see that AV companies has now a very big database collection of those signal fingerprints. While matching this huge DB against a new sample, even a good benign one you might find it surprisingly that some of them could be detected as malware causing false positive detection even if they ain't and this is because some malware can have similar fingerprint to a benign software file and they will tend to match to a high degree.
This why crypt coder has to find a new recipe for its stub for creating a file fingerprint which by chance dos not cross-interfere with the existing ones in that huge DB.
This of course create a challenge for crypt coder to come up with new out of the box thinking all of the time in order to succeed.
The change of file icon for example that has been talked about to optimize file detection is because of that false positive detection of an already made malware that is similar
Just because some other person out there already used a similar crypt design with similar icon that is why now file is being detected even if has not been spread in the wild yet.
Here is a practical real example of ML into action:
Nowadays WD on Windowns 11 started to be acting differently that WD on Windows 10 when the same sample file is executed on them.
WD 11 now has the same detection algos and signatures as WD 10 but WD 11 has now a more aggressive approach when comes to ML detection.
While creating a undetected file and scanning it and even launching on WD 11 from explorer all could work just fine but after taking it a bit further:
1. making an archive and put file into archive and download this file from a host you might notice that WD 11 pops that payload is detected compared to just bringing the payload in WD 11 by other means and just executing it.
2. launching file from a shell script would make WD 11 change its behavior and consider that payload file is actually a malware
So as you can see its a mixture of how suspicious is a file plus its execution context in order to obtain best results
Now let's sum this up:
* A crypter is not enough to secure file detection to a maximum. There should be a testing of the whole pairing of: execution chain context + crypt + botnet software
After establishing the real case scenario of execution context we could then crypt the software and adjust parameters of the crypt and/or botnet software in such a way that ML detection is reduced to a minimum. In such a case good tools like scanner.to cannot control the execution context because is automated. Because of this a crypter coder would have to deploy its whole runtime scanning system for at least the most common security products out there in order to be able to reproduce the exact execution context for each software that is being used with the the crypt.
Based on some study of mine. Having general audience traffic of installs (that does not includes corporations, just regular people target machines) then usually traffic distribution by AV product as follows: has about 30% WD 10 , about 30% WD 11, about 20% Mcafee, about 5% Avast/AVG. Products which are the most used.
So at least for these the whole pair of context+crypt+botnet software should be tested and fine tuned on those to obtain the best detection results.
* A crypted software that has pass ML as suspicious should be considered as a failure because sandbox will only offer initial execution of that payload because cannot obtain persistence in any way. Crypter+Software should be refined in such a way that ML suspicion status of the file to be removed
To conclude all of this:
I think in order to have best results would be that just one coder to create a monolyth software that has it all-in-one, crypt + botnet software + well tested in the wild against all detections technologies of present, that's the only way to control detection at its best
But we can also accept compromise since the perfect outcome is very hard to achieve involving very hard work from the coder perspective which also tries to make his living out of this.
Bringing a sample file to being clean 60-70% on a real case execution scenario is now achievable but the difference of going from that 60-70% to up to a 100% is getting very harder to do by each additional 1% gained on top of that 60-70%.
We can either be "paranoid of being detected" or just make compromise and accept that shared commercial software has its limitations.
Welcome in Crypter comparison mega thread. In this topic we will narrow you everything about crypters to help you understand what is being used nowadays, this would make your work easier. With so many crypters on the market to choose from, it can be overwhelming to make a decision. When you whittle them down, all crypters do essentially the same thing: protect your file from anti-virus software. So when we compare crypters, we have to get down to the most important aspect, detection at runtime.
There are many rumors around, such as that WD cannot be bypassed. In this thread we will debunk those. Before we proceed strictly with tests let us describe in general you what are crypters and how they work.
Crypters are is in other words encrypting tools. They are used to obfuscate different type of files in order to protect them from cracking and analyzing and to bypass anti virus software. How does it work? There is many different types of encryption and injection methods and techniques to bypass anti virus software, so it would be really hard to list and describe everything, so in summary the image bellow illustrates very simply what a crypter does to your file.
Quote: I would like everyone to exercise your mind with me for a little bit and let me take you into "paranoid of being detected" mode:
- Paranoid number One - Crypted file should be clean by itself
Considering the tests results from previous posts and considering that this is a crypter comparision thread then I think a even better test could be performed to eliminate some of the "noise" and variance from these current results.
In order to understand this first let's consider this scenario with a focus on botnet software first:
Coder writes botnet software then distributes this to its fellow users.
Users take some crypt service to crypt the software.
They use it for a while, everyone is happy.
After some time or even at the genesis of the botnet software the coder or the client of this software decides to do a test run to the software without the crypter and launches in the wild his uncrypted but FUD file to some 500-1000 target machines.
Well from the moment the coder/user does that then an invisible chain reaction happens that takes a little time: AV products finds this new file with no reputation whatsoever on machine getting executed and sends this to its cloud scanning servers for further analysis. Cloud scanning reveals nothing then standard procedure for these files are to be sent its fellow AV company AV-analysts employees to manually analyse the files then the analyst makes signatures for this uncrypted FUD file. He also trains his machine learning dataset with this file but this will be put into discussion a little bit later.
After some time the uncrypted botnet software is being detected by some AV products
After all of this now the users takes the file again crypt it and it's FUD on scantime and runtime starts to fail on some products and that cannot be reverted. It will be like this like forever or until software developer reFUD's the uncrypted software while doing runtime scan checks iterations as many as necessary. Since this is a large code base then there is a huge attack surface for signatures of AV's so the coder will have a lot headache to fix this. But most probably the coder will not bother because it thinks it is crypter's job to do that.
The mistake here is of course that developer or client of the software had to launch this file uncrypted just because it worked ? which leads to this irreversible runtime detection which cannot be cleaned with any crypt software/service.
This leaves very few options to fix this like:
Software developer of the botnet could try to fix runtime detections and make his own crypt and deliver both of them always inseparable from each other
Or each botnet owner make his own software and be careful to always distribute the file only crypted.
In the end here is what could be done about the crypter in order to see how good it is statistically speaking:
In order to eliminate the risk that botnet software could be tainted by the previous scenario and most probably is then we should remove botnet software out of the picture for these tests.
We should be making a dummy payload .exe .bat file that is/was never sent in the wild that has 0 runtime detection or close to that ( that can be done I have tested it myself ) then this file should be crypted with the crypter candidates to see if they are able not to be detected with a dummy file for runtime detection.
Further test iterations can also be made by downloading a malware archive and throwing some parts of the binary of the well know malware into our payload .data then to .code section in order to specifically tune detection for a certain security product at runtime.
Also further iteration for tuning file .data and .code section and imports can be made.
This way I think noise variance is removed from the test results and will show how good a crypter is by itself.
Normaly the crypt that gives close to 0 runtime detections with the dummy file samples should be our best pick.
Then botnet software should have not been released into the wild then we are good to go for the lowest detection of a pair of crypt+software.
- Paranoind number Two - Machine Learning already knows you at first sight
Machine Learning(ML) aka Artificial Intelligence(AI) algos have grown very powerful for security products in the last 5 years or so.
ML algos makes uses of scantime and runtime behavior to gather signals in order to get a picture of how a payload looks or behaves.
Some of the signals are:
- Static scanning signals like: size of the file, file sections (.data .text or others) presence or size of those sections, file imports, visible strings in the sample file, icons, certificates, others resources etc..
- Runtime scanning like sensitive API calls, persistence events, network connections or behavior while communicating, files executed or dropped. Sample file name or dropped files name. ( BTW this should not be confused with sandbox analysis )
- Environmental context while executing a file and or execution chain that follows. Meaning which app launched the sample file. Was it powershell/cmd execution? Was it being launched from explorer? Was it being launched from a browser ? Was it downloaded from a browser as archive then extracted then executed? And so on.. All of this is also logged by the ML engine running on machine side in realtime.
Now basically ML gathers all those signals without knowing what's under the hood of the crypt actually then compiles sort of fingerprint out of all these signals then some security product analyst manually categorize this fingerprint as good/bad then inserts it into ML database on the product.
By doing this they actually train their ML model to recognize new samples that where never seen before and based on their signal count match they then put a "tag" on that newly seen sample into being good, suspicious or bad. For example if is an upgraded version of a very used good software then is safe to assume is a good sample, or if is a recrypt/upgrade of a bad malware then is safe to assume that this is a bad sample, or sample is a suspicious one.
In case file is suspicious then some AV products employs sandboxing of that launched file while letting it launch on the machine. This of course will put some limitations of that sandboxing such as readonly filesystem as a whole or parts of it. Persistence does not hold in this case.
A suspicious execution in a sandboxed evnironment which could have be avoided should be considered fault of the crypt.
Some security products that employs such sandboxing are: Comodo, Kasperksy, Mcafee, Fortinet
Normally ML algos need cloud detection to be enabled in order to work at their full power. Scantime detection services has ML disabled to a considerable degree. That includes services like avcheck and even VirusTotal on some of their AV products.
Controlled runtime scanning even with cloud scanning enabled does not takes into account the environmental context of the sample execution which has been lost or not provided.
When context is lost then runtime detections don't happen in controlled scanning environment and crypt developer thinks he won the battle. But actually when there is context then file has higher chances being detected in the wild as well.
Back on the ML training model we see that AV companies has now a very big database collection of those signal fingerprints. While matching this huge DB against a new sample, even a good benign one you might find it surprisingly that some of them could be detected as malware causing false positive detection even if they ain't and this is because some malware can have similar fingerprint to a benign software file and they will tend to match to a high degree.
This why crypt coder has to find a new recipe for its stub for creating a file fingerprint which by chance dos not cross-interfere with the existing ones in that huge DB.
This of course create a challenge for crypt coder to come up with new out of the box thinking all of the time in order to succeed.
The change of file icon for example that has been talked about to optimize file detection is because of that false positive detection of an already made malware that is similar
Just because some other person out there already used a similar crypt design with similar icon that is why now file is being detected even if has not been spread in the wild yet.
Here is a practical real example of ML into action:
Nowadays WD on Windowns 11 started to be acting differently that WD on Windows 10 when the same sample file is executed on them.
WD 11 now has the same detection algos and signatures as WD 10 but WD 11 has now a more aggressive approach when comes to ML detection.
While creating a undetected file and scanning it and even launching on WD 11 from explorer all could work just fine but after taking it a bit further:
1. making an archive and put file into archive and download this file from a host you might notice that WD 11 pops that payload is detected compared to just bringing the payload in WD 11 by other means and just executing it.
2. launching file from a shell script would make WD 11 change its behavior and consider that payload file is actually a malware
So as you can see its a mixture of how suspicious is a file plus its execution context in order to obtain best results
Now let's sum this up:
* A crypter is not enough to secure file detection to a maximum. There should be a testing of the whole pairing of: execution chain context + crypt + botnet software
After establishing the real case scenario of execution context we could then crypt the software and adjust parameters of the crypt and/or botnet software in such a way that ML detection is reduced to a minimum. In such a case good tools like scanner.to cannot control the execution context because is automated. Because of this a crypter coder would have to deploy its whole runtime scanning system for at least the most common security products out there in order to be able to reproduce the exact execution context for each software that is being used with the the crypt.
Based on some study of mine. Having general audience traffic of installs (that does not includes corporations, just regular people target machines) then usually traffic distribution by AV product as follows: has about 30% WD 10 , about 30% WD 11, about 20% Mcafee, about 5% Avast/AVG. Products which are the most used.
So at least for these the whole pair of context+crypt+botnet software should be tested and fine tuned on those to obtain the best detection results.
* A crypted software that has pass ML as suspicious should be considered as a failure because sandbox will only offer initial execution of that payload because cannot obtain persistence in any way. Crypter+Software should be refined in such a way that ML suspicion status of the file to be removed
To conclude all of this:
I think in order to have best results would be that just one coder to create a monolyth software that has it all-in-one, crypt + botnet software + well tested in the wild against all detections technologies of present, that's the only way to control detection at its best
But we can also accept compromise since the perfect outcome is very hard to achieve involving very hard work from the coder perspective which also tries to make his living out of this.
Bringing a sample file to being clean 60-70% on a real case execution scenario is now achievable but the difference of going from that 60-70% to up to a 100% is getting very harder to do by each additional 1% gained on top of that 60-70%.
We can either be "paranoid of being detected" or just make compromise and accept that shared commercial software has its limitations.