Deobfuscating APT28’s HTA Trojan: A Deep Dive into VBE Techniques & Multi-Layer Obfuscation
Summary
I have recently noted
that APT28 conducts cyber espionage on Central Asia and Kazakhstan diplomatic
relations, and the report from sekoia.io, the third
part of the report referring to HATVIBE and CHERRYSPY infection chain, which
related to another report from CERT-UA, attracted me to the
extension. That sample is heavily obfuscated, so here it is, for this analysis, we will
focus on doing deep dive with x32dbg debugging. Base on
the last report title “Unveiling APT28’s Advanced Obfuscated Loader
and HTA Trojan: A Deep Dive with x32dbg Debugging” posted in 2025.02.25, But now on this report I will
make a further process into the algorithm that decodes APT28’s HTA Trojan; it
is very interesting to dive deeper inside. In brief, I am so excited to see
what’s happened and what evasion technique it used.
Technical analysis
The
sample HASH md5 d0c3b49e788600ff3967f784eb5de973
Sha256: 332d9db35daa83c5ad226b9bf50e992713bc6a69c9ecd52a1223b81e992bc725
Format:
plain text
First, let’s go back to the obfuscated code of HTA Trojan, and I noted that some similar features seem to use “@#@” to split the long strings, so I remade the code more readable as follows.
Fig.1-manually
split with "@#@"
Let’s continue the work based on the last report I posted, it decodes the obfuscated code and controls the interaction with the end string “1JICAA==^#~@.”. The assemble command “add edi,2” is the address designed to search for the next character obfuscated to be decoded.
Fig.2-interaction
strings
From the result of comparing, if it is not the end, it will continue to compare with EDX and EAX; if not bigger and equal, it will go to the next process of the interaction loop. The Edx registry is pointing to the memory of the decoded, and the EAX registry will be handled with.
Fig.3-both
edx and eax
Let’s move and have a look at what a character obfuscated will be decoded to, the obfuscated strings in the DS segment, and use EDI to change the address with character-by-character, and now turn to the character “x”
Fig.4- the
character “x” will be decoded
Let’s move forward, and
we can find the next coming character obfuscated compared to 80H (128) and jump
forward if the result is true. And it seems that it will randomly choose the
character ‘n’ imbedded into Trojan and give it to the AX registry as follows.
Fig.5-chosen
by imbedded character
And now let’s observe and see what’s happened. The value of the AX registry was copied to the memory area of the decoded; yeah, that’s it.
Fig.6-the character decoded “n” showing
Let’s continue to debug
again and discover that it chose another character imbedded inside the malware,
and the process, you know, seems to be designing a map algorithm pointing to
the character decoded with an address of index.
Fig.7-imbedded
strings inside the malware
The
map algorithm was customized, which is not complicated; the address range is
from 6DB59CF0 to 6DB59FF0, and the character is deobfuscated from the range.
Fig.8-the key map
algorithm
Ok,
now let’s make clearer to the key logic with a picture as follows. The
obfuscated strings are decoded with map algorithm to deobfuscated strings.
Fig.9-map
algorithm
Now
let’s keep moving to the red highlight imbedded strings. From the above, the
process of decoding seems to indicate that all source code was encoded to
different but similar parts of the same length, all being plain text, and if
you want to know more details, maybe you would like to ask a big question: “How
to generate the strings like that?”.
Fig.10-imbedded strings
I tried to debug many times and finally found that the
address range is from 6DB59CF0 to 6DB59FF0, and I went to the IDA to search the
string with “6DB59” and find something interesting. At this time I am very
confident that the imbedded strings are generated by Windows vbscript.dll as
follows.
Fig.11-vbscript.dll imbedding
strings
I continue to check them out and go to the highlight
and find the hex strings bye_6DB59CF0 to compare with imbedded strings; it was
generated by Windows vbscript.dll.
Fig.12-hex imbedded
strings in windows vbscript.dll
Yeah, very nice, and I
continue to look for some strings and the beginning flag #@~ and the end flag
#@~$, with AI’s help. I am very sure it is the original VBE technique, which is
the Microsoft-provided tool called Windows Script Encoder (screnc.exe)
to encode VBScript (.vbs) and JavaScript (.js) files, making them harder to
read but still executable. Here’s what an encoded .vbe file might look like.
Fig.13- .vbe File
And
to compare APT28’s Trojan with those flags, I successfully tried to find a
Python script “vbe-decoder.py” on GitHub to do deobfuscation, thanks to
JohnHammond, and to save the .vbe file as follows and use the Python script to
execute.
Fig.14-APT28 .vbe sample
Yeah, finally we have a deobfuscated. VBS script file
as follows, the hashs:
Md5 f3b5da6704f014c741fcbb8c59d3bfb0
Sha1
efc991003df3a384158cfe7f8f8658b09558e356
Sha256
4132840174222c62aa10950e9583463a614c3ecf24775aa44cffd7530f2e80b9
Note: The hash are up to .vbe file and python script, it maybe a
little different, it is not important.
Another hashs:
Md5 690fe881d288167fde157c6fb834c3ef
Sha1 364a51f445a69808ac7026cc585a99c3f818f360
Sha256 0fa7e3ffb8a9ca246cc1f1e3f6118ced7a7b785de510d777b316dfcefdddb0be
Fig.15-APT28’s
.vbs sample
To continue to dive deeper into the Fig. 15 .vbs sample, I changed the code and printed it out to a file.
Fig.16-change
the malware and print out snippet
To do it that way, we
can get the final malware sample as follows. The hashs:
Md5 2505649df3f33cf3b65059d338e3dd6f
Sha1 d2ba5a1e32232a8e52b7619d9e6c5aafd9aeb8e1
Sha256 11992682f4c485bb0543cba2830b061a050bc9fcc853358a4c38f1e706451ff8
Fig.17- the
final deobfuscated malware sample
Conclusion
The APT28’s HTA Trojan
uses the VBE technique in HTA and multiple-layer obfuscation evading
techniques, which imply APT28 is so actively seeking new opportunities and
changing their policies for cyber espionage campaigns. It is a big hidden
threat to the digital world; let’s pay more close attention.
IOCs
Files:
md5: d0c3b49e788600ff3967f784eb5de973
Sha256:
332d9db35daa83c5ad226b9bf50e992713bc6a69c9ecd52a1223b81e992bc725
Md5: 690fe881d288167fde157c6fb834c3ef
Sha256:
0fa7e3ffb8a9ca246cc1f1e3f6118ced7a7b785de510d777b316dfcefdddb0be
Md5:
2505649df3f33cf3b65059d338e3dd6f
Sha256:
11992682f4c485bb0543cba2830b061a050bc9fcc853358a4c38f1e706451ff8
Md5: f3b5da6704f014c741fcbb8c59d3bfb0
Sha256:
4132840174222c62aa10950e9583463a614c3ecf24775aa44cffd7530f2e80b9
Network:
5[.]45.70[.]178
End.
Support My Work:
**If you found this report helpful, consider supporting my work by buying me a coffee!** Your contribution helps me dedicate more time to malware analysis and creating free resources for the cybersecurity community.
![Alipay QR Code] 支付宝
*I am deeply grateful from the bottom of my heart!*
Labels: #APT28 #HTA Trojan, #Multi-Layer Obfuscation, #SeekerAnalysis, #VBE Techniques
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home