Thursday, 21 September 2017

Announcement: IRIS-H (alpha) - Online Digital Forensic Tool for Microsoft Office Files


IRIS-H is an online web service that performs static analysis of the files stored in a directory-based or strictly structured formats. The service disassembles submitted files into individual components based on the detected file format and performs static analysis of each of the components. The analysis process involves sequentially reading components' binary data and enriching it with human readable information. The enrichment process is based on the binary data description as per official file format specification. Further rule-based evaluation is performed on the extracted data in order to establish if the submitted file can be harmful to a computer system.


Currently, the service is still being developed and running in alpha phase of the release life cycle. Application updates are pushed regularly and may require full data flush. There is absolutely no guarantee the uploaded data  and corresponding generated reports will be available during 'alpha' cycle. Any development/maintenance is done in my free time - this service is the result of a hobby rather than a paid job.


I'd like to say 'Thank You!' to the following individuals and organizations for their direct and/or indirect support. 👍
  • VirusTotal crew
  • StackOverflow ReactJS and NodeJS communities
  • Decalage
  • Individual security researchers who provided their invaluable feedback that materialized into improved and new features (I do not support fame leeching, so no names, but you know who you are)

IRIS-H Service


(skip to the next paragraph if you're not into fairy tales and stuff...)
Once upon a time I decided that challenging myself with learning JavaScript flavors is a great idea... well, still think it's a great idea... So, I set out on a quest to build something as I learn it. Looking far and wide, I thought a simple console based digital forensic tool could be a good start. Little did I know at the time, on how far it will actually take me... and so I ended up creating IRIS-H. The name doesn't really stand for anything with 'Next Generation' or 'Artificial Intelligence' or 'Blockchain' or even 'Cyber' buzz words in mind(sorry, no lasers either), despite having a logo. It's simply my tribute to all the hard working Irish people I had a pleasure to encounter in my last 20 years living in Ireland.

So, what's all about?...

I'd like to share an online tool I've been working on for the last a few months. IRIS-H is mainly being positioned as a digital forensics tool, though it has some rule-based logic it applies in order to determine the outcome of opening the analysed file on a computer system. The digital forensics aspect is revolving around putting descriptive meaning on the binary data derived from the analyzed file. Where possible, the tool attempts to extract digital artifacts to allow for further manual or automated analysis. In the case of malicious files analysis, a trained eye could leverage the tool to help him/her to perform a simple 'campaign' type attribution based on the data derived by the service and their best judgment.

It's important to note that IRIS-H is not a sandbox environment. The submitted file is never opened with its corresponding host application. This slightly limits IRIS-H functionality in terms of obtaining network based indicators of compromise(IoC). Still, the service attempts to evaluate the risk of opening the file on a computer system based on the IoCs derived from the binary data and presence of certain digital artifacts. Sometimes, when you search for a quick answer, this might be all you need.

IRIS-H can do some tricks, but the service is far from being mature. It's at the stage where it just started bringing some value to the work I do, so I hope it can do the same for the others.

Right, what can it do?...

The interaction with the service is done through its web interface. The interface allows for navigation through web pages that offer specific service features.
  • Home Page - offers ability to view latest file submissions. The view is utilizing a table to present the following data: submission time, submitted file MD5 hash, file name at submission time, detected file type and the result of rule-based risk evaluation.
Latest Submissions table view
  • Search Page - offers ability to search the service database using MD5, SHA1 or SHA256 hash strings. If a successful analysis already exists for the file with the provided hash the user is forwarded to a corresponding Report Page(more on this below).
Search Form view

  • Submit Page - offers ability to submit a file to be analyzed. File size and type validations are performed on this page. The page provides 'drag-and-drop' and 'no-submit-button' component to allow service users to select a file to submit. The upload file size limit is set to 10MB. If submitted file type is not supported an alert is spawned notifying the user. Only single file submissions are supported at this time. ZIPed files are not accepted yet.
Submit Page dropzone box
The following are some examples of what IRIS-H service can handle at this stage:
  • Files saved in Microsoft Office 97-2003 format (DOC, XLS, PPT) - only DOC files are fully supported at the moment
  • Files saved in Microsoft Office 2007+ format (DOCX, PPTX) - only DOCX files are fully supported at the moment. XLSX are not being accepted yet.
  • VBA project files extracted from the Microsoft Office documents that are saved in Open Office XML format
  • Objects embedded into other Microsoft Office documents including those in OOXML format
Technically, IRIS-H will accept and attempt to process any file in OLE-CF format. There are certain case per case limitations though, where the service might not have a parser for a 'not-so-common' OLE stream types.

Once a file is accepted and uploaded, IRIS-H begins dissecting the submitted file into separate components for further automated static analysis. When the analysis is completed, the user is forwarded to a corresponding Report Page.

Report Page is where everything comes together and to help service users navigate through the chunks of information, the page provides a navigation bar.
Left Navigation Bar Example
The information presented on the Report Page is a mixture of high level and 'deep dive' forensics data. The original idea was to be able to export it into a file format that can be stored or shared(like, PDF or similar), but due to some technical challenges I couldn't overcome I enabled report availability through its URL link.

Alright, but what's in it for me?...

Well, one would have to try it out and see, right? 😜 IRIS-H service is available at . Please make sure you get yourself familiar with Terms of Service. There is also About page that provides more details about the service.

... but just to give you some ideas, the screenshots below highlight some of the findings I came across of using file samples at my disposal.
VBA Digital Signature details
Detection for Microsoft Office field characters
IRIS-H detected embedded VBS script in the submitted file
'deep-dive' digital forensics view of the VBS file reported above.
Results of  VBA scripts analysis
Partial view of document embedded form analysis showing UA string hidden in form's tag field

Partial view of document meta data analysis showing URLs stored in the document
'deep-dive' forensics view of a linked object showing object's name and network path where it's stored
'deep-dive' forensics view of data from WordDocument stream showing Revision History, System Fonts and Users data
Example of Extracted Images preview
Example of findings view showing evidence of a linked ZIP file from the document
Example of extracted downloadable artifact (VBA macro script in this case)

Closing Note

If you have any feedback please do not hesitate to reach out. I believe, there is no better way to improve something, but to hear what people think about it.

No comments:

Post a Comment