A Tale of Two Systems: How We Integrated MISP with AssemblyLine

A Tale of Two Systems: How We Integrated MISP with AssemblyLine
May 30, 2024

At Cosive, we work heavily with MISP via our CloudMISP managed service. It’s a great platform for helping companies know what threats are active in the community. 

One day, a request came across our desks: 

“We need to know more about the file artifacts hashes we see in CloudMISP, ideally getting a copy of the sample too. However: we don’t want to store malicious files in MISP!”.  

MISP has a lot of strengths, but it’s not a malware analysis service in its own right. It does deal with file hashes day in and day out, though. While you can add malicious file samples to MISP, we advise against it to make sure analysts don’t make mistakes and practise good network hygiene. So, we wanted to add another screwdriver to MISP’s toolbox.

Enter AssemblyLine, a scalable file triage and malware analysis system integrating the cybersecurity community's best analysis tools. We built a managed service of this called MalwareZoo to simplify the complexities of maintaining Kubernetes deployments.

With both MISP and Assemblyline up and running we decided these tools are like peanut butter and jam, so we got to integrating them.

The Plan

We had our mission, but how were we going to execute it? 

We didn’t want to spin up both systems and have the user bounce between them since that adds friction with each system having wildly different looks and feels. Additionally, we wanted most analysts just working out of MISP, with only malware analysts having direct access to Assemblyline. We wanted to provide a solution that “just worked” (easier said than done).

Our users primarily work in CloudMISP on a day-to-day basis (much like reading the morning newspaper, with all the details, minus the funny pages). We decided that the main method of control would be through MISP, with Assemblyline operating in the background.

What we didn’t want: malware on corporate networks

Since we were working with malicious files, the last thing we wanted was a stockpile of foot-guns waiting to happen in easy reach of our users. Thankfully, with our approach of having Assemblyline work behind the scenes, it meant we had a ready made, restricted file store that our users wouldn’t have access to without the right permissions.

What we did want: streamlined malware analysis operations

We needed a set of controls that CloudMISP users would be familiar with. So we decided to use MISP tags (sorry, no prize for creativity here).

Tags gave us an interface that could be used for control, and as a bonus, provide feedback for analysts on the status of their analysis tasks.  Feedback is important as Assemblyline is fundamentally an asynchronous file processor, meaning that we need to notify the user when an event or attribute had been processed. Tags notify other users that the event or attribute has already been submitted to Assemblyline for processing, thus avoiding duplicate submissions. We also wanted to provide feedback on whether or not the file could be successfully retrieved based on file hash (e.g. it’s not on VirusTotal at all), or if something had gone wrong with analysis.

Does anyone speak MISP?

Okay, we have our idea, we know what we do and don’t want to do, and we know what it’s going to look like. So, how is this actually going to work? 

The solution is going to be bidirectional. CloudMISP speaks to Assemblyline (AL for short), sending file hashes, AL pulls the file from one of its stores if present (VirusTotal, MalwareBazaar etc.) and chews on it for analysis. When AL is done with its meal, it sends the report back to CloudMISP for the user to read.

The first hurdle was getting CloudMISP and AL talking with each other. Specifically, how and when to talk to each other, and what the conversation was going to be. 

AL supports sending requests to API endpoints through means of post-process actions. When a job is done, tell someone. Great! This is what we need.

MISP has its Workflows which sounds like a winner… BUT, sadly, at the time of writing this, MISP does not support a workflow trigger “On Local Tag added to”. Our first big hurdle. So close, but we were back to the drawing board.

With MISP Workflows out of the equation for our design right now, we fell back to MISP’s inbuilt message queue service, ZeroMQ, which supplied us with what we needed. A ZeroMQ message is sent on attaching a local tag to an Attribute or Event, so we can take action on that. 

With post-process actions for AL and ZeroMQ Messages for CloudMISP we have the When, but the How and What were still in the air. With our two systems not sharing any sort of common channels, we decided the best approach would be to build an integration point between the two. We chose Python as a common language used in both platforms. We decided to break the integration into two parts. It simplified the whole process as each app was responsible for one thing, and did one thing well!

Outbound from MISP

Outbound hooks into the ZeroMQ service embedded in MISP and waits for events and attributes to show up with our custom “Analyse This” tag attached to it. On finding one, it parses the event or attribute, validates and sends the SHA256 file hashes found to AL for processing.

Assemblyline processing

So what does Assemblyline do? You can always RTFM (Read the fancy manual) but in short, AssemblyLine processes an uploaded file based on a number of configured analysis engines which could be Yara rules, script deobfuscation, or maldoc analysis. In fact, we wrote a whole guide to Assemblyline services if you’re so inclined.

In our case, when we provide a SHA256 hash, AL goes and pulls the file from one of its configured Malware Repositories, primarily VirusTotal. If the hash exists on VirusTotal, it downloads the file (with the right type of subscription API key) and chews on it for a while. Finally, AL spits out a report on what was found (if anything).

Inbound to MISP

Being the opposite of Outbound, Inbound waits to receive a call from AL. When it does, Inbound tailors a report in a format that CloudMISP can ingest. This format is a new MISP event, with a report attribute containing the actual details about the file.

“But wait Cosive Developer, Why are you creating an event? Shouldn’t you just add the attribute to the Event that spawned the file hash?”

You’re right… almost. While the event is in your MISP instance, you might not “own” it since events are sourced from other organisations. The event should be modified by the original creator, not us as a recipient. The last thing we want to do is pollute their Event with data that belongs to us since we would now take ownership of that CTI package. This is where MISP’s Event Extensions feature comes into play. 

Event Extensions allow you to create a new Event and link it to other Events as an extension of the linked event. The new event is something that you own, can manage, distribute, and edit to your heart’s content, all without corrupting and polluting the original event.

The End …?

Okay, there we have it, from idea to fruition: integration points between CloudMISP and Assemblyline.

The end result?

  1. File analysis from hashes using only MISP: most analysts can just work in MISP, letting us keep direct access to Assemblyline reserved for malware analysts, preserving network hygiene.
  2. Use MISP tags as file operation triggers: when we see a file hash of interest, we can just add a local MISP tag to an event or attribute to indicate we want to fetch it from Virustotal and analyse it. We can also add these tags in an automated fashion when we want to analyse everything matching a certain criteria, e.g. everything from a particular feed provider.
  3. Feedback loop back to MISP: We get feedback about the analysis status via tag update.
  4. Analysis results back to MISP in new events: We get Assemblyline analysis results using MISP Extended Events attached to the original event where we found the hash.

Sure, I’ve painted a rosy picture of software development, and to be sure I’ve glossed over the pain points and haven’t included the caffeine-to-code consumption figures, but it gives you an idea of our thinking. 

This little idea spawned a greater thinking here at Cosive. We thought to ourselves “We offer CloudMISP and we offer AssemblyLine, why not build up a suite that we can continue to add value to." And MalwareZoo was born.

MalwareZoo is our offering where we take the headache and overhead of installation, maintenance and hosting of both tools out of your hands and leave you with just using the Software.

So if you have a collection of interesting malware that needs a home, come have a talk with us.