Hackers Dropping Mid-Heist Selfies: LLM Identifies Information Stealer Infection Vector and Extracts IoCs
10-18, 13:00–13:50 (America/New_York), Track 1 (206b)
Language: Français

Information stealer malware has become one of the most prolific and damaging threats in today’s cybercrime landscape, siphoning off everything from browser-stored credentials to session tokens and other systems secrets. In 2024 alone, we witnessed more than 30 million stealer logs traded on underground markets. Yet buried within these logs is an underexplored goldmine: screenshots captured at the precise moment of infection. Think of it as a thief taking a selfie mid-heist, unexpected but convenient for us, right? Surprisingly, these crime scene snapshots have been largely overlooked until now.

Leveraging infostealer infection screenshots and Large Language Models (LLMs), we propose a new approach to identify infection vectors, extract indicators of compromise (IoCs) and track infostealer campaigns at scale. Our approach found several hundred potential IoCs in the form of URLs leading to the download of the malware-laden payload. By applying this method to ”fresh” stealer logs, we can detect and mitigate infection vectors almost instantaneously, reducing further infections. Our analysis uncovered distribution strategies, lure themes and social engineering techniques used by threat actors in successful infection campaigns. We will break down three distinct campaigns to illustrate the tactics they use to deliver malware and deceive victims: cracked versions of popular software, ads pointing to popular software and free AI image generators.

This presentation, with its live demonstration, shows how LLMs can be harnessed to extract IoCs at scale while addressing the challenges and costs of implementation. Attendees will walk away with a deeper understanding of the modern infostealer ecosystem and will want to apply LLM to other illicit artifacts to extract actionable intelligence.


Intro

Information stealer malware represents one of the most consequential yet underexamined threats in modern cybersecurity. This presentation dives deep into the architecture and implications of stealer logs - comprehensive digital dossiers that capture victims' entire online footprints. We'll demonstrate how these logs serve as complete digital histories, containing everything from financial relationships (banks, suppliers) and travel patterns (airline, hotels) to browser search histories and authentication secrets. This data powers sophisticated attack chains, enabling everything from targeted social engineering to ransomware deployment.

Stolen information includes all browsers’ credentials, session cookies, extensions (cryptowallets, password managers, authenticator apps, etc.) and personal data out of browsers (txt files, keypass databases, docs, PDFs, etc.), in addition to other sensitive information from the host. This data is then sold in illicit Telegram channels or used for further exploitation.

We then explain the Infostealer ecosystem, how it works, how logs are sold. We then pivot on why infostealers are such an important threat: the stolen data represents a person’s virtual identity. In 2024 alone, over 30 million stealer logs were posted on illicit forums. Such numbers make the monitoring and remediation of the threat very impractical.


Screenshots are a gift from threat actors themselves

In recent years, infostealers introduced a new screenshot-capturing feature. They take a screenshot of the device at the moment of infection. At first, we saw these screenshots as just funny anecdotes, like a glimpse into how absurd some infections are. However, we quickly noticed that over 25% of stealer logs contained a screenshot. This represents several millions of compromised devices.

As the saying goes, ”a picture is worth a thousand words”; these screenshots have the potential to deliver immediate insights that can reveal context and subtleties often missed or overlooked in textual logs. For example, the webpage visited by the victim when the infection occurred, the fake installer that contained the malware or even witnessing that the victim disable anti-virus software himself.


Technical Implementation of an Analysis Pipeline

We explain how to leverage an LLM (gpt-4o-mini) to describe screenshots and extract potential IoCs and infection vectors. We describe the prompt and logic behind the design. Three main screenshot types are guiding our prompt engineering: File System Screenshots, Web Content Screenshot and Hybrid Screenshots.

We also present the results of the assessment of the LLM’s performance. We show that it is able to identify the relevant information when it comes to infostealers infection vectors and IoCs, showing consistent performance on File Identification, General Description and Suspicious Elements Identification, but inconsistent performance on Browser Tabs Identification. The complete prompt design and assessment of the model can be found in the white paper included in this proposal.


Identifying IoCs at scale

We describe how we go from screenshot description to extracting URLs, and discriminating between benign URLs and potentially malicious URLs. We present the pipeline to check if the URLs are up and redirect to something else.

Out of 1007 screenshots, the LLM identified 337 unique URLs. Among these, 247 were flagged as potential infection vectors. Further analysis revealed that 208 of these URLs were either inactive or benign (returning errors like 404, 400, or having no downloadable content), while 39 remained active. The LLM also identified 239 unique files comprising:
1. 79 executable (.exe) files
2. 38 compressed archives (.zip)
3. 23 RAR archives (.rar)
4. 2 dynamic link libraries (.dll)


Common Distribution Strategies and Luring Themes

We describe recurring lures, malware distribution methods and social engineering techniques used in successful infection campaigns identified in our analysis.

5.1 Lure Themes

  1. Cracked Software
  2. Gaming Mods and Cheats

5.2 Distribution Strategies

  1. Youtube as a Distribution System
  2. Leveraging Google Ads to exploit users’ trust

Campaign Tracking

We explain how we identified campaigns based on the frequency of their IoCs and recurring themes. Three distinct and effective infection campaigns were identified and we pieced together the workflow of each campaign to reveal the social engineering tactics that led to successful infections.

6.1 Blitz Java

Backdoored Java Runtime Environments

6.2 Zero MidJourney

MidJourney was famously only available via a Discord bot. Its popularity created an opportunity for threat actors to distribute fake Midjourney executables that were well ranked on search engines.

6.3 Snow Microsoft 2022

A cracked version of Microsoft Office 2022, laced with malware, was provided in the description box of a single Youtube video. This campaign is a great example of how threat actors leverage Youtube.


Research Discussion

The approach’s biggest strength: While traditional detection methods must constantly adapt to code-level changes, our screenshot-based approach maintains effectiveness across diverse malware families. The screenshots embody both our greatest strength and our primary limitation: we depend on the existence and quality of said screenshots.

We discuss the costs of implementing our approach and potential with newer models capable of tooling like Mistral Large 2.


Conclusion

We will conclude this presentation with the key takeaways of this presentation. We highlight that infostealers are a real threat but threat actors offered us the artifacts to identify infection vectors on a silver platter. LLMs can be leveraged to identify potential IoCs at scale and track campaigns using those screenshots. Nonetheless, there is still a need for humans in the pipeline to address limitations despite LLM greatly facilitating the identification.


Are you releasing a tool? – yes

Olivier Bilodeau, a principal researcher at Flare, brings 15+ years of cutting-edge infosec expertise in honeypot operations, binary reverse-engineering, and RDP interception. Passionate communicator, Olivier spoke at conferences like AtlSecCon, BlackHat, DEFCON, SecTor, Derbycon, and more. Invested in his community, he co-organizes MontréHack, is NorthSec’s President, and runs its Hacker Jeopardy.