BlackSheepWall: Contextually Expanded Intelligence Collection
10-17, 15:00–15:50 (America/New_York), Track 1 (206b)
Language: English

Online information space is highly dynamic, dependent on the access origin, and imposes access restrictions and content modifications. Current intelligence tools are limited to representing acquired information from a single vantage point. This talk will present a novel developed and publicly released framework for information collection and enrichment with a multi-vantage point approach. This enables the identification of targeted phishing attacks, conducting threat actor profiling, assessing content modification and misinformation, and detecting access restrictions and content availability for all individuals working with information space awareness, including CSIRT teams and intelligence units.


The tip of the iceberg:
Current cyber threat intelligence and information space awareness tools are limited in providing information to the end user from just a single perspective of how it is being collected. Either the collection is performed from a single point of vantage or content is selected from one if such collection is done from more than one position on the internet. Current information space - the content served over the clear-net, deep-net, and dark-net services - is multi-faceted, highly dynamic, and may serve different content or impose restrictions based on the access origin. Legitimate services employ this to perform geographical load balancing, serve regionally tailored content, or restrict access. Similarly, cybercriminals may modify served content to target a particular group of victims (e.g., phishing sites), threat actors conduct targeted attacks (e.g., drive-by exploitation), and nation states perform content modification and misinformation campaigns (e.g., targeted messaging delivery).
Approaching the current information space from a single or limited vantage point limits observing all the possible ways the information source serves its content. An approach is developed to reach out to an information source from a large set of globally distributed vantage points employing content access modification mechanisms, to collect and analyse as many ways the content is being represented. This enriches the existing visibility and broadens the perspective on comprehending under which conditions the information source returns modified content.

See the unseen:
Developed approach utilises deployment of specialised Docker containers across cloud infrastructure and VPS service provider networks in a globally distributed way. These containers perform the collection of data from the same resource by using various additional techniques, such as, use of proxy and VPN services, connection brokers, TOR exit nodes, and modification of HTTP request parameters. Once collected, all of the content instances are aggregated in a central repository, forming a single snapshot for that particular URL. By extracting embeddings of the resource content (e.g., HTML document) and its graphical representation (e.g., website screenshot), clustering is performed to identify for which collector parameters and connection origins the information source provides different content. The differences are represented in clustering formats, visual diffs, and source document modifications.
This approach is used to enrich information acquired through existing intelligence sources and may reveal cyber crime operations, such as, regionally targeted phishing websites (i.e., malicious content only served for users in particular region or with specific connection parameters), content availability and restrictions (i.e., geo-fencing and blocking access to resources), and content modification (i.e., targeted content delivery for profiled audiences and regions).

Enter the matrix:
In this talk, the focus will be set on practical details and use-case analysis covering at least:
1. Design and implementation approach for the framework;
2. Overview of the dynamic nature of the most popular benign websites;
3. Analysis of cybercrime campaigns;
4. Analysis of threat actor operations.


Are you releasing a tool? – yes