Automated Map Searches, Scam-Busting Tools and Twitter Search Translations: Here are the Results of Bellingcat's Second Hackathon
Last month, Bellingcat hosted its second ever hackathon, with the event focussing on developing general tools for digital investigations. This differed slightly from our first hackathon, which encouraged participants to develop tools for network analysis.
We’re excited to present the final projects from Hackathon Two here with a brief explanation as to what they can do. We hope to see them used by open-source researchers in the future. In addition, Bellingcat is offering Tech or Tech Education Fellowships to the developers of the winning projects to further develop their tools.
One thing we particularly liked about this hackathon was the number of projects that sought to match tool development with researcher needs. This is an issue that Bellingcat identified previously, with many tools not being accessible or too technically complex for researchers to use effectively. We’re excited to see several developers thinking about ways of addressing this and considering ease of deployment.
We were also happy to see several developers who participated in our previous hackathon participate in this hackathon too. Some continued development on the tool they started in the previous hackathon while others developed completely new tools.
Here’s a look at what they came up with.
First Place: OSM Finder, developed by Grant Grubbs, is a tool that searches for locations in OpenStreetMap based on the distances and angles of annotated map features. It has the potential to be used by researchers to narrow down possible locations when geolocating a picture or video screen grab. For example, if an image shows a highway with two bridges across it at approximately 90 and 65 degrees which are separated by between 200 and 1,000 metres somewhere in Massachusetts, those specifications can be inputted into the tool to search for results. Judges liked OSM Finder’s easy-to-use interface, its potential for partially automating geolocation workflows, (a challenge Bellingcat has long been interested in) and its novel use of filtering by angle and distance between map features. The tool has the potential to offer much more than merely filtering by proximity, a feature that is already available using OpenStreetMap’s Overpass Turbo API.
Second Place: Action Transcription, developed by Simon Willison (the creator of Datasette), is a framework for using the GitHub web interface to allow users to easily and automatically run software commands. The example displayed for the hackathon was a command that transcribed a specific YouTube video using OpenAI’s Whisper model. Judges liked its novel combination of existing capabilities (GitHub Issues and GitHub Actions) and lack of need for installing any software, as well as its potential to allow developers to easily create and modify interfaces for similar workflows.
Third Place: 451 Corporate Risk Miner, developed by Elena Dulskyte, Marko Sahan, and Peter Zatka-Haas from ComplyAdvantage, is a dashboard for visualising corporate ownership structures from UK Companies House data and analysing several risk signatures that could suggest financial crime. Judges liked its flexibility for specifying the weights of multiple risk signatures and its potential for reducing the time burden for financial investigations. This tool could be used by a researcher who wants to find corporate networks related to a specific individual and easily see if any entities in the network are on sanction lists, or if the network has a cyclical ownership structure that could indicate suspicious behaviour.
Most Impactful: Chips, developed by Ishaan Jhaveri and Yoni Nachmany from The New York Times, is a tool for downloading a set of cropped satellite images from a file containing geospatial coordinates. Judges liked that, though the tool idea was simple, the solution was well implemented and filled a well-defined researcher need (as well as its potential to save investigators lots of time). This tool could be used by a researcher who has a KML, shapefile, or GeoJSON file containing the coordinates of places of interest, for example suspected military bases in a country, and wants to easily get satellite images of those places without significant manual effort. Another relevant use case is to quickly get “chips” of curated data for training satellite image recognition machine learning models. The tool can be viewed on this website.
Identifying ‘Instascam’ Accounts, Twitter Search Translation and Tracking Illegal Sand-Dredging Boats
Other projects to come out of the hackathon are listed below, alongside links so that anyone interested can try the tools out for themselves. These tools are listed in alphabetical order.
Arabella, developed by John Rodley, is a tool for matching open-source researcher needs with open-source software developers. This tool could be used to put a software developer who is interested in open-source research in contact with an open-source investigator who is having trouble using, or who has found a bug, in an open-source software tool.
BellingcatCat, developed by James Fleming and Eric Brichetto, is a platform for populating graph databases using restagraph, which generates an HTTP API for a neo4j database. This tool could be used by a researcher who wants to organise information about an investigation in a graph database without having to install or understand neo4j.
Blattodea, developed by Timo Damm, Jakob Hauser, and Ioana Preoteasa, is a tool that provides a graphical user interface to scrape post information from a specified Twitter account. It also generates interactive network visualisations and statistical analyses of the accounts that have interacted with that account. This could be used if a researcher is interested in a particular Twitter account and wants to find other accounts that are associated with them.
Blurfaces, developed by Ravi Kant Sharma, is a tool for automatically blurring people’s faces in videos. This tool could be used by a user who has recorded video, for example of government repression at a protest, and wants to blur out the faces of those there in order to protect them before sharing the video.
CommentSearch, developed by hackathon participant Richard, is a Google Chrome extension that searches YouTube and Facebook comments on a video or post. This tool could be used by a researcher who wants to find all YouTube comments for a specific video that contains a particular keyword.
Commonality, developed by Fraser Crichton, is a tool, built on Telepathy (an existing tool for analysing Telegram channels), that finds links to websites commonly shared by two Telegram channels. This tool could be used by a researcher who wants to understand the overlap between Telegram channels and see what websites sites the two channels link to.
Dezoofier, developed by Alex Trefilov, is a search interface for Bellingcat’s Online Investigation Toolkit that allows researchers to more easily find the specific tools they’re looking for based on the information they already know and the further information they want to get. This tool could be used by a researcher who, for example, has the username and email address of a person of interest, and wants to find their name, but doesn’t know what tools to use to get that information.
Doppelgänger Finder, developed by Shivansh Sethi and Rohit, is a dashboard for analysing the behaviour of one or more Twitter accounts and performing several analyses, including follower network mapping, sentiment analysis and word cloud generation. This tool could be used by a researcher who wants to determine how similar several Twitter accounts are to each other.
Instascams, developed by Corky Hogan, is a tool for detecting scam Instagram accounts that sell fraudulent Venezuelan passports and immigration documents. This tool could be used by a researcher who wants to automatically determine whether or not a given Instagram account exhibits behaviour associated with fraudulent activity.
LCREYE, developed by Vincent Castro, is a tool for automatically detecting objects and faces in video. This tool could be used by a researcher who has a video containing some features of interest (for example a person or a tank) and find the parts of the video that contain those features, without having to watch the entire video themselves.
Lucid Contributions, developed by Thane Shubaly, is a tool for easily visualising networks in political campaign contribution data, with a focus on Canadian federal elections. This tool could be used by a researcher who is interested in a particular political context where donations data is available and wants to see politicians who received contributions from similar donors.
NERDoc, developed by Ryan Willett, is a tool for analysing document contents by performing named-entity recognition searches and generating network visualisations of detected entities in documents. This tool could be used by a researcher with a large number of documents, for example an email database, and wants to know the most important people and terms in the documents without having to read them all.
OpenSauce, developed by Tom Dwyer, is a tool for anonymous, decentralised file sharing, with additional capabilities for file verification. This tool could be used by a user who wants to share a video, for example containing evidence of a war crime, anonymously while retaining some metadata fields that would allow investigators to verify the video’s authenticity.
OSINT Community, developed by hackathon participant Pavelas, is a website that allows open-source researchers and open-source developers to communicate and collaborate on tool development while sharing and working on common challenges. This tool could be used by an open-source developer who wants to know what tools to develop, or by an open-source researcher who has a tool idea and wants to find a developer interested in implementing it. It could also be used by individuals who want to learn about tools from a community built source.
Quintessence, developed by Morgan Hervé-Mignucci and Sean Greaves, is a tool for analysing data about corporations from UK Companies House and scoring them based on several risk signatures that could suggest suspicious behaviour or financial crime. This tool could be used by a researcher who wants to see if a particular company exhibits suspicious behaviour.
Sand Dredger, developed by Guy Phillips, Steffen Merten and Robert Hochstedler from bolster.nz is a tool for monitoring and tracking boats that may be involved in sand dredging operations. This tool could be used by a researcher who is interested, for example, in monitoring sand dredging near the Taiwan Strait or elsewhere.
Search Optimizer, developed by Peter Kompasz, is a Google Chrome extension that allows users to easily construct advanced Google search queries that employ search operators through a simple user interface. This tool could be used by a researcher who wants to perform a Google search using specific search operators (filtering results by date, file type, site, or title content) without having to know the format or specific command of those search operators.
Twitter-Translate, developed by the developer, Lemming, is a Google Chrome extension that allows users to search for a specific keyword on Twitter in multiple languages simultaneously. This tool could be used by a researcher who has a specific word (for example the Ukrainian city of Kherson) and wants to easily find tweets containing the Ukrainian (Херсон) and Polish (Chersoń) language translations of that word.