The problem: a photo without a location
You have a photo. Maybe it was sent to you by an anonymous source, collected from a social media post, or captured by a security camera. The photo shows a street, some buildings, maybe a traffic sign or a distinctive shopfront. The question is simple: where was it taken?
This is the core problem that street-level geolocation solves. Unlike satellite image analysis, which looks at the earth from above, street-level geolocation works with the kind of imagery that people actually produce — photos shot at eye height, showing the world as we experience it.
How street-level geolocation works
At its core, geolocation is the process of determining the geographic coordinates of a photo based solely on its visual content. No metadata needed, no GPS tags required — just the pixels in the image.
Traditional approaches relied on manual analysis. A skilled investigator might recognise a licence plate format that narrows down the country, a street sign in a particular language, or architectural styles characteristic of a specific region. The research collective Bellingcat popularised these techniques and demonstrated how careful observers could pinpoint locations from clues in photos and videos.
Modern AI-driven geolocation takes this further. Instead of relying on a single recognisable element, the machine learning model analyses the entire visual scene. It learns to recognise patterns that humans might miss: the specific curve of a road, the type of vegetation, the style of electricity poles, the colour of road markings. These models are trained on millions of geotagged street-level images and build an internal picture of what different places on earth look like.
The process typically works in stages:
- Feature extraction — An AI model converts the photo into a numerical representation (called an embedding) that captures its visual characteristics.
- Candidate retrieval — This embedding is compared against a database of millions of reference images with known locations, finding the most visually similar matches.
- Geometric verification — The top candidates are verified by matching specific local features (such as corners, edges and textures) between the search photo and the reference images.
- Location estimation — The verified matches yield GPS coordinates, providing an accurate location estimate.
How it differs from GPS and EXIF data
It is worth clarifying what street-level geolocation is not.
GPS tagging happens when a device with a GPS receiver (such as a smartphone) records its coordinates at the moment a photo is taken. This data is stored in the EXIF metadata of the image file. If you take a photo with your phone’s location services enabled, the latitude and longitude are embedded directly in the file.
The problem is that GPS/EXIF data is often missing or unreliable:
- Many cameras (especially DSLRs and security cameras) have no GPS hardware at all.
- Users frequently disable location services for privacy reasons.
- Social media platforms strip EXIF data when images are uploaded.
- Screenshots and re-saved images lose their metadata.
- EXIF data can be trivially forged or edited.
Street-level geolocation works without any metadata. It analyses what is in the image, not what is attached to the file. This makes it indispensable when metadata is unavailable, stripped, or untrustworthy.
There is also a distinction with IP-based geolocation, which estimates a user’s location based on their internet connection. IP geolocation tells you roughly where someone is browsing; photo geolocation tells you where a specific image was captured. They solve different problems.
Real-world applications
Street-level geolocation has become an essential capability across multiple fields:
Journalism and fact-checking. When a video surfaces claiming to show an event at a particular location, journalists need to independently verify that claim. Photo geolocation can confirm whether footage genuinely shows what it claims to show, or whether it has been misattributed.
Open-source intelligence (OSINT). Researchers use geolocation to track movements, verify claims and build evidence chains. In conflict zones, geolocating photos and videos from social media can document events that would otherwise go unrecorded.
Insurance and claims verification. Insurers need to confirm that damage photos were actually taken at the claimed location. A photo of a flooded basement means something very different if it was taken at the policyholder’s address versus a location 200 kilometres away.
Law enforcement. From locating missing persons to tracking criminal activity, the ability to determine where a photo was taken gives investigators crucial leads.
Real estate and urban planning. Understanding spatial context from imagery helps professionals assess properties, plan developments and monitor changes in the built environment.
The accuracy question
How precise can AI-driven geolocation be? The answer depends heavily on the scene. In dense urban environments with distinctive architecture and signage, modern systems can pinpoint locations to within a few metres. In rural or visually repetitive landscapes (think endless farmland or generic suburbs), accuracy naturally decreases.
The key point is that geolocation is not a binary success-or-failure task. Even a result that narrows a photo’s origin down to a specific city or neighbourhood can be enormously valuable for an investigator who previously had no location information at all.
Why this matters now
The volume of images without location tags is growing exponentially. Security cameras, social media posts with metadata stripped, messaging apps, archival photographs — the demand for tools that can extract location information from visual content alone has never been greater.
At GeoPin, we have built a system that handles this at scale, combining state-of-the-art AI models with a reference database covering the Netherlands at street level. In future articles, we will dive deeper into the specific technologies that make this possible.