Operating a geolocation service in the European Union means working under the GDPR — the General Data Protection Regulation that sets the global standard for data privacy. For GeoPin, GDPR compliance is not a checklist bolted onto an existing system. It is a design principle that has shaped our architecture from the beginning. We process images to determine locations, and we do so without retaining the data that makes privacy violations possible in the first place.

The privacy challenge of image processing

Images contain rich personal data. A street photo may include faces, licence plates, building addresses or other information that can identify individuals. Under the GDPR, processing images that contain personal data requires a lawful basis, and storing such data carries obligations around access rights, rectification, erasure and data portability.

We chose a different path. Rather than managing the complexity of storing personal data responsibly, we designed a system that does not store it at all.

Architecture of data minimisation

The GDPR’s data minimisation principle states that personal data must be “adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed.” For GeoPin, the purpose is geolocation — determining where a photo was taken. Here is what is necessary for that purpose and what is not:

Necessary: Temporarily processing the image to extract a visual embedding (a 512-dimensional numerical vector). Comparing that vector against our reference index. Returning coordinates to the user.

Not necessary: Storing the original image. Storing the extracted embedding. Logging what the image contained. Retaining any link between a user’s query and the image they submitted.

Our processing pipeline reflects this distinction:

The user uploads an image to our Cloudflare Worker endpoint.
The Worker converts the image to base64 and sends it to our GPU backend on RunPod for embedding extraction.
The GPU backend processes the image in memory, extracts the CosPlace embedding and returns the vector. The image data is not written to disk or any persistent storage on the GPU instance.
The Worker uses the embedding to query Cloudflare Vectorize for nearest-neighbour matches.
The top candidates undergo geometric verification, again processed in memory.
The final result — coordinates and confidence scores — is returned to the user.
The uploaded image, the extracted embedding and the intermediate candidate data exist only in working memory during request processing. Once the response is sent, they are eligible for garbage collection.

There is no images table in our database. There is no R2 bucket for uploaded photos. There is no search history recording what users searched for. These storage layers do not exist because we never built them.

What we do store

Transparency requires that we clearly state what data we do retain:

Usage records. We track API key identifiers (or IP addresses for trial users) alongside timestamps and billing periods for rate limiting. Usage records contain no information about the content of queries — they record that a query was made, not what was queried.

API keys and subscription details. Paying users have an API key with standard account information (email address, plan type, subscription status).

Reference image metadata. Our index contains metadata at the reference level: source platform, image ID, coordinates, capture date and compass heading. This is not user data — it is metadata about openly licensed imagery.

That is the complete list. We cannot look up what a user searched for because that information is never recorded.

The GDPR grants data subjects specific rights. Here is how our architecture relates to each one:

Right of access (Article 15). We hold only account information and usage counts. We cannot provide search history because we do not have it.

Right to erasure (Article 17). We can delete account and billing data. There are no uploaded images to delete because they were never stored.

Right to data portability (Article 20). We can export account information and usage statistics. There is no image data to port.

Right to restriction of processing (Article 18). Deactivating an API key immediately prevents any further processing.

The reference data question

The reference images themselves are not stored by GeoPin. We store only metadata (coordinates, source IDs) and CosPlace embeddings — 512-dimensional floating-point vectors encoding spatial and architectural features, not biometric or personally identifiable information. The original images reside on their respective platforms (Mapillary, KartaView, etc.), which have their own privacy processes including face and licence plate blurring. An embedding cannot be reversed to reconstruct the original image.

International data transfers

Our infrastructure runs on Cloudflare’s global network with GPU processing on RunPod, both under appropriate data processing agreements. Since images are processed in transit and never persisted, the jurisdictional complexity that affects many cloud-based image processing services is significantly reduced.

Controller versus processor responsibilities

When a company integrates GeoPin’s API, the GDPR’s controller-processor framework applies. The integrating company is the data controller; GeoPin acts as the data processor. We offer a standard Data Processing Agreement (DPA) for business customers that reflects our actual architecture: in-memory processing, no image storage, coordinates-only output.

Why privacy-first is good engineering

Building a privacy-first system was not a sacrifice — it was a simplification. By not storing images, we have eliminated entire categories of technical problems: image storage costs, backup management, access control for sensitive data, retention policies, anonymisation pipelines and data subject requests relating to image content.

Our system is simpler, cheaper and faster because it does not carry the burden of managing an ever-growing image archive. The GDPR guided us towards a better design.

Data minimisation is not just a legal requirement. It is a principle that produces cleaner systems. If you do not need the data, do not collect it. If you do not collect it, you cannot leak it, lose it, or be compelled to hand it over. That is a privacy guarantee no policy document can match — it is enforced by architecture.