GDPR and Identity Data: What You Can and Cannot Store

Identity verification produces some of the most sensitive data your systems will ever touch: ID document scans, faces, dates of birth, document numbers, and biometric templates. Under the GDPR, all of it is personal data — and some of it sits in the special-category tier that carries heavier obligations.
This post breaks down what you can store, what you generally cannot, and how to design a KYC flow that keeps regulators happy.
The data categories you are dealing with
During a typical verification, you collect several distinct types of data:
- Identity attributes — name, date of birth, document number, expiry date, nationality.
- Document images — the scanned passport, driver licence, or ID card.
- Selfie / face images — captured for face match and liveness.
- Biometric templates — the mathematical representation used to compare a selfie against a document photo.
- Verification metadata — pass/fail results, fraud flags, timestamps, IP addresses.
Under GDPR, biometric data used to uniquely identify a person is special-category data (Article 9). That changes the legal footing for storing it.
Why biometrics are treated differently
Article 9 prohibits processing biometric data unless a specific exception applies. The most common bases for KYC are explicit consent (Article 9(2)(a)) and substantial public interest (Article 9(2)(g)), the latter often underpinned by national AML legislation. Plain "legitimate interest" does not cover special-category biometrics.
Heads up
A face match score is not automatically special-category data, but the biometric template that produced it usually is. If you persist templates, treat them as Article 9 data and document a valid exception before you store anything.
What you can store
You can store identity data when you have a lawful basis and a defined purpose. For most KYC use cases that basis is legal obligation (Article 6(1)(c)) tied to AML/CTF rules, supplemented by consent for any biometric element.
Things you can reasonably retain:
- Verification outcomes — pass/fail, the reasons, and which checks ran. These support your audit trail and are often required by AML regulators.
- Identity attributes needed for ongoing customer due diligence.
- Document images and selfies, where your AML regime requires you to keep evidence of the checks performed.
- AML/PEP and sanctions screening results, so you can demonstrate you screened the customer and what you found.
The key is purpose limitation. You store data to meet a documented obligation, not "in case it is useful later."
Retention: keep it only as long as you must
GDPR's storage-limitation principle clashes with AML record-keeping rules, which often mandate retention for five years (or more) after the relationship ends. Resolve this explicitly:
- Map each data element to the rule that requires keeping it.
- Set automatic deletion when the retention window closes.
- Separate AML-mandated records from operational data you no longer need.
If a piece of data is not required by law and no longer serves your stated purpose, delete it.
What you cannot store (or should not)
- Biometric templates without a valid Article 9 basis. No consent or statutory exception means no storage.
- Raw data beyond your stated purpose. Collecting a passport for age verification does not entitle you to mine the MRZ for marketing segments.
- Data kept "indefinitely." Open-ended retention is one of the most common GDPR findings.
- Sensitive fields you never needed. If your check does not require the document photo to persist, do not store it.
- Plaintext sensitive data at rest. Not technically prohibited by name, but failing to apply "appropriate technical measures" (Article 32) is a breach waiting to happen.
Designing a compliant verification pipeline
A few architecture choices make GDPR compliance dramatically easier.
Minimise what leaves the device and what you persist
Run document OCR and MRZ/barcode reading to extract only the attributes you actually need. If you only need to confirm a person is over 18, you do not need to retain the full document image after the check.
Decide where biometrics live — and for how long
If you run face match and liveness, decide whether you persist the template at all. Many flows perform the comparison, store only the pass/fail result, and discard the template. That single decision can move you out of long-term Article 9 storage entirely.
Encrypt, isolate, and control access
ID Analyzer is ISO 27001 certified, and our Vault lets you store verification records and documents in an encrypted, access-controlled environment instead of scattering sensitive files across your own buckets. For organisations that cannot let data leave their jurisdiction or premises at all, ID Fort offers on-premise deployment so identity data never traverses a third party.
Honour data-subject rights
Whatever you build, individuals can request access, rectification, and — within AML limits — erasure. Make sure you can:
- Locate every record tied to one person.
- Export it in a portable format.
- Delete it once retention obligations expire.
If your data is spread across logs, databases, and storage buckets with no index, fulfilling these requests becomes a manual nightmare.
A practical checklist
- Document your lawful basis for each data type — and a separate Article 9 basis for biometrics.
- Map every field to a retention period backed by a rule.
- Minimise collection and discard biometric templates where you can.
- Encrypt at rest and in transit; restrict access.
- Automate deletion at the end of retention.
- Keep an audit trail of checks performed.
Identity verification and GDPR are not in conflict — they pull in the same direction toward minimal, purposeful, well-secured data. Build for that from the start, and compliance becomes a property of your architecture rather than a scramble before an audit.



