Here’s a question most organisations haven’t considered: when you email a PDF to a client, and that PDF contains your employee’s full name, your company name, and a creation timestamp in its metadata, have you just processed personal data without a clear legal basis? Under GDPR, the answer is almost certainly yes.
The GDPR defines personal data broadly: any information relating to an identified or identifiable natural person. File metadata frequently contains exactly this.
A Word document’s author field contains an employee’s name. A photo’s EXIF data contains GPS coordinates precise enough to identify a home address. A PDF’s producer field reveals the specific software version and operating system running on an employee’s machine. Timestamps with timezone offsets can narrow someone’s geographic location. Device serial numbers embedded in photos create a persistent identifier linkable to an individual.
All of this constitutes personal data under GDPR. And every time a file containing this metadata is shared externally, that personal data is being transmitted — often without the data subject’s knowledge or meaningful consent.
Article 5(1)(c) of GDPR establishes data minimization as a core principle: personal data shall be adequate, relevant, and limited to what is necessary in relation to the purposes for which it is processed.
When you send a contract to a client, the purpose is conveying the contractual terms. The author’s name in the PDF metadata, the total editing time, and the template filename are not relevant to that purpose. Including them violates the data minimization principle.
This isn’t an aggressive reading of the regulation. The European Data Protection Board has consistently emphasized that data minimization applies to all personal data processing, including incidental processing that organisations might not think of as intentional data sharing.
Try MetaStrip — it's free
Strip metadata from any photo in seconds. No upload, no account.
Outbound document sharing. Every document your organisation sends externally — contracts, proposals, reports, invoices — potentially contains employee names, company metadata, and editing history. If your standard practice is to email Word documents and PDFs without stripping metadata, you’re systematically sharing personal data beyond what’s necessary for the business purpose.
Published reports and filings. Documents published on your website or submitted to regulators carry metadata that becomes publicly accessible. An annual report PDF with seventeen different author names embedded in its metadata exposes those individuals’ involvement in the document’s creation to anyone who downloads it.
Photography and marketing materials. Product photos, event photos, and marketing images carry EXIF data including GPS coordinates (where was the photo taken?), device information (whose phone or camera?), and timestamps. If your marketing team shares photos on your website or social channels without stripping metadata, you may be publishing employee location data.
Cross-border document transfers. When documents carrying employee metadata are sent to recipients outside the EU/EEA, the metadata constitutes a cross-border transfer of personal data, potentially triggering additional GDPR requirements around international data transfers.
No GDPR enforcement action has yet centred exclusively on file metadata. But several DPA guidance documents and decisions touch on the issue:
Multiple European data protection authorities have included document metadata in their guidance on data minimization in practice. The Irish DPC’s guidance on data protection by design specifically mentions document metadata as an area where organisations should implement technical measures to prevent unnecessary data disclosure.
The principle of data protection by design and by default (Article 25) is directly relevant. An organisation that has no process for metadata management is arguably failing to implement data protection by design — the default behaviour of their document workflow exposes personal data unnecessarily.
In practice, regulators are more likely to cite metadata issues as an aggravating factor in a broader investigation than to pursue standalone enforcement. But the principle is clear: if you’re sharing files externally, the metadata in those files should be considered as part of your data processing activities.
Establish a metadata policy. Define which document types require metadata stripping before external sharing. At minimum, this should cover all client-facing documents, publicly published files, and any documents sent to third parties.
Implement technical controls. Rather than relying on individual employees to remember to strip metadata, implement it as a workflow step. This could be a document management system that strips metadata on export, a designated tool that employees use before sending files, or automated metadata removal in your email gateway.
Audit your templates. Document templates are a common source of inherited metadata. Review your organisation’s templates to ensure they don’t carry metadata from previous authors, other organisations, or sensitive internal classifications.
Include metadata in your data processing records. If you’re maintaining records of processing activities under Article 30, consider whether file metadata should be included as a category of personal data that your organisation processes.
Train your people. Most employees have no idea that the documents they create carry hidden metadata. A brief awareness session — showing them what their own documents contain — is usually sufficient to change behaviour. The moment someone sees their home address embedded in a photo they took at their kitchen table, the lesson sticks.
GDPR compliance doesn’t mean treating every metadata field as a crisis. The regulation is principles-based and expects proportionate responses to data protection risks.
For most organisations, a reasonable approach is: strip metadata from documents and images before sharing them externally, audit and clean templates regularly, and include metadata in your data protection training. You don’t need to strip metadata from internal documents shared among colleagues (though there may be other reasons to do so), and you don’t need to retroactively clean every file your organisation has ever shared.
The key is demonstrating that you’ve considered the issue, implemented appropriate measures, and can show that your approach is consistent with data minimization principles.
MetaStrip processes files entirely in the browser, meaning your documents never leave your infrastructure — an important consideration for organisations handling confidential or regulated data. For teams processing documents regularly, the document batch pass handles up to 25 files at once with selective removal options.
Metadata compliance is one of those areas where the gap between what organisations should be doing and what they are doing is wide. The organisations that close that gap now will be better positioned when a regulator eventually asks the question.
Free for single files. No account, no upload, no tracking.
Open MetaStrip →