Building GDPR Compliance with open-source tools
This guide outlines the technical implementation steps required to align a SaaS application with GDPR requirements, focusing on data mapping, privacy-first analytics, and automated data subject request workflows. It moves beyond legal theory into specific database and infrastructure configurations.
Audit and Map PII in the Database
Identify every column containing Personally Identifiable Information (PII). In a typical SaaS schema, this includes 'email', 'billing_address', and 'last_sign_in_ip'. Create a data dictionary to track where this data resides to facilitate future erasure requests.
SELECT table_name, column_name
FROM information_schema.columns
WHERE column_name ILIKE ANY (ARRAY['%email%', '%address%', '%phone%', '%ip_address%'])
AND table_schema = 'public';⚠ Common Pitfalls
- •Forgetting PII stored in JSONB blobs
- •Overlooking backup files and staging database clones
Transition to Privacy-First Analytics
Replace tracking-heavy tools like Google Analytics with privacy-focused alternatives like Plausible or Fathom. These tools use hash-based counting that does not require a cookie consent banner under GDPR because they do not track individuals across sessions or sites.
<!-- Replace GA script with Plausible -->
<script defer data-domain="yourdomain.com" src="https://plausible.io/js/script.js"></script>⚠ Common Pitfalls
- •Accidentally leaving old GTM (Google Tag Manager) snippets active
- •Failing to disable IP logging in application-level error trackers like Sentry
Implement a 'Hard Delete' Workflow for Right-to-Erasure
GDPR requires that data be deleted upon request. Many SaaS apps use 'soft deletes' (deleted_at columns). You must implement a routine that identifies soft-deleted records and permanently purges them, including associated records in related tables using ON DELETE CASCADE.
ALTER TABLE user_profiles
ADD CONSTRAINT fk_user
FOREIGN KEY (user_id)
REFERENCES users(id)
ON DELETE CASCADE;
-- To execute erasure:
DELETE FROM users WHERE id = 'target_user_uuid';⚠ Common Pitfalls
- •Orphaned records in logging tables or audit trails
- •Data persisting in database WAL logs or periodic backups for longer than the 30-day compliance window
Configure Conditional Script Loading for Consent
If using non-essential cookies (e.g., Hubspot, Intercom), you must block these scripts until the user provides explicit affirmative consent. Use a wrapper function to initialize these services only after the consent state is saved in local storage.
function loadMarketingTools() {
if (localStorage.getItem('cookie_consent') === 'accepted') {
// Initialize Intercom or other trackers
window.Intercom('boot', { app_id: 'your_id' });
}
}
// Call on page load and after user clicks 'Accept'
loadMarketingTools();⚠ Common Pitfalls
- •Loading scripts via 'async' or 'defer' before the consent check executes
- •Pre-ticking 'Accept' checkboxes in the UI
Anonymize Application Logs
Server logs (Nginx/Apache) often capture IP addresses by default, which are considered PII. Configure your web server to mask the last octet of the IP address or disable IP logging entirely if not required for security forensics.
log_format privacy_aware '$remote_addr_mask - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent"';
map $remote_addr $remote_addr_mask {
~(?P<ip>\d+\.\d+\.\d+)\.\d+ $ip.0;
default 0.0.0.0;
}⚠ Common Pitfalls
- •Storing raw IP addresses in application-level logs (e.g., Winston, Morgan, or Logrocket)
- •Sending unmasked logs to third-party aggregators like Datadog or Papertrail
Automate Data Portability Exports
Users have the right to receive their data in a machine-readable format (JSON/CSV). Create a background job that aggregates a user's data from all tables and generates a signed URL for download.
async function exportUserData(userId) {
const data = await db.users.findUnique({
where: { id: userId },
include: { posts: true, settings: true, billing: true }
});
const buffer = Buffer.from(JSON.stringify(data));
return await storage.upload(`exports/${userId}.json`, buffer);
}⚠ Common Pitfalls
- •Exporting internal metadata (like password hashes or internal notes) that shouldn't be exposed
- •Timeouts when generating exports for high-volume users
What you built
GDPR compliance is a continuous technical requirement rather than a one-time checklist. By automating data erasure, anonymizing logs, and switching to privacy-first analytics, you reduce the surface area of PII and minimize the administrative burden of manual compliance requests.