dataeng.tools

PII scanner

Drop a CSV, Parquet, JSON, Excel or XML file to scan its columns for personal data — emails, SSNs, credit cards, phone numbers and IPs by value, plus names and addresses by column name. It runs entirely in your browser (DuckDB-WASM), so the very data you’re checking for sensitivity never leaves your device.

Drop a file or click to browse

Parquet, Avro, CSV, TSV, JSON, JSONL, XML — scanned locally in your browser, never uploaded

Frequently asked questions

Does my data get uploaded to scan for PII?
No — and that's the point. The scan runs entirely in your browser with DuckDB-WASM: values are sampled and matched against PII patterns locally. Nothing is sent anywhere, which is exactly what you want from a tool that looks at sensitive data.
What kinds of PII does it detect?
Value patterns: email addresses, US Social Security numbers, credit card numbers (validated with the Luhn check), phone numbers, and IPv4 addresses. It also flags likely name, address and date-of-birth columns from their names, since those are free text with no reliable pattern.
How reliable is it?
Treat it as a fast first pass, not a compliance guarantee. It samples up to a couple thousand rows and uses heuristics, so it can miss PII in unsampled rows or unusual formats, and it can over- or under-flag. Always review the findings.
Are the example values safe to show?
Examples are masked (e.g. a••••••m) so you can see which columns matched without the raw sensitive value being echoed back on screen.