Platform
Everything you need to build with real data — without the production risk.
SOFI is an on-prem test data platform for virtualization, masking, refresh, and compliance evidence. Provision masked database workspaces without sending production data to an external SaaS.
Pillars
The complete stack for non-production data.
Six layers that work together to deliver realistic, secure, fresh test data to your team.
Virtualization
Thin-clones from snapshots. Each VDB takes < 5% of the source storage.
Masking
50+ native rules. Automatic PII detector for SSN, email, credit cards, and addresses.
CDC
Logical replication via WAL, binlog, and LogMiner. Keeps staging in sync with prod.
38+ connectors
PostgreSQL, MySQL, Oracle, SQL Server, MongoDB, ClickHouse, Cassandra and more.
Multi-tenancy
ORM-level isolation via tenant_id. Mandatory soft-delete. Full audit trail.
API & CLI
FastAPI REST. Hook into your CI/CD and provision VDBs inside pull requests.
Architecture
Built to scale with your team.
SOFI runs close to your databases: API workers, masking jobs, snapshot management, and provisioning workers stay inside your private environment. The dashboard and automation surface sit on top of the same audited control path.
┌─ apps/api ────── FastAPI · async SQLAlchemy │ ├─ Celery × 6 queues │ └─ RBAC + tenant-scoped access │ ├─ apps/web ────── Next.js 14 · App Router │ ├─ React Query · Zustand │ └─ shadcn/ui · Tailwind │ └─ engine ──────── private data plane ├─ Snapshot Manager · CoW ├─ DataMasker (50+ rules) ├─ CDC (WAL · binlog · LogMiner) └─ Cluster Provisioner