Drift Detective
Drift Detective is a Python library for tracking schema evolution using versioned JSON snapshots
Drift Detective “Did the structure of my data change, and should I care?”Drift Detective is a Python library for tracking schema evolution and detecting structural drift in tabular datasets using vers...

About this project
Drift Detective
“Did the structure of my data change, and should I care?”
Drift Detective is a Python library for tracking schema evolution and detecting structural drift in tabular datasets using versioned JSON snapshots.
It is designed for data workflows where table schemas evolve over time.
The library focuses on schema-level changes and not row-level
Drift Detective is built around four core components, each responsible for a specific part of schema tracking and reporting:
DfSnapshot: Captures the schema state of a pandas DataFrame at a specific point in time and stores it as a versioned snapshot.
SnapshotHistory: Creates a schema evolution timeline listing version and schema changes.
SnapshotDiff: Compares schema changes between two snapshot versions, listing all added and removed columns across intermediate versions.
SchemaReport: Integrates all components into a complete report to tell the full story
JSON snapshot
{ "table_name": "netflix_titles", "filepath": "netflix_titles.csv", "timestamp": "20251230_161527", "version": 1, "column_count": 12, "row_count": 8807, "schema": { "show_id": "object", "type": "object", "title": "object", "director": "object", "cast": "object", "country": "object", "date_added": "object", "release_year": "int64", "rating": "object", "duration": "object", "listed_in": "object", "description": "object" }, "columns_added": [], "columns_removed": [] }Snapshot Timeline for table: netflix_titles ──────────────────────────────────────────────────────────── v1 ● 20251230_162126 │ columns: 12 │ rows: 8807 │ initial snapshot v2 ● 20251230_163649 │ columns: 11 │ rows: 8807 │ - removed columns: title v3 ● 20251230_163729 │ columns: 10 │ rows: 8807 │ - removed columns: listed_in ────────────────────────────────────────────────────────────You must be logged in to comment
Sign in to commentComments
No comments yet
Be the first to share your thoughts!