Git for Datasets
Time-travel debugging and transformation lineage tracking
Version control for your data transformations. Track every column change, instantly replay any checkpoint, and debug pipeline mysteries with an in-memory DAG that records the complete history of your DataFrame.
Pure Python, zero overhead. Works with Pandas, Polars, and NumPy arrays.
"Why did my metric change?"
You ran a 6-hour training job. The Sharpe ratio dropped from 1.4 to 0.9. Somewhere between the raw tick data and the feature matrix, a silent transformation introduced look-ahead bias or row filtering. You have no idea where. Your logs don't track it. Git can't diff binary Parquet files. And by the time you discover the bug, you've wasted days.
Versions entire files with S3 backends, CI pipelines, and YAML configs. You don't want to learn a new orchestration systemโyou want to know what happened to column price_lag1 between step 3 and step 7.
Tracks at the column and row level. Zero I/O overhead. Records every transformation in an in-memory DAG. Instant time-travel to any checkpoint.
git diff on a Parquet file is binary noise. It cannot tell you "this .filter() removed 412 rows" or "this .with_columns() introduced a null in 3% of rows."
Shows you exactly which rows were added/removed, which columns changed, and how many nulls appearedโall in human-readable format.
flashback wraps your DataFrame in a zero-cost proxy that records every transformation as a node in an in-memory Directed Acyclic Graph (DAG). Each node is identified by a deterministic SHA-256 hash of the schema + operation arguments.
The key insight: Identical transformations applied to identical data always produce the same node ID. Transformations are deterministic by construction, making them reproducible and cache-able.
fb.checkout("before-lag") returns the exact frame at that checkpoint with no I/O unless you ask for it.
frame.diff(other) shows exactly which rows were added or removed between any two checkpoints.
fb.visualize() renders a git-log-style tree in your terminal or an SVG graph in Jupyter.
Identical transformations on identical data always produce the same node IDโno hidden state, no surprises.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ FlashbackFrame โ โ โ โ โโโโโโโโโโโโโโโโ intercept โโโโโโโโโโโโโโโโโโโโโ โ โ โ Polars API โ โโโโโโโโโโโโโโถ โ LineageDAG โ โ โ โ .filter() โ โ โ โ โ โ .sort() โ record node โ root โโโถ filter โ โ โ โ .join() โ โโโโโโโโโโโโโ โ โโโถ sort โ โ โ โโโโโโโโโโโโโโโโ โ โโโถ join โ โ โ โ โโโโโโโโโโโโโโโโโโโโโ โ โ โผ โ โ polars.DataFrame (unchanged; Polars still optimises) โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Each node is identified by a 20-character hex SHA-256 of:
{
"parents": ["<parent_node_id>"],
"op": "filter",
"kwargs": {"arg_0": "[(col(\"price\")) > (0)]"},
"schema": {"id": "Int64", "price": "Float64", ...}
}This means:
Terminal output with fb.visualize():
โญโ flashback lineage โข 4 commits โข HEAD โ rolling_mean โโโโโโโโโโโโโโโโโโฎ
โ โ
โ ๐ LOAD 5,000 rows ร 4 cols [14:03:01] โ
โ โ โ
โ โโ ๐ filter arg_0=...col("price")... 4,823 rows ร 4 cols #a1b2c3d4 โ
โ โ โ
โ โโ โ with_columns arg_0=...alias("notional") 4,823 rows ร 5 #e5f6a7 โ
โ โ โ
โ โโ โช lag column='price' n=1 4,823 rows ร 6 [before-lag] #b8c9d0 โ
โ โ โ
โ โโ ๐ rolling_mean window=5 4,823 rows ร 7 โ HEAD #01e2f3a4 โ
โ โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
pip install flashback-df
# or with uv (recommended):
uv add flashbackimport flashback as fb
# โโ 1. Load any source โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
df = fb.load("trades.parquet") # Parquet
df = fb.load("prices.csv") # CSV
df = fb.load(my_polars_df) # Polars DataFrame
df = fb.load(my_pandas_df) # Pandas DataFrame
# โโ 2. Transform โ every step is recorded automatically โโโโโโโโโโโโโโโโโโโโโ
df = df.filter(fb.col("price") > 0)
df = df.with_columns(
(fb.col("price") * fb.col("volume")).alias("notional")
)
# Tag a checkpoint before a risky operation
df = df.tag("before-lag")
df = df.lag("price", 1) # sugar for shift(-1) + tracking
df = df.rolling_mean("notional", 5)
# โโ 3. Time-travel โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
df_clean = fb.checkout("before-lag") # instant; no disk I/O
# โโ 4. See what broke your Sharpe ratio โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
fb.visualize()| Function | Description |
|---|---|
fb.load(source, *, label=None) |
Load from file (.parquet, .csv, .json), Polars/Pandas DataFrame. Returns FlashbackFrame. |
fb.col(name) |
Alias for polars.col. Use for IDE-friendly imports. |
fb.commit(frame, label, *, message="") |
Tag the current state with a human-readable label (like git tag). |
fb.checkout(label, *, frame=None) |
Time-travel to a named checkpoint. Returns a fully materialised FlashbackFrame. |
fb.visualize(frame=None, *, style="tree") |
Render the transformation lineage. Styles: "tree" (default), "dag", "svg" (Jupyter). |
| Method | Description |
|---|---|
.lag(column, n=1, *, alias=None) |
Shift column by n periods with tracking. Creates column_lag_n. |
.rolling_mean(column, window, *, alias=None) |
Rolling mean with tracking. Creates column_rmean_window. |
.tag(label, *, message="") |
Same as fb.commit() but called on the frame. |
.diff(other) |
Structural diff. Returns Polars DataFrame with _diff column. |
.history() |
Full transformation chain as list of dicts (root โ HEAD). |
Lineage graphs can be saved to and loaded from disk for reproducible research:
from flashback.storage import Storage
# Save a lineage DAG
store = Storage(".flashback") # or Storage.from_cwd()
store.save(df, frame_id="experiment-001")
# Later, in another session:
df = store.load("experiment-001")The .flashback/ directory layout:
.flashback/
โโโ config.json
โโโ graphs/
โ โโโ experiment-001.json # serialised DAG
โโโ cache/
โโโ <node_id>.parquet # materialised node snapshots
Efficient storage: Only materialises nodes you explicitly checkpoint or request. Intermediate DAG nodes are compressed and stored as metadata only.
git clone https://github.com/flashback-dev/flashback
cd flashback
pip install -e ".[dev]"
# Lint
ruff check flashback tests
ruff format --check flashback tests
# Type-check
mypy flashback
# Test with coverage
pytestThe CI pipeline validates across:
OS
3
Ubuntu, macOS, Windows
Python
5
3.10 โ 3.13
Coverage
90%
Hard threshold
fb.branch("experiment-A") for parallel pipeline exploration
Reconcile two branches at the DAG level with conflict detection
Push/pull lineage graphs to S3, GCS, or Hugging Face Hub
Track Polars lazy evaluation before .collect()
%load_ext flashback with live DAG sidebar in Jupyter
Generate .dvc stage files from a flashback DAG
Instantly see which transformation caused your metric to drop. No guessing. No log archaeology.
Time-travel to any checkpoint in milliseconds. No re-running 6-hour jobs.
Deterministic node hashing ensures the same pipeline is always reproducible across environments.
Wraps Polarsโdoesn't slow it down. Lazy materialization means minimal memory use.
Load from either. Convert between them. Mix and match in your pipeline.
Pure Python. No database. No external services. Runs entirely locally.
flashback โ Git for Datasets
MIT License ยท Python 3.10+ ยท Pandas & Polars ยท Open Source