← Previous · All Episodes
From Notebooks to Production: Xorq’s lockfile Approach for Reproducible, Portable ML Pipelines Episode 24

From Notebooks to Production: Xorq’s lockfile Approach for Reproducible, Portable ML Pipelines

· 57:26

|
In this episode, Hussain shares the story behind xorq: a “lockfile for ML pipelines” that makes notebook work easier to reproduce, debug, and ship. We talk about why the research→production path is still so manual, how schemas (and Arrow) become the contract between systems, and what it takes to run the same pipeline across engines like Snowflake and Databricks. We also dig into escape hatches for imperative code, why feature stores didn’t become the default, and how xorq fits alongside other technologies like Iceberg.

Chapters
00:00 Hussain's Journey in Data Science
06:00 The Need for xorq: Bridging Research and Production
10:38 Challenges in Machine Learning Deployment
17:40 The Role of Lock Files in Data Pipelines
29:51 Understanding Schema Management in Data Systems
34:40 Navigating Declarative and Imperative Transformations
36:39 The Developer's Journey with xorq
38:34 Feature Stores vs. xorq: A Comparative Analysis
43:43 The Future of Feature Stores and Machine Learning
51:41 Reproducibility in Data Pipelines: xorq vs. Git-like Operations
55:47 The Future of xorq and the Data Ecosystem

View episode transcript


Subscribe

Listen to Tech on the Rocks using one of many popular podcasting apps or directories.

Apple Podcasts Spotify Overcast Pocket Casts Amazon Music
← Previous · All Episodes