Data Engineering
-
A Little More Conversation, A Little Less Action — A Case Against Premature Data Integration
Data ScienceRunning a large data integration project before embarking on the ML part is easily a…
14 min read -
Parquet from scratch: A Python deep dive into a raw parquet file
10 min read -
Straight-to-the-point tips for the best SQL IDE
5 min read -
Three real-world SQL patterns that can be applied to many problems
14 min read -
If you’re an Anaconda user, you know that conda environments help you manage package dependencies, avoid compatibility…
13 min read -
“I train models, analyze data and create dashboards — why should I care about containers?”…
13 min read -
Python has grown to dominate data science, and its package Pandas has become the go-to…
14 min read -
In the world of machine learning, we obsess over model architectures, training pipelines, and hyper-parameter…
5 min read -
Follow me through the steps on how to evolve your architecture to align with your…
17 min read -
Stop Creating Bad DAGs – Optimize Your Airflow Environment By Improving Your Python Code
Data EngineeringValuable tips to reduce your DAGs’ parse time and save resources.
10 min read