What is dagster, what it is at a high level and how is it different from Apache Airflow?
Y
Nikhil Das Nomula - Principal Engineer | Founder
2024-05-17 • Data Engineering
Orchestrators play an important role in data engineering to automate workflows. Orchestrators have grown from something that can run a sequence of steps to now where we expect orchestrators to
Manage and co-ordinate complex workflows and data pipelines.
Monitor workflows to understand the sequence of steps
Show what succeeded and what failed?
How long each step took?
How these steps are related?
Basically providing a nice interface where we can "observe" what is going on with our workflows/data pipelines.
When it comes to orchestrators, Apache airflow and Kestra have been a great orchestrator but their approach is task based. What it means is that - the way we approach the problem is by focusing on hows? The tasks/verbs
Dagster takes a different approach where it is focused on the whats - which dagster terms them as assets. Dagster provides a great example in its documentation of how this makes a difference when it comes to reusability.
For e.g. if we want to make cookies the task centric way, the way we approach the problem is
Get the ingredients
Mix the ingredients
Add chocolate chips
Bake
Now if we take the asset centric approach, the way we approach the problem is
Get the ingredients, mix them to make cookie dough
Get chocolate chips and mix with the cookie dough to get chocolate chip cookie dough
Bake the cookied dough to get cookies
Now what makes asset centric approach different is that, we can re-use these assets. For e.g. in the above example, if we go with asset based approach to make peanut based cookies, you can use the existing asset which is cookie dough and add peanuts to it.
We will get into more detail in the series of dagster articles, but this should give you an idea of what Dagster is and how it is different from Apache Airflow?
Need a Strategic Partner?
Yajur LLC partners with enterprise leaders to solve the exact challenges discussed in this article.
Initiate Consultation