How to use github actions to run cron jobs
Nikhil Das Nomula - Principal Engineer | Founder
2024-04-07 • Platform Engineering
Github actions is a great tool for CI however we recently had a data engineering usecase where the client did not have cloud infrastructure in place but had a requirement to move the data from a source to a destination but also wanted that to run on a schedule(cron). In this article we will go over why and how we ended up using Github Actions and its ability to run cron to address this particular use case.
We have written a python script to achieve the data engineering task and we had to think what would be best for our client to achieve automation. There were the options:
- Run the script on the instances of one of the major cloud providers - AWS EC2, Google VM instance, Azure VMs'.
- Utilize Heroku or fly.io to run the python script on.
- Utilize something like Apache airflow or Prefect.
- Use github actions.
The reason we chose github actions is simplicity for this usecase. If we had chosen the first two options, we would have to set github connectivity, handle environment variables in a different place other than github where the script resides. Apache Airflow and Prefect are an overkill for what we are trying to achieve.
From a cost perspective github actions are free, there is a caveat that the schedule might not run sometimes when loads are high on runners but that was not an issue for us in this particular usecase.
The best part is when we transition this to the client they just have one technology instead of a bunch of them that they would have to manage. Lets get into how. Here is the code for github actions is pretty simple and this is how it looks like:
As you can see, we set up the cron and use the standard python-dotenv to access secrets so that secrets are not in your code. This takes care of major concerns and provide a simplistic solution. That being said, this approach is not suited for every need. The option to choose depends on multiple factors.
Here is how the YAML for the GitHub action looks:
name: run script
on:
workflow_dispatch:
schedule:
- cron: '*/10 * * * *'
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.12'
- name: Run script
env:
ENV_VAR1: ${{ secrets.ENV_VAR1 }}
run: python src/main.py
If you have any questions, feel free to reach out to us at nikhil.nomula@yajur.tech
Need a Strategic Partner?
Yajur LLC partners with enterprise leaders to solve the exact challenges discussed in this article.
Initiate Consultation