I started writing a scheduled jobs management tool and now realize it was a mistake.
The original problem I was trying to solve was “how to you setup scheduled jobs on multiple servers (dev, staging, QA/UAT, production) while also saving the run schedule to version control”. I was over-complicating this in my head and thinking it needs to be a stand-alone web service. I want to document my new, efficient proposed solution 1.) for my own future reference, and 2.) so I don’t over complicate this again in the future.
My proposed workflow is:
- Cron/scheduled job to run ever 1, 2, or 5 minutes.
- That Cron job executes a server file, PHP, Bash, Python, Node, whatever you like.
- That Cron file:
- Checks if a lock/ran file already exists and if so, exists/stops.
- Creates a new lock/ran file named the minute of the hour. That way its easy to check if the job is already running/ran. You end up with 20-60 lock files depending how often it runs.
- Loops through files in a directory executing each file and not waiting for a response/completion. The easiest way to do this is a cURL call that does not await response.
- Within each file called the file checks the necessary schedule (I.E. if hour == 5am or if minute/5 is an even number).
The value of approaching it this way is:
- The actual schedule of the scheduled jobs is stored in the actual code file and thus saved to Git version control.
- Setting up the scheduled jobs only requires a single line in Cron or Task Scheduler.
- Scheduled jobs are prevented from being triggered twice at once with the central lock file.
It might be worth using a security token of some kind if you don’t want a malicious user or search crawler to trigger the scheduled job by simply browsing to the page/script. This is easily done by either a pre-shared hash such as md5(‘my secret salt’ + DAY OF WEEK) or by storing a security token in a file that all files can access and confirm.