I have. Unix Cron takes good care of launching a Job procedure at a predetermined time. Every run of Job includes a key (job_name plus a date) associated with it, to differentiate one run of a job from another. For simplicity consider there are approximately lakhs of jobs that get triggered daily and that job may only run once a day.
Job writes its own condition within a file system.
This directory is per job run and comprises a few documents, written by Job procedure, these documents contain condition of a Job (Running, Failed, Finished etc.) along with some other details.
In addition, I are employed monitoring internet program, that reads this information in the file and reveals it. The Backend of this work monitoring website application is written in Java and Frontend is composed using React/Redux. Java host comes with an in-memory cache (basically a hashmap) which has generated whenever that the server restarts. This cache contains the 1 week of its own state and Jobs. The cache gets refreshed periodically every 30 seconds – reads the brand new state of the project, eliminates jobs that shouldn’t be in a cache, then add new tasks etc.. This cache is so that we can serve data from past 1 week. If the query is for longer then a week we then fall back into file system for reading Jobs’ state that are external cache period.
I am thinking to retire this Java backend because of complications in generating and refreshing the in-memory cache.
Since the requirement here is to function data (at least 2 months of data) very quickly, I’m thinking to retire yet another Java server, possess a committed from process cache like Redis instead of in-memory cache.
I.e. I’m thinking to make a python server that will periodically read Job state in the file system and periodically (say every 10 minutes ) upgrade Job state to the Redis cache. I am also thinking to create a RESTful Python API that inquiries and serves data to the UI out of Redis cache. I’m choosing Python since the Job Scheduler and Executor are written in Python, so that I have project condition to be see by Python persistence APIs .
Data dimensions: 1 week of information is approximately 10GB.
Requests Per Second: 100
Is the structure I so are Redis and Python and am suggesting the selection the choice that is perfect too? Suggestions here are valued.