TaskForest
A simple, expressive, open-source, text-file-based Job Scheduler with console, HTTP, and RESTful API interfaces.
Documentation
  1. Downloading TaskForest
  2. Installing TaskForest
  3. Configuring TaskForest
    1. Jobs & Families
    2. Calendars
    3. Automatic Retries
    4. Sending Emails
    5. Options
    6. Configuration File
  4. Running TaskForest
  5. Running the TaskForest Web Server
  6. Web Server Security
  7. Checking TaskForest Status
  8. Rerunning a Job
  9. Marking a Job
  10. Tokens
  11. Releasing all Dependencies from a Job
  12. Putting a Job on Hold
  13. Releasing a Hold Off a Job
  14. HOWTO
  15. The RESTful Web Service
  16. Frequently Asked Questions (FAQ)
  17. Bugs
  18. Change Log
  19. Author
  20. Acknowledgements
  21. Copyright

Frequently Asked Questions (FAQ)

For time zones that observe Daylight Saving Time, on the day that DST starts (in Spring) the day has only 23 hours. On the day that DST ends the day has 25 hours. If you run an hourly job on those days, the job will run either 23 or 25 times. If you prefer, you can have your Family file use a timezone that does not observe DST - like GMT, for instance.

Having a midnight point gives us a convenient time interval within which a run of a job is considered valid. For example, with the system as it is now, saying "job J1 has run" means that it has run *today*. If J2 depends on J1, we only have to look for successful runs of J1 today.

If we don't have a known time interval, then we have to resolve the problem that occurs when jobs run for longer than 24 hours. Let's start with a simple example first. Assume that our Family file looks like this:


+----------------------------------------------------------------
|start=>'00:00',tz=>'America/Chicago',days=>'Mon,Tue,Wed,Thu,Fri'
| | J1( start=>22:00 ) | J2()
+----------------------------------------------------------------
Example 1: J1's run time is 5 hours. This is what happens today

(Chicago)
Day  Time       Action(s)
Mon 00:00
    01:00
    02:00
    03:00
    04:00
    05:00
    06:00
    07:00
    08:00
    09:00
    11:00
    12:00
    13:00
    14:00
    15:00
    16:00
    17:00
    18:00
    19:00
    20:00
    21:00
    22:00       J1 Starts
    23:00          |
Tue 00:00          |
    01:00          |
    02:00          v
    03:00       J1 Ends
    04:00
    05:00
    06:00
    07:00
    08:00
    09:00
    11:00
    12:00
    13:00
    14:00
    15:00
    16:00
    17:00
    18:00
    19:00
    20:00
    21:00
    22:00       J1 Starts
    23:00          |
Wed 00:00          |
    01:00          |
    02:00          v
    03:00       J1 Ends
    04:00

So you can see that J2 never gets a chance to run! At 03:00 on Tuesday, the system checks to see if J1 has run for that day, and it hasn't. So J2 can't run for that day.

Now, I could remedy this particular case with the following technique (or hack):

Example 2: J1's run time is 5 hours. J2's run time is 1 hour.

Change the Family File to say:


+----------------------------------------------------------------
| # changed the start time, time zone and days of week
|start => '00:00', tz => 'GMT', days => 'Tue,Wed,Thu,Fri,Sat'
|
| J1( start=>03:00 )
| J2()
+----------------------------------------------------------------

What this does is *shift* the entire Family so that all jobs in it run on the same day in the time zone specified. If you use cron to invoke TaskForest, the crontab entry that invokes taskforest does not need to be changed. That can stay the same. Now, this is what happens:


(Chicago)       (GMT)
Day  Time       Day  Time      Action(s)
Mon 00:00       Mon 05:00
    01:00           06:00
    02:00           07:00
    03:00           08:00
    04:00           09:00
    05:00           11:00
    06:00           12:00
    07:00           13:00
    08:00           14:00
    09:00           15:00
    11:00           16:00
    12:00           17:00
    13:00           18:00
    14:00           19:00
    15:00           20:00
    16:00           21:00
    17:00           22:00
    18:00           23:00
    19:00       Tue 00:00
    20:00           01:00
    21:00           02:00
    22:00           03:00      J1 Starts
    23:00           04:00         |
Tue 00:00           05:00         |
    01:00           06:00         |
    02:00           07:00         v
    03:00           08:00      J1 Ends, J2 Starts
    04:00           09:00               J2 Ends
    05:00           11:00
    06:00           12:00
    07:00           13:00
    08:00           14:00
    09:00           15:00
    11:00           16:00
    12:00           17:00
    13:00           18:00
    14:00           19:00
    15:00           20:00
    16:00           21:00
    17:00           22:00
    18:00           23:00
    19:00       Wed 00:00
    20:00           01:00
    21:00           02:00
    22:00           03:00      J1 Starts
    23:00           04:00         |
Wed 00:00           05:00         |
    01:00           06:00         |
    02:00           07:00         v
    03:00           08:00      J1 Ends, J2 Starts
    04:00           09:00               J2 Ends
    05:00           10:00

Shifting the start/end of the days by changing the timezones has fixed this problem. J1 still starts at 22:00 Chicago time, and J2 will still run if J1 runs successfully.

If the jobs in your Family files run in less than 24 hours, you can use this method now.

However, this is not a universal solution. What happens if J1's run time is 26 hours? Even with the family as shown in Example 2, J1 ends the day after it started. This means that J2 will never run, no matter what we set the timezone to be. This is because no matter what day it is, today's run of J1 will never have completed that day.

So the current version of taskforest does not support Family files whose jobs run for more than 24 hours from the start of the first job to the end of the last job, even if each individual job runs for less than 24 hours.

Example 3: J1 runs for 25 hours

(Chicago)       (GMT)
Day  Time       Day  Time      Action(s)
Mon 00:00       Mon 05:00
    01:00           06:00
    02:00           07:00
    03:00           08:00
    04:00           09:00
    05:00           11:00
    06:00           12:00
    07:00           13:00
    08:00           14:00
    09:00           15:00
    11:00           16:00
    12:00           17:00
    13:00           18:00
    14:00           19:00
    15:00           20:00
    16:00           21:00
    17:00           22:00
    18:00           23:00
    19:00       Tue 00:00
    20:00           01:00
    21:00           02:00
    22:00           03:00      J1 Starts
    23:00           04:00         |
Tue 00:00           05:00         |
    01:00           06:00         |
    02:00           07:00         |
    03:00           08:00         |
    04:00           09:00         |
    05:00           11:00         |
    06:00           12:00         |
    07:00           13:00         |
    08:00           14:00         |
    09:00           15:00         |
    11:00           16:00         |
    12:00           17:00         |
    13:00           18:00         |
    14:00           19:00         |
    15:00           20:00         |
    16:00           21:00         |
    17:00           22:00         |
    18:00           23:00         |
    19:00       Wed 00:00         |
    20:00           01:00         |
    21:00           02:00         |
    22:00           03:00         |     J1 Starts
    23:00           04:00         v        |
Wed 00:00           05:00      J1 Ends     |
    01:00           06:00                  |
    02:00           07:00                  |
    03:00           08:00                  |
    04:00           09:00                  |
    05:00           10:00                  |

The other thing we're seeing here is that now we have 2 instances of J1 running simultaneously (for 2 hours), even though that is not the intent of the Family file. This could be a major problem. J2 never runs.

I can't think of a foolproof way to resolve this.

If we implement rules that require a family to 'complete' all of the previous day's jobs until it can run no more - because all jobs have completed successfully. I don't know if this is a satisfactory approach or not. This means that J1's start times will be as follows:


   Mon: 22:00
   Wed: 00:00

On Tuesday we don't run the Family, because Monday's run hasn't completed. That's fine. But what about Wednesday? With this definition, Wednesday will never run as well because Tuesday's run never completed (there wasn't a run on Tuesday). So the system has to be smart about that and realize that. And that's fine, until you change the family on Tuesday to run a new job J3 after J2. Then, the system needs to know on Wednesday not to wait for look for a successful run of J3 for Monday's run (because J3 didn't exist on Monday). Which means that the system needs to know your change control history. Which is a whole other problem.

No. For a detailed explanation, please see the answer to the previous FAQ.

To see a list of all time zones you may use, you can read the perldoc for DateTime::TimeZone::Catalog.

Yes. You can use tokens to limit how many jobs of a class may run simultaneously. If these jobs all use the same token, you can set the cardinality of the token to the upper limit on the number of jobs that may run simultaneously.

You're probably using 'run' as the run_wrapper instead of 'run_with_log'. The former will not generate log files, and therefore will not generate the links. The latter will.

When a job is on hold, it will never run during the current day. Once you release a hold on a job, then the job is 'back to normal' and it will run once its dependencies are met. When you release all dependencies on a job, it will run right away, even if all its dependencies have not been met. What's important to realize is that if a job is on Hold, that takes priority over whether or not you can release all dependencies. The website does not allow you to release all dependencies on a job that's on hold. First you must release the hold, and only then can you release all dependencies.