Sending Emails
You can configure TaskForest to send you an email in any of these three situations:
- When a job fails
- When a job fails but will be automatically retried
- When a job that failed and was automatically retried finally succeeds
You can configure this feature system-wide via the configuration file, and you can also override it for individual jobs in the Family files (the same way retries are configured). You can also completely configure the emails going out in each of the three cases, from the SMTP envelope information to MIME headers to body text. Again, this can be configured across the entire system and overridden for individual jobs and families. It really is a very powerful system that, with the proper email filters, can make managing your job workflow very easy. Let's see how it works.
In order to be able to send emails, you need to provide TaskForest some information in the configuration file:
# This is the SMTP server that will be used to send
# emails out when a job fails, for example
smtp_server = "localhost"
# The default SMTP port is 25.
smtp_port = 25
# In a production environment this should be 60 or 120
smtp_timeout = 60
# This is the SMTP envelope sender
# (the text after "MAIL FROM:")
smtp_sender = "user1@example.com"
# This is the email address that appears in the From:
# mail header
mail_from = "user1@example.com"
# If a user replies to a received email, the reply
# will go to this address instead of the From: address.
# This address is set in the Reply-To mail header.
mail_reply_to = "user2@example.com"
# This is the address to which bounces will be sent if
# they occur at the SMTP server (as opposed to the
# receiving Mail Transfer Agent).
mail_return_path = "user3@example.com"
# This is the directory that stores the contents of
# the emails that are sent by the system.
instructions_dir = "instructions"
Then there are the following three settings that control who receives the emails. They can be set system-wide in the configuration file:
# When a job fails, emails are sent to this address
email = "test@example.com"
# When a job fails, but is being automatically retried,
# emails are sent to this address, as opposed to the
# one stored in the 'email' setting. If no_retry_mail
# is set, then no email will be sent in this case
retry_email = "test2@example.com"
# When a job fails, is automatically retried one or more
# times and then suceededs, emails are sent to this
# address, as opposed to any of the others. If
# no_retry_success_email is set, then no email will be sent
# in this case.
retry_success_email = "test3@example.com"
Given the setup shown above, the email generated when a job is being retried is shown below. What you see below is a transcript of a SMTP session between TaskForest and a fake SMTP server used for testing. Lines that start with 'S: >
' denote text sent from the SMTP server to the client (TaskForest). Lines that start with 'C: <
' denote text sent from the SMTP client (TaskForest) to the SMTP server.
S: > 200 OK TaskForest Fake SMTP Server
C: < EHLO user1@example.com
S: > 200 OK
C: < MAIL FROM:<user1@example.com>
S: > 200 OK
C: < RCPT TO:<test2@example.com>
S: > 200 OK
C: < DATA
S: > 200 OK
C: < From: user1@example.com
C: < Return-Path: user3@example.com
C: < Reply-To: user2@example.com
C: < To: test2@example.com
C: < Subject: RETRY RETRY::J_Retry
C: <
C: < This is the TaskForest system at your_machine_name
C: < ------------------------------------------------------
C: <
C: <
C: < The following job failed and will be rerun automatically.
C: <
C: < Family: RETRY
C: < Job: J_Retry
C: < Exit Code: 256
C: < Retry After: 2 seconds
C: < No. of Retries: 1 of 1
C: <
C: < Instructions that apply to all jobs in the Family named RETRY.
C: <
C: < Instructions that apply to the jobs named J_Retry.
C: <
C: <
C: <
C: < ------------------------------------------------------------
C: < For help, please see http://www.taskforest.com/
C: < .
S: > 200 OK
C: < QUIT
What's left to be explained is how TaskForest chose the email's subject and the body of the email. The subject of the email will contain the Family name and Job name preceded by 'RETRY', 'FAIL' or 'RETRY_SUCCESS', depending on whether the job failed and is about to be retried (after the sleep time), or failed for the final time, or succeeded after failing and being retried one or more times.
As for the body, TaskForest concatenates the contents of several files found in instructions_dir. If the appropriate file does not exist, it will ignore it and move on to the next one. As it retrieves the contents of each file, TaskForest does some simple substitutions to replace placeholders found within the file's contents with the the value of that variable in the run_wrapper's environment. TaskForest makes all of the environment variables available to the taskforest
program available to the run wrapper (run_with_log) and also adds the following variables to the environment:
- TASKFOREST_FAMILY_NAME
- This is the name of the family in which this job was run.
- TASKFOREST_JOB_NAME
- This is the name of the job that was run.
- TASKFOREST_LOG_DIR
- This is the full path of the TaskForest log directory
- TASKFOREST_JOB_DIR
- This is the full path of the TaskForest job directory
- TASKFOREST_PID_FILE
- This is the name of the pid file that's used internally by TaskForest
- TASKFOREST_SUCCESS_FILE
- This is the name of the file that's created if the job succeeded.
- TASKFOREST_FAILURE_FILE
- This is the name of the file that's created if the job failed.
- TASKFOREST_UNIQUE_ID
- This is an internal identifier used by TaskForest to refer to the job.
- TASKFOREST_NUM_RETRIES
- This is the value of the num_retries configuration variable.
- TASKFOREST_RETRY_SLEEP
- This is the value of the retry_sleep configuration variable.
- TASKFOREST_EMAIL
- This is the value of the email configuration variable.
- TASKFOREST_RETRY_EMAIL
- This is the value of the retry_email configuration variable.
- TASKFOREST_NO_RETRY_EMAIL
- This is the value of the no_retry_email configuration variable.
- TASKFOREST_INSTRUCTIONS_DIR
- This is the value of the instructions_dir configuration variable.
- TASKFOREST_SMTP_SERVER
- This is the value of the smtp_server configuration variable.
- TASKFOREST_SMTP_PORT
- This is the value of the smtp_port configuration variable.
- TASKFOREST_SMTP_SENDER
- This is the value of the smtp_sender configuration variable.
- TASKFOREST_MAIL_FROM
- This is the value of the mail_from configuration variable.
- TASKFOREST_MAIL_REPLY_TO
- This is the value of the mail_reply_to configuration variable.
- TASKFOREST_MAIL_RETURN_PATH
- This is the value of the mail_return_path configuration variable.
- TASKFOREST_SMTP_TIMEOUT
- This is the value of the smtp_timeout configuration variable.
- TASKFOREST_RETRY_SUCCESS_EMAIL
- This is the value of the retry_success_email configuration variable.
- TASKFOREST_NO_RETRY_SUCCESS_EMAIL
- This is the value of the no_retry_success_email configuration variable.
Let $instructions_dir
refer to the value of the instructions_dir
option. Let's call the reason for the email $reason
. It's value is one of 'RETRY', 'FAIL', or 'RETRY_SUCCESS' (as described above). Let's also refer to the job name as $job_name
, and the family name as $family_name
. TaskForest will look for the following files in the following order, inserting their contents into the email body.
$instructions_dir/HEADER
- This is a header that will show up in every email. In the example that generated the above email, the contents of this file were:
This is the TaskForest system at $HOSTNAME ------------------------------------------------------
$instructions_dir/HEADER.$reason
- This is a header that will be displayed just for that reason. In other words, it will be one of three files: HEADER.RETRY, HEADER.FAIL, and HEADER.RETRY_SUCCESS. In the example that generated the above email, the name of this file was HEADER.RETRY and it's contents were:
The following job failed and will be rerun automatically. Family: $TASKFOREST_FAMILY_NAME Job: $TASKFOREST_JOB_NAME Exit Code: $TASKFOREST_RC Retry After: $TASKFOREST_RETRY_SLEEP seconds No. of Retries: $TASKFOREST_RETRY of $TASKFOREST_NUM_RETRIES
$instructions_dir/FAMILY.$family_name.$reason
- This is file that can contain specific instructions for that family and reason. In the example that generated the above email, the name of this file was FAMILY.RETRY.retry and its contents were:
Instructions that apply to all jobs in the Family named RETRY.
$instructions_dir/JOB.$job_name.$reason
- This is file that can contain specific instructions for that job and reason. In the example that generated the above email, the name of this file was JOB.J_Retry.retry and its contents were:
Instructions that apply to the jobs named J_Retry.
$instructions_dir/FOOTER.$reason
- This is a footer that will be displayed just for that reason. In other words, it will be one of three files: FOOTER.RETRY, FOOTER.FAIL, and FOOTER.RETRY_SUCCESS. In the example that generated the above email, the name of this file would have been FOOTER.RETRY, but that file did not exist, so TaskForest skipped over it.
$instructions_dir/FOOTER
- This is a footer that will show up in every email. In the example that generated the above email, the contents of this file were:
------------------------------------------------------------ For help, please see http://www.taskforest.com/