Quick-Start Guide¶

This section serves as a quick-start guide for installing the OTH and launching an existing test. For creating a new test, see Adding a New Test. For adding support for a new machine, see Adding a New Machine.

Installation¶

To launch the OLCF Test Harness (OTH) you must first access the harness code. This can be done in two ways: by obtaining your own copy of the code or using the centralized harness code that is available on most OLCF systems.

Option 1: Using the centralized (pre-built) OTH¶

On Andes and Frontier¶

On most OLCF machines, the code is already installed in: /sw/acceptance/olcf-test-harness

Setup the environment:

export OLCF_HARNESS_DIR=/sw/acceptance/olcf-test-harness
module use $OLCF_HARNESS_DIR/modulefiles
module load olcf_harness
# Machine name examples: andes, frontier, odo
# Check ${OLCF_HARNESS_DIR}/configs/*.ini to see all available machines
export OLCF_HARNESS_MACHINE=<machine_name>

Option 2: Using your own copy of the harness¶

Clone the repo on the target system:

git clone https://github.com/olcf/olcf-test-harness.git

Setup the environment:

cd olcf-test-harness
export OLCF_HARNESS_DIR=${PWD}
module use $OLCF_HARNESS_DIR/modulefiles
module load olcf_harness
export OLCF_HARNESS_MACHINE=<machine_name>

Note

You must have the $OLCF_HARNESS_MACHINE.ini file in your current directory or in $OLCF_HARNESS_DIR/configs. An example_machine.ini file is provided in the $OLCF_HARNESS_DIR/configs directory, and OLCF machine examples are provided in the configs/olcf_examples directory. For creating a new machine, see Adding a New Machine.

Launching the OTH¶

Basic Usage¶

Create a directory where you will place input files. No computation will be done here:

mkdir summit_testshot
cd summit_testshot

Prepare an input file of tests (e.g., rgt.input.summit). In the file, set Path_to_tests to the location where you would like application source and run files to be kept (note that the directory provided must be an existing directory on a file system visible to the current machine). Next, provide one or more tests to run in the format Test = <app-name> <test-name>. In this example for Summit, the application hello_mpi is used and we specify two tests: c_n001 and c_n002.

Note

Tests may be hosted in GitHub/GitLab repositories, or may be placed on the file system in the directory specified by Path_to_tests. The OTH can automatically clone Git repositories from remote servers. Configuration settings for Git repositories are in the $OLCF_HARNESS_MACHINE.ini file (see Adding a New Machine or Configuration Variables). Applications not hosted in GitHub/GitLab must be manually placed in Path_to_tests.

################################################################################
#  Set the path to the top level of the application directory.                 #
################################################################################

Path_to_tests = /some/path/to/my/applications

Test = hello_mpi c_n001
Test = hello_mpi c_n002

For convenience, the Include keyword reads in another harness input file, adding the tests, paths, and tasks to the current harness parameters. Path_to_tests is set to the first value encountered while parsing the input files. For example:

Path_to_tests = /some/path/to/my/applications

Include 1-node-tests.inp
Include 2-node-tests.inp

Set a scratch area for this specific instance of the harness (a default is set from $OLCF_HARNESS_MACHINE.ini, but this is how to change from the default):

export RGT_PATH_TO_SSPACE=<some path in the file system>/Scratch

The latest version of the harness supports command line tasks as well as input file tasks. If no tasks are provided in the input file, it will use the command line mode. To launch via the command line, use a command like the following:

# Preferred to checkout separately, to verify that the checkout was successful
runtests.py --inputfile rgt.input.summit --mode checkout
runtests.py --inputfile rgt.input.summit --mode start stop

To launch tasks in the input file instead of the command-line, add lines like the following to rgt.input.summit:

# 1 task per line
harness_task check_out_tests
harness_task start_tests
harness_task stop_tests
harness_task display_status

When using the checkout mode, the application source repository will be cloned to the <Path_to_tests>/<app-name> directory for all the tests, but no tests will be run. If the repository already exists, no action will be taken. Updating the repo via git pull or git fetch should be done outside of the test harness.

After using the start mode, results of the most recent test run can be found in <Path_to_tests>/<app-name>/<test-name>/Run_Archive/<testid>. Results of the most recent test run can be found in the <Path_to_tests>/<app-name>/<test-name>/Run_Archive/latest symbolic link.

Note

The latest link may not update cleanly if multiple instances of the same test are running simultaneously. The OTH will print a warning, but will continue running.

Command-line Options¶

The OTH receives configurations from two primary methods: command-line flags and environment variables. This section details the command-line parameters and the next section details available environment variables.

The primary OTH driver script, runtests.py, supports the following command-line parameters:

-h,--help                           show help message and exit
-i,--inputfile INPUTFILE            Input file name (default: rgt.input)
-c,--configfile CONFIGFILE          Configuration file name (default: ${OLCF_HARNESS_MACHINE}.ini)
-l,--loglevel LOGLEVEL              Logging level (default: NOTSET)
                Options: [NOTSET,DEBUG,INFO,WARNING,ERROR,CRITICAL]
-o,--output {screen,logfile}        Destination for harness stdout/stderr messages (default: 'screen')
                Options: [screen,logfile]
                        'screen'  - print messages to console (default)
                        'logfile' - print messages to log file
-m,--mode MODE [MODE ...]           Specify the mode(s) to run the harness with (default: 'use_input_file')
                Options: [use_input_file,checkout,start,stop,status]
                        'use_input_file' - use tasks defined in the input file
                        'checkout'       - checkout application tests listed in input file
                        'start'          - start application tests listed in input file
                        'stop'           - stop application tests listed in input file
                        'status'         - check status of application tests listed in input file

--fireworks                         Use FireWorks to run harness tasks (beta)
-sb, --separate-build-stdio         Separate output from build into build_out.stderr.txt and build_out.stdout.txt

Note

The --loglevel flag currently does not apply to all output from the OTH. This issue is tracked by Issue 130.

Run-time environment parameters¶

The OTH is designed to automatically ingest some parameters from user-set environment variables at launch time. Nearly all parameters in the $OLCF_HARNESS_MACHINE.ini file can be directly overridden by a corresponding environment variable. For example, git_reps_branch is a parameter in $OLCF_HARNESS_MACHINE.ini that specifies the branch of the remote repository to clone. The RGT_GIT_REPS_BRANCH environment variable can be used to override this value at launch time. The general precedence of configuration options from lowest to highest is:

$OLCF_HARNESS_MACHINE.ini
User-set environment variables (ie, RGT_GIT_REPS_BRANCH, RGT_PROJECT_ID)
<Path_to_tests>/<app-name>/<test-name>/Scripts/rgt_test_input.ini

The specific parameters are defined in Adding a New Test and Adding a New Machine.

The exception to this is setting the batch queue and project ID used for submission. The precedence of configuration options for the batch queue and project ID from lowest to highest is:

batch_queue and project_id from $OLCF_HARNESS_MACHINE.ini (overridden by setting RGT_BATCH_QUEUE and RGT_PROJECT_ID at launch)
batch_queue and project_id from <Path_to_tests>/<app-name>/<test-name>/Scripts/rgt_test_input.ini
User-set environment variables: RGT_SUBMIT_QUEUE and RGT_SUBMIT_ACCT

Since the test configuration overrides the machine configuration for these two variables, the user cannot use the same environment variable names to override the settings. The test configuration will just override whatever the user sets, because the OTH does not know who sets RGT_BATCH_QUEUE – the user or the machine.ini. So, two separate variables are used to override the machine and test configuration: RGT_SUBMIT_QUEUE for setting a batch queue and RGT_SUBMIT_ACCT for setting the account ID for submission.

Finding Test Output¶

This section details where all output files can be found once a harness run completes. There are 4 directories referenced in this section:

$BUILD_DIR - the build directory, equal to $RGT_PATH_TO_SSPACE/<app>/<test>/<test-id>/build_directory
$WORK_DIR - equal to $RGT_PATH_TO_SSPACE/<app>/<test>/<test-id>/workdir
$RESULTS_DIR - the directory used to launch the job and store relevant output, equal to <Path_to_tests>/<app>/<test>/Run_Archive/<test-id>
$STATUS_DIR - the directory used to store harness status files, equal to <Path_to_tests>/<app>/<test>/Status/<test-id>

Build, Submit, and Check Output¶

Each OTH test run consists of 4 primary stages – build, submit, run, and check, as can be seen in Overview of the Test Harness. The build, submit, and check stages have pre-defined locations for the output:

build output: ${BUILD_DIR}/output_build.txt
submit output: ${RESULTS_DIR}/submit.{out,err}
check output: ${RESULTS_DIR}/output_check.txt

Application Output¶

Output from the executable run can be found in one of two places.

Uncaptured output will go to the scheduler’s stdout/stderr mechanism, and will commonly be found in ${RESULTS_DIR}
Any output files created by the application should be in ${WORK_DIR}

Harness-maintained Log Files¶

The OTH also produces log files throughout the run using the Python logging module, containing messages from the harness with data useful for debugging failed tests. These log files default to the Python logging level of INFO, but will be changed to DEBUG if the --loglevel=DEBUG flag is provided on the command-line. The table below lists the available log files and what scope messages in each file is from, in relative chronological order.

Harness-generated log files¶
Directory	Logfile name	Description
<launch_directory>	`main.log`	The highest-level execution information. For example, “Completed reading the harness input file.” This scope ends when the `runtests.py` command returns.
<launch_directory>/harness_log_files.<timestamp>	`libraries.regression_test.<timestamp>.txt`	High-level logging messages while preparing tests. For example, “Application Hello, World! has completed launching.” This scope ends when the `runtests.py` command returns.
<launch_directory>/harness_log_files.<timestamp>/<app>	`<app>__<test>.logfile.txt`	A brief scope between the regression_test log-file and the log files located inside the test’s results directory. This scope ends when the `runtests.py` command returns.
${RESULTS_DIR}/LogFiles	application_logfile.txt	The majority of log messages generated while building, submitting, running, and checking a test.
${RESULTS_DIR}/LogFiles	status_logfile.txt	Log messages generated at various checkpoints during the lifetime of a test such as `build_end` `binary_execute_start` and `check_end`.