Analysis Getting Started

From SBS wiki
Revision as of 17:04, 4 March 2022 by Puckett (Talk | contribs)

Jump to: navigation, search

These instructions are specific to the analyzer and SBS-offline installations existing under /work/halla/sbs, maintained by Andrew Puckett.

How to Reach the SBS Work Directory

  • The SBS work directory is located at /work/halla/sbs
    • Created a directory here with mkdir username
    • If you do not have permission contact Ole Hansen (ole@jlab.org) and ask to be added to the SBS user group.


Setting up Environments

  • Open ~/.cshrc and add the lines:
source /site/12gev_phys/softenv.csh 2.5
source /work/halla/sbs/ANALYZER/install/bin/setup.csh 

Getting Files from Cache

  • All raw EVIO files from GMn are on tape at /mss/halla/sbs/raw
  • Cached EVIO files are located at /cache/halla/sbs/raw
  • To write files from tape to cache see documentation here, https://scicomp.jlab.org/docs/node/586
    • For example, to get all EVIO splits from run runnumber to cache execute jcache get /mss/halla/sbs/raw/*runnumber*


Storing Output Files

  • No output files (logs or ROOT) should be stored in the work directory
  • They should be stored in the "volatile" directory
    • Go to /volatile/halla/sbs
    • Created a directory here with mkdir username


Setting up the SBS Replay


SBS Installation

  • Follow the README instructions on https://github.com/JeffersonLab/SBS-offline to install SBS-offline.
  • After installing, there should be a directory, install/run_replay_here
  • Inside there should be one file named .rootrc (it is a hidden file).
  • Wherever you run the replay, this file must be there to load the SBS-offline libraries. Either run your replays here, or move the .rootrc file to the new destination.


SBS Replay Environments

  • The following lines should be used in a script to define where the data, and output files should be located
setenv SBS_REPLAY path-to-your-replay/SBS-replay
setenv DB_DIR $SBS_REPLAY/DB
setenv DATA_DIR /cache/mss/halla/sbs/raw
setenv OUT_DIR path-to-your-volatile/rootfiles
setenv LOG_DIR path-to-your-volatile/logs
setenv ANALYZER_CONFIGPATH $SBS_REPLAY/replay
  • DATA_DIR tells the replay where the EVIO files are.
  • OUT_DIR tells the replay where to put the output ROOT files.

Running the SBS Replay

Working example scripts for GMN analysis on the batch farm

The most efficient and convenient way to analyze GMN data on the batch farm is using the swif2 system. A general overview of the swif2 system is available from the computer center's documentation page here

The first step in using the swif2 system is setting up a "workflow" under your CUE account using "swif2 create", as documented here.

Once you have created a workflow, it can be used to launch jobs on the batch farm. The general command-line reference for using swif2 can be found here

Working example scripts to launch GMN replay jobs on the batch farm can be found at

/work/halla/sbs/puckett/GMN_ANALYSIS/launch_GMN_replay_swif2.sh

/work/halla/sbs/puckett/GMN_ANALYSIS/run_GMN_swif2.sh

These scripts both refer to directories and workflows that are specific to the "puckett" user account on the farm. They should be viewed as templates and examples for you to copy to your own work disk area and develop your own scripts and workflows. The first of these two scripts takes just two arguments: a run number and a maximum number of segments. The proper usage would be:

./launch_GMN_replay_swif2.sh runnum maxsegment


Here "runnum" refers to the CODA run number and "maxsegment" is the number of EVIO file segments to be replayed. The script will create one batch job per EVIO file (assuming the file exists) and add it the workflow "puckett_GMN_analysis".

After executing this script, if the workflow is not already running, you can tell it to start releasing jobs to the batch farm using the command:

swif2 run <workflow name>

The second script (run_GMN_swif2.sh) takes care of setting up the environment on the farm node and actually executing the analyzer, and copying the output files of the replay job to an appropriate directory on /volatile. You don't need to call this script directly, but it is called with appropriate arguments by the swif2 jobs created by the first script.