Difference between revisions of "Fixing the SBS DAQ"

From SBS wiki
Jump to: navigation, search
(List of crates in the DAQ)
(List of crates in the DAQ)
 
Line 24: Line 24:
  
 
'''The passwords for these links are on a wall in the counting house'''
 
'''The passwords for these links are on a wall in the counting house'''
 +
 +
All network devices at https://hallaweb.jlab.org/wiki/index.php/SBS_Network_Devices_in_Hall_A
 +
  
 
{|border=1
 
{|border=1

Latest revision as of 13:46, 14 May 2024

<<SBS Main

The DAQ will not run and I know nothing about CODA

Figure 1. CODA GUI with ROC information printed at the bottom


Before attempting anything try to reset and start CODA two times. Often this will fix the issues. If CODA is still not working then follow these steps:

  • If it is during the work day or you think it will take you more than 15 minutes to troubleshoot this yourself then call the RC immediately instead of following these steps
  • First thing is to figure out which DAQ crate is causing the issue
  • Look at the CODA GUI (figure 1), and look at the "Severity" information at the bottom
    • Yellow "Warn" messages are fine
    • Red or orange "error" message must be fixed.
  • If there are no errors then scroll up, sometimes they are pushed upward before you will notice them.
    • If there are still no obvious errors call a DAQ expert
  • Look at the crate name on the left column and determine which crate it is. The table below lists which subsystem the crates belong to.
  • At this point if you want to try fixing the DAQ on your own go HERE.
    • Do not do this if it is between 8 am and 10 pm and you think this will take you more than 10 minutes. Call the expert instead.
    • If it is night time then try for 20 minutes or so before calling the relevant expert.
  • Otherwise look at table HERE and call the expert matching that subsystem.

List of crates in the DAQ

The passwords for these links are on a wall in the counting house

All network devices at https://hallaweb.jlab.org/wiki/index.php/SBS_Network_Devices_in_Hall_A


CODA ROC Name Subsystem Reset Link
ER1, ER2, ER3 CODA Platform HERE
SEB1, SEB2, SEB3 CODA Platform HERE
DC1, DC2 CODA Platform HERE
sbsvme29ROC1 CODA Platform HERE
sbsvtpROC24 SBS GEMs HERE
sbsvtpROC25 SBS GEMs HERE
vtpROC20 BB GEMs HERE
(portserver) for HCal HCal http://129.57.188.119 (port 5)
hcalROC16 hcalvtpROC28 HCal http://hcalvxs1.jlab.org http://129.57.192.243 portservhats4 2002
hcalROC17 hcalvtpROC29 HCal http://hcalvxs2.jlab.org portservhats4 2004
sbsgemROC23 SBS GEMs HERE
sbsgemROC22 SBS GEMs HERE
grinchROC7 GRINCH http://grinchvxs.jlab.org
lhrsROC10 LHRS http://lefthrsvxs.jlab.org
bbgemROC19 BB GEMs HERE
bbshowerROC6 BBCal http://bbshowervxs.jlab.org
bbhodoROC5 BBhodo http://bbhodovxs.jlab.org
sbsTS21 CODA Platform HERE http://sbstsvxs.jlab.org

*Note: If rebooting the HCAL portserver doesn't fix the issue, consider rebooting the VXS crate. hcalvtp1ROC28 is in crate http://hcalvxs1.jlab.orghttp://129.57.192.243 and hcalvtp2ROC29 is in crate http://hcalvxs2.jlab.org. Rebooting hcalvxs2 crate also reboots TDCs (hcalROC17) and DAC (bbcal and hcal thresholds goes to defaults) along with hcalvtp2ROC29, so you must be careful before rebooting this.

DO NOT START A RUN AFTER REBOOTING HCALVXS2 OR HCALROC17 WITHOUT RELOADING THE TRIGGER THRESHOLD DAC AND SETTING THE THRESHOLDS!

Fixing the DAQ yourself

Almost all issues are fixed by restarting the right crate. Look at the table HERE and determine what subsystem matches the crate error.

  • If it is a GEM crate go to GEM Crates
  • If it is a different crate follow the reset links on the table above.
  • If it is a "CODA Platform" issue then the reset must be through CODA. Go to Fixing CODA Platform Issues

Fixing CODA Platform Issues

  • On the adaq@adaq2 desktop look for the folder labeled "SBS_coda_scripts" and open it.

Right

  • The following options will appear in front of you. First try clicking on "Restart CODA Components"

Right

  • If a CODA component continues being in a red "disconnected" state then instead try "Kill CODA Xterms" and then "Start Xterms".
  • If things will still not work then try restarting the platform using the instructions below.
 ssh adaq@adaq2
 sudo systemctl restart platform.service
  • Use the "Kill CODA Xterms" button and restart all terminals with "Start Xterms"
  • If the DAQ still will not run then try this from any hadesk:
 ssh sbs-onl@eel124gemdaq
 kcoda
  • This will kill other processes that may be causing an issue
  • The click on the "Restart CODA Components" button
  • If things are still not working then call the DAQ expert

RC GUI is Frozen

Sometime the RC GUI will be frozen and you cannot press any buttons or close it. If that happens then do the following in the terminal:

ps aux | grep ui.rcgui

You will get information in the terminal that looks like the picture below.

Right

Inside the red rectangle is the process ID, which I will call the PID. To kill the GUI then do:

pkill -9 #PID

The GUI should close and then you can open a new GUI using the "CODA RunControl" button without issue.

Resetting sbsvme29ROC1

First log in to the adaq@adaq2 computer. Then open a terminal and execute the following commands

ssh sbs-onl@sbsvme29 ssh sbs-onl@adaq2
cd ~/bin
caen_power.py intelha9 0 # turn off

From adaq2, run the following (probably twice) until it comes back up:

caen_power.py intelha9 1 # turn on

GEM Crates

Figure 1. GEM Reset GUI

On the a-onl machine do this:

gosbs
GEM_resets.sh

A GUI like the one shown on the right will appear. Refer to the table HERE under "subsystem" to figure out if the BB or SBS GEMs need to be reset. If you are uncertain do both. Shift workers may press the big reset buttons and it will take a couple minutes for the GEMs to completely reset.

After the GEMs have reset, CODA may still be showing that a ROC is disconnected. In this case, you will need to restart the coda terminals. You can do this, from the folder called SBS_coda_scripts. Click the script to Kill CODA Xterms and then run the script to Start Xterms. Proceed to Configure the DAQ.