wiki:HetProcedures/RA/quick/TimsHetFail

Version 2 (modified by sco, 8 months ago) (diff)

--

When a timshet failure occurs, what do I do?

  1. We receive an email notifying us that hpf has lost tcs metadata. We may also hear an alert from netsounds.sh (although I have never heard anything).
  1. Try to perform a graceful tims shutown by going to the rasession scripts window and issuing: "tims het shutdown"
  1. If list item 2 is not successful, go to the tims client window (in rasession this is the upper-right window). In this window we enter a cntrl-C to kill the timshet run that is presumed to be hung.
  1. We hit the "TCS client restart" button in the RA Ops web to restart the timshet communication. We observe outputin the tims client window (rasession) that verities that the restart has occurred. --OR-- You can issue "timshet" in the rasession client window.