== Virtual Host and/or Network Switch outage recovery procedure == === Shutdown before outage === For a planned outage do the following prior to shutting down the virtual hosts and network switches; * on utility-apps.het.astronomy.utexas.edu 1. shut down the apc server in order to cleanly close the network connections to the TDK power supplies a. `sudo service apcServer stop` * on htcs.het.astronomy.utexas.edu 1. shut down all the servers to cleanly close the network connections a. `sudo service pfipServer stop` b. `sudo service pasServer stop` c. `sudo service legacyServer stop` d. `sudo service trackerServer stop` e. `sudo service tcsServer stop` 2. Kill the caRepeater process * on lrs2 1. Disable vacuum gauges and ion pumps in lrsServer a. syscmd -l "disable_pressure_gauges()" === Restart and recovery === Once the virtual hosts and network switches are operating again the following steps are needed for recovery of operations. * on utility-apps.het.astronomy.utexas.edu 1. shutdown the running tcs services a. `sudo service tcsNamed stop` b. `sudo service logRelay stop` c. `sudo service apcServer stop` 2. restart the tcs services a. `sudo service tcsNamed start` b. `sudo service logRelay start` c. `sudo service apcServer start` 3. verify that the services are running a. `chksys`, this should return a response like, {{{ hetdex 19373 1 0 102152 24116 1 Mar06 ? 00:00:45 /opt/het/hetdex/bin/tcsnamed --named-route tcp://192.168.66.99:30000 -c /opt/het/hetdex/etc/configfiles/tcsnamed.conf hetdex 19508 1 0 156388 95896 1 Mar06 ? 00:01:17 /opt/het/hetdex/bin/tcs_log_relay --named-route tcp://192.168.66.99:30000 hetdex 11093 1 0 335329 14600 1 Mar06 ? 00:05:04 /opt/het/hetdex/bin/apcServer --named-route tcp://192.168.66.99:30000 -c /opt/het/hetdex/etc/configfiles/apcServer.conf }}} * start a weather gui and verify that all readings are present. It may take 5 minutes for dust readings to occur. If there are missing weather readings, 1. try restarting the weather system from the `tolauncher` 2. check to weather process logs in /data1/archive/weather/logs/ 3. login to hetwx.het.astronomy.utexas.edu a. `su - wx` b. trouble shoot the programs in `/home/hetwx/wx/weather_sys` If truss temperatures are missing, 1. use rdesktop to connect to trusst.het.astronomy.utexas.edu, login as guider 2. restart the `Truss Temp Reader and Server - v3.3` via the shortcut on the desktop * on ute2.het.astronomy.utexas.edu 1. verify that event monitors are running a. `chksys`, should return a response like, {{{ hetdex 12996 1 0 121132 6124 0 12:02 ? 00:00:37 /opt/het/hetdex/bin/tcs_monitor --named-route tcp://192.168.66.99:30000 --system-names pfip_epics_relay --key-filter .*status --db-file /tmp/MonLogs/20220307T120000.pfipEpicsRelay_db hetdex 13062 1 0 122182 11208 0 12:05 ? 00:00:25 /opt/het/hetdex/bin/tcs_monitor --named-route tcp://192.168.66.99:30000 --system-names tcp://scs-smoco1:10001 --db-file /tmp/MonLogs/20220307T120501.scsMonitor_db hetdex 12895 1 4 122291 10724 2 12:00 ? 00:12:05 /opt/het/hetdex/bin/event_monitor --named-route tcp://192.168.66.99:30000 --system-names apc,tracker,tcs,pas,pfip,legacy,lrs2,virus,log-relay,event-monitor,thermocube_side1_epics_relay,thermocube_side2_epics_relay,dimmpoller,pfipmaxon --system-filter .* --source-filter .* --key-filter .* --db-file /tmp/MonLogs/20220307T120000.db hetdex 12996 1 0 121132 6124 0 12:02 ? 00:00:37 /opt/het/hetdex/bin/tcs_monitor --named-route tcp://192.168.66.99:30000 --system-names pfip_epics_relay --key-filter .*status --db-file /tmp/MonLogs/20220307T120000.pfipEpicsRelay_db hetdex 12895 1 4 122291 10724 2 12:00 ? 00:12:05 /opt/het/hetdex/bin/event_monitor --named-route tcp://192.168.66.99:30000 --system-names apc,tracker,tcs,pas,pfip,legacy,lrs2,virus,log-relay,event-monitor,thermocube_side1_epics_relay,thermocube_side2_epics_relay,dimmpoller,pfipmaxon --system-filter .* --source-filter .* --key-filter .* --db-file /tmp/MonLogs/20220307T120000.db }}} * on htcs.het.astronomy.utexas.edu 1. Restart the Tcs servers a. `sudo service pfipServer start` b. `sudo service pasServer start` c. `sudo service legacyServer start` d. `sudo service trackerServer start` e. `sudo service tcsServer start` 2. verify the servers are running by starting the tcsGui and check for green lights. 3. the caRepeater will start with the pfipServer * on lrs2 1. Enable pressure gauges a. syscmd -l "enable_pressure_gauges()" b. if no connection, power off pressure gauges and ion pumps c. restart lrs2 server d. pong on pressure gauges and ion pumps * launch rdesktop and connect to mirrormaster.het.astronomy.utexas.edu 1. login as guider 2. verify the Y:\ drive is connected to gracie::\common 3. launch the `CXRecorder` from the desktop shortcut. 4. on the Devices tab, select both accelerometers 5. Click the `Start Recording` 6. close the remote desktop connection * on dome.het.astronomy.utexas.edu 1. ensure that the DAS server has been restarted * on scs-smoco1 1. ensure that SCS server has been restarted a. on tolauncher SCS->Restart SCS System * On poster 1. ensure that het-operations email list is operational a. Restart sendmail (Not sure if this is all that needs to be done. May need to start mailman, but mailman won't remain started) * On izar 1. Make sure weather display is show on the LCD on top of control room rack a. With keyboard in rack, click button to reload web page.