PS1 IPP Czar Logs for the week 2013.04.08 - 2013.04.14

(Up to PS1 IPP Czar Logs)

Monday : 2013.04.08

  • 10:15 Bill stopped pstamp pantasks to perform mysqldump of ippc17 ippRequestServer. These have not been done regularty since ippc19 went off line. Will restart it once the backup finishes.
  • 14:15 Bill adjusted the MOPS postage stamp priority. QUB has submited over 10,000 requests which are going to block MOPS with the default settings

Tuesday : 2013.04.09

  • 06:10 Serge: ippdb02 backup on ippc63 and ipp064 complete. Restarting replication on ippdb02.
  • 10:25 Bill: update pantasks died. Started up a new one. Also restarted pstamp pantasks.
  • 12:10 CZW: stopped processing to allow Rita to fix a cabling issue on ipp020 and ipp021.
  • 12:23 haf: i'm cleaning up the faulted isp - we now have a script that finds the faulted ones, gets the proper bytes/md5sum, and changes the db.. we don't like this...
  • 13:20 CZW: processing restarted. ipp020 and ipp021 are back up.
  • 15:30 CZW: processing stopped for ippdb00 disk upgrade procedure.
  • 16:30 CZW: all clear from Serge. Restarting pantasks servers and starting processing (except cleanup, which will remain off).
  • 17:20 SC: restarted pstamp and update after ippc17 reboot
  • 20:00 SC: created ipp user on ippdb02. Should be useful for czartool
  • 23:20 Serge: All seems fine

Wednesday : 2013.04.10

  • 12:35 Bill The pstamp pantasks is running cleanup of old data which will cause the load on ippc17 to appear high. Scratch that wait until mops and qub pstamps are finished.

Thursday : 2013.04.11

  • 12:00 MEH: Gene put stsci06 into neb-host up state after rebooting into 3.7.6 kernel yesterday. MD08.pv1.20130403 running warps only to give it something to read/write ( until nightly data starts unless needed)
  • 19:55 MEH: registration seems to be getting hung up easily lately, manually reverting problem get get nightly processing going

Friday : 2013.04.12

  • 20:10 MEH: looks like stdscience is due for regular restart. doing before nightly data gets going full force
  • 23:00 MEH: again download fault, manually revert to help along

Saturday : 2013.04.13

  • 10:50 MEH: nightly data finished, no problems with db02 and cleanup so adding MD processing back in (warps to delay until stsci nodes back fully neb-host up)
  • 23:10 MEH: registration stalled waiting for faulted download for ~1.5hr? manually reverted

Sunday : 2013.04.14

  • 08:55 MEH: some missing files in registration due to being in drop state again in pzDownloadExp -- fault 110
  • 10:25 MEH: the stalled 5 exposures have been cleared and downloaded for processing, had pzDownloadImfile fault 110 and pzDownloadExp state drop so
    pztool -dbname gpc1 -revertcopied -exp_name o6396g0289o -inst gpc1 -telescope ps1 -fault 110
    pztool -updatepzexp -exp_name o6396g0289o -inst gpc1 -telescope ps1 -set_state run -summit_id 595332 -dbname gpc1
  • 19:00 MEH: misc MD missed chip cleanups. Many MD nightly science warps not cleaned up from past summer, from initial plan of keeping all online and on the stsci nodes. should be cleaned to free up space since the plan has changed, will keep the key fields MD04,07,09 (maybe MD08)