PS1 IPP Czar Logs for the week 2011-12-26 - 2012-01-01

(Up to PS1 IPP Czar Logs)

Monday : 2011-12-26

Bill is czar today

  • Only got 33 exposures last night. Wind and humidity.
  • LAP is proceeding. Postage stamp server is busy.
  • marked all ps_ud_% data to be cleaned.
  • 09:42 doubled number of hosts doing update processing.
  • 10:31 restarted stdscience and distribution. The pcontrols were spinning.
  • 12:16 pstamp is done with the backlog of outstanding jobs. Restarted update with normal number of hosts.

Tuesday : 2011-12-27

Roy is czar.

  • 08:00: no data last night.....
  • 11:45: Bill stopped stdscience to "integrate a change to psastro"
  • 16:13:yeeeehaw! heather stopped pantasks to pull in a pile of addstar changes (into the trunk, into ipp-dvo, and into gpc1 db). ipp's ipp is unchanged for the moment - the changes only affect addstar related activities.

Wednesday : 2011-12-28

Roy is czar.

  • 08:00: no data last night

Thursday : 2011-12-29

  • no data last night (hexapod)
  • bill restarted dist -this 'kicked' lap
  • heather started condooooooor! on JTMD stuff

Friday : 2011-12-30

Saturday : 2011-12-31

  • 06:26 Gene rebooted ipp058
  • 07:45 or so Bill repaired several bad raw instances and skipped some chips with empty burntool tables.
  • 11:30 Bill restarted stdscience. pcontrol was running amok making communication between pantasks_client and server error prone
  • 16:30 Mark: MD04 staticsky stalled in distribution for past 24hrs, restarted and cleared along with 480 LAP.
  • 18:00 Mark: restarted deepstack pantasks to rerun MD06 skycell.087-i that failed with poor quality
  • 22:00 Mr Fussy noticed a forlorn MD06.GRO stack failing that had a broken input file. Then noticed many many MD06 broken files (we really need to bite the bullet and replicate transient input files) and rebuilt the chips and warp.
    • Mark has been slowly working through the remaining MD06xxxx that pop up from distribution and the annoying missing .kernel files. Please resist the temptation to revert the MD06 faults for now :)

Sunday : 2011-01-01

  • 11:00 Mark restarted distribution pantasks to see if unsticks 2 remaining MD06 stacks (and 10 LAP) for distribution. nope.
  • 13:08 Bill cleaned up the latest stuck LAP files. Every few hours some exposures do not finish because of various problems. Here are the steps that I have been following to get LAP moving again
    • corrupt files on XY26 Fix with tools/repair_bad_instance --repair --exp_id <exp_id> --class_id <class_id> To check without fixing leave off --repair
    • rawImfiles with no good instances: Also handled by tools/repair_bad_instance (sets rawImfile.ignored)w
    • missing burntool tables chiptool -dropprocessedimfile -set_quality 40 -exp_id <exp_id> --class_id <class_id>
    • warpSkyfiles with no remaining instances warptool -updatewarpskyfile -set_quality 42 -fault 0 -warp_id <warp_id> -skycell_id <skycell_id> ;
    • sometimes the following is also needed: warptool -tofullskyfile -warp_id <warp_id> -skycell_id <skycell_id> to get the warpRun to go to full.