PS1 IPP Czar Logs for the week 2011.08.29 - 2011.09.04

(Up to PS1 IPP Czar Logs)

Monday : 2011.08.29

  • 8:00 : Roy : Everything down from summit, standard processing looks ok
  • 9:00 Mark: set MD01.GR warps, MD10.GR WSdiffs to cleanup. should be mostly finished with GR cleanup now.
  • 13:00 another offnight diffim with large number of detections (pub_id=345096, diff_id=158270) running on ippc07 for 15hr (ippc09 prior for 29hr). killed ppMops process and set to drop
    pubtool -dbname gpc1 -updaterun -pub_id 345096 -set_state drop
    

Tuesday : 2011-08-30

  • 07:00 Bill: There are lots of faults due to files wanted from ipp013 and ipp014 which are not available. Set diff.revert off to stop the madness.
  • 07:00 created label for phot.jtrp to keep them out of the way. Chips are faulting and reverting repeatedly due to missing files.
  • 07:37 update pantasks has more faults than successes (probably due to the down nodes). Restarted pantasks to get a clearer picture of what's happening.
  • 08:00 Roy: Everything copied from summit. Same MD diff failures as yesterday: waiting for rsync to finish copying to ipp014, which should resolve the issue

Wednesday : 2011-08-31

  • 10:01 Serge: Memory usage increase on ippdb00 and ippdb01. Processing stopped around 9:00 and restarted at 10:00
  • 10:30 Bill: reran camRun 255845 to replace corrupted mask file that was causing a destreak failure
  • 11:09 CZW: Made change to ipp user's .tcshrc file to select a nebulous server directly, instead of using the randomization via DNS. I tested it yesterday on my own account, so I don't expect issues.
  • 11:45 CZW: Unstuck stuck exposure: regtool -revertprocessedexp -exp_id 384925
  • 14:45 Serge: All nightly processing finished but the weird o5804g0186o / o5804g0204o diff case
  • 16:50 Bill set chip, warp, and diff runs that were in state "error_cleaned" back to "goto_cleaned" (These runs were stalling some postage stampm jobs)

Thursday : 2011-09-01

  • 07:05 Bill: Postage stamp server has finished all pending requests. Set all data from ps_ud labels to be cleaned up. Set all pstampDependents that were in state new to full. This will allow us to try and update any components that were temporarily "not available" or "gone" due to temporary disk outages.
  • 07:10 Bill: queued STS diffs for 2011-08-26 and 2011-08-28
  • 08:17 Bill: dropped STS magicRuns 195005 and 195083 which fail due to corrupt differrence images. They cannot be remade because the input warps have been deleted.
  • 09:45 Serge: Ran cam stage again for cam_id 257085 (corrupted cam.mk fits file)
  • 11:35 Serge: chip.off and warp.off. LAP prevents nightly science execution
  • 11:50 Serge: set STS.nightlyscience priority to 399 (so that 3pi.ns is higher)
  • 14:05 Serge: chip.on and warp.on

Friday : 2011-09-02

  • 01:55: CZW: Noticed registration was hung while checking to see how LAP was going. Ran "regtool -updateprocessedimfile -exp_id 385775 -class_id XY11 -set_state pending_burntool" to twiddle the registration state back after a failed burntool. This unstuck registration and suggests we need a way to auto-revert failed burntool issues.
  • 10:40 heather is now the czar. I reverted some stacks that had fault of 4, some of them have already succeeded.
  • 21:00 Mark: taking advantage of some marginal weather and adding more MD02.V3 GR0/refstack jobs to the queue.

Saturday : 2011-09-03

  • 13:00 Mark: another zero size file for neb://ipp012.0/gpc1/20101117/o5517g0443o/o5517g0443o.ota20.fits. Also a third copy that doesn't exist at
    /data/ippb02.0/nebulous/63/8c/977648223.gpc1:20101117:o5517g0443o:o5517g0443o.ota20.fits
    
    To fix the zero size file, ran
    cp /data/ipp029.0/nebulous/63/8c/544873317.gpc1:20101117:o5517g0443o:o5517g0443o.ota20.fits /data/ipp012.0/nebulous/63/8c/977647893.gpc1:20101117:o5517g0443o:o5517g0443o.ota20.fits
    
  • 20:00 Mark: 12 warps from 9/02 stuck for a while with warp_overlap.pl. RawExp? comment are for OT tests like "30Hz, OT half focal plane". Set warp state to drop with note.
    warptool -dbname gpc1 -updaterun -set_state drop -set_note "OT, 30Hz testing" -warp_id 244936
    

Sunday : 2011-09-04

  • 21:00 Mark: discovered duplicate set for MD02.GR0.20110904 was being included in MD02.refstack.20110907. Will need to purge the duplicate and redo the stacks.