PS1 IPP Czar Logs for the week 2012.01.09 - 2012.01.14

(Up to PS1 IPP Czar Logs)

Monday : 2012.01.09

Mark is czar

  • 07:00 nightly science mostly finished. seeing few remaining XY17 with fault 4 again. reverting and moving forward, including LAP.
  • 11:20 restarting stdscience after fixing the SSdiff trigger time to being ~noon HST rather than 3am.
  • 14:10 since magic not required now, setting up SMFs from MD06.GR0 for distribution to ps1-md-GR-cat. will slowly add MD07,04 as other fields with full exposure processing since 4/2009.
  • 16:00 started remaining MD06.GR0 nightly stacks in the deepstack pantasks, changed -min_num from 4 to 2, edge exampled of 2-3 look okay. will do rest of MD06.GR0 nightly stacks this way for comparison.

Tuesday : 2012.01.10

Mark is czar

  • 07:10 data still downloaing (780 exposures last night). MD06 stack failing w/ bad PSF (similar for MD06.GR0 nightly stack but with only 2 inputs). Looks like 4 LAP chips stuck with corrupted extensions and see if fixed after nightly science finishes.
  • 10:00 nightly science finished, set MD06 stack to quality 42 since all PSF inputs rejected (fault 5). LAP looks to be stalled in warp with skycell_id having quality>0 (not user 42) and not being updated. Bill suggested running warptool -dbname gpc1 -tofullskyfile -warp_id -skycell_id on the stalled ones and they have cleared. 900 stacks to do now.
  • 11:30 setting up another MD06.GR0 nightly stack run under deepstack pantasks and distribution MD04.GR0 SMFs (ignore the MD04 stacks on czar page).
  • 12:30 shifting the SSdiff time earlier to 10am HST, stoppping stdscience and restarting after rebuild of ippTasks (since LAP is tied up in stack still).
  • 13:30 Bill reports that the repair of the damaged M31.v4.2010 distribution runs has finally finished. All data from 2010 and 2011 have not been processed without magic and distributed.
  • 13:40 MOPS's postage stamps aren't progressing as quickly as we would like. Restarted pstamp and update pantasks and raised the priority for the MOPS label and lowered QUB
  • 13:50 Mark mistakenly sent pre-condor processing of MD04.GR through distribution. must be sitting around somewhere but don't see on datastore.. also don't see the proper condor_MD04 SMFs either. many things going to datastore right now, LAP, M31, and rcserver.makefileset.run has limit of 5 so will be a bit.
  • 14:20 Chris implementing LAP changes now -- remaining magic/destreak will continue until lap_id waiting for them clear through cleanup.

Wednesday : 2012.01.11

  • 10:00 bill repaired a few bad instance files
  • 10:29 bill noticed a LAP stack failed due to a corrupt file on ipp015. So much for the "we don't get corrupt files anymore statement. Repaired with tools/runwarpskyfile and reverted the stack
  • 12:30 serge: bzipping log files in ~ipp/cleanup/logs since the home directory is 99% full (for f in `find . -type f`; do echo $f; bzip2 $f; done). Before:
    Filesystem           1K-blocks      Used Available Use% Mounted on
    /dev/sdb1            961228084 895112064  17288428  99% /export/ippc18.0
    

Thursday : 2012.01.12

Roy is Czar in the morning

  • 08:00: not much data last night (32 exposures), all downloaded
  • 09:30 Mark is continuing to run 1000s of MD06 nightly stacks in the deepstack pantasks.
  • 09:40 Mark is kicking LAP, 6 XY26 chips extn problem has halted all, one had no instances (neb://ipp028.0/gpc1/20100617/o5364g0282o/o5364g0282o.ota26.fits).
  • 13:50 Serge: bzipping logs in cleanup finished at 18:33yesterday.
    ipp@ippc18:/home/panstarrs/ipp>df .
    Filesystem           1K-blocks      Used Available Use% Mounted on
    /dev/sdb1            961228084 799222380 113178112  88% /export/ippc18.0
    

Friday : 2012.01.13

Roy is morning Czar

Saturday : 2012.01.14

  • Bill repaired 9 bad ins tances of raw file (xy26) and marked three rawmImfiles with no remaining good files as ignored: 229878 XY31 229889 XY04, 229719 XY32
  • 12:00 Mark has MD06.GR0 on pause in distributing all the nightly stacks until can look at some in more detail

Sunday : 2012.01.15

  • 00:04 Mark: LAP looks fully stalled by 13 chips, mix of extn issue and no instances.
  • 18:00 kicking more LAP ota extn error and missing instances
  • 17:55 Bill was fixing ota errors at the same time as Mark.
  • 18:30 Johannes K reports that M31 distribution is not complete. The raw files from 2011 have not been fully distributed yet and also the variance images (background corrected warps) were not received.
  • 18:40 Bill dropped distRuns 1626613 - 1626616. They are "dummy diffRuns" that have no diffSkyfiles. We should not set the dist_group for runs like this to ThreePi?