PS1 IPP Czar Logs for the week 2011.04.11 - 2011.04.17

(Up to PS1 IPP Czar Logs)

Monday : 2011.04.11

  • Bill queued STS.2010.a data for destreak and distribution.

Tuesday : 2011.04.12

  • Bill set stackRun 262194 to drop. It fails due to the problem reported in ticket 1427.
  • Roy: 200 or so exposures, through warp by 1:30ish, and distributed by 6am
  • Bill 10:49 Set MD04.deeptest.20110320 to survey lists for WSdiff, magic, destreak, and distribution
  • Bill 11:15 ipp005 panic'd. Distribution was using as a compute node. Set the host to off.
  • Roy: distribution crashed at 8:20pm ish. Restarted.

Wednesday : 2011.04.13

  • 14:55 Serge fixed the "PROBLEM" entry in the replication statuses table.
  • 19:13 CZW: finally got load under control on ippc18. Similar to last week, I broke it by running ppStack in a pantasks with extra debugging information on (I forgot to remove one). This caused a massive number of write attempts to my home directory, freezing the system. However, unlike last week, pantasks ignored the first "shutdown now", creating a problem where it would spawn new stacks as I killed them off. Everything's now back to normal.
  • 19:52 CZW: unstuck registration. See ticket #1477.

Thursday : 2011.04.14

  • 05:50 Set all runs with label 'ps_ud%' to goto_cleaned.

Friday : 2011.04.15

  • 15:01 Bill restarted the pstamp pantasks because pcontrol was using a whole cpu on ippc17
  • 21:12 Bill with advice from Chris worked around a registration bug by setting a dark burntool_state to -14 Later Chris finished the fix by setting data_state to full. (See ticket #1477)
  • 22:23 Bill set stdsciene to stop in preparation for restarting it. It seems slow and the large number of old faults is distracting

Saturday : 2011.04.16

Bill is acting like the czar this morning.

  • 07:00 Queued diffs for sts test exposures by hand with: difftool -input_label STS.nightlyscience -template_label STS.nightlyscience -definewarpwarp -dateobs_begin 2011-04-15 -set_workdir neb://@HOST@.0/gpc1/STS.ng/2011/04/15 -set_data_group STS.20110415 -set_label STS.nightlyscience -set_dist_group STS -set_reduction WARPWARP -simple -distance 0.1 -good_frac 0.1
  • 07:12 distribution had a lot of outstanding work. Lots of idle nodes. Restarted distribution. Also removed set the pantasks host ippc15 to off in stdscience (4 instances) and distribution (2 instances).
  • 08:00 slopes of the magic and magicDS graphs in czartool are much steeper since the restart.
  • 08:16 chip 214041 XY17 repeatedly faulting. It's the last one left from last night. Set it to fault 0 quality 42. See ticket 1479
  • 08:38 repaired typo in the last activity. Now chip run 214041 is done.

Sunday : 2011.04.17

  • 16:00 ran stacktool -revertsumskyfile
  • 17:00 started processing for a new STS reference stack. The first draft of the new STS tessellation was unsatisfactory. At Johannes' request changed the size in RA and doubled the overlap to 120 arcesconds. Kept the same tess_id and overwrote the installation. Set preveious STS.V3 data to goto_cleaned. Will set to goto_purged later.