PS1 IPP Czar Logs for the week 2011.02.14 - 2011.02.20

(Up to PS1 IPP Czar Logs)

Monday : 2011.02.14

  • 13:35 (heather) added magic test labels into distribution. All diffs/etc are done heather's stdsci, heather is preparing for the switchover to magictesting on the current trunk
  • 16:40 (bill) Bill is running cleanup out of his build. The new code is working and has been checked into the production tree. Once last night's processing completes he will stop things update the production build and start cleanup from there.
  • 16:40 (bill) dist.cleanup is off in order to give destreak cleanup time to catch up. The backlog is blocking postage stamp processing.

Tuesday : 2011.02.15

  • 10:00 (eam) ipp005 is still rebuilding. I set it to neb down to protect it from activity. I tried to umount it, but could only remount it read-only.
  • 11:09 (heather) heather added MD04.jtrp, MD05.jtrp, MD06.jtrp to ipp's stdscience. These have low priority. Heather also added (for diff of one item only) magictest.3Pi.ipp20101215.20110215. Heather is currently queueing/processing/kicking mopstestsuite.20110214. The processing for that mopstest goes into heather's stdscience, but the publishing will be done by ipp
  • 17:30 (eam) stopped and restarted the stdscience pantasks, partly to clear out errors, but mostly to include the addnoise reduction class for the bigboss tests. (boss.MD*.20110215 labels)

Wednesday : 2011.02.16

  • 11:45 bill queued 8 3pi exposures from 2010-12-30 for reprocessing. They didn't get run through magic then.
  • 12:52 bill reset the warp book. It was full of entries in state DONE again.
  • 16:05 bill added ThreePi?.missing to distribution. This is 8 exposures from 12/30/2010 processed through stdscience this morning for Bertrand.
  • 16:45 bill ran warptool -revertwarped on 3pi data. We have at least 1 corrupted file.

Thursday : 2011.02.17

  • 18:28 bill turned dist.revert back on since ipp005 is back
  • 21:00 bill restarted pstamp and update so that we don't have to look at the ridiculously large fault counts

Friday : 2011.02.18

  • 12:40 CZW: Bill pointed out that ipp005 and ippdb00 both had loads approaching 150. This is probably due to the disk balance code still not being very well tuned. It seemed to have caused ippdb00/apache to go into the segfault loop. I stopped disk balance, restarted apache, and everything seems to be working fine now.

Saturday : 2011.02.19

  • 04:00 Bill went to bed early last night so he woke up at 3:30 am. Man the moon is bright. Can't see hardly any stars. Decided to check on processing and RATS lots of destreak failures on czartool. And even more failures in desreak.revert in pantasks. Bill's rule in action ("If it isn't tested regularly, it doesn't work"). While the changes I made to destreak in the trunk worked fine for chip stage (which I tested thoroughly), they fell over completely for camera stage. There were 3 bugs which I have now fixed.
  • 04:22 There is a persistent distribution failure due to a corrupt file. It's a destreaked file so these are more troublesome to fix. However I have been working on this. Setting magicDSRun for diff 112000 to be restored... OH CRUD the magicDSRun has been cleaned up... Never mind. Setting distRun 428739 to be cleaned. Magic and cleanup are going to be the death of me.
  • 04:30 Queued STS raw images for 20100524 to be destreaked.
  • later queued dist runs for this (survey task can't be used because it needs -use_alternate)
  • 08:39 Queeud STS raw images for 20100603 to be destreaked.
  • 08:40 Ran the following command to put a dead end diff skycell to sleep:
difftool -updaterun -set_state wait -diff_id 112362 -set_note 'skycell.077 faults with data error set to wait for debugging'

Sunday : 2011.02.20

  • 02:00 pstamp pantasks crashed around 7pm last night. Restarted.
  • 16:15 fixed corrupt warp output perl ~bills/ipp/tools/ --redirect-output --warp_id 163766 --skycell_id skycell.1735.140
  • 16:20 changed diff.revert back to only reverting fault 2