IPP Progress Report for the week 2011-09-26 - 2011-10-02

(Up to IPP Progress Reports)

Eugene Magnier

vacation

Serge Chastel

  • condor: chip to stack implementation
  • partially tested it on JT's MD

Heather Flewelling

  • helped serge with condor stuff (queueing of JT stacks for testing, answered many questions)
  • helped roy with dvodb stuff
  • investigated GALFit (via Jim) as a sanity check against our code
  • czar 1 day
  • cleaned up ipp006 when I discovered it was full (why was it full?)
  • sick 1 day

Roy Henderson

  • huge speed-up for ipptopsps by using multiple 'clients'. This included:
    • lots of headaches with TRANSACTIONS and LOCK TABLES to protect the critical section of the database when queuing multiple clients
    • with a mysql LOCK in place, it was necessary to speed-up a lot of my queuing queries to avoid deadlock
    • each process now stores its PID and hostname for unique log-naming
    • lots of monitoring to check things were running smoothly
    • some tricky clean-up after a screw-up before mutex was properly implemented
  • other ipptopsps development:
    • added coordinate constraints to queuing query so we can load certain regions of old 3PI data
    • datastore class now takes an ipptopsps database object to constructor so it can update the database itself
    • changes to cleanup code: now only deletes from local disk batches that have been merged
  • other:
    • duplicate exp_ids in DVO broke merge, some cleanup work with Sue was required
    • more cleanup work with Sue later in the week when DXLayer stopped and local files were deleted
    • filled up /var with mysql logs: Gavin kindly moved mysql data dir to data partition, and reduces logging retention time
    • some help getting DXLayer to start again: too many subdirs
    • changes to PSPS news page to reflect current status after the merge

Mark Huber

  • Fnished office setup with Chris and Roy
  • Czar Thursday and Friday
  • Processing throughput
    • added another test monitor to ganglia - iostats for disk activity on ipp053
    • sat up with pantasks overnight and found strange interruption behavior in processing that was reducing the rate nearly 50% between 2-6am. Working with Gavin and Cindy to track down.
  • MD.GR0
    • still adding to documentation summary
    • all remaining MD.V3 tessellations setup/checked (MD03,04,05,06,08,09) and configured for processing except for MD08, 09 as they are still active.
    • MD03-zy exposures reprocessed for reference stack, y reference stack finished
    • MD04 past pre-V3 altstack warps retired and setup queue for i-band exposures the following week.
  • returning to ppSub auto convolution direction, re-setup of test set.

Bill Sweeney

  • medical leave for surgery
  • start work on reprocessing the 2010 M31 data with the new tessellation.

Chris Waters

  • LAP: defined new areas for processing, and finished testing of new code. Identified common failures that are caused by reprocessing:
    • chip stage: due to bad shuffles in the past, some nebulous keys do not have two good instances. This causes errors when the bad instance is selected for processing. I've developed a work around, and will be implementing a permanent fix for this problem in the future.
    • warp stage: as we are now updating data instead of launching a new processing, we can occasionally create warp skycells that are unusable in stacking because they do not have a PSF. This appears to be a result of the destreak process masking the small number of PSF stars on the image. This case is now detected and the quality of these skycells set to reflect the bad data quality.
    • stack stage: we occasionally orphan stacks by cleaning up the data. This happens because the stack faults, the LAP code detects that all stacks have been attempted, and triggers the cleaning. As these faulted stacks may not ever complete correctly due to code issues, this seems to be the correct solution for the moment.
  • Diskspace: work on increasing the reliability of the shuffle code. No great improvement, although there is some evidence that the shuffle is slowly freeing space on the most full disks.
  • Stacking: Looked into how the stack PSF is determined, as we have good evidence that the output PSF in the convolved stacks is larger than expected based on the set of input warps. No conclusion yet.