IPP Progress Report for the week 2010.10.18 - 2010.10.22

(Up to IPP Progress Reports)

Eugene Magnier

Serge Chastel

* IPP czar * Crash of ps1sc db replication * Still tried to replicate nebulous.

  • InnoDB cannot be replicated easily
  • InnoDB is proprietary software: backup/replication needs HotBackup? software
  • HotBackup? license is about $1200 but free for one month test (I got a tmp license).

* Started to look at IPP-MOPS ICD/code.

Heather Flewelling

  • czar Monday & Tuesday - slow processing because we had problems with ipp018 and fsck, and ipp006 going down.
  • addstar
    • cleanup of ipp004 fillup mess (merged db is fine, the latest minidvdob was removed - database updated to make addstar think it had never processed the last minidvodb)
    • restarted addstar (numbers of timeout faults - these need to be investigated)
  • ifa liason (tomo)
  • publishing for wainscoat - 2 nights of OSS + 2 extra images
    • there is no easy way to reprocess (and diff) OSS. I wrote some scripts (not checked in), to diff exactly the same exposures as before.
  • publishing bugs found (one fixed), instructions on how to use publishing: http://svn.pan-starrs.ifa.hawaii.edu/trac/ipp/wiki/CreatingNewDatastoreForPublishing
  • minidvodbcopy

Roy Henderson

* PSPS

  • prepping for PSPS software engineer interviews (reading, Googling...)
  • full day of interviews and discussions about candidates
  • hassles restarting loading after IPP IP changes

* IPP

  • czartool
    • roboczar now emailing (just me, for now) when a stage of processing is 'stuck', i.e. plateaued while (pending - faults) > 0
    • more work on 'analysis' code. Can now figure out when processing starts, stops and when it is stuck for a given period

* Other stuff

  • some time fixing desktop machine after MySQL killed it
  • day off on Monday

Bill Sweeney

  • Fixed bug in data store reported by MPG where distribution runs were getting cleaned up but the filesets were left on the data store. Fixed this by reworking the error handling in dsreg.
  • wrote a script to repair the damage cause by that bug.
  • dealt with a postage stamp request storm.
  • 2 days processing czar.
  • Changed the magic streak detection code to save final results in nebulous so that this data can be replicated. It is very expensive to calculate.
  • Wrote a script to copy existing magic outputs to nebulous.
  • set up OSS data to get distributed. (we forgot to do this when the new labels were added)
  • fixed distribution bug that was preventing cleaned camera runs from getting distributed.
  • Found data on ipp004 to be deleted or moved to solve out of space problem on that node.
  • reworked destreak cleanup to allow for a real update process. The primary purpose of this was to preserver the masking statistics which are saved in rows that were getting deleted from the database. Came up with a strategy to recover those data.
  • various hours recovering from errors caused by troublesome machines.

Chris Waters