PS1 IPP Czar Logs for the week 2011.11.28 - 2011.12.02

(Up to PS1 IPP Czar Logs)

Monday : 2011.11.28

Mark is Czar

  • 10:00 updating ~1600 MD04 chips from past condor set to run through destreak to distribute camera files
  • 10:40 stdscience not fully loading? restarting
  • 14:20 also restarting distribution after condor_MD04.V3_01 destreak found running slowly

Tuesday : 2011.11.29

Mark is czar

  • 10:20 restarted distribution
  • 11:08 restarted the postage stamp pantasks. pcontrol was spinning throughput was low.
  • 13:30 setup SAS staticsky 5 filter to distribution with filter='multi' under the ps1-lap-staticsky and LAP.ThreePi? but triggered re-distribution of the single filter LAP staticsky as well. removing LAP.ThreePi?.20110809 label from distribution for now. Set the wrongly added "new" to drop.
  • ~20:00-21:00 odd stack fault=2 for 10s of stacks SAS.123, MD01, MD10. revert ok
  • 22:40 stdscience lagging, restarted.

Wednesday : 2011.11.30

  • 08:30 Mark: more 10-20 stacks from the past night remaining, including set in SAS2.123 re-run. looks like a /local/ipp/tmp dir access problem when running on ipp064 as the disk is read-only. Reverts succeed, but ipp064 not useful to have in stack pantasks right now and removing.
     Unable to open FITS file /local/ipp/tmp//RINGS.V3.skycell.1315.080.stk.550485.0.conv.im.fits to write image.
     Unable to open FITS file /local/ipp/tmp//RINGS.V3.skycell.1315.090.stk.550782.0.conv.im.fits to write image.
     Unable to open FITS file /local/ipp/tmp//RINGS.V3.skycell.1406.018.stk.550894.0.conv.im.fits to write image.
     Unable to open FITS file /local/ipp/tmp//MD03.V3.skycell.026.stk.551045.0.conv.im.fits to write image.
     Unable to open FITS file /local/ipp/tmp//MD03.V3.skycell.066.stk.551077.0.conv.im.fits to write image.
     Unable to open FITS file /local/ipp/tmp//MD03.V3.skycell.035.stk.551052.0.conv.im.fits to write image.
     Unable to open FITS file /local/ipp/tmp//MD02.V3.skycell.022.stk.550356.0.conv.im.fits to write image.
     Unable to open FITS file /local/ipp/tmp//MD02.V3.skycell.042.stk.550371.0.conv.im.fits to write image.
    
  • looks like also causing diff faults, removing ipp064 from stdscience and adding to the ignore_hosts in the ~/ippconfig/pantasks_hosts.input until fixed.
  • 09:00 17000 postage stamp jobs outstanding. Proceeding slowly. Restarted pstamp pantasks and doubled the number of hosts working on update.

Thursday : 2011.12.01

  • 09:00 ipp058 neb-host repair and out of processing while disk problem being investigated.
  • 13:45 restarted distribution and pstamp pantasks because pcontrol was using too much cpu. Doubled number of distribution hosts

Friday : 2011.12.02

Roy is czar

  • 09:15: restarted stdscience as it seemed sluggish

Saturday : YYYY.MM.DD

Sunday : YYYY.MM.DD