PS1 IPP Czar Logs for the week 2012.06.18 - 2012.06.25

(Up to PS1 IPP Czar Logs)

Monday : 2012.06.18

  • 21:30 Mark: queued staticsky run in deepstack for MD07.refstack.20111106 (since using raw/unconv for measurements now, okay to run pre-ppstack convolution bug fix) -- however, looks like 25/70 have faulted and need looking into.
  • 23:00 Mark: setup SAS2 chips to ps1-sas distribution fo Johannes.

Tuesday : YYYY.MM.DD

Wednesday : 2012-06-20

Serge is czar

  • No data last night; no lap

Thursday : 2012-06-21

Serge is czar

  • No data last night

Friday : 2012.06.22

Mark is czar

  • power back on at summit but pumping down the camera still. No data last night or tonight.
  • looks like PSS has been running to just watch jobs stall for past couple days according to the web status page. restarting pstamp and update.
    • log now showed error with pstamp_save_server_status.pl --update-link but not the problem
    • things are still stalled.
  • Chris warns to keep an eye on replication, may be a good idea to turn off once reaches ~95% full on ippb03 just to be safe
  • 13:50 Mark: turning stdscience off to keep daily cleanup from running until sorted out PSS..
  • PSS appears chips are a mix of stalled cleaned/update/full -- manually resetting some to test
    • adding compute3 power to updates since no nightly science tonight as camera pumping down
    • selected chip updates finished, still working on kicking things
  • 22:10 Mark: PSS moving again, was it something I did or was someone else also monkeying with it? seems not, so brief summary -- don't understand fully why was blocked (not faulting and clearing) but gave up and put problem request in a hold state with
    pstamptool -dbname ippRequestServer -dbserver $PSDBSERVER -updatereq -req_id 182918 -set_state hold
    
    • more detail, status at stalled state
      | req_id | name                              | label  | reqType | state | fault | Num Rows | Total Jobs | Completed Jobs | Pending Jobs | Faulted Jobs | Image updates | last state change (UTC) |
      +--------+-----------------------------------+--------+---------+-------+-------+----------+------------+----------------+--------------+--------------+---------------+-------------------------+
      | 182918 | web_59225                         | WEB    | pstamp  | run   |     0 |        1 |       9872 |           4811 |         5061 |         1370 |          1783 |     2012-06-20 20:49:14 | 
      | 182921 | MOPSSTAMPREQsp20120620T233423Z_00 | MOPS   | pstamp  | run   |     0 |        4 |          4 |              1 |            3 |            1 |             0 |     2012-06-20 23:34:59 | 
      | 182922 | MOPSSTAMPREQsp20120621T001932Z_00 | MOPS   | pstamp  | run   |     0 |       14 |         14 |              1 |           13 |            1 |             0 |     2012-06-21 00:20:15 | 
      | 182926 | cheee_20120620T143140             | WEB.UP | pstamp  | run   |     0 |      409 |        409 |            110 |          299 |           70 |             0 |     2012-06-21 00:36:34 |
      
      
    • initially restarting pstamp and update and started getting errors related to suspected problem PSS request update
      config error for: chip_imfile.pl --threads @MAX_THREADS@ --exp_id 123499 --chip_id 55143 --chip_imfile_id 3148661 --class_id XY52 --uri neb://ipp033.0/gpc1/20100117/o5213g0177o/o5213g0177o.ota52.fits --camera GPC1 --run-state update  --deburned 0 --outroot neb://ipp033.0/gpc1/MD04.20100208/o5213g0177o.123499/o5213g0177o.123499.ch.55143 --redirect-output --dbname gpc1 --verbose
      
      gpc1/MD04.20100208/o5213g0177o.123499/o5213g0177o.123499.ch.55143.XY52.log
      gpc1/MD04.20100208/o5213g0177o.123499/o5213g0177o.123499.ch.55143.XY52.log.update
      
      Unable to parse camera.
       -> psMetadataLookupMetadata (psMetadata.c:1148): I/O error
           Couldn't find PPIMAGE.PATTERN in the metadata.
       -> pmFPAfileDefineOutputForFormat (pmFPAfileDefine.c:187): I/O error
           Can't find file rule PPIMAGE.PATTERN!
       -> ppImageParseCamera (ppImageParseCamera.c:139): I/O error
           Unable to generate output file from PPIMAGE.PATTERN
      --> very old data
      
    • tried some manually triggered cleanup+updates for QUB and MOPS, all fine but never recognized by pstamp/update to continue
    • tried changing label priority with labeltool, seemed to change in -listlabels but not on PSS status page even after restarting
    • tried tracing through, but giving up and just putting to hold. probably created a lot of cases for manual cleanup in future.

Saturday : YYYY.MM.DD

Sunday : YYYY.MM.DD