(Up to PS1 IPP Czar Logs)

Monday : 2012-08-20

  • 09:15 (Serge): stdscience apparently crashed. Restarted.
  • 10:10 (Serge): Recovered gpc1/20100527/o5343g0315o/o5343g0315o.ota26.fits. Temporarily set ippb00 to up in nebulous to manually replicate an instance on ippb00.2.

Tuesday : 2012-08-21

Serge is czar

  • 06:30 (Serge): Nightly processing almost complete. Just a few 3pi at the warp stage.
  • 10:40 (Serge): LAP
    • Lost:
      • gpc1/20090628/o5010g0074o/o5010g0074o.ota25.fits (XY25 82622 553606)
      • gpc1/20090628/o5010g0066o/o5010g0066o.ota44.fits (XY44 82614 553693)
      • gpc1/20090628/o5010g0077o/o5010g0077o.ota24.fits (XY24 82625 553724)
      • gpc1/20090628/o5010g0077o/o5010g0077o.ota31.fits (XY31 82625 553724)
    • Recovered:
      • gpc1/20100629/o5376g0035o/o5376g0035o.ota03.fits
      • gpc1/20100629/o5376g0038o/o5376g0038o.ota46.fits
      • gpc1/20100626/o5373g0186o/o5373g0186o.ota56.fits
      • gpc1/20100629/o5376g0045o/o5376g0045o.ota05.fits
      • gpc1/20100629/o5376g0041o/o5376g0041o.ota64.fits
      • gpc1/20100629/o5376g0042o/o5376g0042o.ota44.fits
      • gpc1/20100629/o5376g0042o/o5376g0042o.ota64.fits
      • gpc1/20100629/o5376g0054o/o5376g0054o.ota05.fits
    • Fixed:
      • gpc1/20100702/o5379g0103o/o5379g0103o.ota41.burn.tbl
      • gpc1/20100702/o5379g0120o/o5379g0120o.ota76.burn.tbl
      • gpc1/20100702/o5379g0104o/o5379g0104o.ota52.burn.tbl
      • gpc1/20100702/o5379g0119o/o5379g0119o.ota05.burn.tbl
      • gpc1/20100702/o5379g0105o/o5379g0105o.ota41.burn.tbl
      • gpc1/20100702/o5379g0122o/o5379g0122o.ota17.burn.tbl
      • gpc1/20100627/o5374g0246o/o5374g0246o.ota33.burn.tbl
      • gpc1/20100627/o5374g0237o/o5374g0237o.ota30.burn.tbl
      • gpc1/20100627/o5374g0204o/o5374g0204o.ota24.burn.tbl
      • gpc1/20100702/o5379g0155o/o5379g0155o.ota41.burn.tbl
  • 11:00 (Serge): All 4 ATRC nodes shut down (A/C check there)
  • 14:35 (Serge): The 4 ATRC nodes are back. All partitions set to repair but ippb00.2 which is up.
  • 15:00 (Serge): LAP
    • Fixed:
      • gpc1/20100702/o5379g0107o/o5379g0107o.ota76.burn.tbl
      • gpc1/20100702/o5379g0125o/o5379g0125o.ota31.burn.tbl
      • gpc1/20100702/o5379g0126o/o5379g0126o.ota53.burn.tbl
      • gpc1/20100702/o5379g0138o/o5379g0138o.ota41.burn.tbl
      • gpc1/20100702/o5379g0152o/o5379g0152o.ota43.burn.tbl
      • gpc1/20100702/o5379g0159o/o5379g0159o.ota53.burn.tbl

Wednesday : 2012-08-22

Serge is czar

  • 06:45 (Serge): Nightly processing almost complete (at warp stage for 10ish exposures)
  • 09:30 (Serge): LAP
    • Recovered
      • gpc1/20100626/o5373g0166o/o5373g0166o.ota56.fits
      • gpc1/20100626/o5373g0168o/o5373g0168o.ota56.fits
      • (15:55) gpc1/20100626/o5373g0170o/o5373g0170o.ota56.fits
  • 16:05 (Serge): Sluggish stdscience: restarted

Thursday : 2012-08-23

Serge is czar

  • 09:15 (Serge): Added WSdiff for OSS to stdscience.
  • 10:00 (Serge): LAP
    • Recovered:
      • gpc1/20100627/o5374g0131o/o5374g0131o.ota26.fits
      • gpc1/20100512/o5328g0422o/o5328g0422o.ota15.fits
    • Lost:
      • gpc1/20090628/o5010g0078o/o5010g0078o.ota24.fits (o5010g0078o XY24 82626 555189)
      • gpc1/20090628/o5010g0078o/o5010g0078o.ota25.fits (o5010g0078o XY25 82626 555189)
      • gpc1/20090628/o5010g0067o/o5010g0067o.ota45.fits (o5010g0067o XY45 82615 555322)
      • gpc1/20090628/o5010g0076o/o5010g0076o.ota24.fits (o5010g0076o XY24 82624 555638)
      • gpc1/20090628/o5010g0075o/o5010g0075o.ota24.fits (o5010g0075o XY24 82623 555688)
      • gpc1/20090628/o5010g0065o/o5010g0065o.ota45.fits (o5010g0065o XY45 82613 555710)
      • gpc1/20090628/o5010g0068o/o5010g0068o.ota30.fits (o5010g0068o XY30 82616 555794)
      • gpc1/20090628/o5010g0068o/o5010g0068o.ota45.fits (o5010g0068o XY45 82616 555794)
    • Fixed:
      • gpc1/20100629/o5376g0164o/o5376g0164o.ota73.burn.tbl
      • gpc1/20100629/o5376g0197o/o5376g0197o.ota46.burn.tbl
      • gpc1/20100629/o5376g0198o/o5376g0198o.ota50.burn.tbl
  • 10:45 (Serge): ATRC nodes are down (ippb0[0-3]) for A/C maintenance
  • 11:15 (Bill): shut down the unused deepstack pantasks. I will be taking it over for some tests.
  • 14:50 (Serge): ATRC nodes are up (all partitions in "repair" state but ipp00.2 which is up)
  • 15:45 (Serge): LAP
    • Recovered:
      • gpc1/20100624/o5371g0097o/o5371g0097o.ota64.fits
      • gpc1/20100626/o5373g0143o/o5373g0143o.ota26.fits
      • gpc1/20100626/o5373g0120o/o5373g0120o.ota16.fits
      • gpc1/20100626/o5373g0154o/o5373g0154o.ota50.fits
    • Fixed:
      • gpc1/20100626/o5373g0120o/o5373g0120o.ota56.burn.tbl
  • 17:00 (Bill): restarting stdscience

Friday : 2012-08-24

  • 11:25 (Serge): LAP
    • Fixed:
      • gpc1/20100627/o5374g0078o/o5374g0078o.ota16.burn.tbl
    • Recovered:
      • gpc1/20100626/o5373g0077o/o5373g0077o.ota26.fits
  • 11:45 (Serge): Restarted stdscience since I made some changes to survey.WSdiff (added explicit target label to the arguments list), stdscience/input (MOPS will have OSS WSdiffs produced with label OSS.trail_fitting.test.nightlyscience) and publishing/input (OSS.trail_fitting.test.nightlyscience will be published to client 12 i.e. IPP-MOPS-TEST-2).

Saturday : 2012-08-25

  • 14:30 (Serge):
    • GPC1 mysql replication broken;
    • Restarted publishing and stdscience;
    • MOPS OSS not published

Sunday : 2012-08-26

  • 06:45 (Serge) MOPS OSS not published (looks like survey.add.publish OSS OSS.nightlyscience 5 NULL and survey.add.publish OSS OSS.trail_fitting.test.nightlyscience 12 NULL can't coexist). Changed stdscience/input and restarted stdsscience.
  • 08:33 (Bill) Looks like stdscience was restarted but not initialized. Now it is. Also Czartool says that there are stacks left to be processed but gpc1 disagrees.
  • 10:08 (Bill) czartool was not getting updated because it was using the replicated gpc1 database on ippdb03 which is not up to date. I don't know how to fix that so I restarted czarpoll with czarconfig.xml changed to point to ippdb01 for now