Update Processing Versus Magic - Problems and potential solutions

There will not be enough disk space in the IPP cluster to retain all data products for the PS1 survey.

The static sky images and catalogs will be retained but most of the processed images will be deleted sometime after they are processed and perhaps released. The deletion process is known as cleanup. During cleanup enough information is retained so that the pixels can be regenerated. The process of recreating the images is known as update.

When dealing with destreaked images performing 'updates' becomes somewhat complicated. This note discusses these issues and describes the approach that we propose to implement.

Overview

When a Run (chipRun, warpRun, or diffRun) is cleaned it's state in the database is set to 'cleaned'. The run is queued for update by changing it's state to 'update'. After the run has been updated it's state is set to 'full'.

When a run has dependencies for example a warpRun needs the dependent chipRun's to be in full state. The various pending queries of the ippTools enforce this.

We assume that the update is being performed because some user wants to see the pixels. For example the postage stamp server needs to queue update processing when stamps from cleaned images are requested. This means that the resulting images must have been de-streaked, i.e. detected satellite streaks must be masked.

There are two ways that the de-streaked requirement can be met.

  • A. By performing the update processing on input images that have been de-streaked
  • B. By re-running the destreak program on images containing the streak pixels

Plan A is much easier to implement in the context of normal IPP operations because the ultimate source of all pixels the raw images will have streaks removed. (The excised pixels will be saved in separate image files).

We have done some experiments running de-streaked chip processed images through warp and difference processing. There are some differences between the results and the original runs including the streak pixels

  • Areas that were not included in the magic streak detection phase were masked
  • The masked areas of the streaks were widened somewhat TODO: quantify

Implementation of Plan B would require several enhancements to the IPP and operations

  • excised pixels from destreaking would need to be saved at all stages (deleted during cleaning)
  • psModules would need the ability recreate the original image by merging the excised pixels with the de-streaked image

This solution is interesting and we can implement it later if we find that Plan A doesn't meet our requirements.

Specific Implementation Tasks

*Run.magicked *file.magicked values

Currently when an image file is de-streaked the column 'magicked' in the corresponding database table is set to the magic_ds_id value of the corresponding magicDSRun. If this value is non-zero, the image has been destreaked and thus can be sent through the distribution system or used as input to postage stamp requests. When destreak processing is finished for the entire run the Run's magicked bit is set as well.

In the case where say a chipProcessedImfile is updated using a destreaked raw image we'd like to propagate the value from the rawImfile. However this is the magic_ds_id of the run that destreaked the raw file. It doesn't make much sense to use that value in the chip run's data.

We will replace those values with the magic_id of the magic streak detection run that generated the list of streaks that were removed from the image. This will be consistent in all cases.

The magicked column needs to be managed properly through the cleanup and update process. Currently the value is unchanged when the run is cleaned. It is zeroed when the magicDSRun is cleaned up. The magicDSRun controls the destreak process and the associated magicDSFile table contains the location of the backup and excised pixel images. This is a bug since nothing enforces that the magicDSRun's be cleaned up.

We propose that when a run with 'magicked > 0' is cleaned up, the value is set to -1. Tests in the system that check destreaked status will need to be changed

from 
        if magicked 
to 
        magicked > 0.

During update processing the check for the status of the inputs will be changed

from
        *Run.state = 'full'
to
        (*Run.state = 'full' and *Run.magicked >= 0)

This insures that the updated images will be restored to the state they were in before cleanup. Allowing processing with magicked == 0 allows runs which have been cleaned but not yet magicked to updated. (It also allows update to work transparently for cameras that don't require magic de-streaking without a lot of tedious SQL editing by the ipp tools)

chipRun update when rawExp has not been de-streaked

Once we get into operations the raw images will be de-streaked and the excised pixels saved. Since we want to be able to easily reprocess data as the IPP analysis software evolves we have not been de-streaking the raw images.

In order to fit in with the processing flow outlined in this note we will apply the streak mask after the chip processed images have been updated. The mask images with the destreak mask bits are saved in the camera stage. The camera stage files are relatively small and are not cleaned up.