From Pixels to PSPS

The following is my understanding of how the data from an PS1 exposure makes it's way from the camera through the IPP and finally PSPS.

Special attention is paid to the names and identifiers as they make their way through the various systems.

On the Summit of Haleakala

The summit systems take an exposure for one of the various PS1 surveys. 60 fits files, one for each GPC1 OTA are made. Into the header of these files are placed several keywords identifying the contents.

  • OBS_MODE is set to identify the "survey" for the exposure. e.g. 3PI, M31, MD, STS, etc.
  • OBJECT provides a more fine grained distinction. e.g MD01
  • CMTOBS is set to a string that further identifies the exposure. e.g. 'MD03 B g,r N55173 g MD03 dither 2: -753,-399'

Note: since OBJECT is not yet being populated we are currently queuing processing based on the CMTOBS.

At MHPCC the IPP begins the analysis process

The IPP downloads the image files assigns an integer exposure identifier exp_id and examines the images and loads metadata about the exposures into the IPP database. When this is complete the exposure is said to be registered.

Some IPP task looks at the metadata and based upon the obs_mode and comment queues the exposure for chip processing.

As part of this process the IPP needs to identify which survey the exposure belongs to and set various parameters in the exposure's row in the chipRun table. These parameters will usually proagate to subsequent pipeline stages.

  • tess_id is set to the sky tesselation that the pixels will be warped to.
  • dvodb is se the dvo database associated with the survey is selected
  • dist_group is set to identify the data for distribution
  • label set to a value used by pantasks to select the exposure for processing (label may change later)
  • data_group is set to label. This value is intended to stay constant over time.

At this time these values are being set manually by the IPP operators.

Chip Stage

The pixels are run through detrend processing applying corrections for dark current, flat field, and possibly fringe correction depending on the filter.

The 64 cells of each ota are combinded into a mosaic and saved as a single image for each ota.

psphot performs photmetric analysis on each detrended image and identifies sources (detections).

The parameters for these sources are saved in a 'cmf' file. One file for each ota. Each cmf file has a primary header and two extensions.

  • .psf contains the sources
  • .deteff contins data associated with detection efficiency

Each source is given a row in the 'psf table and is labeled with a value named IPP_IDET. This value starts at zero for the first source on the chip and increments. Note that only the position of the source on the ota is computed. The RA and DEC values are set to NULL.

Note: All discussion of sources or detections in this note refer to "PSF matched" (point) sources. Extended sources are not considered here.

When chip processing is complete, a row is inserted into the chipProcessedImfile table. Each chipProcessedImfile has an associated chip_imfile_id. This value is placed into the fits header of the image and cmf files with the keyword IMAGEID.

The headers for these files also contain keywords copied from the raw image and various values added at the chip stage.

An keyword related to IMAGEID is SOURCEID. This value is formed from the project associated with the IPP database and the processing stage.

    SOURCEID = (project_id(dbname) << 3) | $PS_TABLE_CHIP.

In the current setup proj_id = 4 and PS_TABLE_CHIP = 1 so SOURCEID = 33 for gpc1 chip images.

Given SOURCEID and IMAGEID one can determine uniquely identify an image's files. This is used later in DVO.

Camera Stage

At the camera stage psastro determines the astometric fit for the entire exposure.

The 60 cmf files from the chip stage are combined into a 'smf' file. In the smf file contains several extensions. The primary is an empty image. The fits header values from the input chip images are passed along here. (XXX: we have 60 inputs for this but only one output is this an application of "concepts averaging"?).

The primary header also contains the overall (XXX: need proper term here) astrometric transformation for the exposure.

The next extension is called MATCHED_REFS. (I don't know what this is used for)

Following the first two extensions are 3 fits extensions for each ota

  • .hdr is an empty fits image. It carries the chip level astrometric transformation XXX: use proper terminology
  • .psf contains the sources copied from the cmf file. Here the RA and DEC is filled in by applying the astrometry to the chip coorinates
  • .deteff contains the deteff extension computed at the chip stage

Note: otas may be omitted from the smf file if the quality of the analysis at either the chip or camera stage is not good enough.

The smf files are the inputs to DVO which manages the catalogs for IPP.

Waiting for MAGIC

Since PS1 images need to have satellite streaks removed before data can be released, we cannot insert the sources into dvo until the exposure is run through magic.

During the 'de-streak process' any sources whose pixels are masked with the bit STREAK are removed from the smf file.

Note that this leaves holes in the sequence of IPP_IDET values in the psf tables.

Insertion into DVO

Once magic processing for the exposure is complete, the sources are processed by the DVO program addstar.

addstar inserts new measurements to a given DVO database. As noted above the DVO database is selected based on the survey to which the exposure belongs.

The algorithm used by DVO can be described in the following psuedo-code. (It's not actually implemented in this order).

foreach image in the exposure's smf file {
    Add an entry into the Images table.

        newDVOImage.IMAGE_ID  = ++maxImageID in Images table
        newDVOImage.EXTERN_ID = chip.hdr.IMAGEID
        newDVOImage.SOURCE_ID = chip.hdr.SOURCED
        newDVOImage.CCDNUM = atoi(first string of digits in chip.hdr.EXTNAME) // CCDNUM(XY33) = 33

    foreach source in the chip.psf extension of the smf file {

        if source doesn't match some set of cuts discard it (FilterStars)

        if (aveObject = find_matches(source))
            Find catalog file that contain's the new object's coordinates
            // create a new row in the 'averages table' DVO_AVERAGE_PS1_V1
            newAveObject.RA  = source.RA
            newAveObject.DEC = source.DEC
            newAveObject.OBJ_ID = ++maxObjIDInCatalog
            newAveObject.CAT_ID = catalogID
            newAveObject.EXTERN_ID = CreatePSPSObjectID(source.RA, source.DEC)  // algorithm provided by PSPS
            aveObject = newObject
        create an entry in the 'measurements' table DVO_MEASURE_PS1_V1
        newMeasure.D_RA  = aveObject.RA    - source.RA    
        newMeasure.D_DEC = aveObject.D_DEC - source.DEC
        newMeasure.AVE_REF = aveObject.objID
        newMeasure.DET_ID  = source.IPP_IDET
        newMeasure.IMAGE_ID = newDVOImage.imageID
        newMeasure.CAT_ID = catalogID
        newMessure.EXT_ID = CreatePSPSDetectionID(MJD of observation, newDVOImage.CCDNUM, newMeasure.DET_ID)
                            // algorithm provided by PSPS

Transferring Detections to PSPS

Periodically the IPP will compute new bundles of data for PSPS.

The format of the data will be fits tables placed on a data store web site with some xml descriptor files.

The data for a particular exposure are represented by 3 extension types. XXX: get names for the extensions.

Each extension corresponds to a table in the PSPS SQL database.

  • FrameMeta Describes an exposure
  • ImageMeta Describes the otas used for the exposure
  • Detection List of detections for the exposure

The files will be created by a program that reads the original smf file and the corresponding DVO database.

In the following we ignore the fits files used for transport and instead focus on the correspondence between the various identifiers.

PSPS Table FrameMeta
PSPS column name    type        IPP Value inserted
frameID             BIGINT      rawExp.exp_id
surveyID            TINYINT     value from PSPS defined list for given survey
filterID            TINYINT     value from PSPS defined list computed from smf header's FILTER
                                g = 1, r = 2, i = 3, z = 4, y = 5, w = 6
cameraID            SMALLINT    1 (GPC1)
cameraConfigID      SMALLINT    TBD
telescopeID         SMALLINT    1 (PS1)

PSPS Table ImageMeta
PSPS column name    type        IPP Value inserted
imageID             BIGINT      (rawExp.exp_id * 100) | dvoImage.CCDNUM
frameID             BIGINT      FrameMeta.frameID
ccdID               SMALLINT    DVOImage.CCDNUM
photoCalID          INT         DVOImage.PHOTCODE
filterID            TINYINT     FrameMeta.filterID
detectorID          SMALLINT    integer value from some configuration file that identifies the actual
                                OTA part. Computed from string in ota's header in the smf file.
calibModNum         SMALLINT    TBD based on negotiation between IPP and PSPS
dataRelease         TINYINT     0   // placeholder. Actual value inserted by PSPS

PSPS Table Detection
PSPS column name    type        IPP Value inserted
objID               BIGINT      aveObject.EXTERN_ID     // Computed per PSPS defined algorithm
detectID            BIGINT      measurement.EXTERN_ID   // Computed per PSPS defined algorithm
ippObjID            BIGINT      (aveObject.CAT_ID << 32) | aveObject.OBJ_ID)
ippDetectID         BIGINT      (measurement.IMAGE_ID << 32) | measurement.DET_ID
filterID            TINYINT     ImageMeta.filterID
surveyID            TINYINT     FrameMeta.surveyID
imageID             BIGINT      ImageMeta.imageID
activeFlag          TINYINT     0       // placeholder updated by psps
assocData           DATE        TBD     // Quote from Jim
                                        "This should be the detection is first reported to the ODM.
                                        If you are reporting it it has to be associated to an object
                                        by our agreement. So, for a given detection it'd be the
                                        reporting date" Perhaps this should be the date that the fits
                                        table is made?
historyModNum       SMALLINT    0
dataRelease         TINYINT     0       // This is PSPS data release. Not to be confused with calibration


Since DVO allows for the catalogs to be re-organized, will aveObject.CAT_ID be constant?


relphot, relastro, and reporting "re-calibrations' to PSPS.