DVO Table Structures

The DVO shell is the easiest way to explore data contained in a DVO database. However, involved analysis usually requires custom designed tools. For these tasks, the DVO shell may not be convenient. This document discusses the structure of the DVO database tables, with the intent of outlining how one might work with these tables using external tools. I give particular emphasis to working with DVO tables in IDL.

All of the DVO tables are saved as multi-extension FITS files. The first extension of the fits file is usually a dummy extension, and most of the interesting information is in the next extension. To read one of these files into an IDL data structure, use the MRDFITS routine from the IDL Astronomy User's Library.

Note that this article is concerned only with reading data from the DVO tables, and not changing their contents. In my opinion, given the number of dependencies and relationships between different quantities in these tables, it is best to use the IPP to update the DVO database.

IDL> data = mrdfits( fileName, 1, header)

File Hierarchy

The top level of a DVO database contains three files and many folders

beaumont% ls
Images.dat     n0000  n2230  n4500  n6730  s0000  s2230  s4500  s6730
Photcodes.dat  n0730  n3000  n5230  n7500  s0730  s3000  s5230  s7500
SkyTable.fits  n1500  n3730  n6000  n8230  s1500  s3730  s6000  s8230

Each of the folders contains science information for a specific subset of the sky, and has 0, 1, or many sets of the same 4 files:

beaumont% ls n0000/
0148.cpm  0148.cpn  0148.cps  0148.cpt  0149.cpm  0149.cpn  0149.cps  0149.cpt

Let us first discuss the 3 files at the top of the hierarchy, and then move onto the 4 types of files in the subdirectories

Top Level Files

The top level of the DVO database has 3 data files: Images.dat, Photcodes.dat, and SkyTable.fits

Images.dat

This file contains information about each 'image' in the DVO database. A single exposure from a telescope is often broken into multiple images. For example, one Megacam exposure is 37 images, 1 for each of the 36 chips, and one dummy image which summarizes the exposure.

The header summarizes the data fields in the Images.dat file. Some useful quantities:

  • IMAGE_ID : An integer uniquely identifying this image. This is the 1-indexed row number in the Images.dat table
  • PHOTCODE: This image's photcode (see Photcodes.dat discussion below)
  • NAME: A string giving the filenam
  • EXPTIME: Exposure time, usually in seconds (but check the header)
  • NSTAR: Number of stars detected in the image
  • NX / NY: Pixel size of image

In addition, there are a number of astrometry related keywords like CRVAL, etc. However, I think these are defined in a non-standard way. Luckily, there is more direct information about each object's location in other tables.

Photcodes.dat

Each filter / chip combination is associated with a unique photcode. References to photcodes are found in the Images.dat table above, and in the .cpm tables discussed below. The Photcodes.dat file contains information about all of the photcodes recognized by the IPP.

The photcodes table format is simple:

IDL> p = mrdfits('Photcodes.dat', 1, h)
MRDFITS: Binary table.  21 columns by  818 rows.

IDL> help, p[150], /structure
** Structure <8d1ec2c>, 21 tags, length=108, data length=104, refs=2:
   CODE            INT            204
   NAME            STRING    'MEGACAM.g.04'
   TYPE            STRING    ''
   DUMMY           STRING    ''
   C_LAM           INT          26460
   C_LAM_ERR       INT              0
   X_ERR           INT              0
   K               FLOAT         -0.150000
   C1              LONG                 0
   C2              LONG                 0
   EQUIV           LONG                 1
   NC              LONG                 1
   X               FLOAT     Array[4]
   ASTROM_ERR_SYS  FLOAT           0.00000
   ASTROM_ERR_SCALE
                   FLOAT           0.00000
   ASTROM_ERR_MAG_SCALE
                   FLOAT           1.00000
   ASTROM_POOR_MASK
                   INT              0
   ASTROM_BAD_MASK INT          14472
   PHOTOM_ERR_SYS  FLOAT           0.00000
   PHOTOM_POOR_MASK
                   INT              0
   PHOTOM_BAD_MASK INT              0
  • CODE: The photcode value which other tables reference via their PHOTCODE data field. Note that this is NOT the same as the row number in the Photcodes.dat table.
  • NAME: A more instructive label. In this case, this photcode referes to CCD4 on megacam, using the G filter.
  • C_LAM, C_LAM_ERR, X_ERR, K, C1, C2, EQUIV, NC, X: Constants which translate between instrumental magnitudes and AB magnitudes. For examples of their use, see Ohana/src/libdvo/src/dvo_photcode_ops.c in the IPP

There are a number of bit masks for each photcode. During reduction, the IPP sets a number of quality flags, which get copied into the PHOT_FLAGS data field in the .cpm files. The photcode fields ASTROM_BAD_MASK, ASTROM_POOR_MASK, PHOTOM_BAD_MASK encode which of those bits indicate failures in an object's astrometry or photometry. See examples at the end for working with these values in IDL.

SkyTable.fits

As mentioned above, the objects in the DVO database get grouped into regions based on their position in the sky, and placed in one of the DVO subdirectories like n0000. The SkyTable.fits file gives the sky boundaries of each of these subdirectories, and can be used to locate which directory a particular region of interest is in:

IDL> sky = mrdfits('SkyTable.fits', 1, h)
MRDFITS: Binary table.  12 columns by  161905 rows.
IDL> help, sky[1], /struct
** Structure <a04384c>, 12 tags, length=80, data length=80, refs=2:
   R_MIN           FLOAT           0.00000
   R_MAX           FLOAT           360.000
   D_MIN           FLOAT           0.00000
   D_MAX           FLOAT           7.50000
   CHILD_S         LONG                25
   CHILD_E         LONG                50
   PARENT          LONG                 1
   INDEX           LONG                 1
   DEPTH           STRING    ''
   CHILD           STRING    ''
   TABLE           STRING    ''
   NAME            STRING    'n0000'

This particular sky region, n0000, is a 7.5 degree wide declination band centered at dec=3.75.

Subdirectory files

Most of the science content of the DVO database is found in the subdirectories with names like n0000. There are four types of files, with suffixes of .cpm, .cpn, .cps, and .cpt.

The Averages Table: .cpt

Each .cpt table has one entry for each object in that region of the sky. It summarizes the average properties of that object as long as those properties can be derived independently of the filter used. This means that the magnitude of the object cannot be found here (it is different for each filter).

Note: it is important that your DVO database is sorted. Do this by running addstar -resort on your database after initially populating it (usually with addstar -update)

IDL> t = mrdfits('n0000/0148.cpt', 1, h)
MRDFITS: Binary table.  23 columns by  285933 rows.
IDL> help, t, /struct
** Structure <8f069ac>, 23 tags, length=96, data length=96, refs=1:
   RA              DOUBLE           102.62241
   DEC             DOUBLE         0.015223970
   RA_ERR          FLOAT         0.0480983
   DEC_ERR         FLOAT         0.0481240
   U_RA            FLOAT           0.00000
   U_DEC           FLOAT           0.00000
   V_RA_ERR        FLOAT           0.00000
   V_DEC_ERR       FLOAT           0.00000
   PAR             FLOAT           0.00000
   PAR_ERR         FLOAT           0.00000
   SIGMA_POS       FLOAT          -374.000
   CHISQ_POS       FLOAT           0.00000
   NUMBER_POS      INT              0
   NMEASURE        INT            136
   NMISSING        INT              0
   NEXTEND         INT              0
   OFF_MEASURE     LONG                 0
   OFF_MISSING     LONG                -1
   OFF_EXTEND      LONG                -1
   FLAGS           LONG                 1
   OBJ_ID          LONG                 0
   CAT_ID          LONG               578
   EXT_ID          LONG      Array[2]
  • RA, DEC: Average RA and DEC, in degrees
  • RA_ERR, DEC:ERR: Error in position, in arcsec
  • U_RA, V_RA, etc: proper motions, in mas/yr (note that these must be explicitly calculated using relastro via the command --update-objects +pm)
  • PAR: Parallax in mas (must be populated via relastro --update-objects +par)
  • NMEAURE: Number of measurements associated with this object
  • NMISSING: Number of times that an object doesn't appear in other exposures. Not currently populated.
  • OFF_MEASURE: The zero-indexed row number in the measurement (.cpm) table that contains the first measurement for this object. That row, and the next NMEASURE -1 rows, are the measurements for this object

The Measurement Table: .cpm

Each .cpm table contains all of the measurement information for each object in the average (.cpt) table

IDL> m = mrdfits('n0000/0148.cpm',1,h, range=[0,5])
MRDFITS: Binary table.  42 columns by  6 rows.
IDL> help, m, /structure                           
** Structure <8ee013c>, 42 tags, length=160, data length=158, refs=1:
   D_RA            FLOAT         0.0669184
   D_DEC           FLOAT          0.135676
   MAG             FLOAT               NaN
   M_CAL           FLOAT           0.00000
   M_APER          FLOAT               NaN
   MAG_ERR         FLOAT               NaN
   MAG_CAL_ERR     FLOAT          0.660210
   M_TIME          FLOAT           1.21749
   AIRMASS         FLOAT           4.57614
   AZ              FLOAT          -89.8685
   X_CCD           FLOAT          0.393112
   Y_CCD           FLOAT           3051.92
   SKY_FLUX        FLOAT           17.5981
   SKY_FLUX_ERR    FLOAT           3.94064
   TIME            LONG        1067008742
   AVE_REF         LONG                 0
   DET_ID          LONG              1504
   IMAGE_ID        LONG                 9
   OBJ_ID          LONG                 0
   CAT_ID          LONG               578
   EXT_ID          LONG      Array[2]
   PSF_QF          FLOAT               NaN
   PSF_CHISQ       FLOAT               NaN
   PSF_NDOF        LONG                 0
   PSF_NPIX        LONG                 0
   CR_NSIGMA       FLOAT               NaN
   EXT_NSIGMA      FLOAT               NaN
   FWHM_MAJOR      INT              0
   FWHM_MINOR      INT              0
   PSF_THETA       INT              0
   MXX             INT              0
   MXY             INT              0
   MYY             INT              0
   TIME_MSEC       INT              0
   PHOTCODE        INT            307
   X_CCD_ERR       INT            358
   Y_CCD_ERR       INT             48
   PAD             STRING    ''
   POSANGLE        INT           8262
   PLTSCALE        FLOAT          0.185317
   DB_FLAGS        LONG                 0                        
   PHOT_FLAGS      LONG         268470272
  • D_RA: Offset, in arcseconds, between the average ra (in the .cpt table) and this measurement.
  • D_DEC: Offset, in arcseconds, between the average dec (in the .cpt table) and this measurement.
  • MAG: Instrumental magnitude for this object
  • X_CCD, Y_CCD: Pixel location of this object's centroid
  • TIME: Time of exposure. For the data I worked with, this was the Unix time (seconds since UTC Jan 1 1970)
  • AVE_REF: The zero indexed row number in the averages (.cpt) table of the object associated with this measurement
  • IMAGE_ID: The one-indexed row number in Images.dat for this measurement's parent image
  • X_CCD_ERR, Y_CCD_ERR: The centroid error, in 1/100th of a pixel
  • FWHM_MAJOR, FWHM_MINOR: The psf fwhm in 1/100th of a pixel
  • Photcode: The photcode corresponding to this measurement's parent image.
  • DB_FLAGS: Flags supplied by relastro, relphot, etc
  • PHOT_FLAGS: Flags set by the IPP. Bad photcode bit masks are found in the Photcodes.dat table

The missing table: .cpn

The secfilt table: .cps

Examples of Working with DVO tables in IDL

  • Determine which objects in a .cpm table have bad astrometry
m = mrdfits('n0000/0148.cpm',1)
p = mrdfits('Photcodes.dat',1)

photcode = m[0].photcode
index = where(photcode eq p.photcode)
mask = p[index].astrom_bad_mask ;- usually the same mask for all photcodes associated with a given camera

bad = where((m.phot_flags and mask) ne 0, complement = good)

  • Extract all measurements of a given object into a variable without reading in the entire .cpm table
    t = mrdfits('n0000/0148.cpt', 1)
    object = 10
    lo = t[object].off_measure
    hi = lo + t[object].nmeasure - 1
    
    measurements = mrdfits('n0000/0148.cpm',1, range=[lo,hi])
    
  • Convert between the UNIX time in the .cpm table to Julian Date
    ;+
    ; PURPOSE:
    ;  Convert linux time (the number of seconds since UTC Jan 1 1970) to
    ;  julian date (the number of days since Jan 1 4713 BC).
    ;
    ; CATEGORY:
    ;  time
    ;
    ; CALLING SEQUENCE:
    ;  result = linux2jd(linuxTime)
    ;
    ; INPUT:
    ;  The linuxTime, in seconds
    ;
    ; OUTPUT:
    ;  The julian date, in days
    ;
    ; MODIFICATION HISTORY:
    ;  March 2009 Written by Chris Beaumont
    ;  April 8 2009: Fixed bug which treated j2000 as j2001
    ;-
    function linux2jd, linuxTime
    compile_opt idl2
    on_error, 2
    
    ;- check inputs
    if n_params() ne 1 then begin
       print, 'linux2jd calling sequence:'
       print, ' jd = linux2jd(linux time)'
       return, !values.f_nan
    endif
    
    ;- reference numbers
    j2000 = 946684800D        ;-linux time at 2000
    juldate, [2000,1,1], jd0  ;-reduced julian date at 2000
    jd0 += 2400000D           ;-conversion to normal jd
    
    return, jd0 + (linuxTime - j2000) / 86400
    end