# Changeset 40130

Show
Ignore:
Timestamp:
09/09/17 15:20:24 (5 months ago)
Message:

Files:
1 modified

Unmodified
Removed
• ## trunk/doc/release.2015/ps1.datasystem/datasystem.tex

r40071 r40130
9393\label{sec:intro}
9494
95\note{missing figures: analysis elements, DVO schema}
96
9597The 1.8m Pan-STARRS\,1 telescope is located on the summit of Haleakala
9698on the Hawaiian island of Maui.  The wide-field optical design of the

104106The \PSONE\ camera \citep{2009amos.confE..40T}, known as GPC1, consists of a
105107mosaic of 60 back-illuminated CCDs manufactured by Lincoln Laboratory.
106 The CCDs each consist of an $8\times8$ grid of $\sim 600\times 600$
107 pixel readout regions, yielding an effective $4800\times4800$
108The CCDs each consist of an $8\times8$ grid of $590 \times 598$
109pixel readout regions, yielding an effective $4846 \times 4868$
108110detector.  Initial performance assessments are presented in
109111\cite{2008SPIE.7014E..0DO}.  Routine observations are conducted remotely from the
110112Advanced Technology Research Center in Kula, the main facility of the
111 University of Hawaii's Institute for Astronomy operations on Maui.
113University of Hawaii's Institute for Astronomy (IfA) operations on Maui.
112114The Pan-STARRS1 filters and photometric system have already been
113115described in detail in \cite{2012ApJ...750...99T}.

167169%Pan-STARRS Pixel Analysis : Source Detection
168170\citet[][Paper IV]{magnier2017.analysis}
169 describes the details of the source detection and photometry, including point-spread-function and extended source fitting models, and the techniques for forced" photometry measurements.
171describes the details of the source detection and photometry, including point-spread-function and extended source fitting models, and the techniques for forced'' photometry measurements.
170172
171173%Magnier et al. 2017 (Paper V)

202204reducing data from other cameras and telescopes.
203205
204 \note{overview discussion of Pan-STARRS: the telescope, survey time
205   period, surveys.  2 paragraphs.}
206
207 The Pan-STARRS Image Processing Pipeline consists of a suite of
208 software programs and data systems that are designed to reduce
209 astronomical images, with a focus on parallelization necessary to
210 speed the processing of the large images produced by the GPC1 camera.
211 Part of this parallelization is derived from the fact that this camera
212 consists of 60 independent orthogonal transfer array (OTA) devices,
213 and can therefore be processed simultaneously.  Although there are
214 multiple stages that operate on an entire exposure at once, the
215 majority of stages operate only on smaller segments of a full exposure
216 to allow the processing tasks to be spread over the machines in the
217 processing cluster.
218
219
220 \note{fix this summary once outline is solidified}
221
222 This paper presents a description of the IPP data handling system.
223 Section \ref{sec:subsystems} describes the major IPP subsystems that
224 underlie the main pipeline, providing a set of common interfaces and
225 tools used at multiple stages.  The main processing stages of the
226 pipeline are described in Section \ref{sec:stages}, although all
227 exposures may not necessarily pass through each of these stages.  The
228 hardware systems that have done the processing for the PV3 data
229 release are listed in Section \ref{sec:hardware}, with some details
230 on the scale of computing needed to reduce this large number of
231 exposures.  Finally, Section \ref{sec:discussion} presents a
232 discussion of some of the lessons learned in the creation of the IPP,
233 and its utility in reducing data from other cameras and telescopes.
234
235206{\color{red} {\em Note: These papers are being placed on arXiv.org to
236207    provide crucial support information at the time of the public

244215\label{sec:overview}
245216
246 The Pan-STARRS Data Analysis system consists of many elements to
247 support the wide range of activities: archiving and management of the
217\subsection{Elements of the Pan-STARRS Data Processing System}
218
219The Pan-STARRS data analysis system consists of many elements to
220support a wide range of activities: archiving and management of the
248221raw and processed image files; real-time nightly processing of images
249222for transient and moving object science; large-scale re-processing and
250223calibration to produce measurements for the science collaboration and
251 the wider public; specialized image processing tasks to facilitate
252 research and development of the analysis system itself; distribution
253 of the resulting data products to various consumers in a variety of
254 formats and modes.
255
256 The Pan-STARRS Data Analysis system is divided internally into several major
224the wider public; specialized image processing to facilitate research
225and development of the analysis system itself; and distribution of the
226resulting data products to various consumers in a variety of formats
227and modes.
228
229The Pan-STARRS data analysis system is divided internally into several major
257230components:
258231\begin{itemize}

260233  data analysis tasks needed to support the on-going observations.
261234  In this article, we focus only on those aspects used by the off-summit
262   analysis stages.  \note{is summit processing discussed anywhere?}
235  analysis stages.
263236\item Image Processing Pipeline (IPP) : this portion of the data
264237  analysis system takes the data from raw pixels on the summit

295268the summit systems are described by \note{REF?}.
296269
270\begin{figure*}[htbp]
271  \begin{center}
272 \includegraphics[width=\hsize,clip]{PS1_Data_Analysis_System_Overview.pdf}
273  \caption{\label{fig:analysis.elements} Elements of the Pan-STARRS\,1
274    Data Analysis System.  Rectangles represent data analysis steps;
275    ellipses represent databases; rounded rectangles represent
276    external groups (customers'').  The arrows show a simplified representation
277  of the major flow of data between the analysis stages and data
278  processing elements.}
279  \end{center}
280\end{figure*}
281
282\subsection{Nightly Processing Analysis Stages}
283
297284Data analysis to support nightly science operations is driven by two
298285main goals: 1) rapid detection of the moving and transient sources to

309296(\IPPstage{warp}).  Warped images may either be added together
310297(\IPPstage{stack}) or used in an image subtraction (\IPPstage{diff}).
311 For nightly science operations, images for certain fields such as the
312 Medium Deep survey fields \citep[see][]{MDref}, are stacked together
313 in nightly chunks, providing deeper detection capability on 1-day
314 timescales.  Depending on the survey mode, difference images are
315 generated for the nightly stack images (vs a deep stack template) or
316 for individual warp images.  In the later case, the warp images may be
317 difference against another warp from the same night or against a
298As part of nightly science processing, images for certain fields such
299as the Medium Deep survey fields \citep[see][]{MDref}, are stacked
300together in nightly chunks, providing deeper detection capability on
3011-day timescales.  Depending on the survey mode, difference images are
302generated for the nightly stack images (using a deep stack template)
303or for individual warp images.  In the later case, the warp images may
304be differenced against another warp from the same night or against a
318305reference stack from the appropriate part of the sky.
319306
307\subsection{Re-processing Analysis Stages}
308
320309Pan-STARRS has performed several large-scale reprocessings of both the
321 Medium Deep and 3pi Survey data for internal consumption.  For the 3pi
322 Survey data, we identify these large-scale reprocessings as PV1, PV2,
323 and PV3, with PV3 the analysis used for the first public data release,
324 DR1.  We also refer to the nightly science analysis of the data as
325 PV0.  For these reprocessing stages, the standard steps of chip
326 through warp, plus stack and diff are performed, starting from raw
327 data, usually using a single homogenous version of the data analysis
328 procedures.  PV2 was a special case in which we started from the
329 camera level products of PV1 to speed up the turn-around to the
330 community.  In addition to the analysis stages listed above which are
331 shared with the nightly processing, these large-scale reprocessing
332 analyses include additional processing.  A more detailed photometric
333 analysis is performed on the stacks, including morphological analysis
334 appropriate to galaxies.  The results of the stack photometry analysis
335 are used to drive a forced-photometry analysis of the warp images.
336 The data products from the camera, stack photometry, and forced-warp
337 photometry analysis stages are ingested into the internal calibration
338 database (DVO, the Desktop Virtual Observatory) and used for
339 photometric and astrometric calibrations.
310Medium Deep and $3\pi$ Survey data for internal consumption.  For the
311$3\pi$ Survey data, we identify these large-scale reprocessings as
312PV1, PV2, and PV3, with PV3 the analysis used for the first public
313data release, DR1.  We also refer to the nightly science analysis of
314the data as PV0.  For these reprocessing stages, the standard steps of
315\ippstage{chip} through \ippstage{warp}, plus \ippstage{stack} and
316\ippstage{diff} are performed, starting from raw data, usually using a
317single homogenous version of the data analysis procedures.  PV2 was a
318special case in which we started from the camera level products of PV1
319to speed up the turn-around to the community.  In addition to the
320analysis stages listed above which are shared with the nightly
321processing, these large-scale reprocessing analyses include additional
322processing.  A more detailed photometric analysis is performed on the
323stacks, including morphological analysis appropriate to galaxies.  The
324results of the stack photometry analysis are used to drive a
325forced-photometry analysis of the warp images.  The data products from
326the camera, stack photometry, and forced-warp photometry analysis
327stages are ingested into the internal calibration database (DVO, the
328Desktop Virtual Observatory) and used for photometric and astrometric
329calibrations (see Section~\ref{sec:DVO}).
340330
341331\subsection{Data Access and Distribution}

371361{\bf Stage} & {\bf Primary Table} & {\bf Secondary Table(s)} & {\bf Key} & {\bf Notes} \\
372362\hline
363  \ippstage{summitcopy}   & \ippdbtable{pzDataStore}  &                                  & & Lists locations to check for new exposures.\\
364                          & \ippdbtable{summitExp}    & \ippdbtable{summitImfile}        & \ippdbcolumn{summit_id} & Exposures available at the telescope.\\
366                          & \ippdbtable{newExp}       & \ippdbtable{newImfile}           & \ippdbcolumn{exp_id} & Exposures that have been saved to IPP cluster.\\
367
368  \ippstage{registration} & \ippdbtable{rawExp}       & \ippdbtable{rawImfile}           & \ippdbcolumn{exp_id} & \\
369  \ippstage{chip}         & \ippdbtable{chipRun}      & \ippdbtable{chipProcessedImfile} & \ippdbcolumn{chip_id} & \\
374370  \ippstage{camera}       & \ippdbtable{camRun}       & \ippdbtable{camProcessedExp}     & \ippdbcolumn{cam_id} & \\
375   \ippstage{chip}         & \ippdbtable{chipRun}      & \ippdbtable{chipProcessedImfile} & \ippdbcolumn{chip_id} & \\
371  \ippstage{fake}         & \ippdbtable{fakeRun}      & \ippdbtable{fakeProcessedImfile} & \ippdbcolumn{fake_id} & \\
372  \ippstage{warp}         & \ippdbtable{warpRun}      & \ippdbtable{warpImfile}          & \ippdbcolumn{warp_id} & \\
373                          &                           & \ippdbtable{warpSkyCellMap}      & & Mapping of input chips to projection skycells.\\
374                          &                           & \ippdbtable{warpSkyfile}         & & \\
375  \ippstage{stack}        & \ippdbtable{stackRun}     & \ippdbtable{stackInputSkyfile}   & \ippdbcolumn{stack_id} & \\
376                          &                           & \ippdbtable{stackSumSkyfile}     & & \\
377  \ippstage{staticsky}    & \ippdbtable{staticskyRun} & \ippdbtable{staticskyInput}      & \ippdbcolumn{sky_id} & \\
378                          &                           & \ippdbtable{staticskyResult}     & & \\
379  \ippstage{skycal}       & \ippdbtable{skycalRun}    & \ippdbtable{skycalResult}        & \ippdbcolumn{skycal_id} & \\
380  \ippstage{fullforce}    & \ippdbtable{fullForceRun} & \ippdbtable{fullForceInput}      & \ippdbcolumn{ff_id} & \\
381                          &                           & \ippdbtable{fullForceResult}     & & \\
382                          &                           & \ippdbtable{fullForceSummary}    & & Properties about average parameters from all results.\\
383  \ippstage{diff}         & \ippdbtable{diffRun}      & \ippdbtable{diffSkyfile}         & \ippdbcolumn{diff_id} & \\
384                          &                           & \ippdbtable{diffInputSkyfile}    & & \\
376385  \ippstage{detrend}      & \ippdbtable{detRun}       & \ippdbtable{detRunSummary}       & \ippdbcolumn{det_id} & \\
377386                          &                           & \ippdbtable{detInputExp}         & & \\

381390                          & \ippdbtable{detResidExp}  & \ippdbtable{detResidImfile}      & & \\
382391                          & \ippdbtable{detNormalizedExp} & \ippdbtable{detNormalizedImfile} & & \\
383   \ippstage{diff}         & \ippdbtable{diffRun}      & \ippdbtable{diffSkyfile}         & \ippdbcolumn{diff_id} & \\
384                           &                           & \ippdbtable{diffInputSkyfile}    & & \\
385393  \ippstage{distribution} & \ippdbtable{distRun}      & \ippdbtable{distComponent}       & \ippdbcolumn{dist_id} & \\
386394                          &                           & \ippdbtable{distTarget}          & & \\
387   \ippstage{fake}         & \ippdbtable{fakeRun}      & \ippdbtable{fakeProcessedImfile} & \ippdbcolumn{fake_id} & \\
388   \ippstage{fullforce}    & \ippdbtable{fullForceRun} & \ippdbtable{fullForceInput}      & \ippdbcolumn{ff_id} & \\
389                           &                           & \ippdbtable{fullForceResult}     & & \\
390                           &                           & \ippdbtable{fullForceSummary}    & & Properties about average parameters from all results.\\
395  \ippstage{publish}      & \ippdbtable{publishRun}   & \ippdbtable{publishDone}         & \ippdbcolumn{pub_id} & \\
396                          &                           & \ippdbtable{publishClient}       & & \\
391397  \ippstage{lap}          & \ippdbtable{lapSequence}  & \ippdbtable{lapRun}              & \ippdbcolumn{seq_id} & Sequence of full reprocessing\\
392398                          & \ippdbtable{lapRun}       & \ippdbtable{lapExp}              & \ippdbcolumn{lap_id} & \\
393   \ippstage{publish}      & \ippdbtable{publishRun}   & \ippdbtable{publishDone}         & \ippdbcolumn{pub_id} & \\
394                           &                           & \ippdbtable{publishClient}       & & \\
395   \ippstage{summitcopy}   & \ippdbtable{pzDataStore}  &                                  & & Lists locations to check for new exposures.\\
396                           & \ippdbtable{summitExp}    & \ippdbtable{summitImfile}        & \ippdbcolumn{summit_id} & Exposures available at the telescope.\\
398                           & \ippdbtable{newExp}       & \ippdbtable{newImfile}           & \ippdbcolumn{exp_id} & Exposures that have been saved to IPP cluster.\\
399
400   \ippstage{registration} & \ippdbtable{rawExp}       & \ippdbtable{rawImfile}           & \ippdbcolumn{exp_id} & \\
401399  \ippstage{remote}       & \ippdbtable{remoteRun}    & \ippdbtable{remoteComponent}     & \ippdbcolumn{remote_id} & \\
402   \ippstage{skycal}       & \ippdbtable{skycalRun}    & \ippdbtable{skycalResult}        & \ippdbcolumn{skycal_id} & \\
403   \ippstage{stack}        & \ippdbtable{stackRun}     & \ippdbtable{stackInputSkyfile}   & \ippdbcolumn{stack_id} & \\
404                           &                           & \ippdbtable{stackSumSkyfile}     & & \\
405   \ippstage{staticsky}    & \ippdbtable{staticskyRun} & \ippdbtable{staticskyInput}      & \ippdbcolumn{sky_id} & \\
406                           &                           & \ippdbtable{staticskyResult}     & & \\
407   \ippstage{warp}         & \ippdbtable{warpRun}      & \ippdbtable{warpImfile}          & \ippdbcolumn{warp_id} & \\
408                           &                           & \ippdbtable{warpSkyCellMap}      & & Mapping of input chips to projection skycells.\\
409                           &                           & \ippdbtable{warpSkyfile}         & & \\
410400\hline
411401\end{tabular}

424414successive processing stages to begin their own tasks.
425415
426 The processing database is colloquially referred to as the gpc1'
416The processing database is colloquially referred to as the gpc1''
427417database, since a single instance of the database is used to track the
428418processing of images and data products related to the PS1 GPC1 camera.
429419This same database engine also has instances (same schema, different
430420data) for other cameras processed by the IPP, e.g., GPC2, the test
431 cameras TC1, TC3, and the Imaging Sky Probe (ISP).
421cameras TC1, TC3, and the Imaging Sky Probe (ISP).  In general,
422processing information for different cameras is separate in different
423processing database; merging of output products takes place in DVO.
432424
433425Within the processing database, the various processing stages are

435427primary table which defines the conceptual list of processing items
436428either to be done, in progress, or completed.  An associated secondary
437 table (or set of tables) lists the details of elements which have been
438 processed.  Table \ref{tab: database schema} contains an outline of
439 the database schema, showing the relations between tables organized by
440 processing stage.  As an example, one critical stage is the
441 \ippstage{chip} processing stage (see \S\ref{sec:chip}) in which the
442 individual chips from an exposure are detrended and sources are
443 detected.  Within the gpc1 database, the primary table is called
444 \ippdbtable{chipRun} in which each exposure has a single entry.
445 Associated with this table is the \ippdbtable{chipProcessedImfile}
446 table, which contains one row for each of the chips
447 associated with the exposure (up to 60 for gpc1).  The primary tables, such as
448 \ippdbtable{chipRun}, are populated once the system has decided that a
449 specific item (e.g., an exposure) should be processed at that stage.
450 Initially, the entry is given a state of run'', denoting that the
451 exposure is ready to be processed.  The low-level table entries, such
452 as the \ippdbtable{chipProcessedImfile} entries, are only populated
453 once the element (e.g., the chip) has been processed by the analysis
454 system.  Once all elements for a given stage, e.g., chips in this
455 case, are completed, then the status of the top-level table entry
456 (\ippdbtable{chipRun}) are switched from run'' to full''.
429table (or set of tables) lists the details of component elements which
430have been processed for each top-level item.  Table \ref{tab: database
431  schema} contains an outline of the database schema, showing the
432relations between tables organized by processing stage.  As an
433example, one critical stage is the \ippstage{chip} processing stage
434(see \S\ref{sec:chip}) in which the individual chips from an exposure
435are detrended and sources are detected.  Within the gpc1 database, the
436primary table is called \ippdbtable{chipRun} in which each exposure
437has a single entry.  Associated with this table is the
438\ippdbtable{chipProcessedImfile} table, which contains one row for
439each of the chips associated with the exposure (up to 60 for gpc1).
440The primary tables, such as \ippdbtable{chipRun}, are populated once
441the system has decided that a specific item (e.g., an exposure) should
442be processed at that stage.  Initially, the entry is given a state of
443run'', denoting that the exposure is ready to be processed.  The
444low-level table entries, such as the \ippdbtable{chipProcessedImfile}
445entries, are only populated once the element (e.g., the chip) has been
446processed by the analysis system.  Once all elements for a given
447stage, e.g., chips in this case, are completed, then the status of the
448top-level table entry (\ippdbtable{chipRun}) are switched from run''
449to full''.
457450
458451If the analysis of an element (e.g., the individual OTA chip)

467460other hand, if the analysis failed because of a problem with the input
468461data, this is noted by setting a non-zero value in a different table
469 field, \ippdbcolumn{quality}.  For example, if the chip analysis
462field, \ippdbcolumn{quality}.  For example, if the \ippstage{chip} analysis
470463failed to discover any stars because the image was completely
471464saturated, the analysis can complete successfully (\ippdbcolumn{fault}

483476of the \ippdbcolumn{fault}s which occur are ephemeral due to current
484477conditions of the processing cluster, the processing stages are set up
485 to occasionally clear and re-try the faulted entries.  Some faults
478to occasionally clear and re-try the faulted entries.  Some \ippdbcolumn{fault}s
486479represent software bugs and in the early stages of processing were
487480accumulated until the corresponding software issue could be addressed;
488481since the start of the PS1 Science Consortium Surveys, these types of
489 faults have largely been eliminated.  Thus, automatic processing is
482\ippdbcolumn{fault}s have largely been eliminated.  Thus, automatic processing is
490483able to keep the data flowing even in the face of occasional network
491484glitches or hardware crashes.

496489As exposures are taken by the PS1 telescope \& GPC1 camera system, the
497490data from the 60 OTA devices are read out by the camera software
498 wsystem and written to disk on a collection of computers at the summit
491system and written to disk on a collection of computers at the summit
499492in the PS1 facility called pixel servers.'' After the images are
500493written to disk, a summary listing of the information about the
501 exposure and the chip images are added to the summit datastore.
494exposure and the chip images are added to the summit datastore (an
495internal http-based data sharing tool, see
496Section~\ref{sec:datastore}).
502497
503498During night-time operations, while the summit datastore is being

531526
532527Once the chips for an exposure have all been downloaded, the exposure
533 is ready to be registered.  In this context, registration' refers to
528is ready to be registered.  In this context, registration'' refers to
534529the process of adding them to the database listing of known, raw
535 exposures (not to be confused with registration' in the sense of
536 pixel re-alignment).  The result of the registration analysis is an
530exposures (not to be confused with registration'' in the sense of
531pixel re-alignment).  The result of the \ippstage{registration} analysis is an
537532entry for each exposure in the \ippdbtable{rawExp} table, and one for
538533each chip in the \ippdbtable{rawImfile} table.  These tables are
539534critical for downstream processing to identify what exposures are
540 available for processing in any other stage.  At the registration
535available for processing in any other stage.  At the \ippstage{registration}
541536stage, a large amount of descriptive metadata for each chip is added
542537to the \ippdbtable{rawImfile} table, the majority of which is

552547
553548Unlike much of the rest of the IPP stage, the raw exposures may only
554 have a single entry in the registration tables of the processing
549have a single entry in the \ippstage{registration} tables of the processing
555550database tables (\ippdbtable{rawExp} and \ippdbtable{rawImfile}).
556551
557 For GPC1, the image registration stage is also the stage at which the
552For GPC1, the \ippstage{registration} stage is also the stage at which the
558553\ippprog{burntool} analysis is run.  This analysis is more completely
559554described in \citet{waters2017}.  In brief, the \ippprog{burntool}

564559observation date and time listed in the headers, with the results
565560stored in an text table.  As a result of the sequential nature of this
566 analysis, the registration of exposures is blocked until the
561analysis, the \ippstage{registration} of exposures is blocked until the
567562\ippprog{burntool} has been run on the previous exposures.
568563
569 Once the registration process has finished, new science exposures that
570 have an \ippdbcolumn{obs_mode} value that indicates they are part of
571 a particular science survey are automatically launched into the
572 science analysis by defining entries for the \ippstage{chip}
573 processing stage, as described above.  This analysis can be relaunched
574 multiple times, such as for the large scale PV3 reprocessing.
575 However, this automatic process ensures the shortest time between
576 observation and analysis, which is particularly important in the
577 search for transient sources.
564Once the \ippstage{registration} process has finished, new science
565exposures that have an \ippdbcolumn{obs_mode} value that indicates
566they are part of a particular science survey are automatically
567launched into the science analysis by defining entries for the
568\ippstage{chip} processing stage, as described above.  The science
569analysis of a given exposure can be relaunched multiple times, such as
570for the large scale PV3 reprocessing.  The automatically-launched
571analysis process ensures the shortest time between observation and
572analysis, particularly important in the search for transient sources.
578573
579574\subsection{Chip Processing}

619614%% attempts to target the processing for each OTA to the machine on which
620615%% the data for that detector is stored.  The output products are then
621 %% primarily saved back to the same machine.  This targetted' processing
616%% primarily saved back to the same machine.  This targetted'' processing
622617%% was an early design choice to minimize the system wide network load
623618%% during processing.  In practice, as computer disks filled up at

647642
648643The results of the image processing are then written to disk,
649 including the science, mask, and variance images, the background model
650 subtracted, the PSF model used in the photometry process, and a FITS
651 catalog of detected sources.  Additional binned images of the full OTA
652 are also saved, providing $16\times{}16$ and $256\times{}256$ pixel
653 binning scales for quick visualization.  The processing log and a
654 selection of summary metadata describing the processing results are
655 also written to disk.  This metadata is used to populate a row in the
656 \ippdbtable{chipProcessedImfile} table (linked to the
657 \ippdbtable{chipRun} entry by a shared \ippdbcolumn{chip_id} value)
658 to indicate that the processing of this OTA is complete.
644including the science, mask, and variance images, the binned
645background model subtracted, the PSF model used in the photometry
646process, and a FITS catalog of detected sources.  Additional binned
647images of the full OTA are also saved, using $16\times{}16$ and
648$256\times{}256$ pixel binning scales for quick visualization.  The
649processing log and a selection of summary metadata describing the
650processing results are also written to disk.  This metadata is used to
651populate a row in the \ippdbtable{chipProcessedImfile} table to
652indicate that the processing of this OTA is complete.
659653
660654As each OTA is processed independently of the others across a number
661 of computers, the \ippprog{pantasks} managing the jobs periodically
662 runs an \ippmisc{advance} task that checks that the number of rows in
663 \ippdbtable{chipProcessedImfile} with \ippdbcolumn{fault} equal to
664 zero matches the associated number of rows in \ippdbtable{rawImfile}.
665 If this condition is met, than all processing for that exposure is
666 finished, and the \ippdbcolumn{state} field is set to full''.  If
667 the \ippdbtable{chipRun}.\ippdbcolumn{end_stage} field is set to
655of computers, the \ippprog{pantasks} server managing the jobs
657number of rows in \ippdbtable{chipProcessedImfile} with
658\ippdbcolumn{fault} equal to zero matches the associated number of
659rows in \ippdbtable{rawImfile}.  If this condition is met, than all
660processing for that exposure is finished, and the \ippdbcolumn{state}
661field is set to full''.  If the
662\ippdbtable{chipRun}.\ippdbcolumn{end_stage} field is set to
668663\ippstage{chip}, then no further action is taken.  However, this field
669664is usually set to a subsequent stage (most often \ippstage{warp}),
670 then an entry for this exposure is added to the \ippdbtable{camRun}
665in which case an entry for this exposure is added to the \ippdbtable{camRun}
671666table, and processing continues.
672667

710705to help guarantee a solution in the case of a modest pointing error.
711706The guess astrometry is used to match the reference catalog to the
712 observed stellar positions in the focal plane coordinate system.  Once
713 an acceptable match is found, the astrometric calibration of the
707observed stellar positions in the focal plane coordinate system
708\citep[see][]{magnier2017.calibration}).
709
710Once an acceptable match is found, the astrometric calibration of the
714711individual chips is performed, including a fit to a single model for
715712the distortion introduced by the camera optics.  After the astrometic

720717used to generate synthetic w-band photometry for areas where no
721718PS1-based calibrated w-band photometry is available.  For more
722 details, see \cite{magnier2017.calibration}.  The result of these calibrations is
723 stored as a single multi-extension FITS table containing the results
724 from each OTA as a separate extension.
719details, see \cite{magnier2017.calibration}.  The result of these
720calibrations is stored as a single multi-extension FITS table
721containing the results from each OTA as a separate extension.
725722
726723In addition to the astrometric and photometric calibrations, the

740737processed all at once, this update also updates the associated
741738\ippdbtable{camRun} entry, linked by the \ippdbcolumn{cam_id}.  As
742 with the \ippstage{chip} stage, the
739with the \ippstage{chip} stage, if the
743740\ippdbtable{camRun}.\ippdbcolumn{end_stage} is for a subsequent
744741stage, an appropriate entry is added to the \ippdbtable{fakeRun}
745 table.
746
747 %% \subsection{Fake Analysis}
748 %% \label{sec:fake}
749 %%
750 %% The \ippstage{fake} stage was originally designed to do false source
751 %% injection and recovery, in order to determine the detection efficiency
752 %% of sources on the exposure.  However, early in the design of the IPP,
753 %% this task was moved to the rest of the photometry analysis done at the
754 %% \ippstage{chip} stage.  Removing the stage would require significant
755 %% changes to the database schema.  As a result, this conveniently named
756 %% stage generally does no actual data processing, and consists mainly of
757 %% database operations to move the exposure on to the \ippstage{warp}
758 %% stage.  The operations mimic the \ippstage{chip} stage, with
759 %% individual jobs run for each OTA that update rows in the
761 %% updates the \ippdbtable{fakeRun} table and promotes the exposure to
762 %% the next stage by adding a row to the \ippdbtable{warpRun} table.
742table.
743
744\subsection{Fake Analysis}
745\label{sec:fake}
746
747The \ippstage{fake} stage was originally designed to do false source
748injection and recovery, in order to determine the detection efficiency
749of sources on the exposure.  However, early in the design of the IPP,
750this task was moved to the rest of the photometry analysis done at the
751\ippstage{chip} stage.  Removing the stage would require significant
752changes to the database schema.  As a result, this conveniently named
753stage generally does no actual data processing, and consists mainly of
754database operations to move the exposure on to the \ippstage{warp}
755stage.  The operations mimic the \ippstage{chip} stage, with
756individual jobs run for each OTA that update rows in the
758updates the \ippdbtable{fakeRun} table and promotes the exposure to
759the next stage by adding a row to the \ippdbtable{warpRun} table.
763760
764761\subsection{Image Warping}

776773described by a single tangent plane projection, or for larger regions
777774which have multiple projection centers.  For the $3\pi$ survey, the
778 \ippmisc{RINGS.V3} tessellation was used that used projection centers
775\ippmisc{RINGS.V3} tessellation was used that arrange projection centers
779776spaced every four degrees in both RA and DEC, with $0\farcs{}25$
780777pixels.  These projections are further broken down into skycells''

822819\label{sec:stack}
823820
824 The skycell images generated by the \ippstage{warp} process are added
825 together to make deeper, higher signal-to-noise images in the
821The skycell images generated by the \ippstage{warp} process can be
822added together to make deeper, higher signal-to-noise images in the
826823\ippstage{stack} stage.  These stacked images also fill in coverage
827824gaps between different exposures, resulting in an image of the sky

831828input images.  During nightly science processing, the 8 exposures per
832829filter for each Medium Deep field are combined into a set of stacks
833 for that field.  These so-called nightly stacks' are used by the
830for that field.  These so-called nightly stacks'' are used by the
834831transient survey projects to detect faint supernovae, among other
835832transient events.  For the PV3 $3\pi$ analysis, all images in each

840837For the PV3 processing of the Medium Deep fields, stacks have been
841838generated for the nightly groups and for the full depth using all
842 exposures, producing deep stacks''.  In addition, a best seeing'
839exposures, producing deep stacks''.  In addition, a best seeing''
843840set of stacks have been produced \note{using image quality cuts to be
844841  described: need input from MEH}.  We have also generated
845 out-of-season stacks for the Medium Deep fields, in which all image
842out-of-season stacks for the Medium Deep fields, in which all images
846843not from a particular observing season for a field are combined into a
847844stack.  These later stacks are useful as deep templates when studying

850847season.
851848
852 When a given set of \ippstage{stack} stage are defined, exposures with
853 existing \ippstage{warp} entries that match the filter, position, and
854 other criteria such as seeing are grouped by their skycell.  An entry
849When a given set of \ippstage{stack} stage processing is defined,
850exposures with existing \ippstage{warp} entries that match the filter,
851position, and other criteria such as seeing are identified.  An entry
855852is then added for each skycell in the \ippdbtable{stackRun} table,
856853with the \ippdbcolumn{warp_id} entries for the exposures added to the
858 \ippdbtable{stackRun} entry by the \ippdbcolumn{stack_id} field.
859 This defines the mapping for which exposures contribute to the
860 \ippstage{stack}.  This breaks exposures into single skycells, but as
861 adjacent \ippstage{stack} skycells may contain inputs from different
862 exposures, there is no simple way to group the processing at the
863 \ippstage{stack} stage into exposures.
855\ippdbtable{stackRun} entry by the \ippdbcolumn{stack_id} field.  This
856defines the mapping for which exposures contribute to the
857\ippstage{stack}.  The \ippstage{stack} stage processing is performed
858at the skycell level.
864859
865860The \ippstage{stack} jobs pass the information about the input images

867862image combinations.  See~\cite{waters2017} for details on the stack
869 variance produced at other stage, additional images are constructed
864variance produced at other stages, additional images are constructed
870865with information about the contributions to each pixel.  A number
871866image contains the number of input exposures used for each pixel,

887882deferred to the \ippstage{staticsky} stage.  This separation is
888883maintained because the photometry analysis of the \ippstage{stack}
889 images is performed on all 5 filters simultaneously.  By deferring
890 this analysis, the processing system may also decouple the generation
891 of the pixels from the source detection.  This makes the sequencing of
892 analysis somewhat easier and less subject to blocks due to a failure
893 in the stacking analysis.  Similar to the \ippstage{stack} stage, an
894 entry is created in the \ippdbtable{staticskyRun} table, linked to a
895 series of rows in the \ippdbtable{staticskyInput} table by a common
896 \ippdbcolumn{sky_id}, each of which also contains the appropriate
897 \ippdbcolumn{stack_id} entries for the skycell under consideration.
884images, including convolved galaxy model fitting, is performed on all
8855 filters simultaneously.  By deferring this analysis, the processing
886system may also decouple the generation of the pixels from the source
887detection.  This makes the sequencing of analysis somewhat easier and
888less subject to blocks due to a failure in the stacking analysis.
889Similar to the \ippstage{stack} stage, an entry is created in the
890\ippdbtable{staticskyRun} table, linked to a series of rows in the
891\ippdbtable{staticskyInput} table by a common \ippdbcolumn{sky_id},
892each of which also contains the appropriate \ippdbcolumn{stack_id}
893entries for the skycell under consideration.
898894
899895The input images are passed to the \ippprog{psphotStack} program,

927923The stack photometry output catalogs are re-calibrated for both
928924photometry and astrometry in a process very similar to the
929 \ippstage{camera} calibration stage.  In the case of this
930 \ippstage{skycal} stage, each skycell is processed independently.
931 Because of this independence, when queued for processing, the entries
932 in the \ippdbtable{skycalRun} table contain the \IPPdbcolumn{sky_id}
933 and \ippdbcolumn{stack_id} entries of the parent data directly.  As
934 in the \ippstage{camera} stage, the \ippprog{psastro} program reads in
935 the stack photometry catalog, and produces a calibrated output, with
936 format matching the input.  A different processing recipe is supplied
937 to \ippprog{psastro}, which controls for the different data.  The same
938 reference catalog is used for the \ippstage{camera} and
939 \ippstage{stack} calibration stages.  Upon completion, the analysis
940 statistics are written to the \ippdbtable{skycalResult} table.
925\ippstage{camera} calibration stage.  Although the individual warps
926which go into the stack are calibrated based on the \ippstage{camera}
927stage analysis, there was some concern that these calibrations might
928not be sufficiently well-defined for some of the input warps, biasing
929the photometry of the stack.  By re-calibrating the stacks, we can be
930sure that the stack photometry as measured is tied to the photometric
931reference system.
932
933In the case of this \ippstage{skycal} stage, each skycell is processed
934independently.  Because of this independence, when queued for
935processing, the entries in the \ippdbtable{skycalRun} table contain
936the \ippdbcolumn{sky_id} and \ippdbcolumn{stack_id} entries of the
937parent data directly.  As in the \ippstage{camera} stage, the
938\ippprog{psastro} program reads in the stack photometry catalog, and
939produces a calibrated output, with format matching the input.  A
940different processing recipe is supplied to \ippprog{psastro}, which
941controls for the different data.  The same reference catalog is used
942for the \ippstage{camera} and \ippstage{stack} calibration stages.
943Upon completion, the analysis statistics are written to the
944\ippdbtable{skycalResult} table.
941945
942946\subsection{Forced Warp Photometry}

995999individual warp images used to generate the stack.  This
9961000\ippstage{fullforce} analysis is performed on all warps for a single
997 skycell and filter as a single unit, as this matches the arrangement
998 of the input source catalog from the \ippstage{skycal} stage.  When
999 processing is queued for this stage, an entry is added to the
1000 \ippdbtable{fullForceRun} primary database table linking to the
1001 specific \ippdbcolumn{skycal_id} entry that will be used as the
1002 catalog for the photometry.  The \ippdbcolumn{warp_id} values for the
1003 input \ippstage{warp} stage images that contributed to the
1004 \ippstage{stack} associated with that \ippdbcolumn{skycal_id} are
1001skycell and filter as a single unit within the processing database,
1002while individual warps are processed individually in parallel as
1003separate processing jobs.
1004
1005When processing is queued for this stage, an entry is added to the
1006\ippdbtable{fullForceRun} primary database table with a reference to
1007the corresponding stack and \ippdbcolumn{skycal_id} entry that is the
1008input source of detections to be measured.  The \ippdbcolumn{warp_id}
1009values for the input \ippstage{warp} stage images that contributed to
1010the \ippstage{stack} associated with that \ippdbcolumn{skycal_id} are
10061012primary table by the \ippdbcolumn{ff_id} identifier.  The individual

10081014stage image products along with the \ippstage{skycal} catalog to the
10091015\ippprog{psphotFullForce} program.
1016
1017%% In this program, the positions of sources are loaded from the input
1018%% catalog.  PSF stars are pre-identified from the stack image and a PSF
1019%% model generated for each \ippstage{warp} image based on those stars,
1020%% using the same stars for all warps to the extent possible (PSF stars
1021%% which are excessively masked on a particular image are not used to
1022%% model the PSF).  The PSF model is fitted to all of the known source
1023%% positions in the warp images.  Aperture magnitudes, Kron magnitudes,
1024%% and moments are also measured at this stage for each warp.  Note that
1025%% the flux measurement for a faint, but significant, source from the
1026%% stack image may be at a low significance (less than the $5\sigma$
1027%% criterion used when the photometry is not run in this forced mode) in
1028%% any individual warp image; the flux may even be negative for specific
1029%% warps.  When combined together, these low-significance measurements
1030%% will result in a signficant measurement as the signal-to-noise
1031%% increases by the square root of the number of measurements.  The
1032%% individual warp measurements are combined together to generate
1033%% averages values within DVO.
10101034
10111035The convolved galaxy models are also re-measured on the

10531077images are matched.  \note{discuss Alard-Lupton}.
10541078
1055 In the \ippstage{diff} stage, the IPP generates diffferece images for
1079In the \ippstage{diff} stage, the IPP generates difference images for
10561080appropriately specified pairs of images.  It is possible for the
10571081difference image to be generated from a pair of \ippstage{warp} stage
10581082images, from a \ippstage{warp} and a \ippstage{stack} of some variety,
10591083or from a pair of \ippstage{stack} stage images.  During the PS1
1060 survey, pairs of exposures, call TTI pairs (see~\note{Survey
1084survey, pairs of exposures, called TTI pairs (see~\note{Survey
10611085  Strategy in Chambers et al}), were obtained for each pointing within a $\approx$ 1
10621086hour period in the same filter, and to the extent possible with the

10741098\ippdbtable{diffRun} table, and the appropriate input images are added
10751099to the \ippdbtable{diffInputSkyfile} table, with one entry for each
1076 skycell that are covered by the images.  For a \ippstage{diff}
1100skycell that is covered by the images.  For a \ippstage{diff}
10771101generated from two \ippstage{warp} stage products, the input images
10781102have their \ippdbcolumn{warp_id} values recorded in the

10951119catalogs passed to the \ippprog{ppSub} program.  This does the
10961120subtraction, as well as the photometry of any sources detected in the
1097 \ippstage{diff} image.  The algorithm used for PSF matching is
1098 described in \citet{waters2017}.  Upon completion of these jobs,
1099 statistics about the processing are written to an entry in the
1121\ippstage{diff} image.  Sources may be detected as a positive source
1122(flux in the minuend is higher than the subtrahend) or as a negative
1123source (flux in the subtrahend is higher).  The algorithm used for PSF
1124matching is described in \citet{waters2017}.  Upon completion of these
1125jobs, statistics about the processing are written to an entry in the
11001126\ippdbtable{diffSkyfile} table.  An \ippmisc{advance} checks for the
11011127completion of all of the components listed in

11111137\begin{table}[hb]
11121138\begin{center}
1113 \caption{DVO Database Tables\label{tab:DVO_schema}}
1139\caption{DVO Database Tables\label{tab:DVO_schema} \note{fix order,
1140    drop invalid tables}}
11141141\begin{tabular}{ll}
11151142\hline

11551182DVO tracks three main classes of information: 1) average properties of
11561183astronomical objects; 2) measurements of those objects (from which the
1157 average properties are derived); 3) properties of image which provided
1184average properties are derived); 3) properties of the images which provided
11581185some or all of the measuements.  Figure~\ref{fig:DVO_schema}
11591186illustrates the schematic relationship between these types of

11821209measurements; those which store information about the images; those
1184
1185 \subsubsubsection{Photcodes}
1186
1187 % photcodes
1188 DVO has a special metadata table called \ippdbcolumn{photcode} which
1189 identifies the photometry filter systems.  Entries in this table are
1190 used to identify the source of measurements and images.  Each row in
1191 the \ippdbcolumn{photcode} table includes a \ippdbcolumn{photcode}
1192 name, a unique numerical ID, and information about that photometry
1193 system.
11941211
11951212DVO includes two major classes of database tables: those containing

12081225levels each containing a finer mesh of regions covering the sky.
12091226
1227\subsubsubsection{Photcodes}
1228
1229% photcodes
1230DVO has a special metadata table called \ippdbtable{photcode} which
1231identifies the photometry filter systems.  Entries in this table are
1232used to identify the source of measurements and images.  Each row in
1233the \ippdbtable{photcode} table includes a \ippdbtable{photcode}
1234name, a unique numerical ID, and information about that photometry
1235system.
1236
1237There are 3 classes of photcodes defined within the DVO system.  One
1238class of photcodes define the filter systems for the average
1239photometry measurements; these are called \ippmisc{SEC} photcodes.  A
1240second class of photcode is associated with measurements from a
1241specific camera for which image metadata is available are called
1242\ippmisc{DEP} photcodes.  There are also those measurements which come
1243from external data sources for which DVO does not have any information
1244to determine a calibration (e.g., instrumental magnitudes and detector
1245coordinates).  These are measurements are reference values and are
1246assigned \ippmisc{REF} photcodes.
1247
12101248The names for \ippmisc{SEC} photcodes are the names of filter systems,
12111249such as $g,r,i$ or $J,H,K$.  For \ippmisc{DEP} and \ippmisc{REF}

12291267properties derived from multiple measurements, and for which the
12301268measurement-to-image relationship is not provided.  Ingests methods
1231 have been defined for example for 2MASS, WISE, Gaia, USNO-B.  In each
1269have been defined, for example, for 2MASS, WISE, Gaia, USNO-B.  In each
12321270of these cases, the astrometric and photometric measurements are
12331271stored in the \ippdbtable{Measure} table, with the data source

12581296discussed below) and the astrometrically calibrated position.
12591297Astrometric offsets for several systematic corrections discussed below
1260 are also defined for each measurement.  Photometry from chip, warp,
1261 and stack are all placed in the same table with photcodes
1298are also defined for each measurement.  Photometry from \ippstage{chip}, \ippstage{warp},
1299and \ippstage{stack} are all placed in the same table with photcodes
12621300distinguishing the source \note{show example of stack and warp
12631301  photcodes}.  Since stacks and forced warp fluxes may have

12691307For the warp images, we also measure the weak lensing KSB parameters
12701308related to the shear and smear tensors \citep{1995ApJ...449..460K}.
1271 These measurements are stored in the \ippdbcolumn{Lensing} table,
1309These measurements are stored in the \ippdbtable{Lensing} table,
12721310along with the radial aperture fluxes for radii numbers 5, 6, \& 7
12731311(respectively 3.0, 4.63, and 7.43 arcsec).  This table contains one

12811319sorted \ippdbtable{Lensing} table entries.  \note{discuss failure of
12821320  the Lensing to Measure indexing}
1321
1322\note{Average used above but defined below}
12831323
12841324\subsubsubsection{Object Tables}

13591399these photometric distance modulus measurements are not extremely
13601400precise (see below), they provide a constraint on the distance is used
1361 in our analysis of the astrometry \citep[][see]{magnier2017.calibration}.
1401in our analysis of the astrometry \citep[see][]{magnier2017.calibration}.
13621402
13631403In the \ippdbtable{Measure} table, there are three fields which

14161456determined by the photometry calibration analysis and the astrometric
14171457flat-field corrections determined by the astrometry calibration
1418 analysis \citep[][see]{magnier2017.calibration}.
1458analysis \citep[see][]{magnier2017.calibration}.
1459\note{use names and match DVO schema table}
14191460
14201461\subsubsection{Sky Partition}
14211462
1422 DVO includes two major classes of database tables: those containing
1463\note{re-word this sentence}  DVO includes two major classes of database tables: those containing
14231464information about astronomical objects in the sky and those containing
14241465other supporting information.  The object-related tables are

14381479on the one used by the Hubble Space Telescope Guide Star Catalog
14391480files.  \note{add figure} Level 0 is a single region covering the full
1440 sky.  Level 1 divides the sky in Declination into bands
1441 7.5\degree\ high.  Level 2 subdivides these Declination bands in the
1481sky.  Level 1 divides the sky in declination into bands
14827.5\degree\ high.  Level 2 subdivides these declination bands in the
14421483RA direction, with spacing related to the stellar density.  Level 3
14431484divides these RA chunks into 4 - 8 smaller partitions.  This level

14591500astronomical objects in the database files, with an associated maximum
14601501of \approx 30 million measurements in these files.  With the compression
1461 scheme described above, the largest database files are \approx
1502scheme described below, the largest database files are \approx
146215033GB, which can be loaded into memory in 30 seconds on the processing
14631504machines that contain partition data.

14991540tables are compressed using the (to date) experimental FITS binary
15001541table compression strategy outlined by \note{REF}.  Table compression
1501 is in general an option in DVO; for the PV3 database, the large data
1542is an option in DVO; for the PV3 database, the large data
15021543volume (70TB compressed) drove the decision to compress the tables.
15031544

15051546The FITS binary table compression scheme uses a strategy similar to
15061547that used for FITS image compression (\note{REF}).  The binary tabular
1507 data is compressed and stored in the HEAP' section of the FITS table
1548data is compressed and stored in the HEAP'' section of the FITS table
15081549extension, with pointers to the compressed data stored in the regular
15091550data section.  Each column in the FITS table is compressed as one (or

15111552column format (e.g., TFORM1) are replaced with keywords which describe
15121553the location and size of the compressed data in the HEAP section; the
1513 information about the uncompressed data is moved to a keyword with Z'
1554information about the uncompressed data is moved to a keyword with Z''
15151556the compression algorithm (e.g., ZCTYP1).  The column names (e.g.,

15331574in the tables.  In practice, we have chosen a default in which
15341575floating point numbers use \code{GZIP_2}, character strings use
1535 \code{GZIP_1}, integers use \code{RICE}.
1576\code{GZIP_1}, and integers use \code{RICE}.
15361577

15401581Upon completion of the processing of each stage, the results of the
15411582photometry analysis are stored in a large number of individual catalog
1542 files as described in~\ref{XXX}.  The data from these files are loaded
1543 into a DVO database to define the astronomical objects and to allow
1544 for calibration analysis.  The program which loads the data into the
1545 DVO database is called \ippprog{addstar}, and is associated with the
1546 the \ippstage{addstar} processing stage.  The measurement catalogs
1547 generated by the \ippstage{camera}, \ippstage{staticsky},
1548 \ippstage{skycal}, \ippstage{fullforce}, and \ippstage{diff} stages
1549 are processed loaded into DVOs in this fashion, although not every
1550 measurement in each catalog are included in the master DVO that is
1551 constructed.  For a particular re-processing version, a single master
1552 DVO is constructed for the positive image stages (\ippstage{camera},
1553 \ippstage{staticsky}, \ippstage{skycal}, \ippstage{fullforce}) and a
1554 separate one is constructed for the difference image analysis stage
1555 results.
1583files as described in \cite{magnier2017.analysis}.  The data from
1584these files are loaded into a DVO database to define the astronomical
1585objects and to allow for calibration analysis.  The program which
1587is associated with the the \ippstage{addstar} processing stage.  The
1588measurement catalogs generated by the \ippstage{camera},
1589\ippstage{staticsky}, \ippstage{skycal}, \ippstage{fullforce}, and
1590\ippstage{diff} stages are processed loaded into DVOs in this fashion,
1591although not every measurement in each catalog are included in the
1592master DVO that is constructed.  For a particular re-processing
1593version, a single master DVO is constructed for the positive image
1594stages (\ippstage{camera}, \ippstage{staticsky}, \ippstage{skycal},
1595\ippstage{fullforce}) and a separate one is constructed for the
1596difference image analysis stage results.
15561597
15571598The construction of the master DVO is performed in a hierarchical

15641605databases together.  In the merge, astronomical objects are joined
15651606together using essentially the same rules as those used to associated
1566 detections into objects.  One exception: the match radius may be
1607detections into objects with one exception: the match radius may be
15671608chosen to be a different size depending on the data source.  For
15681609example, when WISE data is merged with PS1 data, as discussed below, a

16121653a function of position in the camera (essentially an astrometric
16131654flat-field correction), as a function of the brightness of the star
1614 (the so-called Koppenh\"offer effect, see~\ref{magnier2017.calibration}), and as
1615 a function of airmass and color (Differential chromatic refraction).
1655(the so-called Koppenh\"offer effect, see~\citealt{magnier2017.calibration}), and as
1656a function of airmass and color (differential chromatic refraction).
16161657Once the systematic errors have been measured, they are applied back
16171658to the measurements in the database.  Within the DVO

16241665astrometry is again performed this time using the corrected positions.
16251666
1667\note{have eddie suggest wording here?}
1668
16261669Photometric calibration consists of determination of zero points for
16271670each exposure along with corrections for systematic effects.  In this
16281671case, we rely on efforts of our external collaborators for the initial
1630 catalog files (smf files') and determined the zero points of those
1673catalog files (smf files'') and determined the zero points of those
16311674exposures which were believed to be obtained in photometric
1632 conditions.  This process, called \"ubercal', is described in detail
1675conditions.  This process, called \"ubercal'', is described in detail
16331676by \cite{2012ApJ...756..158S} for the first (PV1) version.  In brief, photometric
16341677periods, with time-scales of at least \note{half of a night}, are

16381681parameters in this solution consist of a single zero point and airmass
16391682slope for each photometric period along with a collection of
1640 flat-field offsets for several large time range (flat-field
1641 seasons').  For the PV3 \"ubercal analysis, the flat-field offsets
1683flat-field offsets for several large time range (flat-field
1684seasons'').  For the PV3 \"ubercal analysis, the flat-field offsets
16421685were determined on a $2\times2$ grid for each chip and 5 flat-field
16431686seasons were chosen (listed in Table~\ref{tab:flat-field-seasons}).

16731716Telescope Sciences Institute through their Mikulski Archive for Space
16741717Telescopes (MAST).  The underying database at MAST is a copy of a
1675 database generated at the Institute for Astronomy by the subsystem
1718database generated at the IfA by the subsystem
16761719called PSPS : the \note{define PSPS}.  The construction of the PSPS
16771720version of the PS1 database starts once the PS1 photometry and

16811724
16821725The first stage of constructing the PSPS database consists of the
1683 generation of small files called batches' which contain a complete
1726generation of small files called batches'' which contain a complete
16841727set of measurements for a small chunk of the database tables.  The
16851728program which is responsible for the construction of these batches is

16901733One type of batch consists of measurements from the individual
16911734exposures.  These batches are generated based on the output catalog
1692 files generated at the \ippstage{camera} stage (smf files').  The
1735files generated at the \ippstage{camera} stage (smf files'').  The
16931736\ippprog{ipptopsps} program loads the complete set of measurements and
16941737metadata from the smf catalog file, then queries the DVO database for

17571800might be run and to regularly generate new commands based on that
17581801concept.  The tasks'' are defined using the opihi scripting language
1759 (also shared by DVO and other user-interative programs within the
1802(also shared by DVO and other user-interactive programs within the
17601803IPP).
17611804
1762 Pantasks repeatedly checks each task in an attempt to generate a new
1763 command: we say pantasks attempts to execute' the task in each of
1805\ippprog{Pantasks} repeatedly checks each task in an attempt to generate a new
1806command: we say \ippprog{pantasks} attempts to execute'' the task in each of
17641807these attempts.  Tasks may specify the time between execution
17651808attempts, with a 1 second default.

17731816opihi language) which is run each time the task is executed.  The
17741817\code{task.exec} code may refer to variables or other data structures
1775 defined by the opihi language within the pantasks environment.  Within
1818defined by the opihi language within the \ippprog{pantasks} environment.  Within
17761819a single \ippprog{pantasks} instance, all opihi variables and data
17771820structures have global context (\ie, all are visible to all tasks).

17821825
17831826Within the \ippprog{task.exec} macro, the command to be run must be
1784 defined with the function command'.  Once the \ippprog{task.exec}
1785 macro exits successfully, the defined command is the added to the list of jobs
1827defined with the function command''.  Once the \ippprog{task.exec}
1828macro exits successfully, the defined command is then added to the list of jobs
17861829to be run within the UNIX environment.  Jobs may be run in one of two
17871830ways: locally or via the parallel processing system.  The task, or the
1788 \ippprog{task.exec} macro, uses the host' command to define how to
1789 run the job.  If the host is set to local', then the job is run in
1790 the background by pantasks itself (using the C \code{execvp}
1831\ippprog{task.exec} macro, uses the host'; command to define how to
1832run the job.  If the host is set to local'', then the job is run in
1833the background by \ippprog{pantasks} itself (using the C \code{execvp}
17911834function).  Otherwise, the job is sent to the parallel processing
17921835system to be run on another machine within the cluster.  If the host
1793 is set to the special value anyhost', then the parallel processing
1836is set to the special value anyhost'', then the parallel processing
17941837system is allowed to choose the processing computer arbitrarily.  Any
17951838other value is taken to be the DNS name of the computer on which this

17981841that the job only runs on the specifically named computer.  Otherwise,
17991842the parallel processing system may choose to redirect the command to
1800 another computer (based on whatever rules are defined for the parallel
1801 processing system).
1843another computer using its own rules, e.g. to balance processing load
1844across the cluster.
18021845
18031846When the \ippprog{task.exec} macro is run, the code may choose (e.g.,
18041847based on tests of some global variables) to exit the macro with an
1805 error condition, e.g., with the break' command.  In this
1848error condition, e.g., with the break'' command.  In this
18061849circumstance, no job is produced by the task.  The task will be tried
18071850again the next time it is executed.  This feature allows for the user

18181861  online user guide?}
18191862
1820 The option npending' may be used to limit the number of jobs which
1863The option npending'' may be used to limit the number of jobs which
18211864are simultaneously executed for a specific task.  For example, some
18221865classes of jobs should only be run one-at-a-time because they are not
18231866protected against collisions or they may overload a resource.  The use
1824 of npending' allows these situations to be handled cleanly within
1825 pantasks (avoiding cumbersome coding within with program or supporting
1867of npending'' allows these situations to be handled cleanly within
1868\ippprog{pantasks} (avoiding cumbersome coding within with program or supporting
18261869script).
18271870
1828 The option nmax' limits the total number of jobs which a task
1871The option nmax'' limits the total number of jobs which a task
18291872generates.  This option may be useful in cases where
18301873\ippprog{pantasks} is used to perform a limited set of operations.
18311874\note{do we actually use this in IPP?}
18321875
1833 The option trange' allows the user to restrict the time period during
1876The option trange'' allows the user to restrict the time period during
18341877which the specific tasks is executed.  This option is given with a
18351878start and an end time for the limiting time range.  These times may be

18461889ranges may be specified \note{how are they evaluated?}
18471890
1848 The option \code{nice} specifies the nice' level at which the job is
1891The option \code{nice} specifies the nice'' level at which the job is
18491892run when it is executed.  The parallel processing system must respect
18501893this concept.
18511894
18521895The option \code{active} can be used to turn on and off a task for
1853 periods.  Since a user command or a macro run by pantasks can
1896periods.  Since a user command or a macro run by \ippprog{pantasks} can
18541897re-define task options, the \code{active} state may be changed

18571900prevent them from running for some reason.
18581901
1859 \subsubsection{pantasks passes jobs to pcontrol}
1902\subsubsection{pcontrol}
18601903
18611904Jobs which are generated by \ippprog{pantasks} may be run locally on

18831926Similarly, the hosts may also have one of several states: off, down,
18841927busy, idle, etc.  A single host can accept a single job at a time.
1885 Multiple hosts instances corresponding to the same machine may be
1928Multiple host instances corresponding to the same machine may be
18861929specified allowing a single computer to run more than one simultaneous
18871930job.
18881931
1890 them to the list of jobs to execute.  It also accepts from pantasks
1933them to the list of jobs to execute.  It also accepts from \ippprog{pantasks}
18911934the names of computers on which it is allowed to run those jobs.
18921935
1893 \subsubsection{pcontrol passes jobs to pclient}
1894
1895 When pcontrol is provided with the name of a computer, it will attempt
1936\subsubsection{pclient}
1937
1938When \ippprog{pcontrol} is provided with the name of a computer, it will attempt
18961939to make an connection to that machine via ssh (or rsh?).  When a
18971940connection is made, the remote shell is used to run a special
18981941interface program call \ippprog{pclient}.  This program accepts
1899 command lines from pcontrol and is responsible for executing the
1942command lines from \ippprog{pcontrol} and is responsible for executing the
19001943individual commands in the local shell environment.  A single ssh
1901 connection to a remote host keeps a single pclient shell running for a
1944connection to a remote host keeps a single \ippprog{pclient} shell running for a
19021945somewhat arbirarly long time, excuting many shell commands as needed.
19031946This architecture avoids wasting overhead making the ssh connection to

19061949architecture is allowed to be very light and short running if needed.
19071950
1908 After pcontrol sends a job (commands) to a specific pclient, it checks
1951After \ippprog{pcontrol} sends a job (commands) to a specific \ippprog{pclient}, it checks
19091952back occasionally to see if the command has been run and executed.  If
1910 it has finished, then pcontrol will query for the exit status, the
1953it has finished, then \ippprog{pcontrol} will query for the exit status, the
19111954standard output and standard error streams from the command.  (where
1912 do these go, back to pantasks?), with the results associated with the
1913 job statistics.  At that point, the pclient on the remote machine is
1914 ready to accept a new job from pcontrol.  If any jobs are pending in
1915 the list of jobs known to pcontrol, it will send those jobs to any
1955do these go, back to \ippprog{pantasks}?), with the results associated with the
1956job statistics.  At that point, the \ippprog{pclient} on the remote machine is
1957ready to accept a new job from \ippprog{pcontrol}.  If any jobs are pending in
1958the list of jobs known to \ippprog{pcontrol}, it will send those jobs to any
19161959machines which are idle.
19171960
1918 While pcontrol interacts with the many remote machines, it
1919 occasionally interacts with pantasks to report the results from the
1920 jobs it has been monitoring.  Pantasks occasionally requests a list of
1961While \ippprog{pcontrol} interacts with the many remote machines, it
1962occasionally interacts with \ippprog{pantasks} to report the results from the
1963jobs it has been monitoring.  \ippprog{Pantasks} occasionally requests a list of
19211964the completed jobs.  It then requests the status information for each
19221965completed job, including the standard error and standard output.  As
1924 from the list managed by pcontrol.  Thus pcontrol maintains at most a
1925 modest list of jobs which are in flight', leaving all interpretation
1927
1967from the list managed by \ippprog{pcontrol}.  Thus \ippprog{pcontrol} maintains at most a
1968modest list of jobs which are in flight'' , leaving all interpretation
1970
19291972exit status and output products from each job.  For example, the
19301973stderr and stdout may be specified to go to a file (with static name

19361979started.  This mode is useful for testing as all errors are reported
19371980back to the opihi shell.  However, when the user exits the shell, the
1938 pantasks instance exits, shutting down pcontrol and all remote client
1939 connections.  In standard operations, pantasks is run in a client
1981\ippprog{pantasks} instance exits, shutting down \ippprog{pcontrol} and all remote client
1982connections.  In standard operations, \ippprog{pantasks} is run in a client
19401983server mode.  The server runs continuously in the background and
19411984multiple users may connect via the \ippprog{pantasks_client} program.
19421985Users can the send commands to the server to load scripts, add
1943 parallel hosts, check status, and start or stop the pantasks
1986parallel hosts, check status, and start or stop the \ippprog{pantasks}
19441987operations.
19451988

19561999end
19572000\end{verbatim}
1958  \caption{\label{fig:task_example} Example of a simple static
1960    this example, pantasks would run a single instance of the command
1961    ({\tt ls /tmp}) every 5 seconds, sending the stdout and stderr to
1962    the listed files. }
2001\caption{\label{fig:task_example} Example of a simple static
2003  this example, ippprog{pantasks} would run a single instance of the command
2004  ({\tt ls /tmp}) every 5 seconds, sending the stdout and stderr to
2005  the listed files. }
19632006  \end{center}
19642007\end{figure}

19692012
1970 Pantasks provides an environment in which commands can be generated
2013\ippprog{Pantasks} provides an environment in which commands can be generated
19712014and extensive parallel processing managed.  The details of how to
19722015implement the different stages of IPP processing are captured in a
1973 collection of scripts written for pantasks in the \code{opihi}
2016collection of scripts written for \ippprog{pantasks} in the \code{opihi}
19742017language.  In general, each stage is defined by an associated script
19752018collected together under the \ippmisc{ippTasks} collection.  While

20012044row in the result set, each column in the row is stored as a separate
20022045line on the \ippmisc{page}, identified by the database column name.  An
20042047can manage the processing of the job which will be generated by this
2005 page.  When the page is first generate, the
2048page.  When the page is first generated, the
20062049\ippdbcolumn{pantasksState} is set to \ippmisc{INIT}, indicating that
20072050this \ippmisc{page} is a new addition to the \ippmisc{book}.  Once all

20182061construct the appropriate command-line (e.g., lines in the page may
20192062include input file names and output file names for the specific item
2020 in the database).  The resulting command becomes a job in the pantasks
2063in the database).  The resulting command becomes a job in the \ippprog{pantasks}
20212064collection of jobs.  Most IPP analysis stages specify that the jobs
2022 are then sent to pcontrol for parallel process.  Before task generates
2065are then sent to \ippprog{pcontrol} for parallel process.  Before task generates
20232066the job, the \ippdbcolumn{pantasksState} is set to \ippmisc{RUN} so a
20242067future execution of the task will not attempt to re-run this specific job.

20292072this responsibility is left to the program which ran the analysis.
20302073IPP analysis steps normally consist of two main elements: a C-language
2031 program to do the data analysis work and a supporting perl script
2074program to do the data analysis work and a supporting Perl script
20322075which performs the database update upon completion.  Upon completion,
20342077status within the book, but not within the processing database.  This
2035 split keeps the interactions at the pantasks level relatively light,
2078split keeps the interactions at the \ippprog{pantasks} level relatively light,
20362079leaving the overhead of the database interaction within the job
20372080running on one of the computing machines in the cluster.

20422085clear jobs which have failed with one of the ephemeral failure modes
20432086(see the discussion in Section~\ref{sec:processing.database}).  This
2044 step allows these failures to be cleared from the system, and schedule
2045 those jobs again for a retry
2087step allows these failures to be cleared from the system, allowing
2088those jobs to be scheduled again
20462089

20662109discussed above, the query to the processing database for new items is
20672110restricted to a set of user-defined labels.  A given instance of
2068 pantasks will be supplied a set of labels which are then applied to
2111\ippprog{pantasks} will be supplied a set of labels which are then applied to
20702113manages the nightly processing of the basic science analysis stages
2071 (chip - warp, stack, diff) is supplied with several labels which
2114(\ippstage{chip} - \ippstage{warp}, \ippstage{stack}, \ippstage{diff}) is supplied with several labels which
20722115correspond to the different kinds of observations being performed.  In
20732116this way, the analysis of the nightly observations is kept separate

20832126\note{then discuss the addstar sequences with manual triggering}
20842127
2085 Outside of the basic sequence of chip to warp, there is no single
2128Outside of the basic sequence of \ippstage{chip} to \ippstage{warp}, there is no single
20862129natural next step.  For example: a stack can be generated with any
20872130number of input warps; a difference image can be generated between a

21032146significantly reduced from the arbitrary case.
21042147
2105 {\em Queuing the diffs} is done by first examining the set of all
2148Queuing the diffs is done by first examining the set of all
21062149exposures that have been taken at the summit on the current night of
21072150observing, and querying information from each stage up through

21112154group are then sorted by increasing observation date
21122155(\ippdbcolumn{dateobs}).  The database results for each stage
2113 (chip-warp) are checked to ensure that the selected exposures have
2156(\ippstage{chip}-\ippstage{warp}) are checked to ensure that the selected exposures have
21142157been successfully processed for all stages through \ippstage{warp}.
21152158Exposure groups are ignored until all exposures have either been

21292172that were excluded due to an odd number of exposures to be paired with
21302173the exposure closest in time (with the exposure that was previously
2131 first ignored).  Exposure pairs in which at least one exposures does
2174first ignored).  Exposure pairs in which at least one exposure does
21322175not have a pre-existing difference image are queued for difference
21332176image analysis.

21382181exposures, as this is the number of exposures taken for each field.
21392182Once this number was reached, no more exposures are expected, so
2140 \ippstage{stack} database entries can be queued with the
2183\ippstage{stack} database entries can be queued from the
21412184\ippstage{warp} entries.  Again, failures and weather can reduce the
21422185number of usable exposures.  If no stack could be made for a given MD
21432186field with the minimum number of inputs by the time of the
2144 end-of-night darks, stacks are generated using using whatever
2187end-of-night darks, stacks are generated using whatever
21452188exposures are available.
21462189

21612204\ippdbtable{lapRun} entries can be queued that define a
21622205\ippdbcolumn{filter} and a \ippdbcolumn{projection_cell} to be
2163 considered.  A \ippdbcolumn{projection_cell} is a unit of sky defined
2164 to be a square four degrees on each side which has a single tangent
2165 plane projection \citep[][see]{waters2017}.  \note{does waters2017
2166   discuss RINGS.V3? if not, where?}  Once this entry is defined, is is
2167 populated with exposures (stored in the \ippdbtable{lapExp} table in
2168 the database), with any exposure located within 5 degrees of the
2169 center of the projection cell included.  This radius ensures that any
2170 exposure that overlaps the projection cell will be included.  Once the
2171 exposures have been added, the other exposures within the same
2172 sequence are checked to see if a \ippstage{chip} stage entry has been
2173 generated, and if so, the \ippdbcolumn{chip_id} for that entry is
2174 saved into the \ippdbtable{lapExp} as well.  This linkage ensures that
2175 each exposure is only processed once.  If no entry is found, a new
2176 \ippstage{chip} entry is queued for processing.  The task periodically
2177 checks the status of the exposures in each \ippdbtable{lapRun} entry,
2178 and if they have all completed the \ippstage{warp} stage, then a
2179 \ippstage{stack} is queued for each skycell contained within the
2206considered.  These projection cells match the tangent plane centers
2207used for the warp tessellation.  A \ippdbcolumn{projection_cell} is a
2208unit of sky defined to be a square four degrees on each side which has
2209a single tangent plane projection \citep[][see]{waters2017}.
2210\note{does waters2017 discuss RINGS.V3? if not, where?}  Once this
2211entry is defined, it is populated with all exposures (stored in the
2212\ippdbtable{lapExp} table in the database) that are located
2213within 5 degrees of the center of the projection cell included.  This
2214radius ensures that any exposure that overlaps the projection cell
2215will be included.  Once the exposures have been added, the other
2216exposures within the same sequence are checked to see if a
2217\ippstage{chip} stage entry has been generated, and if so, the
2218\ippdbcolumn{chip_id} for that entry is saved into the
2219\ippdbtable{lapExp} as well.  This linkage ensures that each exposure
2220is only processed once.  If no entry is found, a new \ippstage{chip}
2221entry is queued for processing.  The task periodically checks the
2222status of the exposures in each \ippdbtable{lapRun} entry, and if they
2223have all completed the \ippstage{warp} stage, then a \ippstage{stack}
2224is queued for each skycell contained within the
21802225\ippdbcolumn{projection_cell}.
21812226

21922237system per-se, but only method of tracking the locations of files
21932238within the file system, and of tracking duplicate copies of the same
2194 file.  The core of \ippprog{Nebulous} is a dedicated database engine
2195 which tracks storage objects'', the concept of a file exists in the
2239file.  The core of \ippprog{Nebulous} is a mysql database which tracks
2240storage objects'', the equivalent concept of a file within the
21962241system.  Each storage object may be associated with a number of copies
21972242of the actual files on the disks in the storage system (called

22132258stored on a specific computer (for at least one of the instances).
22142259All of the analysis stages which interact with that chip could then be
2215 preferentially targetted to be run on that computer.  The localization
2216 in \ippprog{Nebulous} and the host targetted processing in pantasks
2260preferentially targeted to be run on that computer.  The localization
2261in \ippprog{Nebulous} and the host targeted processing in \ippprog{pantasks}
22172262can therefore work together to encourage processing to require only
22182263local disk access, reducing the I/O local on the network

22212266practice, the as-built IPP has had sufficient network bandwidth that
22222267this targetting was not required.  In practice, due to the timing of
2223 hardware aquisition, occasional hardware failures, and other
2224 organizational details, targetted processing has only been used to a
2268hardware acquisition, occasional hardware failures, and other
2269organizational details, targeted processing has only been used to a
22252270moderate degree within the Pan-STARRS cluster. \note{can we get a
22262271  number here?}

22292274
22302275The user interfaces to Nebulous consist of command-line programs as
2231 well as APIs in both C and Perl.  The basic user commands to interact
2232 with Nebulous are to 1) create a new storage object and associated
2233 instance; 2) add a new instance to an existing storage object; 3)
2234 remove (cull) an instance; 4) delete a storage object; and 5) find a
2235 file associated with a given storage objects.  Note that these user
2236 commands do not affect the files on disk \note{true for cull?}
2237 (exception: the create function will create an empty file if one does
2238 not exist).  They only change the state of the Nebulous database; it
2239 is the responsibility of the user program to read and write data to a
2240 file and to create the copies, etc.
2276well as APIs in both C and Perl.
2277
2278"The basic user commands to interact with Nebulous are to 1) query the
2279database for an existing storage object, and find a valid file
2280instance associated with that object; 2) create a new storage object,
2281which instantiates an empty file that can be opened for writing; 3)
2282replicate an existing storage object to create more file instances; 4)
2283cull a single file instance of storage object from the cluster; and 5)
2284remove a storage object, and ensure that all file instances are
2285removed.  The filehandles returned for newly created instances can
2286then be opened for reading and writing data to that instance.
2287
2288% The basic user commands to interact
2289% with Nebulous are to 1) create a new storage object and associated
2290% instance; 2) add a new instance to an existing storage object; 3)
2291% remove (cull) an instance; 4) delete a storage object; and 5) find a
2292% file associated with a given storage objects.  Note that these user
2293% commands do not affect the files on disk \note{true for cull?}
2294% (exception: the create function will create an empty file if one does
2295% not exist).  They only change the state of the Nebulous database; it
2296% is the responsibility of the user program to read and write data to a
2297% file and to create the copies, etc.
22412298
22422299For the Nebulous users, the identifier of a storage object is a unique

22472304computer (HOST) and disk (VOL).  The path and filename portions become
22482305the identifier and are recorded in the \ippmisc{storage_object} table
2249 in the \ippmisc{extern_id} field.  A storage object entry is then
2250 created in the database for this id, and an instance of the file
2251 created on the specified node (or at random from available nodes if
2252 left empty).
2306in the \ippmisc{ext_id} field.  A storage object entry is then created
2307in the database for this id, and an instance of the file created on
2308the specified node.  If the host is unspecified, or if the specified
2309volume is full, then a host is chosen at random from available nodes.
22532310
22542311Files are stored on specific computers in a \ippprog{Nebulous}

22582315\code{nebulous}.  Beneath the top-level directory are 256
22592316subdirectories with names of the form 00 - ff (i.e., 2 digit
2260 hexadecimate number).  Each subdirectory again as 256 subdirectories
2261 with the same naming scheme.
2317hexadecimal number).  Each subdirectory has 256 subdirectories with
2318the same naming scheme.
22622319
22632320The filename of an instance in Nebulous is deterministic and derived
2264 from the \ippmisc{extern_id}: the \ippmisc{extern_id} is hashed using
2321from the \ippmisc{ext_id}: the \ippmisc{ext_id} is hashed using
22652322the SHA-1 function, and the first four hexadecimal digits of this hash
22662323are separated into two two-digit strings and used as the top and

22692326provide a unique SQL ID for each instance.  Under the subdirectory
22702327identified above, the disk file name is by appending the database
2271 instance id with a string derived from the \code{extern_id}: forward
2328instance id with a string derived from the \code{ext_id}: forward
22722329slash characters are replaced in the name with colons so the string
22732330can represent a file in the UNIX filesystem.  For the example URI

23332390using only the low-latency SOAP communications.
23342391
2335 \note{need a paragraph or two on stats: how many objects, how many
2336   instances?}
2392The Nebulous database currently (2017 July) contains information about
23935,560,533,654 file instances for 3,543,240,981 storage objects.  All
2394raw data, along with permanent products such as catalogs and the
2395current versions of full-sky stacks, are replicated to ensure at least
2396two copies exist in case of hardware failure.  Based on the most
2397recent database ID values (which are unique and never reused), this
2398corresponds to roughly half of all the storage objects and file
2399instances ever created, due to the transient nature of many pipeline
2400products.
2401
2402% those numbers are so_id 6758205602 ins_id 9971666505, with ratios
2403% 0.5242, 0.5576)
23372404
23382405\subsection{Datastore repositories}

23432410that exposes data in a common form.  \note{add Isani / Hoblitt
23442411  reference?}  One of the main datastores used by the IPP is the one
2345 located at the summit.  This datastore exposes, a list of the
2412located at the summit.  This datastore exposes a list of the
23462413exposures obtained since the start of the PS1 operations.  Requests to
23472414this server may restrict to the latest by time.  Each row in the

23532420associated with that exposure.  This listing includes a link to the
23542421individual chip FITS files as well as an md5 checksum.  Systems which
2355 are allowed access may download chip FITS files via http requests to
2422are allowed access may download the raw chip FITS files via http requests to
23572424

25092576These storage nodes are not fully capable of completing all processing
25102577on the short timescale necessary for each night's worth of data.  To
2511 increase the processing capability, we have a large number
2512 \note{actual number?} of compute'' nodes, that have small amounts of
2513 local storage, but are able to add processing power.  In addition to
2514 the direct processing of image data, these nodes are also used to
2515 manage the \ippprog{Nebulous} file interface, as well as controlling
2516 the job scheduling for the processing.
2578increase the processing capability, we have 212 `compute'' nodes that
2579have small amounts of local storage, but are able to provide
2581image data, these nodes are also used to manage the \ippprog{Nebulous}
2582file interface, as well as controlling the job scheduling for the
2583processing.
25172584
25182585The final type of computer in the cluster are the database servers.

26312698products are present.
26322699
2633 Approximately half of the chip through warp processing for the PV3
2634 reduction was performed on Mustang, with 201,040 / 375,573 of the
2635 \ippstage{camera} stage products reduced there.  Only processing
2636 through the \ippstage{stack} stage was attempted, although with a
2637 smaller fraction of the total compared to the \ippstage{camera} stage,
2638 with 290,257 / 998,886 being produced at Los Alamos.  One reason for
2639 this decrease is that due to the memory constraints on the Mustang
2640 processing nodes, we were unable to run stacks with more than 25
2641 inputs there.  Stacks with this larger number of inputs overflow the
2642 memory of the processing node, and as they do not have disk space
2643 available for use as virtual memory, cause the machine to hang until
2644 the job time limit is reached.  These stacks were instead processed on
2645 the regular IPP cluster, where hosts with sufficent memory were
2646 available.
2700Approximately half of the \ippstage{chip} through \ippstage{warp}
2701processing for the PV3 reduction was performed on Mustang, with
2702201,040 / 375,573 of the \ippstage{camera} stage products reduced
2703there.  Only processing through the \ippstage{stack} stage was
2704attempted, although with a smaller fraction of the total compared to
2705the \ippstage{camera} stage, with 290,257 / 998,886 being produced at
2706Los Alamos.  One reason for this decrease is that due to the memory
2707constraints on the Mustang processing nodes, we were unable to run
2708stacks with more than 25 inputs there.  Stacks with larger numbers of
2709inputs overflow the memory of the processing node, and as they do not
2710have disk space available for use as virtual memory, cause the machine
2711to hang until the job time limit is reached.  These stacks were
2712instead processed on the regular IPP cluster, where hosts with
2713sufficent memory were available.
26472714
26482715\subsection{UH Cray Cluster}