In light of continuing confusion on the psphot background measurements, I performed a series of tests to verify the consistency and possible error sources of the sky levels from the raw to stack stages for the 12-input test stack for MD04/skycell.055 from exposures (o6538g0139o,o6540g0157o,o6540g0213o,o6543g0266o,o6550g0059o,o6558g0328o,o6567g0066o,o6569g0065o,o6569g0073o,o6569g0081o). These exposures were chosen to match the test stack Peter made. All IPP processing was done using the standard processing recipes, although with the random seed fixed to the value 1234. This ensures that differences due to pixel selection are removed. In addition, each processing stage was repeated both with the standard quantized integer output images as well as floating point uncompressed images. Image histograms were performed in MANA, using a fixed 0.25 bin width for all images. RAW and CHIP stage images had pixels with any masked bits replaced with NAN values, and were therefore removed from further consideration. WARP and STACK images did not have this operation performed, as they have bad pixel data replaced with NAN before being written out. Non-IPP Gaussian fits were run on these histograms, with all unpopulated and end point bins removed. MANA assigns points outside of the histogram range to these bins, which were chosen to be sufficiently far from the sky distribution to be influential (RAW: +/- 10000, CHIP/WARP: +/- 100, STACK: +/-2000).


  1. The compressed and uncompressed images produce different answers in psphot. For the chip data below, median(compressed - uncompressed) = +0.086 for the mean, and -1.29 for the sigma values. Repeating this with the gaussfit results has median(c-uc) = -0.001589 and median(sigma) = 0.0146. The median BSCALE value (which influences the compression) is 2.293.
  2. There is a trend of increasing mean bias with increasing sigma. For the compressed data, this trend is (psphot_mean - gaussfit_mean) = -0.21 + 0.01 * sigma. The trend for the uncompressed data is (delta mean) = -0.174 + 0.009 * sigma. The uncompressed trend is strongly influenced by outliers.
  3. Warp pixels have a narrower distribution due to the smoothing introduced in the warping process.
  4. Stack output pixel distributions do not appear altered based on whether compressed or uncompressed warps are used as inputs.
  5. No background subtraction is performed on the stack pixels during the stacking process. A background is constructed (and subtracted) during the staticsky/photometry stage, which should find and remove any residual background levels that result due to biases on the stack's input products.


The images grouped in the RAW stage are not true raw OTA data. They have been processed with ppImage to apply all applicable detrends, and to mosaic the images into the standard 4800x4800 arrangement. However, no background subtraction was performed, leaving any sky values in place. The following plots shows one of the inputs to the stack, o6543g0266o XY33. The top row shows the results for the compressed output image, with the uncompressed images on the bottom. The logarithmic plots on the right indicate that the lower tail of the pixel distribution is Gaussian.

linear log

The next set of plots compare the background mean and sigma value calculated by psphot against the external Gaussian fits. The means agree within a count, although there is a slight trend with sigma (and, as the mean value here is essentially sigma2, with the mean as well), such that psphot produces higher background levels than the simple Gaussian fit (for the example OTA, the fit values are PSPHOT: (744.387622, 29.123441) GAUSSFIT: (744.217 30.1759).

mean sigma
delta mean
delta sigma
delta sigma/zoom


The CHIP stage product has had the psphot background subtracted from it, and so is expected to be centered on zero. Following the same OTA as before, we find that the distribution is at zero, and running the measurements again on the image yield PSPHOT: (0.091377, 27.990451) GAUSSFIT: (-0.00620696 29.5627) for the compressed image, and PSPHOT: (0.016442,28.999453) GAUSSFIT: (0.0406877 29.5508) for the uncompressed image.

All methods agree that the pixel are within +/- 0.1 of zero. Given that the compressed/gaussfit mean is effectively zero, but a close check of the logarithmic plot shows that that fit is biased low, there does seem to be a non-zero positive bias remaining on this image.

linear log

Looking at the distribution for all the OTAs, a similar sigma based trend in the mean is visible between the two methods, with the compressed data having a larger effect. Given that the BSCALE values for the chip images have a median of 2.298 (mean 2.83442, sigma 1.81574, min 1.38277, median 2.29866, max 15.38), this is at least partially an issue with the rebinning for the Gaussfit histogram.

mean sigma
delta mean
delta sigma


The following plots the skycell.055 warp for o6543g0266o, along with all the OTAs that contribute to the warp. I did not trim the OTA to exclude regions not included in the warp, and the scaling of the OTA data is set to match the peak values. The warp distribution is narrower than any of the OTAs. The warping process produces an output pixel that is a combination of a set of input pixels. This is a smoothing process, and so the distribution of pixels on a warp is expected to have a smaller sigma than on a chip.

The logarithmic plots show that the Gaussian fit matches the central distribution somewhat tighter than in the chip. The fit values for this warp are PSPHOT: (-0.010854,24.262478) GAUSSFIT: (0.0364079 25.6469) for the compressed image and PSPHOT: (-0.033963,25.211044) GAUSSFIT: (0.078182 25.6508) for the uncompressed image.

linear log

The warp distribution is similar to before, with a slight trend of increasing difference with sigma. The compressed fits tend to be slightly higher than the uncompressed fits. There is again an offset and larger trend in the sigma difference with the compressed data.

mean sigma
delta mean
delta sigma


The previous plots also have the values for the stack included, with a scaling factor of 14 applied (following Peter's scaling relation between the stack and warps). Despite the labels, the output stack that was used was uncompressed. The "comp" stack is a stack produced from the compressed warp data. The stacks fall along the trends for the uncompressed warps, but there is no significant difference in the two stacks; ppStack produces nearly identical outputs regardless of whether it reads in compressed or uncompressed warps.

The following pixel histograms for the stack have not been scaled by 14, and present the true image values. The fit values for these are PSPHOT: (-0.697008,221.535437) GAUSSFIT: (0.552266 224.454) for the compressed input stack, and PSPHOT: (-1.054158, 221.984595,) GAUSSFIT: (0.552266 224.454) for the uncompressed input stack.

comp nocomp


Nigel sent along a series of plots based on MD04 stack tests. I do not fully agree with his interpretation of these plots. First, his plot of the stack image sky sigma (based on the pixel data) with exposure time looks fine. This shows that as more images are combined, the resulting image sigma also increases as the square root of the exposure time. This is what we should expect.

The next plot is where we disagree. This plot shows the mean pixel sky level divided by the number of input warps for the stack, as a function of the number of input warps. Number of input warps scales with exposure time of the final stack, so the x-axis on this plot is in large part just a proxy for the sky sigma of the image. The bias in the psphot background mean measurements has a sigma dependence, due to the way we perform the image histograms. I think a similar effect occurs no matter how the histograms are done; at some level the calculated value of the mean depends on how well the inner N sigma is sampled. In any case, a sigma dependent bias scales as teh square root of the number of input images, so the plotted value of the sky level divided by number of inputs produces a 1/sqrt(N) shape, which matches reasonably well with the values shown here.


Peter Draper has reported issues with the stack and warp backgrounds in PV3 reprocessed data. The relevant links are The DRAVG page and Peter's analysis of a stack.

From the DRAVG page, Peter reports that his manual psphot SKY distribution is centered around 1 count, while the SAS static sky distribution is centered around ~0.3. I cannot reproduce the SAS distribution either. The current trunk, the current PV3 processing tag (ipp-pv3-20140717), the ipp-pv3-cr-20140614 processing tag that was used for the SAS processing, and that same tag reverted back to r36878 to match the log file entry for the original static sky run all agree with Peter's distribution:

I don't know what went wrong with the SAS reduction, as I can not reproduce the values previously published. I believe the ~1 count offset from zero in the stack is consistent with the remaining small background bias that leaves a small amount of background on the input warps. This test also confirms that with a fixed seed value, psphot and psphotStack produce identical background models. This is expected, as they both make an identical call to the psImageBackground function.

Moving on to Peter's analysis of warp/stack SKY levels, I first confirmed that the ppStack code does not subtract any values from the input warps. The images are scaled by a multiplicative factor, individual pixels are rejected, and the remainder are summed. Because of this, the pixel distributions of the stack and the input warps must agree to within a scaling factor. Looking at histograms of all warps and the stack (which I constructed from the warps listed on Peter's analysis, and adopting the factor of 14 between an individual warp and the stack listed there):

Fitting Gaussians to the populated bins from that histogram yield "true" mean and sigma values, with warp "4" being the outlier Peter identified.

Image Mean Sigma
0 -0.0720957 25.3298
1 -0.0575938 24.49
2 -0.0890201 18.8579
3 -0.0399057 25.6635
4 -0.633794 64.1527
5 -0.0685074 34.0803
6 0.112552 31.1058
7 0.0159295 47.0761
8 0.104313 30.5573
9 0.119509 32.5571
stack -0.0428245 228.501

These values can be used to check that the psphot claimed background level is a function of the sky sigma, as the bias report from May below discusses. Plotted are three realizations of each warp's: (1) is my ppStack run, (3) is Peter's ppStack run, and I originally thought (2) was also a psphot background level, but reading Peter's page closer, this is actually his manual histogram/Gaussian fit data. Due to the quantized nature of the warp pixel data, differences in histogram bin size can influence the fit results.

Subtracting off my "true mean" values corrects the outlier warp to fall closer to the same trend as the other points at smaller sigma. My understanding of this remaining trend is that even when I believe the warp pixel value mean to be approximately zero, psphot still produces a measured background that is inflated slightly, with an increase of about 0.3 counts.

On the stacks, we seem to be getting a better estimate of the true sky level, but there is still some variation depending on which set of pixels is examined. Below I show the CMF sky values for multiple photometry runs with psphot and psphotStack that I ran with my stack. The mean of this distribution (which samples the background model realization) shifts by about 1 count between these runs, due to this pixel selection. Also plotted is a fixed seed comparison between psphot and psphotStack, again confirming that they produce the same models.

Finally, plotting all the warp and stack background models as a function of pixel index (index 0 is image(0,0), and index 256 is image(16,16), it is clear that the warp background models all cluster around values ~0.3. The stack realizations (different psphot/psphotStack runs) match the expected value around zero.


The background issue should now be solved for good. There were two issues the contributed to this problem. The first one was the value being written to disk was being trucated via an effective int() call, which isn't appropriate around zero. To make this fully consistent, psLib now takes the floor() of the value. The second issue related to the fuzz value, and I was confused about the math in the 2014-05-16 plot that implied that a fuzz value of 1.0 was needed. Gene pointed out on Monday that the floor function effectively removes a random value between 0 and 1 for all numbers fed into it. To correct for this, the fuzz value needs to add back this random value, corresponding to a fuzz value of zero. Due to the int vs floor vs ceil, this is mathematically equivalent to what I was finding previously (int for negative numbers is ceil, floor and ceil are offset by 1 unit).

I launched a small (27 exposure) set from the SAS to directly compare with the previously processed SAS, and ran the same test as on 2014-05-16 to check that the residual background on the stacks created from these newly processed does not increase with number of inputs:

This plot shows the background level is flat for all input numbers, fixing the issue.

For dealing with older data, the background offset can be determined by checking the BSCALE value on the image. The background is offset by +0.5 * BSCALE, due to this 1/2 bin issue with the fuzz value.


Histogram of (quantized - real), for three fuzz offsets, defining the fuzz as (rand() - offset).


Histograms from the test from last week. The jagged jumps are due to using fixed histogram bins.


After a suggestion from Gene and a series of tests, I've come to the conclusion that the small residual chip background is a function of the image quantization/compression code. The following table shows the background mean/sigma values for two warps for a series of tests.

The first two rows are the entries from the gpc1 database for the skycell and primary chip image (this skycell are comprised primarily of one OTA) in the test.

All subsequent tests were performed with identical random seed (1234). The seed.passX rows are the background values generated in ppImage with a three-pass change (fit background, subtract, repeat). They seem to confirm that the statistic is working just fine.

Next is the in-and-out test, where after writing the image, psphot is used to remeasure the background. This provides the ~0.3 count offset seen in previous tests. To confirm that this is a compression artifact, I repeated with an image saved by ppImage uncompressed. There are three sigma based scaling methods defined in psLib, and I checked that they exhibit biases around zero that make some sort of sense (the positive scale is what is used in the normal compressed image, so that is shown in the in-and-out test).

Finally, I noticed that the statistic used to generate the mean/sigma values for the scaling is psLib was the "robust" median/stddev, not the "fitted" that is used for the background tests. Changing this and repeating the tests did not improve the result significantly.

The way the scaling flips around zero (and that the "both" is still offset in one direction) makes me suspect that there's an off-by-one error in the scaling code. I haven't yet tracked it down, but I wanted to get this status note out.

#test warp.964581 warp.964593
chip.bg 25.1021 17.3215 28.8274 8.38539
warp.bg 0.366421 6.898341 0.357391 6.962276
seed.pass1.bg 25.729 8.0774 28.7602 8.075996
seed.pass2.bg 0.0147 8.0745 0.000294 8.07090
seed.pass3.bg 0.000620 8.0765 0.004085 8.066377
in-and-out 0.281408 7.734210 0.296758 25.729009
io.uncomp 0.013544 8.066099 -0.002336 8.080612
io.both 0.182691 7.497421 0.193386 7.664834
io.neg -0.307696 7.727381 -0.230592 7.777576
io.fitted 0.298210 7.775235 0.253315 7.689663
io.fitted-both 0.159112 7.453220 0.142325 7.238388
io.fitted.int 0.027772 8.151440 -0.108624 7.896051


After looking through the code on this, I think I've finally sorted out the source of this background. In order to ensure that all stacks have a zeropoint of 25, the input fluxes are normalized, such that:

f_stack = \sum (f_i * w_i)/sum(w_i) * n_i
n_i = 10**( -0.4 * ([z_filter + eps_seeing + 2.5 log10(t_i)] -
                       [z_stack + 2.5 log10(t_stack)]))
  ~ N * 10**(-0.4 * [z_filter - z_stack]);

This means that any residual background is the stack is multiplied by N and a zeropoint factor that ranges from 1.3 (r) to 5.1 (y). This means for a typical g-band 8-input stack, the residual background is multipled by ~13.5.

A typical g warp seems to have a residual background of ~.3-.4 with a sigma of ~7-8. The current psStats implementation simply cannot find that small of a signal. As we start stacking, the normalization scaling pushes up that signal much faster than the image variance:

Ninput gauss_mean gauss_sigma
4 2.22026 35.853
8 5.57725 51.721
12 8.83167 50.0609
19 14.4736 82.931

We do not generate a background model of the final stack, but this is done prior to doing the photometry. As the s/n of the background is much higher in the stacks, this can identify and subtract it off, yielding measured SKY values that are non-zero.

Stacks assign weights the input exposures, and this weight has a term related to the difference between the input zeropoint and the target zeropoint of 25.0.

#test warp.964581 warp.964593 chip.bg 25.1021 17.3215 28.8274 8.38539 warp.bg 0.366421 6.898341 0.357391 6.962276 seed.pass1.bg 25.729 8.0774 28.7602 8.075996 seed.pass2.bg 0.0147 8.0745 0.000294 8.07090 seed.pass3.bg 0.000620 8.0765 0.004085 8.066377 in-and-out 0.281408 7.734210 0.296758 25.729009 io.uncomp 0.013544 8.066099 -0.002336 8.080612 io.both 0.182691 7.497421 0.193386 7.664834 io.neg -0.307696 7.727381 -0.230592 7.777576 io.fitted 0.298210 7.775235 0.253315 7.689663 io.fitted-both 0.159112 7.453220 0.142325 7.238388 io.fitted.int 0.027772 8.151440 -0.108624 7.896051


I forgot to plot and post the results obtained from the final changes to psStats. The following plot shows the difference in the background levels for the entire SAS set. The "new" statistic is calculated using the updated psStats code, and the "old" statistic is the one originally obtained with the current working tag (ipp-20130712). The new statistics have mean/median/min/max values that are shifted by about 0.25 count relative to the old statistic, indicating that the new statistic believes the sky to be slightly brighter than was obtained previously. The distribution of sigma values is centered around zero, as there should not be a systematic shift between statistic calculations. This shift should help reduce the trends with stack depth shown previously.


Validation tests using ppSim images


Validation tests


Repeated background subtraction

For a set of 465 i-filter warp images, I ran psphot to obtain a background level estimate. This was done in three passes, to provide the standard background level (what we normally use), a second pass (showing the residual background remaining), and a third pass (checking that the residual had been removed). The following table shows some statistics for these runs:

Pass Average background Average background_sigma Median background Median background_sigma
1 2.14126 30.48071 1.931080 28.195907
2 -0.0381096 30.51861 -0.030491 28.232888
3 0.00361354 30.52406 0.003190 28.226071

This suggests that we are leaving a residual background on the images, and only after three iterations is this background effectively fully removed.


As the statistics and fitsio both seem to be working largely as expected, I began looking through the ppStack code to attempt to identify an issue there. It appears that the important sequence in the code is as follows:

  1. Input images are matched against the target PSF, and convovled to match that target.
  2. After convolution, a single ROBUST_MEDIAN background level is subtracted from the convolved images.
  3. The convolved images are written to a temporary disk file.
  4. Initial and final convolved stacks are constructed, using the disk files to ease the threading of these steps.
  5. For the final unconvolved stacks, the list of input original files is passed to the convolution step. These input files have an unknown (to ppStack), possibly non-zero background level.

I believe this explains the deviations in the background behavior between the convolved and unconvolved stacks. As the warps retain a small non-zero background level, failing to remove this results in a stack background level that scales as the number of exposures/exposure time/input proxy. The solution is to retain a set of background level values for the unconvolved inputs, and optionally apply them during the stacking stage. This will require a bit of development work to implement.

For an illustration of the input dependence, I've plotted up the measured background level for a set of SAS_22 stacks, separated by filter.



From the behavior of the convolved stack background levels, it appears that there is a small statistical issue remaining, on the order of 0.25DN/input (0.5DN/input in y?).


=== Summary of previous discussion of this issue ==

2013-05-13 Confluence page

2013-05-20 Non-zero SKY on stacks

2013-06-03 Update to non-zero sky issue

2013-06-17 More on SKY residuals on stacks

Initial checks

The initial thought for this issue is that the stack sky levels are incorrect either due to an error in the sky statistics, or due to an error in the fitsio data storage. To investigate this, I used the set of warps for a very deep stack (~450 inputs) and pulled statistics out of logs and compared them with direct measurements.

  1. Internal background code check. Manually run psphot to fit the background level, and confirm that this matches the value recorded in the stack log.

  1. External background code check. Compare the psphot background levels to a Gaussian fit to the histogram of image values.

  1. Database check. The GPC1 database contains a background value recorded for the warps. I do not have a good explanation as to why these match so badly.

  1. BSCALE check. If the fitsio routines are to blame, it's likely an issue with the quantization. There's a correlation here, but it is more likely due to the fact that the background level is correlated with the distribution width (bg_sigma), as is BSCALE.