1 Feasibility of the environment quantification in the CMASS sample

1
Feasibility of the environment quantification in
the CMASS sample
Using the 10 × 10 deg2 mock with the SAM of Guo et al., produced by G. Lemson
using the flux limit i < 23, I have carried out some tests to assess the feasibility
of quantifying the small-scale (i.e. few Mpc) environments of the CMASS galaxies.
These tests are the first-order tests, highly idealised (i.e. 100% sampling, Gaussian
P (z) for galaxies in Stripe 82), and they are supposed to serve as guidelines what
can be achieved (in the best case) and in which direction to go to improve the
results. For the tests the major step was to optimise the codes which include the
discretized photo-z objects to run in an acceptable time frame. The slowest process
is reading (and writing) of the files, and the speed is mainly limited by the amount
of data which can be in the memory at a given moment.
The local environments in these tests are taken to be represented by the number
of galaxies within cylinders of some projected radius Rp and the velocity difference
±V el km/s (V el = 1500 km/s in most of the cases) centred on a CMASS galaxy.
The ”true” environments are simply obtained by counting numbers of all galaxies
knowing their spectroscopic redshifts down to the chosen magnitude limit (i < 21 or
i < 22) and the ”reconstructed” environments correspond to the sum of galaxies by
weighting them according to their P (z) (i.e. by integrating P(z) along the redshift
interval defined by ±V el) for galaxies within Rp of a given CMASS galaxy.
For the CMASS galaxies P (z) is just a (Dirac) delta function at the spectroscopic
redshift of that galaxy. For the rest of the galaxies (down to the assumed magnitude
limit) P (z) is approximated by a Gaussian with σ = 0.06(1 + z) in the simplest
case. The value of σ is based on the results on the photometric redshifts quality
in Stripe 82 achieved by K. Bundy (and presented during the BOSS Galaxy Group
meeting at the IPMU in October 2010). In addition, this initial P (z) has been
modified by ”enhancing” it at the proximity of galaxies from the CMASS sample
(at that redshift), by simply obtaining the modified P Z(z) = (P (z) × N N (r <
RZ)/normalisation) at each redshift z (the initial P (z) and modified P Z(z) are
normalised to the same value). The count N N (r < RZ) corresponds to the number
of CMASS galaxies within the sphere with the radius RZ centred at the redshift
(z) of consideration. This modification has been carried out in the bins of redshift
0.002 wide.
I have also done an additional step in the modification of P (z) by weighting the
initial P (z) by 1 + Σξ(rp , π), where the sum is going over all CMASS galaxies on
the distance rp and π (at the redshift of consideration in the initial P (z)). The ξ
function is the 2-dimensional cross-correlation function between the CMASS and
i < 22 galaxies. As the i < 22 galaxies do not trace only the regions defined by the
CMASS galaxies, this was supposed to ”leave” the part of P Z(z) also in the nonCMASS structures. However, there is no improvement of the basic results shown
in the initial results obtained by weighting all CMASS galaxies with unity within
sphere of the radius RZ (the tests with ξ-weighting has been done using only i < 22
sample of photometric galaxies).
The comparison of the NN values, “reconstructed” and “true”, is presented in
separate files: “mock densitynnrp.pdf” and “mock densitynnrp zade.pdf”. For each
case, there are 4 sets of 4 panel figures. The top 4-panel figure in each page directly
compare the reconstructed (y-axis) and true (x-axis) NN within the projected aperture Rp indicated on top of the each plot (1,2,3, and 4 Mpc/h) calculated at the
centres of the CMASS galaxies. The median and the lower and upper quartiles of
the reconstructed NN values at a given true NN value are presented with the red
and blue lines, respectively. In the bottom 4 panel is shown the error (y-axis) in
the reconstructed NN values, shown in the x-axis. The error is calculated as the
1
median in the difference between the reconstructed and true NN values (in red) at
a given reconstructed NN value, and the corresponding lower and upper quartiles
are marked in blue. The radius of a cylinder Rp is again indicated on top of the
individual small panels. The first two sets of 4 panel plots are in the redshift bin
0.45 < z < 0.75, and the last two sets of 4 panel plots are limited to the redshift
interval 0.5 < z < 0.6, which is indicated above each of the sets. All figures in
“mock densitynnrp.pdf” are obtained by integrating the photo-z P (z) in the interval defined by ±V el, where V el = 1500 km/s for the spectroscopic galaxies, and it
can take larger values up to V el = 6000 km/s for the photo-z objects, as indicated
in the label of x-axis in the top 4 panels, or as indicated in the label of y-axis
in the bottom panels. For the figures in “mock densitynnrp zade.pdf”, photo-z
galaxies are counted according to their modified PZ(z) function, and all objects are
integrated using V el = 1500 km/s.
In addition, the histograms shown in Figures 1, 2, and 3, quantify the goodnessof-reconstruction of local environments of the CMASS mock galaxies in the following
way for the reconstructions using only Gaussian P (z), P Z(z) with R = 10 Mpc/h,
and R = 15 Mpc/h, respectively. Reconstructed numbers of objects (NN) are
divided in four quartiles, and then histograms of the true NN are plotted in the
black, red, green, and blue colours for the lowest 25%, 25-50%, 50-75%, and the
highest 75% of the distribution of the reconstructed NN, respectively. It has more
sense to carry out the comparison using some percentile bins of the reconstructed
NN, as the absolute values are rather different when using P(z) or PZ(z) for the
photo-z objects (as seen in the figures in the separate pdf files).
Explicitly, on the x-axis are values NN of objects within aperture Rp = 2 Mpc/h
and velocity ±1500 km/s assuming all galaxies have known spectroscopic redshift
(i.e. “true” NN values). On the y-axis are given numbers of objects in the bins of
this specific true NN.
All histogram results are in 0.45 < z < 0.6; the chosen magnitude limit is
indicated in the figure - top and bottom; s=0.06 indicates that the P(z) is modelled
as σ = 0.06(1 + z); extra addition RZ < XX indicates that the initial P(z) is
modified to PZ(z) where the CMASS galaxies are counted withing radius of XX
which is in Mpc/h (comoving).
2
Figure 1: Goodness-of-reconstruction of local environments of the CMASS mock
galaxies assuming only Gaussian P (z). Reconstructed numbers of objects (NN) are
divided in four quartiles, and then histograms of the true NN are plotted in the
black, red, green, and blue colours for the lowest 25%, 25-50%, 50-75%, and the
highest 75% of the distribution of the reconstructed NN, respectively.
3
Figure 2: Goodness-of-reconstruction of local environments of the CMASS mock
galaxies assuming P Z(z) with R = 10 Mpc/h. Reconstructed numbers of objects
(NN) are divided in four quartiles, and then histograms of the true NN are plotted
in the black, red, green, and blue colours for the lowest 25%, 25-50%, 50-75%, and
the highest 75% of the distribution of the reconstructed NN, respectively.
4
Figure 3: Goodness-of-reconstruction of local environments of the CMASS mock
galaxies assuming P Z(z) with R = 15 Mpc/h. Reconstructed numbers of objects
(NN) are divided in four quartiles, and then histograms of the true NN are plotted
in the black, red, green, and blue colours for the lowest 25%, 25-50%, 50-75%, and
the highest 75% of the distribution of the reconstructed NN, respectively.
5
2
Quality of P(z)
Outside of the SDSS areas with the deeper photometry, the quality of the P(z) is
lower and of different shapes (not necessary well described by a Gaussian function).
To quantify this, as a measure of the redshift uncertainty I have used half of the
difference between the redshifts at which cumulative P(z) reaches 75% (z75 ) and
25% (z25 ), normalised by (1 + z50 ) with z50 redshift at which cumulative P(z) is
0.5. I will call it the sigmaPZ parameter.
First of all, there is a strong dependence on the observed i-band magnitude.
It can be seen in Figure 4, where for a set of about 1400000 galaxies from the
catalogue produced by L. Cheng the median and the lower and upper quartiles in
the distributions of sigmaPZ parameters are plotted for objects separated in two
magnitude bins, as a function of z50 . For a set of objects in i < 21 the median
values are marked with the red curve, and the lower and upper quartiles are plotted
as the light blue dashed lines, respectively. For objects in 21 < i < 22 sigmaPZ is
systematically higher. The median values are marked with the magenta curve, and
the lower and upper quartiles are plotted as the dark blue dashed lines, respectively.
The results are still contaminated with the stars in the sample (and eventually
wrongly measured magnitudes of galaxies). The same type of plot but for the red
galaxies defined by g − r > 1.4, only for i < 21 is shown in Figure 5.
Distribution of the mock galaxies as a function of their g − r colour in 0.4 <
z < 0.8 is shown in Figure 6. Although all galaxies down to i < 22 are shown, the
number of galaxies which have colours above g − r = 1.4 (i.e. ∼ red galaxies) and
which are brighter than i = 21 is negligible. Taking into account only such defined
red galaxies, the histogram goodness-of-reconstruction is given in Figure 7, where
the curves have the same meaning except that the local environments are defined
by the set of g − r > 1.4 and i < 21 galaxies, assuming that σ = 0.06(1 + z) for
these red galaxies (i.e. Figure 5).
The complication arises from the fact that the results depend simultaneosly on
the width of the photo-z, but also how are the tracers with certain properties (i.e.
blue or red galaxies) correlated with the CMASS galaxies, i.e. how well they trace
the same local environments defined by the CMASS galaxies.
Taking only i < 21 objects, a more detailed quantification has been carried out
from a sample of about 5000000 objects from the same catalogue. The surface
number density of objects in the (r-i)-(g-r) diagram are given in Figure 8. They
encompass the full range of colours in the photo-z file, and therefore they are not
sensitive on the finer distributions of galaxies in the central, densely populated
region. The numbers of galaxies are in log 10-units. The binning is 0.5 mags in
both colours, and bins do not overlap.
I define the additional parameter ”sigma” to characterise the ”typical” redshift
PDF of galaxies in the given colour-colour bin. I approximate it by a median of the
distribution of the uncertainties of individual galaxies in the corresponding colourcolour bin, where the uncertainty in the redshift PDF of the individual galaxy is
a half of the difference between the redshifts where the redshift CDF reaches 75%
and 25% (i.e. sigmaPZ parameter). The ((r-i),(g-r)) bins are 0.5 × 0.5 mag wide.
The results are presented in Figures 9 for all i < 21 objects, and 10 for i < 21
objects in 0.4 < z50 < 0.7. The upper panels in both figures show the distribution
of “sigma” values in the colour-colour plane, and the bottom panels show the same
distributions but normalised by (1 + z50 ).
Additional notes: The full distributions of uncertainties (i.e. halves of the difference between z75 and z25) of the individual galaxies in the considered colour-colour
bin are in the directory ”∼ /public html/boss/photz” and they are produced only
for bins with more than 100 galaxies (they are not meant to be printed out, each of
them has about 70 histograms).
6
The names of the files are as following:
hist photz75-25stat i21all.eps - for all i < 21 objects
hist photz75-25n(1+z50)stat i21all.eps - for all i < 21 objects but z75-z25 is normalised by (1 + z50 )
hist photz75-25stat i21z50in0407.eps - for all i < 21 objects in 0.4 < z50 < 0.7
hist photz75-25n(1+z50)stat i21z50in0407.eps - for all i < 21 objects in 0.4 < z50 <
0.7 and sigma normalised by (1 + z50 )
There are also the files with the exact numbers of objects, with the following
columns:
1 Number of objects in the bin
2 lower g-r limit
3 higher g-r limit
4 lower r-i limit
5 higher r-i limit
6 z50
7 z75 − z25
The files are named:
photz75-25stat i21all.dat - for all i < 21 objects
photz75-25n(1+z50)stat i21all.dat - for all i < 21 objects but z75-z25 is normalised
by (1 + z50 )
photz75-25stat i21z50in0407.dat - for all i < 21 objects in 0.4 < z50 < 0.7
photz75-25n(1+z50)stat i21z50in0407.dat - for all i < 21 objects in 0.4 < z50 < 0.7
and sigma normalised by (1 + z50 )
7
0.25
0.2
0.15
0.1
0.05
0
0.2
0.4
0.6
0.8
Figure 4: Median and quartile values of the sigmaPZ parameters for a set of 1400000
i < 22 SDSS objects.
8
Figure 5: Median and quartile values of the sigmaPZ parameters for a set of red
(g − r > 1.4) and i < 21 galaxies from the intial sample of 1400000 i < 22 SDSS
objects.
9
Figure 6: Distribution of the mock galaxies as a function of their g − r colour in
0.4 < z < 0.8. Although all galaxies down to i < 22 are shown, the number of
galaxies which have colours above g − r = 1.4 and which are brighter than i = 21
is negligible.
10
Figure 7: Goodness-of-reconstruction of local environments of the CMASS mock
galaxies assuming Gaussian P (z) with σ = 0.06(1+z) (top) and P Z(z) with R = 10
Mpc/h (bottom). Only red g − r > 1.4 and i < 21 galaxies are used for the counts.
Reconstructed numbers of objects (NN) are divided in four quartiles, and then
histograms of the true NN are plotted in the black, red, green, and blue colours
for the lowest 25%, 25-50%, 50-75%, and the highest 75% of the distribution of the
11
reconstructed NN, respectively.
N of i21all
6
4
5
3
4
g−r
2
3
1
2
0
1
−1
−1
0
1
2
3
4
0
r−i
N of i21 0.4<z<0.7
5.5
4
5
4.5
3
4
3.5
2
g−r
3
1
2.5
2
0
1.5
1
−1
0.5
−1
0
1
2
3
4
0
r−i
Figure 8: Surface number density of objects in a given bin of colour-colour plot.
The top figure is for i < 21 objects at all redshift and the bottom figure is for i < 21
objects with 0.4 < z50 < 0.7
.
12
(z75−z25/2) i21all
0.4
4
0.35
3
0.3
2
g−r
0.25
1
0.2
0
0.15
−1
0.1
−1
0
1
2
3
4
r−i
(z75−z25/2) i21 0.4<z<0.7
0.35
4
0.3
3
2
g−r
0.25
1
0.2
0
0.15
−1
0.1
−1
0
1
2
3
4
r−i
Figure 9: ”Typical” redshift PDF of galaxies in the given colour-colour bin. I
approximate it by a median of the distribution of the uncertainties of individual
galaxies in the corresponding colour-colour bin, where the uncertainty in the redshift
PDF of the individual galaxy is expressed by the sigmaPZ parameter.
13
(z75−z25/2(1+z50) i21all
0.26
4
0.24
0.22
3
0.2
2
0.18
g−r
0.16
1
0.14
0.12
0
0.1
0.08
−1
0.06
−1
0
1
2
3
4
r−i
(z75−z25/2(1+z50) i21 0.4<z<0.7
0.26
4
0.24
0.22
3
0.2
2
g−r
0.18
0.16
1
0.14
0.12
0
0.1
−1
0.08
−1
0
1
2
3
4
0.06
r−i
Figure 10: ”Typical” redshift PDF of galaxies in the given colour-colour bin, normalised by (1 + z50 ). I approximate it by a median of the distribution of the
uncertainties of individual galaxies in the corresponding colour-colour bin, where
the uncertainty in the redshift PDF of the individual galaxy is expressed by the
sigmaPZ parameter.
14