Galaxy clustering: a point process
Galaxy clustering is the aggregation of galaxies in the universe driven by the force of gravity. Galaxies tend to form bigger structures like clusters or filaments that weave the Cosmic Web. This Large Scale Structure of the Universe can be understood as the resulting distribution of galaxies, a process in which all galaxies are subjected to common forces and share universal properties. The analysis of this distribution can be dealt with Point Processes techniques, the study of point configurations in a framework. In this thesis work we use this brach of statistics in three different approaches: summary statistics, data mining and modeling. Results show that Point Process are an excellent tool to unveil the properties of the galaxy distribution as well as to model their patterns. Different data sources have been used as examples of galaxy Point Process. These include modern galaxy surveys like the Sloan Digital Sky Survey (SDSS) and the ALHAMBRA survey, which allow us to study and discover new properties of the galaxy distribution and its consequences in the galaxy behavior. The Counts-in-Cells distribution of a Point Process is a simple yet powerful technique to describe a distribution. For this statistic we use data from the SDSS. We fit the obtained observational distribution with four different probability density functions and compare their goodness of fit. Another example of summary statistics technique is the correlation function, which we use to describe the clustering behavior of galaxies of the ALHAMBRA survey, covering wide redshift values. With this statistic we are able to calculate the galaxy clustering of spectral segregated galaxies at small scales (< 0.2 Mpc/h) for the first time. The data mining and modeling algorithms used in this thesis are tested on galaxy and dark matter simulations, such as LasDamas simulation and the MultiDark simulation. Our first model for the galaxy distribution is the point interaction Gibbs model, a probabilistic model that describes the distribution of galaxies depending on their pairwise distances. For close galaxy pairs the model increases the intensity of the process, creating aggregated patterns. Three different models have been used, depending on the cluster profile. The Geyer model defines a top hat profile where galaxy are aggregated at higher intensities than that of the homogeneous Poisson distribution found at larger scales. The Fiksel model is a continuous profile with an exponential slope, defining higher clustering amplitudes at small scales. Finally, the Power Law model defines also a continuous profile with a pole at distance 0. The Mixture Models are a powerful tool both for modeling and mining the galaxy distribution. Given a process with a well defined structure, such as the galaxy distribution, with clusters, filaments and other kinds of galaxy aggregations, the Mixture Models can properly describe its content. We need to define the number of structures and its morphology, and use this information to build a model which localize and fit each structure. The resulting model is a probability density function which reliably describes the point process content and separately describe each present structure, allowing an efficient data mining.