Local two-sample testing: A new tool for analysing high-dimensional astronomical data

P. E. Freeman, I. Kim, A. B. Lee

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)


Modern surveys have provided the astronomical community with a flood of high-dimensional data, but analyses of these data often occur after their projection to lower dimensional spaces. In this work, we introduce a local two-sample hypothesis test framework that an analyst may directly apply to data in their native space. In this framework, the analyst defines two classes based on a response variable of interest (e.g. higher mass galaxies versus lower mass galaxies) and determines at arbitrary points in predictor space whether the local proportions of objects that belong to the two classes significantly differ from the global proportion. Our framework has a potential myriad of uses throughout astronomy; here, we demonstrate its efficacy by applying it to a sample of 2487 i-band-selected galaxies observed by the HST-ACS in four of the CANDELS programme fields. For each galaxy, we have seven morphological summary statistics along with an estimated stellar mass and star formation rate (SFR). We perform two studies: one in which we determine regions of the seven-dimensional space of morphological statistics where high-mass galaxies are significantly more numerous than low-mass galaxies, and vice versa, and another study where we use SFR in place of mass. We find that we are able to identify such regions, and show how high-mass/low-SFR regions are associated with concentrated and undisturbed galaxies, while galaxies in low-mass/high-SFR regions appear more extended and/or disturbed than their high-mass/low-SFR counterparts.

Original languageEnglish
Pages (from-to)3273-3282
Number of pages10
JournalMonthly Notices of the Royal Astronomical Society
Issue number3
Publication statusPublished - 2017

Bibliographical note

Funding Information:
The authors would like to thank the members of the CANDELS collaboration for providing the data upon which this work is based. We would also like to thank Jeff Newman (University of Pittsburgh) for acting as IK’s external adviser for the project on which this paper is based, and Rafael Izbicki (Federal University of São Carlos) and Jen Lotz (Space Telescope Science Institute) for helpful discussions. This work was supported by NSF DMS-1520786 and NIMH R37MH057881. Our research has made use of SAOimage DS9, as well as the dmtools provided by the Chandra X-ray Center in the application package CIAO.

Publisher Copyright:
© 2017 The Authors.

All Science Journal Classification (ASJC) codes

  • Astronomy and Astrophysics
  • Space and Planetary Science


Dive into the research topics of 'Local two-sample testing: A new tool for analysing high-dimensional astronomical data'. Together they form a unique fingerprint.

Cite this