Skip to main content

Ultrafast Spectral Energy Distribution Modeling of Galaxies using GPUs

Project Information

optimization, parallelization, astrophysics, gpu, python
Project Status: Reviewing Applicants
Project Region: CAREERS
Submitted By: John Moustakas
Project Email: jmoustakas@siena.edu
Project Institution: Siena College
Anchor Institution: CR-Rensselaer Polytechnic Institute
Project Address: Department of Physics and Astronomy
515 Loudon Rd
Loudonville, New York. 12211

Students: Samyak Tuladhar

Project Description

One of the most important outstanding problems in observational and theoretical astrophysics is to understand the physical origin and evolution of galaxies. Galaxies are gravitationally-bound systems consisting of tens to hundreds of billions of stars, gas, and dust, as well as large amounts of dark matter, which we observe across the entire 14 billion-year history of the universe. Fortunately, sophisticated models exist which allow us to interpret the observed spectral energy distributions of galaxies---in essence, how bright they appear in different parts of the electromagnetic spectrum, particularly in the ultraviolet, optical, and infrared---in terms of their physical properties such as stellar mass and star-formation rate. For example, the stellar mass of a galaxy reveals how efficiently gas has been converted into stars over the evolutionary history of the galaxy, while the star-formation rate indicates the current rate at which new stars are being born, or whether star formation has ceased entirely.

Not surprisingly, the parameter likelihood space which must be explored in order to effectively model observations of galaxies can be very large. In addition, the latest generation of massively multiplexed astrophysical surveys such as the Dark Energy Spectroscopic Instrument (DESI) survey are observing samples of tens of millions of galaxies. Consequently, there is an acute need for massively parallelized, computationally efficient code which can extract astrophysically meaningful constraints from large observational datasets of galaxies.

The open-source Python software package needed to carry out this project is called FastSpecFit (https://fastspecfit.readthedocs.org/en/latest). The code is reasonably well-documented and it has already been run on a high-performance computing system on samples of millions of galaxies observed by DESI. There are two computational bottlenecks, however, which are hampering being able to deploy FastSpecFit at the next scale, both in terms of input sample size and complexity of the underlying astrophysical models. These bottlenecks involve non-negative least-squares (NNLS) and non-linear least-squares fitting, both of which are currently being done using the CPU-optimized SciPy library.

With these issues in mind, the goal of this project is to port the computational "heart" of FastSpecFit to GPUs. We propose using JAX (https://jax.readthedocs.io/en/latest), which uses automatic (or computational) differentiation for optimization. Specifically, the open-source project JAXopt (https://jaxopt.github.io/stable) includes well-tested algorithms for solving a wide range of both linear and non-linear constrained optimization problems using GPU-accelerated, automatic differentiation. After testing these algorithms using simple (simulated) datasets, we will then implement an optional GPU version of FastSpecFit, and ultimately test it on actual DESI data.

Project Information

optimization, parallelization, astrophysics, gpu, python
Project Status: Reviewing Applicants
Project Region: CAREERS
Submitted By: John Moustakas
Project Email: jmoustakas@siena.edu
Project Institution: Siena College
Anchor Institution: CR-Rensselaer Polytechnic Institute
Project Address: Department of Physics and Astronomy
515 Loudon Rd
Loudonville, New York. 12211

Students: Samyak Tuladhar

Project Description

One of the most important outstanding problems in observational and theoretical astrophysics is to understand the physical origin and evolution of galaxies. Galaxies are gravitationally-bound systems consisting of tens to hundreds of billions of stars, gas, and dust, as well as large amounts of dark matter, which we observe across the entire 14 billion-year history of the universe. Fortunately, sophisticated models exist which allow us to interpret the observed spectral energy distributions of galaxies---in essence, how bright they appear in different parts of the electromagnetic spectrum, particularly in the ultraviolet, optical, and infrared---in terms of their physical properties such as stellar mass and star-formation rate. For example, the stellar mass of a galaxy reveals how efficiently gas has been converted into stars over the evolutionary history of the galaxy, while the star-formation rate indicates the current rate at which new stars are being born, or whether star formation has ceased entirely.

Not surprisingly, the parameter likelihood space which must be explored in order to effectively model observations of galaxies can be very large. In addition, the latest generation of massively multiplexed astrophysical surveys such as the Dark Energy Spectroscopic Instrument (DESI) survey are observing samples of tens of millions of galaxies. Consequently, there is an acute need for massively parallelized, computationally efficient code which can extract astrophysically meaningful constraints from large observational datasets of galaxies.

The open-source Python software package needed to carry out this project is called FastSpecFit (https://fastspecfit.readthedocs.org/en/latest). The code is reasonably well-documented and it has already been run on a high-performance computing system on samples of millions of galaxies observed by DESI. There are two computational bottlenecks, however, which are hampering being able to deploy FastSpecFit at the next scale, both in terms of input sample size and complexity of the underlying astrophysical models. These bottlenecks involve non-negative least-squares (NNLS) and non-linear least-squares fitting, both of which are currently being done using the CPU-optimized SciPy library.

With these issues in mind, the goal of this project is to port the computational "heart" of FastSpecFit to GPUs. We propose using JAX (https://jax.readthedocs.io/en/latest), which uses automatic (or computational) differentiation for optimization. Specifically, the open-source project JAXopt (https://jaxopt.github.io/stable) includes well-tested algorithms for solving a wide range of both linear and non-linear constrained optimization problems using GPU-accelerated, automatic differentiation. After testing these algorithms using simple (simulated) datasets, we will then implement an optional GPU version of FastSpecFit, and ultimately test it on actual DESI data.