You’ve probably already marveled at the groundbreaking and truly mind-blowing images of space that were captured by NASA’s James Webb Space Telescope (JWST) and released this week. If you haven’t seen them yet, feast your eyes on these photos and prepare to feel very small.
While it’s practically impossible to wrap your head around the vastness of the universe, the technology that JWST uses to take these incredible images is more down-to-earth than you might think. In fact, the general-purpose programming language Python played a big role in JWST’s ability to capture and catalog these images.
Curious how programming skills can impact space exploration? Read on to learn about the connection between astronomy and data science, and the programming languages to learn if you want to get into the field.
Why astronomy is a lot like data science
“Astronomy has always been about data collection,” says Nitya Mandyam, Senior Curriculum Developer at Codecademy who has a PhD in astrophysics. Since ancient times, astronomers — the ground-based researchers who study stars, planets, and other celestial bodies — have been mapping the position of the moon and creating star catalogs for practical purposes, like tracking the seasons or planning crops.
Thanks to high-tech telescopes like JWST, astronomers today have way more data to work with, so they use code to manage it. “We spend our time at a computer writing and running code to analyze the images and data collected from telescopes and other instruments,” Erik Tollerud, Assistant Astronomer at the Space Telescope Science Institute, told GitHub’s The ReadME Project.
Python is the most common programming language that folks in the astronomy field utilize, because it’s “the language of data analysis, data manipulation, and data inference,” Nitya says.
Other programming tools that are common in astronomy include NumPy, the Python module for performing numerical operations on large quantities of data, and MatPlotLib, which is a library for creating visuals in Python.
“So much of astronomy is basically taking data, putting it on a plot, and then drawing conclusions,” Nitya explains. “There’s so much data that the visual trends that we can see make all the difference.”
Using Python for the largest space telescope in history
The spectacular photos that JWST took of glimmering nebulas and “cosmic cliffs” also contain valuable data that scientists rely on to research the galaxies. As JWST orbits 1 million miles away, software engineers back on earth use Python to receive, organize, and file all the data that comes from the telescope.
Here’s how it works: Data from NASA’s Deep Space Network feeds down into the Space Telescope Science Institute’s processing systems using Python. “And that’s where my code comes in,” Mike Swam, the data processing team lead who worked on JWST, said on an episode of the podcast Talk Python to Me in March 2022.
The stakes are high for software engineers to make sure the data is complete, check it for errors, and shepherd all of the pieces along the processing systems pipeline so the files can be archived properly. “We have a lot of data completion checking that we do in Python,” Mike said on the podcast.
The type of data that these programmers interact with is everything from binary data that comes from flight data recorders to engineering and “ephemeris data,” which tell you exactly where the telescope has been positioned and what it’s been doing. All of this supplemental data gets stored in the files so that scientists can access it and conduct research.
Without the data, JWST’s photos might as well be pretty screensavers: “They’re beautiful and they’re almost useless for science without the metadata,” Mike said.
How to start exploring data that came from space
If you want to be a “citizen scientist” and analyze JWST’s data on your own, they have a whole Github repo available where you can access pretty much their entire pipeline. The Space Telescope Science Institute also has guides to its documentation and instructions on how to access it here.
“Obviously, you may not know what to do with it — but even as a starting point, having all of this information there is super useful,” Nitya says.
There are also some fun astronomy-specific tools and applications that you might want to check out: Galaxy Zoo is a crowd-sourced application that allows volunteers to classify galaxies based on their shapes; and Astropy is a Python package specifically designed for astronomy.
It’s a good idea to familiarize yourself with Jupyter Notebooks, the tool that enables you to write and iterate on your Python code for data analysis. Nitya calls Jupyter Notebooks “the bread and butter of data science.”
Whether you’re inspired to pick up Python or you want to dig in analyzing JWST’s data and open-source software right away, here are the courses to check out:
- Learn Python 3: Python tends to be a good language for beginners because it’s easy to read and concise. In this course, you’ll master the fundamentals of programming in Python and complete a few projects.
- Learn SQL: Basic SQL querying is super useful in astronomy, Nitya says. This course walks you through how to use SQL to access, create, query, and manipulate data.
- Visualize data with Python: This skill path will walk you through some of the go-to Python libraries that astronomers use, like MatPlotLib and Pandas. You’ll also learn how to create charts and graphs to tell a story with data.
- Build Deep Learning Models with TensorFlow: If you know some Python, NumPy, and machine learning already, you’ll be well-suited for this skill path. There’s even a project that will have you classifying galaxies.
- Getting Started with Python for Data Science: This free course will show you how to use Python and data science tools like pandas and Jupyter Notebook to start analyzing and visualizing real datasets.