Learn basic Python programming to create functional and effective visualizations from earth observation satellite data sets
Thousands of satellite datasets are freely available online, but scientists need the right tools to efficiently analyze data and share results. Python has easy-to-learn syntax and thousands of libraries to perform common Earth science programming tasks.
Earth Observation Using Python: A Practical Programming Guide presents an example-driven collection of basic methods, applications, and visualizations to process satellite data sets for Earth science research.
Gain Python fluency using real data and case studies
Read and write common scientific data formats, like netCDF, HDF, and GRIB2
Create 3-dimensional maps of dust, fire, vegetation indices and more
Learn to adjust satellite imagery resolution, apply quality control, and handle big files
Develop useful workflows and learn to share code using version control
Acquire skills using online interactive code available for all examples in the book
The American Geophysical Union promotes discovery in Earth and space science for the benefit of humanity. Its publications disseminate scientific knowledge and provide resources for researchers, students, and professionals. Find out more about this book from this Q&A with theAuthor
1 A TOUR OF CURRENT SATELLITE MISSIONS AND PRODUCTS
There are thousands of datasets containing observations of the Earth. This chapter describes some satellite types, orbits, and missions, which benefit a variety of fields within Earth sciences, including atmospheric science, oceanography, and hydrology. Data are received on the ground through receiver stations and processed for use using retrieval algorithms. But the raw data requires further manipulation to be useful, and Python is a good choice for analysis and visualization of these datasets.
At present, there are over 13,000 satelliteâbased Earth observations freely and openly listed on www.data.gov. Not only is the quantity of available data notable, its quality is equally impressive; for example, infrared sounders can estimate brightness temperatures within 0.1 K from surface observations (Tobin et al., 2013), imagers can detect ocean currents with an accuracy of 1.0 km/hr (NOAA, 2020), and satelliteâbased lidar can measure the iceâsheet elevation change with a 10 cm sensitivity (Garner, 2015). Previously remote parts of our planet are now observable, including the open oceans and sparsely populated areas. Furthermore, many datasets are available in near real time with image latencies ranging from less than an hour down to minutes â the latter being critically important for natural disaster prediction. Having data rapidly available enables science applications and weather prediction as well as to emergency management and disaster relief. Researchâgrade data take longer to process (hours to months) but has a higher accuracy and precision, making it suitable for longâterm consistency. Thus, we live in the âgolden ageâ of satellite Earth observation. While the data are accessible, the tools and skills necessary to display and analyze this information require practice and training.
Python is a modern programming language that has exploded in popularity, both within and beyond the Earth science community. Part of its appeal is its easyâtoâlearn syntax and the thousands of available libraries that can be synthesized with the core Python package to do nearly any computing task imaginable. Python is useful for reading Earthâobserving satellite datasets, which can be difficult to use due to the volume of information that results from the multitude of sensors, platforms, and spatioâtemporal spacing. Python facilitates reading a variety of selfâdescribing binary datasets in which these observations are often encoded. Using the same software, one can complete the entirety of a research project and produce plots. Within a notebook environment, a scientist can document and distribute the code to other users, which can improve efficiency and transparency within the Earth sciences community.
Satellite data often require some preâprocessing to make it usable, but which steps to take and why are not always clear. Data users often misinterpret concepts such as data quality, how to perform an atmospheric correction, or how to implement the complex gridding schemes necessary to compare data at different resolutions. Even to a technical user, the nuances can be frustrating and difficult to overcome. This book walks you through some of the considerations a user should make when working with satellite data.
The primary goal of this text is to get the reader up to speed on the Python coding techniques needed to perform research and analysis using satellite datasets. This is done by adopting an exampleâdriven approach. It is light on theory but will briefly cover relevant background in a nontechnical manner. Rather than getting lost in the weeds, this book purposefully uses realistic examples to explain concepts. I encourage you to run the interactive code alongside reading the text. In this chapter, I will discuss a few of the satellites, sensors, and datasets covered in this book and explain why Python is a great tool for visualizing the data.
1.1 History of Computational Scientific Visualization
Scientific data visualizing used to be a very tedious process. Prior to the 1970s, data points were plotted by hand using devices such as slide rules, French curls, and graph paper. During the 1970s, IBM mainframes became increasingly available at universities and facilitated data analysis on the computer. For analysis, IBM mainframes required that a researcher write FortranâIV code, which was then printed to cards using a keypunch machine (Figure 1.1). The punch cards then were manually fed into a shared university computer to perform calculations. Each card is roughly one line of code. To make plots, the researcher could create a Fortran program to make an ASCII plot, which creates a plot by combining lines, text, and symbols. The plot could then be printed to a lineâprinter or a teleprinter. Some institutions had computerized graphic devices, such as Calcomp plotters. Rather than create ASCII plots, the researcher could use a Calcomp plotting command library to control how data were visualized and store the code on computer tape. The scientist would then take the tape to a plotter, which was not necessarily (or usually) in the same area as the computer or keypunch machine. Any errors â such as bugs in the code, damaged punch cards, or damaged tape â meant the whole process would have to be repeated from scratch.
In the midâ1980s, universities provided remote terminals that would eventually replace the keypunch and card reader machine system. This substantially improved data visualization processes, as scientists no longer had to share limited resources such as keypunch machines, card readers, or terminals. By the late 1980s, personal computers became more affordable for scientists. A typical PC, such as the IBM XT 286, had 640 Kb of random access memory, a 32 MB hard drive, and 5.25 inch floppy disks with 1.2 MB of disk storage (IBM, 1989). At this time, pen plotters became increasingly common for scientific visualization, followed later by the prevalence of inkâjet printers in the 1990s. These technologies allowed researchers to process and vis...