Scientific Computing Using Python - PHYS:4905
Lecture #12 - Prof. Kaaret
Image display and processing
Today we'll looking at processing and displaying images in
Python. Most of the code we'll examine involves calling
routines in various Python libraries. As discussed in lecture
#4, a large part of the effort in using libraries to perform tasks
is figuring out what function calls you need to make to do what you
want. The code that you'll use today are useful examples for
plotting images and histograms, making contour and surface plots,
and smoothing images.
M57 - the Ring nebula
The image we will use today was taken using the Hubble Space
Telescope and is of the object known as Messier 57 (M 57) and also
known as the Ring nebula. Objects like M57 are called
"planetary nebula" because they have a disk-like shape and look like
planets if viewed through a telescope with low resolution.
"Planetary" nebula have nothing to do with planets. The name
persists because astronomers are too steeped in their own history.

"Planetary" nebula are actually shells of gas expelled when a star
sheds it outer layers as it ends the giant phase of its evolution
and becomes a white dwarf - a hot, compact stellar remnant.
The white dwarf in the Ring nebula has a temperature of 1.2×105
K and is visible as the white dot in the center. Ultraviolet
light from the white dwarf illuminates the surrounding gas.
The colors in the image are approximately true colors. The color
image was assembled from three images taken through different narrow
band filters with the Hubble telescope's Wide Field Planetary Camera
2. Blue isolates emission from very hot helium (the filter bandpass
is centered at 469 nm to capture He II), which is located primarily
close to the hot central star. Green represents ionized oxygen ([O
III] at 502 nm), which is located farther from the star. Red shows
ionized nitrogen ([N II] at 658 nm), which is radiated from the
coolest gas, located farthest from the star. M57 is about a
light-year in diameter and is located about 2000 light-years away in
the direction of the constellation Lyra.
Reading and plotting an image
To get started, let's look at a simple program to read in an image
in FITS format and plot it on the screen. Load the program plot_image.py and the FITS image file m57_f658n.fits into a directory on your
computer. We are going to run through plot_image one 'cell' at
a time. Spyder allows you to execute programs in what it calls
'cells' with the start of each cell marked by a comment line
beginning with #%%. Open plot_image in Spyder and put your
cursor on the first line. Then press the button to "Run
current call and go to the next one" or press Shift+Enter.

Spyder should list the lines executed in the console pane. The
first cell has a bunch of imports and a plt.ion()
command to setup interactive plotting mode. Run the next
cell. This cell loads the image file and puts the image data
into a variable called img. Go to the variable
explorer window and you should see img. It is an
array of floats with a size of 1200 by 1400.
Run the next cell. This cell makes a histogram of the values
of each pixel in the image. A histogram shows the distribution
of values. We call plt.hist with bins=100, meaning
that our histogram has 100 bins. By default, plt.hist
will make those bins go from the minimum value in the input data to
the maximum value. Look at the plot in the histogram
window. It shows that there are a whole lot of very low values
(values in the first bins) with a small tail going to higher
bins. The histogram isn't that useful. The cell also
plots an image using plt.imshow, say this as 'plot dot
image show'. To plot the image, we first set up a color map,
which is maps from pixel values to color. We have loaded a
gray scale map that will produce what looks like a black and white
image. By default, plt.imshow adjusts the color map
to cover from the minimum value in the input data to the maximum
value. The image is even less useful because it looks totally
black.
The problem with our first try at a histogram and an image is that
we have outliers in the data - meaning one or more points with
values very different from the typical values. We can have a
look at this by executing the next cell, which you should do
now. That cell prints out some statistics about the image, in
particular the minimum, the maximum, the mean, and the median.
We can see that the maximum is much larger than the mean or the
median. That's our problem.
We need to set the histogram range with more sensible values.
The next block uses the image statistics to try to get a better
scaling. There are several lines of code to choose from to set
the variable vmax, which we will use to set the maximum
value for scaling. Think about what you want to use (maximum,
mean, or median) and uncomment the appropriate line. Also
adjust the numerical factor in front of your chosen statistic.
Then run the cell. However, use "Run the current cell"
(Ctrl+Enter ) instead of "Run current call and go to the next one"
so that you can run this cell repeatedly as you play with different
statistics and different numerical factors. Recall that you
can use the zoom button in the "Better Histogram" window to adjust
your view of that histogram. Keep trying different statistics
and/or different numerical factors until you are happy with your
image and histogram.
Smoothing images
Contour and surface plots are more sensitive to noise or
fluctuations in the data. So before generating those plots, we
have chosen to smooth our data. We do this by applying a
'Gaussian filter'. A one dimensional Gaussian looks like the
function below.

The equation for a Gaussian is
where µ is the mean and σ is the standard deviation which is
determines the width. The Full Width at Half Maximum (FWHM) =
2.35σ. The factor in front makes the integral over all real
numbers equal to one.
Since we have a 2D image, we use a 2D Gaussian that looks like the
image below. A 2D Gaussian is the product of two Gaussians,
one for x and one for y.

To smooth the image, we look at a particular pixel. We place a
2D Gaussian centered at that pixel. We then calculate a sum by
multiplying the Gaussian times the image pixel value for each pixel
in the Gaussian. We replace the value of the particular pixel
with the value of the sum. In principle, the Gaussian has
infinite extent in each dimension, but in practice, we limit the
Gaussian and the sum to cover only a finite number of pixels.
The Gaussian is normalized so that its sum over selected set of
pixels is unity. This process is called convolution. The
wikipedia page on convolution, http://en.wikipedia.org/wiki/Convolution,
has a good explanation with some nice graphics to help you get some
intuition about what convolution means and does to functions.
One can do convolution with functions other than Gaussians.
The function selected for convolution with the image is called the
'kernel'. We use gaussian_filter from the scipy.ndimage.filters
package that handles all of these details for us. That
function takes two parameters. One is the image to be smoothed
and the other is the standard deviation for the Gaussian, which is
called sigma. Sigma can be given a single number, which makes
a symmetric function, or as a sequence with a different value for
each axis.
Smoothing will change the range of pixel values in the image.
Thus, after smoothing, we need to re-evaluate the range of
interesting pixel values to use for our plots. Execute the
cell that does smoothing and then plots the smoothed image.
Try adjusting the value of sigma to see what it does to the
plots. Each time you adjust sigma, you may need to adjust the
how vmax is set in order to get a good image. Try
values of sigma of 1, 2, 5, 10, 20, and 50. How would you
describe the effect of different values for sigma on the
image? After you are done, set sigma back to 10.
Contour and surface plots
Contour and surface plots are different ways of presenting
data. They have advantages and disadvantages relative to
images. The choice of which type of plot to use is often
aesthetic and the plots of different types are sometimes
combined. For example, to compare gamma-ray and radio images
of a nebula, one might overlay contours generated from the radio
data on to an image of the gamma-ray data.
To make contour and surface plots in Python, we need three
variables: one that holds the X coordinate of each bit of
data (in our case, the pixel location), one that holds the Y
coordinate, and one that holds the Z coordinate (in our
case, the intensity in the pixel). We have the Z data
already in the image. The first thing that the next block does
is generate arrays with the needed X and Y
coordinates. We first generate arrays of numbers for X
and Y. These arrays must have the same length as the
corresponding axis of the image. Our code uses the pixel
number for the coordinate value. You might want to use
something like the astronomical coordinate values instead. We
then take those arrays and combine them into a grid. The
resulting arrays x and y have exactly the same
shape as the image. Each entry gives the x or y
coordinate of the corresponding pixel in the image.
For contour plots, we need to set the contour levels. These
are the pixel values at which the contours are drawn. For
example, if we pick a level of 0.5, the plot will have a contour
that separates pixels with values above 0.5 from those with values
below 0.5. We have chosen to generate contour levels from vmin
to vmax using np.linspace and then drop the
values at vmin and vmax (those contours would be
empty, think about it). The selected levels go into the
variables contours. The plt.contour function makes
the contour plot and returns information about the contours that can
then be used by plt.clabel to label the contours.
Surface plots are representations of 3D images. (I am still
waiting for my holographic 3D display.) We need to import
Axes3D from mpl_toolkits.mplot3d to get functions to set
up 3D plots. The plot_surface function then plots
surfaces in the graphical space set up with Axes3D. Note that
you can click and drag in the surface plot window if you want to see
the plot from different angles. It's cool.
Your turn
We used some simple statistics to attempt to adjust the scaling
for our images. Another possible way to scale the data would
be to adjust the lower and upper bound on the scale so that some
percentage of the pixel values lie below the lower bound and some
percentage lie below the upper bound. Modify the code in plot_image.py
so that it takes two parameters, vmin_frac and vmax_frac,
that adjust vmin and vmax so that vmin_frac
of the pixels have values below vmin and vmax_frac
have values below vmax. To start, you might use vmin_frac
= 0.01 and vmax_frac = 0.95, but note that they must be
adjustable. Save this code for your future use, in
particular to hand in for homework #11.
FYI
The image that we used was downloaded from the Hubble Legacy
Archive (HLA). The HLA is "designed to optimize science from
the Hubble Space Telescope by providing online, enhanced Hubble
products". If you want to plot more Hubble images, you can
access the HLA at https://hla.stsci.edu/hlaview.html.
Type an object name in the box (M16 is a nice one), click on
Search, then click on the Images tab. You can then download
by right clicking (or whatever mac people do) on one of the
FITS-Science links. You should be able to read the FITS file
directly into today's program. However, you'll need to do
some work on setting vmin and vmax as the raw
HST images usually include outliers.