Fitting Models of Halo Emission

Written Report

This is the second project that you will do as a group. Your group will hand in a single report with all group members contributing and listed as authors. The report should read as a single, complete study on the given problem. The report should be similar to a scientific paper with sections for Introduction and statement of the problem (a few sentences each), description of the models and code (turn in the code separately, do not include it in the body of the report), results/predictions from the models for given parameters, and a discussion of the results. It should address all the questions asked in the assignment description.

The beta model

Re-read the paper "Constraining the Milky Way's Hot Gas Halo with O VII and O VIII Emission Lines" by Matthew Miller and Joel Bregman at http://adsabs.harvard.edu/abs/2015ApJ...800...14M.
To begin, we will use the Python code that you wrote to calculate line intensities based on a beta model. We will restrict ourselves to a spherical beta model with two effective parameters, β and the normalization n₀ r_c³^β. Take your code from the previous assignment and modify it to calculate line intensities for this simplified model.
Using the best fitted values for these two parameters for the optically thin case given in Table 2 of M&B (β = 0.50, n₀ r_c³^β = 0.0135 cm^-3 kpc³^β), plot the O VIII line emission versus Galactic longitude and versus Galactic latitude. Make sure that the vertical scale of your plot is in line units (L.U.).

Faking the line intensities

We will simulate the case where you have a set of measurements of O VIII line emission at the same positions on the sky as used by Henley and Shelton (2012) and with the same uncertainties. Using your code from the assignment Using Python to Read and Plot Data, write Python code to read in the list of (l, b) values, line intensities, and uncertainties from H&S.
You will now write code to generate fake O VIII line data. For each (l, b) value, use your emission line code and the best fitted parameter values from M&B (β = 0.50, n₀ r_c³^β = 0.0135 cm^-3 kpc³^β) to estimate the line intensity for that direction. Make a new set of "measurements" with the measured line intensity from H&S replaced with your calculated value, but keeping l, b, and the uncertainty from H&S. You may choose to write out your fake data into a file or keep them as Python arrays. We will generate several sets of fake data.
Write Python code to evaluate your integral over the line of sight distance for a specific choice of l, b and beta model and parameters. Write code for both the spherically symmetric beta model and the flattened disk model. When writing the integration code, there are several things to keep in mind and discuss in your report. For example, what are the bounds of integration? Is the result sensitive to those bounds? Is the integrand mathematically well behaved over the full range of integration? Note also that any numerical integration of an continuous function is necessarily an approximation. How accurate is your integration?
This page is useful for doing numerical integration with Python: http://docs.scipy.org/doc/scipy/reference/tutorial/integrate.html.

Fitting the fake data

You will now write code to find the set of model parameters that best fit the fake data.
First we need to define 'best fit'. Usually the best fit is taken as the set of model parameters that minimizes the chi-squared ( $\chi ^{2}$ ) deviation of the model from the data defined as the sum over all measurements of (model - measurement)²/(measurement error)². Sometimes, the likelihood L ~ exp( $\chi ^{2}$ /2) is used.
Now we need a procedure that varies the model parameters and finds the best fit. Since we have only two parameters, we will start with a brute force search of the parameter space. Write code that tries generates a grid of parameters covering β from 0.1 to 1.2 and n₀ r_c³^β = 0.002 to 0.05 cm^-3 kpc³^β, finds the $\chi ^{2}$ for each parameter pair. Note that you might want to think about the step size that you want to use and whether you should use linear or logarithmic steps for each parameter. Make 3-d and contour plots of $\chi ^{2}$ versus parameter value. Find the parameters that minimize $\chi ^{2}$ and how much the parameters can vary and keep the increase in $\chi ^{2}$ to less than 2.71. Discuss your results in your report.
The brute force technique works when there are 2 or 3 parameters, but becomes increasingly computationally expensive for large numbers of parameters. There are a large variety of procedures that can be used to explore multi-dimensional parameter space. The scipy routine curve_fit is one and provides a variety of optimization methods. M&B use the Markov Chain Monte Marlo (MCMC) package emcee. Choose a such a routine and write code to use it to find the best fit parameter values and the uncertainty in the best fit parameter values. Compare with the results of your brute force search.

Testing the fit quality

Now let's figure out how well the fake data constrain the model parameters.
Generate fake data with value of β from 0.3 to 1.0 in steps of 0.05 and n₀ r_c³^β = 0.0135 cm^-3 kpc³^β. For each set of fake data, fit the model and record the best fitted normalization and β. Make plots of the fitted normalization and β as a function of the input β. What trends do you see in the plots? Would it be useful to fit the plots?
Repeat the previous step, but for the normalization. Use values of n₀ r_c³^β from 0.005 to 0.03 in steps of 0.0025 (units are cm^-3 kpc³^β). Make similar plots and answer similar questions.
In your report, discuss any resulting thoughts that you have regarding this work.