The Spread of AIDS
Part 3: Checking Your Model
Function
In this part we consider how to
measure the quality of our fit by the model function to the given data.
- Overlay a semilog
or loglog plot (whichever is appropriate) of your
model function on a semilog or loglog plot of the given data. [The
commands for doing one type of plot are already in your worksheet. If you
need the other type, modify the commands accordingly.] If you have constructed
your function f(t) correctly, its graph should appear as
a straight line, and the line should come reasonably close to most of the
data points. If this is not the case, go back now to Part
2, and rethink your construction of the function.
- Next, overlay an ordinary Cartesian
plot of your model function on the initial plot of the data points. [Again,
the commands are already in your worksheet.] Describe the extent to which
you think your model function represents the given data.
- The visual information in steps 1 and 2 can help you decide whether you
followed the right modeling strategy and whether you have reasonable values
for the parameters a and b in the model
function. However, these graphs don't really measure the accuracy of our fit
to the data. We can get a better sense of that by calculating residuals,
that is, the numerical differences between the numbers of cases and the numbers
predicted by model at corresponding times. Specifically, the j-th residual
is the difference Casesj - f(Monthsj).
Use the commands in your worksheet to plot the residuals as a function of
time. If you have a good fit, the residuals should be relatively small --
are they?
- Even with a very good fit, you will probably see that the residuals tend
to grow in magnitude as time goes on, some positive and some negative. There
is a trick word in the preceding question: "relatively". If the
large residuals are differences in very large numbers, they may still be relatively
small. We can "scale" the residuals by dividing each one by the
number of Cases at the corresponding time -- this gives us relative residuals.
Use the commands in your worksheet to plot the relative residuals, and then
comment (again) on whether your residuals are relatively small.