[ad2ce4e] | 1 | .. residuals_help.rst |
---|
| 2 | |
---|
| 3 | |
---|
| 4 | .. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ |
---|
| 5 | |
---|
| 6 | .. _Assessing_Fit_Quality: |
---|
| 7 | |
---|
| 8 | Assessing Fit Quality |
---|
| 9 | --------------------- |
---|
| 10 | |
---|
| 11 | When performing model-fits to some experimental data it is helpful to be able to |
---|
| 12 | gauge how good an individual fit is, how it compares to a fit of the *same model* |
---|
| 13 | *to another set of data*, or how it compares to a fit of a *different model to the* |
---|
| 14 | *same data*. |
---|
| 15 | |
---|
| 16 | One way is obviously to just inspect the graph of the experimental data and to |
---|
| 17 | see how closely (or not!) the 'theory' calculation matches it. But *SasView* |
---|
| 18 | also provides two other measures of the quality of a fit: |
---|
| 19 | |
---|
[5ed76f8] | 20 | * $\chi^2$ (or 'Chi2'; pronounced 'chi-squared') |
---|
[ad2ce4e] | 21 | * *Residuals* |
---|
| 22 | |
---|
| 23 | .. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ |
---|
| 24 | |
---|
| 25 | Chi2 |
---|
| 26 | ^^^^ |
---|
| 27 | |
---|
[99ded31] | 28 | $\chi^2$ is a statistical parameter that quantifies the differences between |
---|
[a7c6f38] | 29 | an observed data set and an expected dataset (or 'theory') calculated as |
---|
[ad2ce4e] | 30 | |
---|
[5ed76f8] | 31 | .. math:: |
---|
[ad2ce4e] | 32 | |
---|
[a7c6f38] | 33 | \chi^2 |
---|
| 34 | = \sum[(Y_i - \mathrm{theory}_i)^2 / \mathrm{error}_i^2] |
---|
[5ed76f8] | 35 | |
---|
[a7c6f38] | 36 | Fitting typically minimizes the value of $\chi^2$. However, for assessing the |
---|
| 37 | quality of the model and its "fit" this parameter is not terribly helpful on its |
---|
| 38 | own. Thus *SasView* instead displays a normalized version of this parameter, |
---|
| 39 | using the traditional reduced $\chi^2_R$. This is the $\chi^2$ divided by the |
---|
| 40 | degrees of freedom (or DOF). The DOF is simply the number of data points being |
---|
| 41 | considered reduced by the number of free (i.e. fitted) parameters. Note that |
---|
| 42 | model parameters that are kept fixed do *not* contribute to the DOF (they are |
---|
| 43 | not"free". This reduced value is then given as |
---|
[ad2ce4e] | 44 | |
---|
[99ded31] | 45 | .. math:: |
---|
| 46 | |
---|
| 47 | \chi^2_R |
---|
| 48 | = \sum[(Y_i - \mathrm{theory}_i)^2 / \mathrm{error}_i^2] |
---|
| 49 | / [N_\mathrm{pts} - N_\mathrm{par}] |
---|
| 50 | |
---|
[a7c6f38] | 51 | where $N_\mathrm{par}$ is the number of *fitted* parameters. Note that this |
---|
| 52 | means the displayed value will vary depending on the number of parameters used |
---|
| 53 | in the fit. In particular, when doing a calculation without a fit (e.g. |
---|
| 54 | manually changing a parameter) the DOF will now equal $N_\mathrm{pts}$ and the |
---|
| 55 | $\chi^2_R$ will be the smallest possible for that combination of model, data |
---|
| 56 | set and set of parameter values. |
---|
| 57 | |
---|
| 58 | When $N_\mathrm{pts} \gg N_\mathrm{par}$ as it should for proper fitting, the |
---|
| 59 | value of the reduced $\chi^2_R$ will not change very much. |
---|
[ad2ce4e] | 60 | |
---|
[99ded31] | 61 | For a good fit, $\chi^2_R$ tends to 1. |
---|
| 62 | |
---|
| 63 | $\chi^2_R$ is sometimes referred to as the 'goodness-of-fit' parameter. |
---|
[ad2ce4e] | 64 | |
---|
| 65 | .. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ |
---|
| 66 | |
---|
| 67 | Residuals |
---|
| 68 | ^^^^^^^^^ |
---|
| 69 | |
---|
| 70 | A residual is the difference between an observed value and an estimate of that |
---|
[99ded31] | 71 | value, such as a 'theory' calculation (whereas the difference between an |
---|
| 72 | observed value and its *true* value is its error). |
---|
[ad2ce4e] | 73 | |
---|
[5ed76f8] | 74 | *SasView* calculates 'normalized residuals', $R_i$, for each data point in the |
---|
[ad2ce4e] | 75 | fit: |
---|
| 76 | |
---|
[5ed76f8] | 77 | .. math:: |
---|
| 78 | |
---|
[99ded31] | 79 | R_i = (Y_i - \mathrm{theory}_i) / \mathrm{error}_i |
---|
| 80 | |
---|
| 81 | Think of each normalized residual as the number of standard deviations |
---|
| 82 | between the measured value and the theory. For a good fit, 68% of $R_i$ |
---|
| 83 | will be within one standard deviation, which will show up in the Residuals |
---|
| 84 | plot as $R_i$ values between $-1$ and $+1$. Almost all the values should |
---|
| 85 | be between $-3$ and $+3$. |
---|
[ad2ce4e] | 86 | |
---|
[99ded31] | 87 | Residuals values larger than $\pm 3$ indicate that the model |
---|
| 88 | is not fit correctly, the wrong model was chosen (e.g., because there is |
---|
| 89 | more than one phase in your system), or there are problems in |
---|
| 90 | the data reduction. Since the goodness of fit is calculated from the |
---|
| 91 | sum-squared residuals, these extreme values will drive the choice of fit |
---|
| 92 | parameters. Any uncertainties calculated for the fitting parameters will |
---|
| 93 | be meaningless. |
---|
[ad2ce4e] | 94 | |
---|
| 95 | .. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ |
---|
| 96 | |
---|
[99ded31] | 97 | *Document History* |
---|
| 98 | |
---|
| 99 | | 2015-06-08 Steve King |
---|
| 100 | | 2017-09-28 Paul Kienzle |
---|