[ad2ce4e] | 1 | .. residuals_help.rst |
---|
| 2 | |
---|
| 3 | |
---|
| 4 | .. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ |
---|
| 5 | |
---|
| 6 | .. _Assessing_Fit_Quality: |
---|
| 7 | |
---|
| 8 | Assessing Fit Quality |
---|
| 9 | --------------------- |
---|
| 10 | |
---|
| 11 | When performing model-fits to some experimental data it is helpful to be able to |
---|
| 12 | gauge how good an individual fit is, how it compares to a fit of the *same model* |
---|
| 13 | *to another set of data*, or how it compares to a fit of a *different model to the* |
---|
| 14 | *same data*. |
---|
| 15 | |
---|
| 16 | One way is obviously to just inspect the graph of the experimental data and to |
---|
| 17 | see how closely (or not!) the 'theory' calculation matches it. But *SasView* |
---|
| 18 | also provides two other measures of the quality of a fit: |
---|
| 19 | |
---|
[5ed76f8] | 20 | * $\chi^2$ (or 'Chi2'; pronounced 'chi-squared') |
---|
[ad2ce4e] | 21 | * *Residuals* |
---|
| 22 | |
---|
| 23 | .. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ |
---|
| 24 | |
---|
| 25 | Chi2 |
---|
| 26 | ^^^^ |
---|
| 27 | |
---|
[99ded31] | 28 | $\chi^2$ is a statistical parameter that quantifies the differences between |
---|
[a7c6f38] | 29 | an observed data set and an expected dataset (or 'theory') calculated as |
---|
[ad2ce4e] | 30 | |
---|
[5ed76f8] | 31 | .. math:: |
---|
[ad2ce4e] | 32 | |
---|
[a7c6f38] | 33 | \chi^2 |
---|
| 34 | = \sum[(Y_i - \mathrm{theory}_i)^2 / \mathrm{error}_i^2] |
---|
[5ed76f8] | 35 | |
---|
[84ac3f1] | 36 | Fitting typically minimizes the value of $\chi^2$. For assessing the quality of |
---|
| 37 | the model and its "fit" however, *SasView* displays the traditional reduced |
---|
| 38 | $\chi^2_R$ which normalizes this parameter by dividing it by the number of |
---|
| 39 | degrees of freedom (or DOF). The DOF is the number of data points being |
---|
| 40 | considered, $N_\mathrm{pts}$, reduced by the number of free (i.e. fitted) |
---|
| 41 | parameters, $N_\mathrm{par}$. Note that model parameters that are kept fixed do |
---|
| 42 | *not* contribute to the DOF (they are not "free"). This reduced value is then |
---|
| 43 | given as |
---|
[ad2ce4e] | 44 | |
---|
[99ded31] | 45 | .. math:: |
---|
| 46 | |
---|
| 47 | \chi^2_R |
---|
| 48 | = \sum[(Y_i - \mathrm{theory}_i)^2 / \mathrm{error}_i^2] |
---|
| 49 | / [N_\mathrm{pts} - N_\mathrm{par}] |
---|
| 50 | |
---|
[84ac3f1] | 51 | Note that this means the displayed value will vary depending on the number of |
---|
| 52 | parameters used in the fit. In particular, when doing a calculation without a |
---|
| 53 | fit (e.g. manually changing a parameter) the DOF will now equal $N_\mathrm{pts}$ |
---|
| 54 | and the $\chi^2_R$ will be the smallest possible for that combination of model, |
---|
| 55 | data set, and set of parameter values. |
---|
[a7c6f38] | 56 | |
---|
| 57 | When $N_\mathrm{pts} \gg N_\mathrm{par}$ as it should for proper fitting, the |
---|
| 58 | value of the reduced $\chi^2_R$ will not change very much. |
---|
[ad2ce4e] | 59 | |
---|
[99ded31] | 60 | For a good fit, $\chi^2_R$ tends to 1. |
---|
| 61 | |
---|
| 62 | $\chi^2_R$ is sometimes referred to as the 'goodness-of-fit' parameter. |
---|
[ad2ce4e] | 63 | |
---|
| 64 | .. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ |
---|
| 65 | |
---|
| 66 | Residuals |
---|
| 67 | ^^^^^^^^^ |
---|
| 68 | |
---|
| 69 | A residual is the difference between an observed value and an estimate of that |
---|
[99ded31] | 70 | value, such as a 'theory' calculation (whereas the difference between an |
---|
| 71 | observed value and its *true* value is its error). |
---|
[ad2ce4e] | 72 | |
---|
[5ed76f8] | 73 | *SasView* calculates 'normalized residuals', $R_i$, for each data point in the |
---|
[ad2ce4e] | 74 | fit: |
---|
| 75 | |
---|
[5ed76f8] | 76 | .. math:: |
---|
| 77 | |
---|
[99ded31] | 78 | R_i = (Y_i - \mathrm{theory}_i) / \mathrm{error}_i |
---|
| 79 | |
---|
| 80 | Think of each normalized residual as the number of standard deviations |
---|
| 81 | between the measured value and the theory. For a good fit, 68% of $R_i$ |
---|
| 82 | will be within one standard deviation, which will show up in the Residuals |
---|
| 83 | plot as $R_i$ values between $-1$ and $+1$. Almost all the values should |
---|
| 84 | be between $-3$ and $+3$. |
---|
[ad2ce4e] | 85 | |
---|
[99ded31] | 86 | Residuals values larger than $\pm 3$ indicate that the model |
---|
| 87 | is not fit correctly, the wrong model was chosen (e.g., because there is |
---|
| 88 | more than one phase in your system), or there are problems in |
---|
| 89 | the data reduction. Since the goodness of fit is calculated from the |
---|
| 90 | sum-squared residuals, these extreme values will drive the choice of fit |
---|
| 91 | parameters. Any uncertainties calculated for the fitting parameters will |
---|
| 92 | be meaningless. |
---|
[ad2ce4e] | 93 | |
---|
| 94 | .. ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ |
---|
| 95 | |
---|
[99ded31] | 96 | *Document History* |
---|
| 97 | |
---|
| 98 | | 2015-06-08 Steve King |
---|
[84ac3f1] | 99 | | 2017-09-28 Paul Kienzle |
---|
| 100 | | 2018-03-04 Paul Butler |
---|