We divide results into appropriate sections according to the
predictors used. Each recovered quantity is shown with its database
mean value, spread, root-mean-square error (RMSE) and the
or
statistic adjusted for the degrees of freedom in the
model, which we now define. For a given variable with spread
, we define the sum-of-square error (SSE) for a model fitted
over
observations with
fitted variables to be the sum of
squared differences between actual and recovered values. Then the
RMSE =
and
. The percentage error
is
defined as 100
(RMSE/
or equivalently
100
.
We remark that all of the above statistical quantities are adjusted
for the number of parameters in the model. Unlike their uncorrected
analogues, which generally improve monotonically with each new
parameter added to a model since they depend only on the absolute
explained variance, these are not artificially enhanced by an
overfitted model (i.e. a model containing many predictors which have
no explanatory worth). Indeed, adding parameters of no predictive
value to a model can often cause a deterioration in the corrected
quantities. Note that the physically relevant quantity is actually
the RMSE, however statistics such as
and
render
the comparison of results for parameters of different dimensions and
widely differing spreads to be made on an equal footing. Also,
is sometimes a more convenient measure than
, as the
latter can often be very close to unity for a good fit and the former
allows greater information resolution.