Generalized additive model
From Wikipedia, the free encyclopedia
The generalized additive model (GAM) is a statistical model developed by Hastie and Tibshirani (1990) blending properties of multiple regression (a special case of general linear model) with additive models.
The multiple regression model (a GAM predecessor) is written
- Y = β0 + β1x1 + β2x2 + ... + βmxm,
where Y is a response variable (more precisely, the expectation of that response variable), xi is the ith of m predictor variables, and the bi are parameters.
In contrast, the GAM replaces the parameter terms bixi of multiple regression with functions f(xi):
- E(Y) = β0 + f(x1) + f(x2) + ... + f(xm).
The functions f(xi) are arbitrary and often nonparametric, thus providing the potential for better fits to data than other methods. A typical GAM might use a scatterplot smoothing function such as a locally weighted mean for the f(xi). Thus, the priority of GAMs is predictive ability, perhaps at the expense of interpretability of results.
Overfitting can be a problem with GAMs. The number of smoothing parameters can be specified, and this number should be reasonably small, certainly well under the degrees of freedom offered by the data. Cross-validation can be used to detect and/or reduce overfitting problems with GAMs (or other statistical methods). Other models such as GLMs may be preferable to GAMs unless GAMs improve predictive ability substantially for the application in question.