10 Regresión Spline adaptativa multivariante (MARS)
Paquetes de esta sección
if(!require(ISLR)){install.packages("earth")}
if(!require(ISLR)){install.packages("caret")}
if(!require(ISLR)){install.packages("AmesHousing")}
En las clases pasadas hemos revisado extensiones de la regresión linear (nls, regresión polinómica, entre otras).
Existen otras variaciones como la regresión Ridge, LASSO y Elastic NET (algunas se verán en el módulo de Aprendizaje Automático).
10.1 Introducción
En estadística, MARS es una forma de regresión lineal introducida por Jorome Friedan en 1991.
MARS es una técnica de regresión no paramétrica y puede ser vista como una extensión de los modelos lineales que automáticamente no linealidades e interacciones entre variables.
El término MARS está protegido por derechos de autor y pertenece a Salford Systems.
Para evitar violentar esos derechos, las implementaciones abiertas de MARS se suelen llamar Earth (El paquete
earth
en R, por ejemplo).La Regresión Spline adaptativa multivariante (Multivariate adaptive regression spline - MARS)
10.1.1 ¿Por qué usar modelos MARS?
MARS es ideal para usuarios que prefieren obtener resultados similares a la regresión tradicional mientras capturan no linealidades e interacciones necesarias.
MARS revela patrones importantes en los datos que otras técnicas suelen fallar en revelar.
MARS construye su modelo uniendo pedazos de líneas rectas que mantienen su propia pendiente.
Esto permite que se detecte cualquier patrón en los datos.
Se puede utilizar para cuando se tiene variables de respuesta cuantitativa y cualitativa.
MARS realiza (todo automático y con gran velocidad):
- selección de variables.
- transformación de variables.
- detección de interacciones.
- testeo
Áreas donde ha mostrado ser una técnica exitosa
- Predicción de demanda de electricidad de companías generadoras.
- Relacionar puntajes de satisfacción del cliente con las especificaciones técnicas del producto.
- Modelización en sistemas de información geográfica.
- MARS es una técnica de regresión muy versátil y es una herramienta necesaria en nuestra caja de herramientas en Analítica de Datos.
10.2 Ejemplo 1
Cargamos los datos:
## Loading required package: Formula
## Loading required package: plotmo
## Loading required package: plotrix
Construimos el modelo basado en los datos:
mars <- earth(y~age+job+marital+education+default+balance+housing+
loan+contact+day+month+duration+campaign+pdays+previous+poutcome,
data=bankfull,pmethod="backward",nprune=20, nfold=10)
Notemos los argumentos usando en la función:
pmethod
: Es el método para podar las variables regresoras. Las opciones sonbackward
,forward
,cv
(se necesita especificarnfold
), yexhaustive
.nprune
: Numero máximo de funciones base que se usan.
En resumen, para plantear el modelo, necesitamos 3 elementos:
- Definir el modelo (como en cualquier regresión)
- Definir el método de testeo (
pmethod
) - Número de funciones base (
nprune
) y de interacciones (degree
)
Veamos el resumen:
## Call: earth(formula=y~age+job+marital+education+default+balance+housin...), data=bankfull,
## pmethod="backward", nprune=20, nfold=10)
##
## coefficients
## (Intercept) 0.7775
## housingyes -0.0408
## loanyes -0.0294
## contactunknown -0.0713
## monthdec 0.1876
## monthjun 0.0519
## monthmar 0.3301
## monthoct 0.1916
## monthsep 0.1789
## poutcomesuccess 0.3809
## h(age-27) 0.0072
## h(54-age) 0.0087
## h(duration-375) 0.0003
## h(1080-duration) -0.0004
## h(duration-1080) -0.0004
## h(2-campaign) 0.0268
## h(pdays-53) -0.0020
## h(349-pdays) -0.0016
## h(pdays-349) 0.0061
## h(pdays-425) -0.0044
##
## Selected 20 of 22 terms, and 13 of 42 predictors (nprune=20)
## Termination condition: RSq changed by less than 0.001 at 22 terms
## Importance: duration, poutcomesuccess, monthmar, housingyes, monthoct, contactunknown, ...
## Number of terms at each degree of interaction: 1 19 (additive model)
## GCV 0.0707 RSS 3192 GRSq 0.315 RSq 0.316 CVRSq 0.316
##
## Note: the cross-validation sd's below are standard deviations across folds
##
## Cross validation: nterms 22.40 sd 1.65 nvars 14.10 sd 1.52
##
## CVRSq sd ClassRate sd MaxErr sd
## 0.316 0.013 0.901 0.002 -1.26 1.06
El gráfico de resultado:
El GCV (generalized cross validation) es
\[ GCV = \frac{RSS}{N\times (1-Num.Par.Efectivos/N)^2} \]
donde RSS es la suma de cuadrados de los residuos medidos en los datos de entrenamiento y N es el número de observaciones.
\[ Num.Par.Efectivos = NumeroTerminosMARS + Penalidad\times (NumeroTerminosMARS-1)/2 \]
La penalidad es alrededor de 2 o 3, pero se puede elegir la penalidad.
10.2.1 Output
El objeto de resultado es un earth.object
que contiene mucha información (ver help(earth.object
).
## List of 39
## $ rss : num 3192
## $ rsq : num 0.316
## $ gcv : num 0.0707
## $ grsq : num 0.315
## $ bx : num [1:45211, 1:20] 1 1 1 1 1 1 1 1 1 1 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr [1:20] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## $ dirs : num [1:22, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:22] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## $ cuts : num [1:22, 1:42] 0 0 0 0 0 0 0 0 0 54 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:22] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## $ selected.terms : num [1:20] 1 2 3 4 5 6 7 8 9 11 ...
## $ prune.terms : num [1:22, 1:22] 1 1 1 1 1 1 1 1 1 1 ...
## $ fitted.values : num [1:45211, 1] 0.0261 -0.0314 -0.074 -0.0597 0.0452 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr "yes"
## $ residuals : num [1:45211, 1] -0.0261 0.0314 0.074 0.0597 -0.0452 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr "yes"
## $ coefficients : num [1:20, 1] 0.777457 -0.000382 -0.000402 0.380944 0.330111 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:20] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. ..$ : chr "yes"
## $ rss.per.response : num 3192
## $ rsq.per.response : num 0.316
## $ gcv.per.response : num 0.0707
## $ grsq.per.response : num 0.315
## $ rss.per.subset : num [1:22] 4670 3880 3497 3433 3378 ...
## $ gcv.per.subset : num [1:22] 0.1033 0.0858 0.0774 0.076 0.0747 ...
## $ leverages : num [1:45211] 0.000243 0.000165 0.000299 0.000194 0.00025 ...
## $ pmethod : chr "backward"
## $ nprune : num 20
## $ penalty : num 2
## $ nk : num 85
## $ thresh : num 0.001
## $ termcond : int 4
## $ weights : NULL
## $ call : language earth(formula = y ~ age + job + marital + education + default + balance + housing + loan + contact + day + m| __truncated__ ...
## $ namesx : chr [1:16] "age" "job" "marital" "education" ...
## $ modvars : num [1:16, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:16] "age" "job" "marital" "education" ...
## .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## $ terms :Classes 'terms', 'formula' language y ~ age + job + marital + education + default + balance + housing + loan + contact + day + month + duration | __truncated__
## .. ..- attr(*, "variables")= language list(y, age, job, marital, education, default, balance, housing, loan, contact, day, month, duration, campai| __truncated__
## .. ..- attr(*, "factors")= int [1:17, 1:16] 0 1 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:17] "y" "age" "job" "marital" ...
## .. .. .. ..$ : chr [1:16] "age" "job" "marital" "education" ...
## .. ..- attr(*, "term.labels")= chr [1:16] "age" "job" "marital" "education" ...
## .. ..- attr(*, "order")= int [1:16] 1 1 1 1 1 1 1 1 1 1 ...
## .. ..- attr(*, "intercept")= int 1
## .. ..- attr(*, "response")= int 1
## .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
## .. ..- attr(*, "predvars")= language list(y, age, job, marital, education, default, balance, housing, loan, contact, day, month, duration, campai| __truncated__
## .. ..- attr(*, "dataClasses")= Named chr [1:17] "factor" "numeric" "factor" "factor" ...
## .. .. ..- attr(*, "names")= chr [1:17] "y" "age" "job" "marital" ...
## $ xlevels :List of 9
## ..$ job : chr [1:12] "admin." "blue-collar" "entrepreneur" "housemaid" ...
## ..$ marital : chr [1:3] "divorced" "married" "single"
## ..$ education: chr [1:4] "primary" "secondary" "tertiary" "unknown"
## ..$ default : chr [1:2] "no" "yes"
## ..$ housing : chr [1:2] "no" "yes"
## ..$ loan : chr [1:2] "no" "yes"
## ..$ contact : chr [1:3] "cellular" "telephone" "unknown"
## ..$ month : chr [1:12] "apr" "aug" "dec" "feb" ...
## ..$ poutcome : chr [1:4] "failure" "other" "success" "unknown"
## $ levels : chr [1:2] "no" "yes"
## $ cv.list :List of 10
## ..$ fold1 :List of 29
## .. ..$ rss : num 2854
## .. ..$ rsq : num 0.321
## .. ..$ gcv : num 0.0703
## .. ..$ grsq : num 0.319
## .. ..$ dirs : num [1:25, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:25] "(Intercept)" "h(duration-1125)" "h(1125-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:25, 1:42] 0 0 0 0 0 0 0 0 0 53 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:25] "(Intercept)" "h(duration-1125)" "h(1125-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:24] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40681, 1] 0.0448 -0.0398 -0.0564 -0.0508 0.0309 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:24, 1] 0.575313 -0.000394 -0.000398 0.376758 0.330166 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:24] "(Intercept)" "h(duration-1125)" "h(1125-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2854
## .. ..$ rsq.per.response : num 0.321
## .. ..$ gcv.per.response : num 0.0703
## .. ..$ grsq.per.response: num 0.319
## .. ..$ rss.per.subset : num [1:25] 4203 3485 3141 3080 3031 ...
## .. ..$ gcv.per.subset : num [1:25] 0.1033 0.0857 0.0772 0.0757 0.0745 ...
## .. ..$ leverages : num [1:40681] 0.000325 0.000191 0.000224 0.000238 0.000293 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 1
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold2 :List of 29
## .. ..$ rss : num 2863
## .. ..$ rsq : num 0.319
## .. ..$ gcv : num 0.0705
## .. ..$ grsq : num 0.317
## .. ..$ dirs : num [1:24, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:24] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:24, 1:42] 0 0 0 0 0 0 0 0 0 53 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:24] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:23] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40709, 1] 0.0433 -0.0429 -0.0567 -0.0536 0.0122 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:23, 1] 1.059987 -0.000351 -0.000399 0.371647 0.332723 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:23] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2863
## .. ..$ rsq.per.response : num 0.319
## .. ..$ gcv.per.response : num 0.0705
## .. ..$ grsq.per.response: num 0.317
## .. ..$ rss.per.subset : num [1:24] 4203 3495 3151 3091 3041 ...
## .. ..$ gcv.per.subset : num [1:24] 0.1033 0.0859 0.0774 0.076 0.0747 ...
## .. ..$ leverages : num [1:40709] 0.000326 0.00019 0.000219 0.000236 0.00021 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 2
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold3 :List of 29
## .. ..$ rss : num 2867
## .. ..$ rsq : num 0.318
## .. ..$ gcv : num 0.0706
## .. ..$ grsq : num 0.317
## .. ..$ dirs : num [1:22, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:22] "(Intercept)" "h(duration-1077)" "h(1077-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:22, 1:42] 0 0 0 0 0 0 0 0 0 54 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:22] "(Intercept)" "h(duration-1077)" "h(1077-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:21] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40693, 1] 0.0274 -0.0336 -0.0557 -0.0595 0.034 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:21, 1] 0.680276 -0.000397 -0.000402 0.384156 0.325384 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:21] "(Intercept)" "h(duration-1077)" "h(1077-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2867
## .. ..$ rsq.per.response : num 0.318
## .. ..$ gcv.per.response : num 0.0706
## .. ..$ grsq.per.response: num 0.317
## .. ..$ rss.per.subset : num [1:22] 4203 3488 3140 3083 3034 ...
## .. ..$ gcv.per.subset : num [1:22] 0.1033 0.0857 0.0772 0.0758 0.0746 ...
## .. ..$ leverages : num [1:40693] 0.000269 0.000178 0.000224 0.000212 0.000296 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 3
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold4 :List of 29
## .. ..$ rss : num 2856
## .. ..$ rsq : num 0.321
## .. ..$ gcv : num 0.0704
## .. ..$ grsq : num 0.319
## .. ..$ dirs : num [1:24, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:24] "(Intercept)" "h(duration-1130)" "h(1130-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:24, 1:42] 0 0 0 0 0 0 0 0 0 53 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:24] "(Intercept)" "h(duration-1130)" "h(1130-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:23] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40674, 1] 0.0431 -0.0431 -0.0571 -0.0525 0.031 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:23, 1] 1.031902 -0.000392 -0.000398 0.375766 0.339415 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:23] "(Intercept)" "h(duration-1130)" "h(1130-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2856
## .. ..$ rsq.per.response : num 0.321
## .. ..$ gcv.per.response : num 0.0704
## .. ..$ grsq.per.response: num 0.319
## .. ..$ rss.per.subset : num [1:24] 4203 3485 3141 3081 3031 ...
## .. ..$ gcv.per.subset : num [1:24] 0.1033 0.0857 0.0773 0.0758 0.0745 ...
## .. ..$ leverages : num [1:40674] 0.000332 0.000191 0.00022 0.000239 0.00029 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 4
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold5 :List of 29
## .. ..$ rss : num 2879
## .. ..$ rsq : num 0.315
## .. ..$ gcv : num 0.0708
## .. ..$ grsq : num 0.314
## .. ..$ dirs : num [1:23, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:23] "(Intercept)" "h(duration-1081)" "h(1081-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:23, 1:42] 0 0 0 0 0 0 0 0 0 55 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:23] "(Intercept)" "h(duration-1081)" "h(1081-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:22] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40718, 1] 0.0337 -0.0444 -0.0625 0.0274 -0.0365 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:22, 1] 0.404932 -0.000372 -0.000403 0.379579 -0.040715 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:22] "(Intercept)" "h(duration-1081)" "h(1081-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2879
## .. ..$ rsq.per.response : num 0.315
## .. ..$ gcv.per.response : num 0.0708
## .. ..$ grsq.per.response: num 0.314
## .. ..$ rss.per.subset : num [1:23] 4204 3498 3156 3098 3052 ...
## .. ..$ gcv.per.subset : num [1:23] 0.1032 0.0859 0.0775 0.0761 0.075 ...
## .. ..$ leverages : num [1:40718] 0.000328 0.000186 0.00022 0.00029 0.00018 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 5
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold6 :List of 29
## .. ..$ rss : num 2873
## .. ..$ rsq : num 0.316
## .. ..$ gcv : num 0.0708
## .. ..$ grsq : num 0.315
## .. ..$ dirs : num [1:22, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:22] "(Intercept)" "h(duration-1073)" "h(1073-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:22, 1:42] 0 0 0 0 0 0 0 0 0 54 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:22] "(Intercept)" "h(duration-1073)" "h(1073-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:21] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40674, 1] -0.0357 -0.0544 -0.0629 0.037 -0.0309 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:21, 1] 0.407754 -0.000409 -0.000406 0.389145 -0.041782 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:21] "(Intercept)" "h(duration-1073)" "h(1073-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2873
## .. ..$ rsq.per.response : num 0.316
## .. ..$ gcv.per.response : num 0.0708
## .. ..$ grsq.per.response: num 0.315
## .. ..$ rss.per.subset : num [1:22] 4203 3498 3148 3090 3043 ...
## .. ..$ gcv.per.subset : num [1:22] 0.1033 0.086 0.0774 0.076 0.0749 ...
## .. ..$ leverages : num [1:40674] 0.000177 0.000213 0.000211 0.000282 0.000178 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 6
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold7 :List of 29
## .. ..$ rss : num 2879
## .. ..$ rsq : num 0.315
## .. ..$ gcv : num 0.0709
## .. ..$ grsq : num 0.314
## .. ..$ dirs : num [1:23, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:23] "(Intercept)" "h(duration-1084)" "h(1084-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:23, 1:42] 0 0 0 0 0 0 0 0 0 54 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:23] "(Intercept)" "h(duration-1084)" "h(1084-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:21] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40689, 1] 0.035 -0.0429 -0.0572 -0.0539 0.0298 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:21, 1] 0.896649 -0.00038 -0.000392 0.383577 -0.03915 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:21] "(Intercept)" "h(duration-1084)" "h(1084-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2879
## .. ..$ rsq.per.response : num 0.315
## .. ..$ gcv.per.response : num 0.0709
## .. ..$ grsq.per.response: num 0.314
## .. ..$ rss.per.subset : num [1:23] 4203 3499 3152 3095 3046 ...
## .. ..$ gcv.per.subset : num [1:23] 0.1033 0.086 0.0775 0.0761 0.0749 ...
## .. ..$ leverages : num [1:40689] 0.000314 0.000188 0.000219 0.000229 0.000289 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 7
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold8 :List of 29
## .. ..$ rss : num 2874
## .. ..$ rsq : num 0.316
## .. ..$ gcv : num 0.0708
## .. ..$ grsq : num 0.315
## .. ..$ dirs : num [1:23, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:23] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:23, 1:42] 0 0 0 0 0 0 0 0 0 53 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:23] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:22] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40680, 1] 0.041 -0.0448 -0.0598 -0.0546 0.0276 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:22, 1] 0.445842 -0.000362 -0.0004 0.387038 0.331638 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:22] "(Intercept)" "h(duration-1080)" "h(1080-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2874
## .. ..$ rsq.per.response : num 0.316
## .. ..$ gcv.per.response : num 0.0708
## .. ..$ grsq.per.response: num 0.315
## .. ..$ rss.per.subset : num [1:23] 4203 3495 3146 3086 3039 ...
## .. ..$ gcv.per.subset : num [1:23] 0.1033 0.0859 0.0774 0.0759 0.0747 ...
## .. ..$ leverages : num [1:40680] 0.000333 0.00019 0.000224 0.000239 0.000294 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 8
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold9 :List of 29
## .. ..$ rss : num 2847
## .. ..$ rsq : num 0.323
## .. ..$ gcv : num 0.0701
## .. ..$ grsq : num 0.321
## .. ..$ dirs : num [1:29, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:29] "(Intercept)" "h(duration-1076)" "h(1076-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:29, 1:42] 0 0 0 0 0 0 0 0 0 54 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:29] "(Intercept)" "h(duration-1076)" "h(1076-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:26] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40692, 1] 0.0336 -0.0451 -0.0643 -0.0559 0.0319 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:26, 1] 0.573455 -0.000383 -0.000389 0.374028 0.293365 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:26] "(Intercept)" "h(duration-1076)" "h(1076-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2847
## .. ..$ rsq.per.response : num 0.323
## .. ..$ gcv.per.response : num 0.0701
## .. ..$ grsq.per.response: num 0.321
## .. ..$ rss.per.subset : num [1:29] 4203 3490 3153 3094 3045 ...
## .. ..$ gcv.per.subset : num [1:29] 0.1033 0.0858 0.0775 0.0761 0.0749 ...
## .. ..$ leverages : num [1:40692] 0.000323 0.000201 0.000241 0.000244 0.000322 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 9
## .. ..- attr(*, "class")= chr "earth"
## ..$ fold10:List of 29
## .. ..$ rss : num 2867
## .. ..$ rsq : num 0.318
## .. ..$ gcv : num 0.0706
## .. ..$ grsq : num 0.317
## .. ..$ dirs : num [1:22, 1:42] 0 0 0 0 0 0 0 0 0 1 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:22] "(Intercept)" "h(duration-1073)" "h(1073-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ cuts : num [1:22, 1:42] 0 0 0 0 0 0 0 0 0 55 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:22] "(Intercept)" "h(duration-1073)" "h(1073-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ selected.terms : num [1:21] 1 2 3 4 5 6 7 8 9 10 ...
## .. ..$ fitted.values : num [1:40689, 1] 0.0176 -0.062 0.0391 -0.0274 0.0615 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : NULL
## .. .. .. ..$ : chr "yes"
## .. ..$ coefficients : num [1:21, 1] 0.962214 -0.000393 -0.000407 0.383956 0.322482 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:21] "(Intercept)" "h(duration-1073)" "h(1073-duration)" "poutcomesuccess" ...
## .. .. .. ..$ : chr "yes"
## .. ..$ rss.per.response : num 2867
## .. ..$ rsq.per.response : num 0.318
## .. ..$ gcv.per.response : num 0.0706
## .. ..$ grsq.per.response: num 0.317
## .. ..$ rss.per.subset : num [1:22] 4204 3492 3147 3089 3038 ...
## .. ..$ gcv.per.subset : num [1:22] 0.1033 0.0858 0.0774 0.0759 0.0747 ...
## .. ..$ leverages : num [1:40689] 0.000262 0.000208 0.000278 0.000176 0.000251 ...
## .. ..$ pmethod : chr "backward"
## .. ..$ nprune : NULL
## .. ..$ penalty : num 2
## .. ..$ nk : num 85
## .. ..$ thresh : num 0.001
## .. ..$ termcond : int 4
## .. ..$ weights : NULL
## .. ..$ call : language earth(x = infold.x, y = infold.y, weights = infold.weights, wp = wp, subset = subset, pmethod = if (pmethod == | __truncated__ ...
## .. ..$ namesx : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ modvars : num [1:42, 1:42] 1 0 0 0 0 0 0 0 0 0 ...
## .. .. ..- attr(*, "dimnames")=List of 2
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. .. .. ..$ : chr [1:42] "age" "jobblue-collar" "jobentrepreneur" "jobhousemaid" ...
## .. ..$ levels : num [1:2] 0 1
## .. ..$ icross : int 1
## .. ..$ ifold : int 10
## .. ..- attr(*, "class")= chr "earth"
## $ cv.nterms.selected.by.gcv: Named num [1:11] 24 23 21 23 22 21 21 22 26 21 ...
## ..- attr(*, "names")= chr [1:11] "fold1" "fold2" "fold3" "fold4" ...
## $ cv.nvars.selected.by.gcv : Named num [1:11] 15 14 13 14 13 13 14 13 18 14 ...
## ..- attr(*, "names")= chr [1:11] "fold1" "fold2" "fold3" "fold4" ...
## $ cv.groups : int [1:45211, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr [1:2] "cross" "fold"
## $ cv.rsq.tab : num [1:11, 1:2] 0.3 0.31 0.311 0.294 0.334 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:11] "fold1" "fold2" "fold3" "fold4" ...
## .. ..$ : chr [1:2] "yes" "mean"
## $ cv.maxerr.tab : num [1:11, 1:2] -1.09 -1.26 -1.14 -1.14 -1.04 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:11] "fold1" "fold2" "fold3" "fold4" ...
## .. ..$ : chr [1:2] "yes" "max"
## $ cv.class.rate.tab : num [1:11, 1:2] 0.899 0.9 0.904 0.898 0.903 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr [1:2] "yes" "mean"
## - attr(*, "class")= chr "earth"
De todos este conjunto, vamos a destacar 3 elementos
- Importancia de las variables
- Funciones base (modelo resultado)
- Curvas y superficie (contribución)
Importancia de las variables
## Overall
## duration 100.000000
## poutcomesuccess 68.109084
## monthmar 45.171762
## housingyes 40.087272
## monthoct 35.114270
## contactunknown 31.401977
## monthsep 27.823303
## age 24.185852
## monthjun 21.090675
## pdays 16.010587
## monthdec 14.461722
## campaign 12.631608
## loanyes 5.779968
Funciones Base
## yes
## (Intercept) 0.7774569240
## h(duration-1080) -0.0003818948
## h(1080-duration) -0.0004020631
## poutcomesuccess 0.3809444003
## monthmar 0.3301108826
## housingyes -0.0407997273
## monthoct 0.1916481210
## contactunknown -0.0712999709
## monthsep 0.1788583816
## h(54-age) 0.0087017318
## h(duration-375) 0.0003026388
## monthjun 0.0518693660
## h(2-campaign) 0.0268377535
## monthdec 0.1876019796
## h(pdays-349) 0.0061454449
## h(349-pdays) -0.0015968138
## h(age-27) 0.0071639964
## h(pdays-53) -0.0020353430
## h(pdays-425) -0.0043865936
## loanyes -0.0293712807
Curvas y superficie
## plotmo grid: age job marital education default balance housing loan contact day month
## 39 blue-collar married secondary no 448 yes no cellular 16 may
## duration campaign pdays previous poutcome
## 180 2 -1 0 unknown