Ejercicios Día 3
Herramientas R para perfilar datos
0.11 Carga de Librerias - AutoEDA
library(devtools)
library(autoEDA)
0.12 Análisis Univariado
## autoEDA | Setting color theme
## autoEDA | Removing constant features
## autoEDA | 0 constant features removed
## autoEDA | 0 zero spread features removed
## autoEDA | Removing features containing majority missing values
## autoEDA | 0 majority missing features removed
## autoEDA | Cleaning data
## autoEDA | Correcting sparse categorical feature levels
## autoEDA | Performing univariate analysis
## autoEDA | Visualizing data
## Feature Observations FeatureClass FeatureType PercentageMissing
## 1 Sepal.Length 150 numeric Continuous 0
## 2 Sepal.Width 150 numeric Continuous 0
## 3 Petal.Length 150 numeric Continuous 0
## 4 Petal.Width 150 numeric Continuous 0
## 5 Species 150 character Categorical 0
## PercentageUnique ConstantFeature ZeroSpreadFeature LowerOutliers
## 1 23.33 No No 0
## 2 15.33 No No 1
## 3 28.67 No No 0
## 4 14.67 No No 0
## 5 2.00 No No 0
## UpperOutliers ImputationValue MinValue FirstQuartile Median Mean Mode
## 1 0 5.8 4.3 5.1 5.80 5.84 5
## 2 3 3 2.0 2.8 3.00 3.06 3
## 3 0 4.35 1.0 1.6 4.35 3.76 1.4
## 4 0 1.3 0.1 0.3 1.30 1.20 0.2
## 5 0 SETOSA 0.0 0.0 0.00 0.00 SETOSA
## ThirdQuartile MaxValue LowerOutlierValue UpperOutlierValue
## 1 6.4 7.9 3.15 8.35
## 2 3.3 4.4 2.05 4.05
## 3 5.1 6.9 -3.65 10.35
## 4 1.8 2.5 -1.95 4.05
## 5 0.0 0.0 0.00 0.00
0.13 Regresión Bivariada
## autoEDA | Setting color theme
## autoEDA | Removing constant features
## autoEDA | 0 constant features removed
## autoEDA | Removing zero spread features
## autoEDA | 0 zero spread features removed
## autoEDA | Removing features containing majority missing values
## autoEDA | 0 majority missing features removed
## autoEDA | Cleaning data
## autoEDA | Correcting sparse categorical feature levels
## autoEDA | Sorting features
## autoEDA | Regression outcome detected
## autoEDA | Calculating feature predictive power
## autoEDA | Visualizing data
## Feature Observations FeatureClass FeatureType PercentageMissing
## 1 Petal.Length 150 numeric Continuous 0
## 2 Petal.Width 150 numeric Continuous 0
## 3 Sepal.Length 150 numeric Continuous 0
## 4 Sepal.Width 150 numeric Continuous 0
## 5 Species 150 character Categorical 0
## PercentageUnique ConstantFeature ZeroSpreadFeature LowerOutliers
## 1 28.67 No No 0
## 2 14.67 No No 0
## 3 23.33 No No 0
## 4 15.33 No No 1
## 5 2.00 No No 0
## UpperOutliers ImputationValue MinValue FirstQuartile Median Mean Mode
## 1 0 4.35 1.0 1.6 4.35 3.76 1.4
## 2 0 1.3 0.1 0.3 1.30 1.20 0.2
## 3 0 5.8 4.3 5.1 5.80 5.84 5
## 4 3 3 2.0 2.8 3.00 3.06 3
## 5 0 SETOSA 0.0 0.0 0.00 0.00 SETOSA
## ThirdQuartile MaxValue LowerOutlierValue UpperOutlierValue
## 1 5.1 6.9 -3.65 10.35
## 2 1.8 2.5 -1.95 4.05
## 3 6.4 7.9 3.15 8.35
## 4 3.3 4.4 2.05 4.05
## 5 0.0 0.0 0.00 0.00
## PredictivePowerPercentage PredictivePower
## 1 87 High
## 2 82 High
## 3 0 Low
## 4 12 Low
## 5 78 High
0.14 Clasificación Bivariada
## autoEDA | Setting color theme
## autoEDA | Removing constant features
## autoEDA | 0 constant features removed
## autoEDA | Removing zero spread features
## autoEDA | 0 zero spread features removed
## autoEDA | Removing features containing majority missing values
## autoEDA | 0 majority missing features removed
## autoEDA | Cleaning data
## autoEDA | Correcting sparse categorical feature levels
## autoEDA | Sorting features
## autoEDA | Multi-class classification outcome detected
## autoEDA | Calculating feature predictive power
## autoEDA | Visualizing data
## Feature Observations FeatureClass FeatureType PercentageMissing
## 1 Petal.Length 150 numeric Continuous 0
## 2 Petal.Width 150 numeric Continuous 0
## 3 Sepal.Length 150 numeric Continuous 0
## 4 Sepal.Width 150 numeric Continuous 0
## 5 Species 150 character Categorical 0
## PercentageUnique ConstantFeature ZeroSpreadFeature LowerOutliers
## 1 28.67 No No 0
## 2 14.67 No No 0
## 3 23.33 No No 0
## 4 15.33 No No 1
## 5 2.00 No No 0
## UpperOutliers ImputationValue MinValue FirstQuartile Median Mean Mode
## 1 0 4.35 1.0 1.6 4.35 3.76 1.4
## 2 0 1.3 0.1 0.3 1.30 1.20 0.2
## 3 0 5.8 4.3 5.1 5.80 5.84 5
## 4 3 3 2.0 2.8 3.00 3.06 3
## 5 0 SETOSA 0.0 0.0 0.00 0.00 SETOSA
## ThirdQuartile MaxValue LowerOutlierValue UpperOutlierValue
## 1 5.1 6.9 -3.65 10.35
## 2 1.8 2.5 -1.95 4.05
## 3 6.4 7.9 3.15 8.35
## 4 3.3 4.4 2.05 4.05
## 5 0.0 0.0 0.00 0.00
## PredictivePowerPercentage PredictivePower
## 1 86 High
## 2 88 High
## 3 46 Medium
## 4 24 Low
## 5 0 Low