5 Summary

Before that, we can compute standard statistics of our data set first. We do this with the function summary. We obtain a general statistical overview of the data which could also be useful for other departments as well.

summary(data)
##  Enterprise Flag Record Number     US Postal State Code Metropolitan Statistical Area (MSA) Code County - 2010 Census
##  Min.   :1.000   Min.   :      3   Length:1000000       Min.   :10180                            Length:1000000      
##  1st Qu.:1.000   1st Qu.:1011896   Class :character     1st Qu.:19740                            Class :character    
##  Median :1.000   Median :2023230   Mode  :character     Median :33460                            Mode  :character    
##  Mean   :1.434   Mean   :2023566                        Mean   :36363                                                
##  3rd Qu.:2.000   3rd Qu.:3034943                        3rd Qu.:41420                                                
##  Max.   :2.000   Max.   :4050116                        Max.   :99999                                                
##                                                                                                                      
##  Census Tract - 2010 Census 2010 Census Tract - Percent Minority 2010 Census Tract - Median Income
##  Length:1000000             Min.   :  0.00                       Min.   :  2499                   
##  Class :character           1st Qu.: 10.74                       1st Qu.: 64153                   
##  Mode  :character           Median : 21.20                       Median : 82760                   
##                             Mean   : 29.15                       Mean   : 87967                   
##                             3rd Qu.: 40.77                       3rd Qu.:106163                   
##                             Max.   :100.00                       Max.   :250001                   
##                             NA's   :423                          NA's   :848                      
##  Local Area Median Income Tract Income Ratio Borrower’s (or Borrowers’) Annual Income Area Median Family Income (2019)
##  Min.   : 18262           Min.   :0.040      Min.   :        0                        Min.   : 19800                  
##  1st Qu.: 66070           1st Qu.:0.900      1st Qu.:    65000                        1st Qu.: 69700                  
##  Median : 73783           Median :1.110      Median :    96000                        Median : 79100                  
##  Mean   : 75430           Mean   :1.164      Mean   :   131026                        Mean   : 80343                  
##  3rd Qu.: 81823           3rd Qu.:1.380      3rd Qu.:   141000                        3rd Qu.: 87900                  
##  Max.   :133523           Max.   :5.290      Max.   :999999998                        Max.   :135500                  
##  NA's   :848              NA's   :848                                                 NA's   :1                       
##  Borrower Income Ratio Acquisition Unpaid Principal Balance (UPB) Purpose of Loan Federal Guarantee Number of Borrowers
##  Min.   :  0.000       Min.   :   5000                            Min.   :1.000   Min.   :1.000     Min.   :1.000      
##  1st Qu.:  0.810       1st Qu.: 155000                            1st Qu.:1.000   1st Qu.:1.000     1st Qu.:1.000      
##  Median :  1.200       Median : 235000                            Median :1.000   Median :1.000     Median :1.000      
##  Mean   :  1.464       Mean   : 258684                            Mean   :2.518   Mean   :1.001     Mean   :1.471      
##  3rd Qu.:  1.760       3rd Qu.: 335000                            3rd Qu.:2.000   3rd Qu.:1.000     3rd Qu.:2.000      
##  Max.   :218.170       Max.   :1395000                            Max.   :7.000   Max.   :4.000     Max.   :6.000      
##  NA's   :15                                                                                                            
##  First-Time Home Buyer Borrower Race1   Borrower Ethnicity Co-Borrower Race1 Co-Borrower Ethnicity Borrower Gender
##  Min.   :1.000         Min.   :1.00     Min.   :1.00       Min.   :1.0       Min.   :1.0           Min.   :1.00   
##  1st Qu.:2.000         1st Qu.:5.00     1st Qu.:2.00       1st Qu.:5.0       1st Qu.:2.0           1st Qu.:1.00   
##  Median :2.000         Median :5.00     Median :2.00       Median :5.0       Median :2.0           Median :1.00   
##  Mean   :1.768         Mean   :4.64     Mean   :1.89       Mean   :4.7       Mean   :1.9           Mean   :1.33   
##  3rd Qu.:2.000         3rd Qu.:5.00     3rd Qu.:2.00       3rd Qu.:5.0       3rd Qu.:2.0           3rd Qu.:2.00   
##  Max.   :2.000         Max.   :5.00     Max.   :2.00       Max.   :5.0       Max.   :2.0           Max.   :2.00   
##  NA's   :265           NA's   :141739   NA's   :143473     NA's   :602783    NA's   :609922        NA's   :79610  
##  Co-Borrower Gender Age of Borrower Age of Co-Borrower Occupancy Code  Property Type   Borrower Age 62 or older
##  Min.   :1.0        Min.   :1.000   Min.   :1.0        Min.   :1.000   Min.   :1.000   Min.   :0.0000          
##  1st Qu.:1.0        1st Qu.:2.000   1st Qu.:2.0        1st Qu.:1.000   1st Qu.:1.000   1st Qu.:0.0000          
##  Median :2.0        Median :3.000   Median :3.0        Median :1.000   Median :1.000   Median :0.0000          
##  Mean   :1.7        Mean   :3.608   Mean   :3.6        Mean   :1.164   Mean   :1.007   Mean   :0.1505          
##  3rd Qu.:2.0        3rd Qu.:5.000   3rd Qu.:5.0        3rd Qu.:1.000   3rd Qu.:1.000   3rd Qu.:0.0000          
##  Max.   :2.0        Max.   :7.000   Max.   :7.0        Max.   :3.000   Max.   :2.000   Max.   :1.0000          
##  NA's   :575553     NA's   :25      NA's   :535309                                     NA's   :25              
##  Co-Borrower Age 62 or older Loan-to-Value Ratio (LTV) Date of Mortgage Note Term of Mortgage at Origination
##  Min.   :0.0                 Min.   :  3.23            Min.   :1.0           Min.   : 60.0                  
##  1st Qu.:0.0                 1st Qu.: 67.34            1st Qu.:1.0           1st Qu.:360.0                  
##  Median :0.0                 Median : 80.00            Median :1.0           Median :360.0                  
##  Mean   :0.2                 Mean   : 75.62            Mean   :1.1           Mean   :332.5                  
##  3rd Qu.:0.0                 3rd Qu.: 90.00            3rd Qu.:1.0           3rd Qu.:360.0                  
##  Max.   :1.0                 Max.   :308.00            Max.   :2.0           Max.   :480.0                  
##  NA's   :535309              NA's   :3                                                                      
##  Number of Units Interest Rate at Origination  Note Amount       Preapproval     Application Channel
##  Min.   :1.000   Min.   :0.62                 Min.   :   5000   Min.   :1.0      Min.   :1.00       
##  1st Qu.:1.000   1st Qu.:3.87                 1st Qu.: 155000   1st Qu.:2.0      1st Qu.:1.00       
##  Median :1.000   Median :4.12                 Median : 235000   Median :2.0      Median :1.00       
##  Mean   :1.025   Mean   :4.24                 Mean   : 260106   Mean   :1.9      Mean   :1.52       
##  3rd Qu.:1.000   3rd Qu.:4.62                 3rd Qu.: 345000   3rd Qu.:2.0      3rd Qu.:2.00       
##  Max.   :4.000   Max.   :8.55                 Max.   :1395000   Max.   :2.0      Max.   :3.00       
##                  NA's   :74                   NA's   :74        NA's   :679221   NA's   :204345     
##  Automated Underwriting System (AUS) Credit Score Model - Borrower Credit Score Model - Co-Borrower
##  Min.   :1.000                       Min.   :1.00                  Min.   : 1.0                    
##  1st Qu.:1.000                       1st Qu.:1.00                  1st Qu.: 2.0                    
##  Median :1.000                       Median :2.00                  Median :10.0                    
##  Mean   :1.259                       Mean   :1.95                  Mean   : 6.3                    
##  3rd Qu.:2.000                       3rd Qu.:3.00                  3rd Qu.:10.0                    
##  Max.   :4.000                       Max.   :3.00                  Max.   :10.0                    
##  NA's   :11919                       NA's   :216866                NA's   :217164                  
##  Debt-to-Income (DTI) Ratio Discount Points   Property Value    Rural Census Tract Lower Mississippi Delta County
##  Min.   :10.00              Min.   :      0   Min.   :   5000   Min.   :0.0000     Min.   :0.0000                
##  1st Qu.:20.00              1st Qu.:      0   1st Qu.: 225000   1st Qu.:0.0000     1st Qu.:0.0000                
##  Median :36.00              Median :      0   Median : 325000   Median :0.0000     Median :0.0000                
##  Mean   :32.84              Mean   :   1016   Mean   : 369373   Mean   :0.1691     Mean   :0.0153                
##  3rd Qu.:43.00              3rd Qu.:   1305   3rd Qu.: 465000   3rd Qu.:0.0000     3rd Qu.:0.0000                
##  Max.   :60.00              Max.   :3362381   Max.   :9005000   Max.   :1.0000     Max.   :1.0000                
##  NA's   :97                 NA's   :294369    NA's   :74                                                         
##  Middle Appalachia County Persistent Poverty County Area of Concentrated Poverty High Opportunity Area
##  Min.   :0.00000          Min.   :0.00000           Min.   :0.00000              Min.   :0.0000       
##  1st Qu.:0.00000          1st Qu.:0.00000           1st Qu.:0.00000              1st Qu.:0.0000       
##  Median :0.00000          Median :0.00000           Median :0.00000              Median :0.0000       
##  Mean   :0.01714          Mean   :0.02611           Mean   :0.07877              Mean   :0.1889       
##  3rd Qu.:0.00000          3rd Qu.:0.00000           3rd Qu.:0.00000              3rd Qu.:0.0000       
##  Max.   :1.00000          Max.   :1.00000           Max.   :1.00000              Max.   :1.0000       
##                                                                                                       
##  Qualified Opportunity Zone (QOZ)
##  Min.   :0.00000                 
##  1st Qu.:0.00000                 
##  Median :0.00000                 
##  Mean   :0.04893                 
##  3rd Qu.:0.00000                 
##  Max.   :1.00000                 
##