2.7 PRIMJER 6

  • U programu R Studio procijenite logističku regresiju oblika:

\[\begin{equation} log\bigg(\dfrac{p_i}{1-p_i}\bigg)=\beta_0 + \beta_1 x_i + \beta_2 z_i, \end{equation}\]

pri čemu su:

\[\begin{align} p_i&=\text{vjerojatnost da će poduzeće kotirati na burzi}\\ x_i&=\text{prihod poduzeća u 000 kn} \\ z_i&=\text{broj zaposlenih u poduzeću} \end{align}\]

  • Uzmite u obzir da je zavisna varijabla \(y=\{0,~1\}\) već kreirana unutar objekta mojipodaci (varijabla d3). Novu logističku regresiju nazovite logisticka2. Za procjenu modela logističke regresije koristite naredbu glm() uz pretpostavku Binomne distribucije i “logit” vezne funkcije
logisticka2=glm(d3~prihod+zaposleni,data=mojipodaci,family=binomial(link="logit"))
summary(logisticka2)
## 
## Call:
## glm(formula = d3 ~ prihod + zaposleni, family = binomial(link = "logit"), 
##     data = mojipodaci)
## 
## Deviance Residuals: 
##        Min          1Q      Median          3Q         Max  
## -3.882e-04  -2.000e-08  -2.000e-08   2.000e-08   4.082e-04  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -4653.4   490610.4  -0.009    0.992
## prihod         116.8    10100.2   0.012    0.991
## zaposleni     -215.8    21126.7  -0.010    0.992
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 6.8029e+01  on 49  degrees of freedom
## Residual deviance: 3.6457e-07  on 47  degrees of freedom
## AIC: 6
## 
## Number of Fisher Scoring iterations: 25
  • Protumačite značenje dobivenih koeficijenata konkretno.
exp(coefficients(logisticka2))
##  (Intercept)       prihod    zaposleni 
## 0.000000e+00 5.273650e+50 1.956816e-94
  • Dobivene vjerojatnosti spremite kao novu varijablu p2 unutar spremnika podataka mojipodaci
mojipodaci$p2=predict(logisticka2,type="response")
head(mojipodaci)
##   prihod zaduzenost  djelatnost kotacija zaposleni reklama rizik d1 d2 d3
## 1  60.53    0.88859    trgovina       ne        16   58.80     5  0  1  0
## 2  50.33    0.06934 proizvodnja       ne        13   37.27     1  1  0  0
## 3 130.61    0.21144      usluge       da        41   40.00     2  0  0  1
## 4 100.67    0.55482    trgovina       ne        33   42.98     3  0  1  0
## 5 130.25    0.14767    trgovina       da        41   72.00     1  0  1  1
## 6 130.95    0.14211    trgovina       da        41   63.07     1  0  1  1
##        p           p2
## 1 0.0000 2.220446e-16
## 2 0.0000 2.220446e-16
## 3 1.0000 1.000000e+00
## 4 0.3665 7.536640e-08
## 5 1.0000 1.000000e+00
## 6 1.0000 1.000000e+00
confusionMatrix(mojipodaci$d3,mojipodaci$p2,threshold=0.5)
1-misClassError(mojipodaci$d3,mojipodaci$p2,threshold=0.5)
##    0  1
## 0 29  0
## 1  0 21
## [1] 1
  • Dodavanjem varijable zaposleni sva poduzeća su točno klasificirana pri razini praga od \(0.5\)

  • Koja bi bila optimalna razina praga?

optimalCutoff(mojipodaci$d3,mojipodaci$p2,returnDiagnostics=TRUE)
## $optimalCutoff
## [1] 0.01
## 
## $sensitivityTable
##           CUTOFF FPR       TPR YOUDENSINDEX SPECIFICITY MISCLASSERROR
## 1   1.000000e+00   0 0.9047619    0.9047619           1          0.04
## 2   9.900000e-01   0         1    1.0000000           1          0.00
## 3   9.800000e-01   0         1    1.0000000           1          0.00
## 4   9.700000e-01   0         1    1.0000000           1          0.00
## 5   9.600000e-01   0         1    1.0000000           1          0.00
## 6   9.500000e-01   0         1    1.0000000           1          0.00
## 7   9.400000e-01   0         1    1.0000000           1          0.00
## 8   9.300000e-01   0         1    1.0000000           1          0.00
## 9   9.200000e-01   0         1    1.0000000           1          0.00
## 10  9.100000e-01   0         1    1.0000000           1          0.00
## 11  9.000000e-01   0         1    1.0000000           1          0.00
## 12  8.900000e-01   0         1    1.0000000           1          0.00
## 13  8.800000e-01   0         1    1.0000000           1          0.00
## 14  8.700000e-01   0         1    1.0000000           1          0.00
## 15  8.600000e-01   0         1    1.0000000           1          0.00
## 16  8.500000e-01   0         1    1.0000000           1          0.00
## 17  8.400000e-01   0         1    1.0000000           1          0.00
## 18  8.300000e-01   0         1    1.0000000           1          0.00
## 19  8.200000e-01   0         1    1.0000000           1          0.00
## 20  8.100000e-01   0         1    1.0000000           1          0.00
## 21  8.000000e-01   0         1    1.0000000           1          0.00
## 22  7.900000e-01   0         1    1.0000000           1          0.00
## 23  7.800000e-01   0         1    1.0000000           1          0.00
## 24  7.700000e-01   0         1    1.0000000           1          0.00
## 25  7.600000e-01   0         1    1.0000000           1          0.00
## 26  7.500000e-01   0         1    1.0000000           1          0.00
## 27  7.400000e-01   0         1    1.0000000           1          0.00
## 28  7.300000e-01   0         1    1.0000000           1          0.00
## 29  7.200000e-01   0         1    1.0000000           1          0.00
## 30  7.100000e-01   0         1    1.0000000           1          0.00
## 31  7.000000e-01   0         1    1.0000000           1          0.00
## 32  6.900000e-01   0         1    1.0000000           1          0.00
## 33  6.800000e-01   0         1    1.0000000           1          0.00
## 34  6.700000e-01   0         1    1.0000000           1          0.00
## 35  6.600000e-01   0         1    1.0000000           1          0.00
## 36  6.500000e-01   0         1    1.0000000           1          0.00
## 37  6.400000e-01   0         1    1.0000000           1          0.00
## 38  6.300000e-01   0         1    1.0000000           1          0.00
## 39  6.200000e-01   0         1    1.0000000           1          0.00
## 40  6.100000e-01   0         1    1.0000000           1          0.00
## 41  6.000000e-01   0         1    1.0000000           1          0.00
## 42  5.900000e-01   0         1    1.0000000           1          0.00
## 43  5.800000e-01   0         1    1.0000000           1          0.00
## 44  5.700000e-01   0         1    1.0000000           1          0.00
## 45  5.600000e-01   0         1    1.0000000           1          0.00
## 46  5.500000e-01   0         1    1.0000000           1          0.00
## 47  5.400000e-01   0         1    1.0000000           1          0.00
## 48  5.300000e-01   0         1    1.0000000           1          0.00
## 49  5.200000e-01   0         1    1.0000000           1          0.00
## 50  5.100000e-01   0         1    1.0000000           1          0.00
## 51  5.000000e-01   0         1    1.0000000           1          0.00
## 52  4.900000e-01   0         1    1.0000000           1          0.00
## 53  4.800000e-01   0         1    1.0000000           1          0.00
## 54  4.700000e-01   0         1    1.0000000           1          0.00
## 55  4.600000e-01   0         1    1.0000000           1          0.00
## 56  4.500000e-01   0         1    1.0000000           1          0.00
## 57  4.400000e-01   0         1    1.0000000           1          0.00
## 58  4.300000e-01   0         1    1.0000000           1          0.00
## 59  4.200000e-01   0         1    1.0000000           1          0.00
## 60  4.100000e-01   0         1    1.0000000           1          0.00
## 61  4.000000e-01   0         1    1.0000000           1          0.00
## 62  3.900000e-01   0         1    1.0000000           1          0.00
## 63  3.800000e-01   0         1    1.0000000           1          0.00
## 64  3.700000e-01   0         1    1.0000000           1          0.00
## 65  3.600000e-01   0         1    1.0000000           1          0.00
## 66  3.500000e-01   0         1    1.0000000           1          0.00
## 67  3.400000e-01   0         1    1.0000000           1          0.00
## 68  3.300000e-01   0         1    1.0000000           1          0.00
## 69  3.200000e-01   0         1    1.0000000           1          0.00
## 70  3.100000e-01   0         1    1.0000000           1          0.00
## 71  3.000000e-01   0         1    1.0000000           1          0.00
## 72  2.900000e-01   0         1    1.0000000           1          0.00
## 73  2.800000e-01   0         1    1.0000000           1          0.00
## 74  2.700000e-01   0         1    1.0000000           1          0.00
## 75  2.600000e-01   0         1    1.0000000           1          0.00
## 76  2.500000e-01   0         1    1.0000000           1          0.00
## 77  2.400000e-01   0         1    1.0000000           1          0.00
## 78  2.300000e-01   0         1    1.0000000           1          0.00
## 79  2.200000e-01   0         1    1.0000000           1          0.00
## 80  2.100000e-01   0         1    1.0000000           1          0.00
## 81  2.000000e-01   0         1    1.0000000           1          0.00
## 82  1.900000e-01   0         1    1.0000000           1          0.00
## 83  1.800000e-01   0         1    1.0000000           1          0.00
## 84  1.700000e-01   0         1    1.0000000           1          0.00
## 85  1.600000e-01   0         1    1.0000000           1          0.00
## 86  1.500000e-01   0         1    1.0000000           1          0.00
## 87  1.400000e-01   0         1    1.0000000           1          0.00
## 88  1.300000e-01   0         1    1.0000000           1          0.00
## 89  1.200000e-01   0         1    1.0000000           1          0.00
## 90  1.100000e-01   0         1    1.0000000           1          0.00
## 91  1.000000e-01   0         1    1.0000000           1          0.00
## 92  9.000000e-02   0         1    1.0000000           1          0.00
## 93  8.000000e-02   0         1    1.0000000           1          0.00
## 94  7.000000e-02   0         1    1.0000000           1          0.00
## 95  6.000000e-02   0         1    1.0000000           1          0.00
## 96  5.000000e-02   0         1    1.0000000           1          0.00
## 97  4.000000e-02   0         1    1.0000000           1          0.00
## 98  3.000000e-02   0         1    1.0000000           1          0.00
## 99  2.000000e-02   0         1    1.0000000           1          0.00
## 100 1.000000e-02   0         1    1.0000000           1          0.00
## 101 2.220446e-16   1         1    0.0000000           0          0.58
## 
## $misclassificationError
## [1] 0
## 
## $TPR
## [1] 1
## 
## $FPR
## [1] 0
## 
## $Specificity
## [1] 1
  • Prikažite grafički ROC krivulju
plotROC(mojipodaci$d3,mojipodaci$p2)

  • Koji je model logističke regresije prikladniji? Zaključak donesite na temelju testa omjera vjerodostojnoti LRT (Likelihood Ratio Test)
anova(logisticka,logisticka2,test="LRT")
## Analysis of Deviance Table
## 
## Model 1: d3 ~ prihod
## Model 2: d3 ~ prihod + zaposleni
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)  
## 1        48     5.4827                       
## 2        47     0.0000  1   5.4827  0.01921 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1