10.3 Generalized linear models

Generalized linear models (GLMs) were introduced by Nelder and Wedderburn (1972), extending the concept of linear regression to a more general setting. These models are characterized by: i) a dependent variable yi whose probability distribution function belongs to the exponential family (see Section 3.1, ii) a linear predictor \eta = \boldsymbol{x}^{\top}\boldsymbol{\beta}, and iii) a link function such that \mathbb{E}[y|\boldsymbol{x}] = g^{-1}(\boldsymbol{x}^{\top}\boldsymbol{\beta}), which implies that g(\mathbb{E}[y|\boldsymbol{x}]) = \boldsymbol{x}^{\top}\boldsymbol{\beta}. GLMs can be extended to the overdispersed exponential family (McCullagh and Nelder 1989).

As we know from Section 3.1, the Poisson distribution belongs to the exponential family, such that p(y|\lambda) = \frac{\exp(-\lambda)\exp(y\log(\lambda))}{y!}, or in the canonical form p(y|\eta) = \frac{\exp(\eta y - \exp(\eta))}{y!}, where \eta = \log(\lambda), which means that \boldsymbol{x}^{\top}\boldsymbol{\beta} = \log(\lambda). Consequently, \mathbb{E}[y|\boldsymbol{x}] = \nabla(\exp(\eta)) = \exp(\eta) = \lambda = \exp(\boldsymbol{x}^{\top}\boldsymbol{\beta}). Therefore, the link function in the Poisson case is the log function. In Exercise 6, we ask you to show that the link function in the Bernoulli case is the logit function. Other examples include the identity function in the case of the Gaussian distribution and the negative inverse in the case of the gamma distribution.

We can use the GLM framework to perform Bayesian model averaging (BMA) using the BIC approximation, following A. Raftery (1995). Specifically, the BIC is given by BIC = k_m \log(N) - 2 \log(p(\hat{\boldsymbol{\theta}}_m | \boldsymbol{y})), where \hat{\boldsymbol{\theta}}_m is the maximum likelihood estimator. Thus, we simply need to calculate the likelihood function at the maximum likelihood estimator.

Example: Simulation exercises

Let’s perform some simulation exercises to assess the performance of the BIC approximation using the Occam’s window in GLMs. There are 27 regressors, where x_{i1} and x_{i2} are just the relevant regressors in all exercises, i=1,2,\dots,1000.

  1. Logit: x_k\sim N(0, 1), k =1,\dots,27, and p(y_i=1|\boldsymbol{x}_i)=\exp(0.5+0.8x_{i1}-1.2x_{i2})/(1+\exp(0.5+0.8x_{i1}-1.2x_{i2})).

  2. Gamma: x_k\sim N(0, 0.5^2), k =1,\dots,27, and y_i\sim G(\alpha,\delta) where \alpha=-(0.5+0.2x_{i1}0.1x_{i2})^{-1} and \delta=1.

  3. Poisson: x_k\sim N(0, 1), k =1,\dots,27, and \mathbb{E}[y_i|\boldsymbol{x}_i]=\lambda_i=\exp(0.5+1.1x_{i1}+0.7x_{i2}).

Our GUI uses the command bic.glm from the BMA package to perform BMA using the BIC approximation with the Occam’s window in GLMs. The next Algorithm shows how to do this in our GUI, and the following code shows how to perform BMA in logit models using the simulation setting.

Algorithm: Bayesian Model Averaging in Generalized Linear Models using BIC

  1. Select Bayesian Model Averaging on the top panel

  2. Select the Generalized Linear Model using the left radio button. Options:

    • Binomial data (Logit)
    • Real positive data (Gamma)
    • Count data (Poisson)
  3. Upload the dataset, selecting first whether there is a header in the file and the type of separator in the csv file (comma, semicolon, or tab). Then, use the Browse button under the Choose File legend

  4. Type the OR number of Occam’s window in the box under OR: Number between 5 and 50. This is not necessary, as the default value is 50

  5. Type the OL number of Occam’s window in the box under OL: Number between 0.0001 and 1. This is not necessary, as the default value is 0.0025

  6. Click the Go! button

  7. Analyze results: After a few seconds or minutes, a table appears showing, for each regressor in the dataset:

    • The posterior inclusion probability (p!=0)
    • The BMA posterior mean (EV)
    • The BMA standard deviation (SD)
    • The PMP for the most relevant models (highest PMPs)
  8. Download posterior results using the Download results using BIC button. Two files are provided:

    • The first file contains the best models by row according to the PMP (last column), indicating variable inclusion with 1 (0 indicates no inclusion)
    • The second file contains the PIP, the BMA expected value, and the standard deviation for each variable in the dataset

The results show that the PIPs of x_{i1} and x_{i2} are equal 1 in all three settings, the data generating process gets the highest PMP, and the BMA posterior means are close to the population values in each simulation setting. The other variables get PIPs close to 0, except a few exceptions, and the BMA posterior means are also close to 0. This suggests that the BIC approximation does a good job finding the data generating process in generalized linear models.

We can take advantage of the glm function in R to perform BMA by programming an MC3 algorithm. The following code illustrates how to do this in the Poisson simulation. First, we simulate the data; second, we define a function to compute the log marginal likelihood approximation using the results from the glm function. Then, we initialize the models to begin the MC3 algorithm. After that, we implement the MC3 algorithm, which involves small modifications of the code used for MC3 in Gaussian linear models. We can calculate the posterior model probabilities (PMPs), posterior inclusion probabilities (PIPs), BMA means, and standard deviations as we did previously.

The simulation setting involves 2^{27} models, which corresponds to approximately 135 million models in the model space. We run our MC3 algorithm using the BIC approximation with 50,000 iterations. This takes considerably more time than the BIC approximation from the BMA package, but it seems to perform well in identifying the data-generating process, as the PMP of this model equals 1. The posterior inclusion probabilities (PIPs) for x_{i1} and x_{i2} are also 1, and the posterior means are 1.1 and 0.7, respectively, which are equal to the population values. The t-ratios are far greater than 2. However, running 50,000 iterations results in mass concentration in one model, in this case, the data-generating process. If we run 25,000 MC3 iterations, the highest PMP is 0.8, but it is not associated with the data-generating process. Nonetheless, the PIP is equal to 1 for x_{i1} and x_{i2}, and other regressors also have high PIPs. The BMA means for x_{i1} and x_{i2} are equal to the population values, and the BMA means for the other regressors are equal to 0. The t-ratios of the regressors in the population statistical model are much greater than 2, whereas the t-ratios of the other regressors are equal to 0. This exercise demonstrates that 25,000 iterations were not sufficient to uncover the data-generating process. However, it also emphasizes an important point: we need to analyze all the relevant results from the BMA analysis, not just the PMPs and/or PIPs.

In Exercise 10, we ask you to use this approach to perform a BMA algorithm in the logit regression, using the simulation setting for logit models from this section.

### Logit ###
rm(list = ls()); set.seed(010101)
n<-1000; B<-c(0.5,0.8,-1.2)
X<-matrix(cbind(rep(1,n),rnorm(n,0,1),rnorm(n,0,1)),n,length(B))
p <- exp(X%*%B)/(1+exp(X%*%B)); y <- rbinom(n, 1, p)
nXgar<-25; Xgar<-matrix(rnorm(nXgar*n),n,nXgar)
df<-as.data.frame(cbind(y,X[,-1],Xgar))
colnames(df) <- c("y", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x14", "x15", "x16", "x17", "x18", "x19", "x20", "x21", "x22", "x23", "x24", "x25", "x26", "x27")
BMAglmLogit <- BMA::bic.glm(y ~ x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15+x16+x17+x18+x19+x20+x21+x22+x23+x24+x25+x26+x27, data = df, glm.family = binomial(link="logit"), strict = FALSE, OR = 50)
summary(BMAglmLogit)
## 
## Call:
## bic.glm.formula(f = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 +     x9 + x10 + x11 + x12 + x13 + x14 + x15 + x16 + x17 + x18 +     x19 + x20 + x21 + x22 + x23 + x24 + x25 + x26 + x27, data = df,     glm.family = binomial(link = "logit"), strict = FALSE, OR = 50)
## 
## 
##   40  models were selected
##  Best  5  models (cumulative posterior probability =  0.6196 ): 
## 
##            p!=0    EV         SD        model 1     model 2     model 3   
## Intercept  100     4.231e-01  0.076191      0.4212      0.4182      0.4296
## x1.x       100.0   8.512e-01  0.084649      0.8467      0.8541      0.8549
## x2.x       100.0  -1.105e+00  0.089964     -1.1029     -1.1079     -1.1023
## x3.x         2.9   2.382e-03  0.018557       .           .           .    
## x4.x        24.5  -4.203e-02  0.082651       .           .         -0.1699
## x5.x         0.8   1.304e-04  0.006885       .           .           .    
## x6.x         1.8   1.240e-03  0.013934       .           .           .    
## x7.x        14.8   2.273e-02  0.061659       .           .           .    
## x8.x         0.8  -6.231e-05  0.006677       .           .           .    
## x9.x         0.9  -2.308e-04  0.007548       .           .           .    
## x10.x        0.9  -3.683e-04  0.008215       .           .           .    
## x11.x        1.1  -6.107e-04  0.009791       .           .           .    
## x12.x        0.8  -1.680e-05  0.006267       .           .           .    
## x13.x        1.1  -6.397e-04  0.009839       .           .           .    
## x14.x        0.8   1.517e-04  0.007144       .           .           .    
## x15.x        0.9  -1.923e-04  0.007157       .           .           .    
## x16.x        1.0   4.411e-04  0.008425       .           .           .    
## x17.x        1.8   1.298e-03  0.013796       .           .           .    
## x18.x        0.8   4.661e-05  0.006742       .           .           .    
## x19.x        1.0   4.311e-04  0.008746       .           .           .    
## x20.x        0.8  -1.571e-04  0.007049       .           .           .    
## x21.x        7.5   8.922e-03  0.037260       .           .           .    
## x22.x        0.8   4.378e-05  0.006687       .           .           .    
## x23.x       24.3  -4.202e-02  0.083002       .         -0.1718       .    
## x24.x        0.8  -1.384e-04  0.007200       .           .           .    
## x25.x        0.8  -1.363e-05  0.006489       .           .           .    
## x26.x        1.8   1.336e-03  0.014274       .           .           .    
## x27.x        1.9   1.354e-03  0.014298       .           .           .    
##                                                                           
## nVar                                          2           3           3   
## BIC                                     -5812.0652  -5810.3862  -5810.3482
## post prob                                   0.260       0.112       0.110 
##            model 4     model 5   
## Intercept      0.4221      0.4271
## x1.x           0.8481      0.8627
## x2.x          -1.1016     -1.1087
## x3.x            .           .    
## x4.x            .         -0.1784
## x5.x            .           .    
## x6.x            .           .    
## x7.x           0.1575       .    
## x8.x            .           .    
## x9.x            .           .    
## x10.x           .           .    
## x11.x           .           .    
## x12.x           .           .    
## x13.x           .           .    
## x14.x           .           .    
## x15.x           .           .    
## x16.x           .           .    
## x17.x           .           .    
## x18.x           .           .    
## x19.x           .           .    
## x20.x           .           .    
## x21.x           .           .    
## x22.x           .           .    
## x23.x           .         -0.1801
## x24.x           .           .    
## x25.x           .           .    
## x26.x           .           .    
## x27.x           .           .    
##                                  
## nVar             3           4   
## BIC        -5809.6482  -5809.1451
## post prob      0.078       0.060 
## 
##   1  observations deleted due to missingness.
### Gamma ###
rm(list = ls()); set.seed(010101)
n<-1000; B<- c(0.5, 0.2, 0.1)
X<-matrix(cbind(rep(1,n),rnorm(n,0,0.5),rnorm(n,0,0.5)),n,length(B))
y1 <- (X%*%B)^(-1)
y <- rgamma(n,y1,scale=1)
nXgar<-25; Xgar<-matrix(rnorm(nXgar*n),n,nXgar)
df<-as.data.frame(cbind(y,X[,-1],Xgar))
colnames(df) <- c("y", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x14", "x15", "x16", "x17", "x18", "x19", "x20", "x21", "x22", "x23", "x24", "x25", "x26", "x27")
BMAglmGamma <- BMA::bic.glm(y ~ x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15+x16+x17+x18+x19+x20+x21+x22+x23+x24+x25+x26+x27, data = df, glm.family = Gamma(link="inverse"), strict = FALSE, OR = 50)
summary(BMAglmGamma)
## 
## Call:
## bic.glm.formula(f = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 +     x9 + x10 + x11 + x12 + x13 + x14 + x15 + x16 + x17 + x18 +     x19 + x20 + x21 + x22 + x23 + x24 + x25 + x26 + x27, data = df,     glm.family = Gamma(link = "inverse"), strict = FALSE, OR = 50)
## 
## 
##   36  models were selected
##  Best  5  models (cumulative posterior probability =  0.6264 ): 
## 
##            p!=0    EV         SD         model 1     model 2     model 3   
## Intercept  100     4.890e-01  0.0117343   4.885e-01   4.895e-01   4.902e-01
## x1.x       100.0   2.026e-01  0.0187327   2.021e-01   2.033e-01   2.042e-01
## x2.x        99.2   7.552e-02  0.0210472   7.577e-02   7.595e-02   7.820e-02
## x3.x         1.1   3.661e-05  0.0011044       .           .           .    
## x4.x         1.2   7.240e-05  0.0012692       .           .           .    
## x5.x         1.0   1.013e-06  0.0010648       .           .           .    
## x6.x         1.0  -7.413e-06  0.0009672       .           .           .    
## x7.x         2.1   1.787e-04  0.0019443       .           .           .    
## x8.x         2.7   3.187e-04  0.0025652       .           .           .    
## x9.x         2.8   3.287e-04  0.0026193       .           .           .    
## x10.x        2.5  -2.618e-04  0.0023160       .           .           .    
## x11.x        2.2  -1.909e-04  0.0018748       .           .           .    
## x12.x        1.3  -1.021e-04  0.0014787       .           .           .    
## x13.x        1.1   5.812e-05  0.0012447       .           .           .    
## x14.x        1.2  -8.201e-05  0.0013340       .           .           .    
## x15.x        1.1   5.983e-05  0.0012278       .           .           .    
## x16.x        1.0   4.335e-06  0.0010045       .           .           .    
## x17.x        1.0   1.127e-05  0.0010351       .           .           .    
## x18.x        1.0   1.195e-05  0.0010612       .           .           .    
## x19.x        1.2  -8.152e-05  0.0013225       .           .           .    
## x20.x       25.2   5.858e-03  0.0112689       .       2.329e-02       .    
## x21.x       11.5  -2.200e-03  0.0069431       .           .      -1.938e-02
## x22.x        1.1  -3.714e-05  0.0011170       .           .           .    
## x23.x       10.4   1.998e-03  0.0067118       .           .           .    
## x24.x        1.0   8.960e-06  0.0009762       .           .           .    
## x25.x        1.5   1.462e-04  0.0017640       .           .           .    
## x26.x        2.1  -1.856e-04  0.0019576       .           .           .    
## x27.x        1.0   5.144e-06  0.0010106       .           .           .    
##                                                                            
## nVar                                        2           3           3      
## BIC                                      -5.849e+03  -5.848e+03  -5.846e+03
## post prob                                 0.316       0.149       0.072    
##            model 4     model 5   
## Intercept   4.893e-01   4.903e-01
## x1.x        2.019e-01   2.027e-01
## x2.x        7.641e-02   7.639e-02
## x3.x            .           .    
## x4.x            .           .    
## x5.x            .           .    
## x6.x            .           .    
## x7.x            .           .    
## x8.x            .           .    
## x9.x            .           .    
## x10.x           .           .    
## x11.x           .           .    
## x12.x           .           .    
## x13.x           .           .    
## x14.x           .           .    
## x15.x           .           .    
## x16.x           .           .    
## x17.x           .           .    
## x18.x           .           .    
## x19.x           .           .    
## x20.x           .       2.364e-02
## x21.x           .           .    
## x22.x           .           .    
## x23.x       1.914e-02   1.948e-02
## x24.x           .           .    
## x25.x           .           .    
## x26.x           .           .    
## x27.x           .           .    
##                                  
## nVar          3           4      
## BIC        -5.846e+03  -5.845e+03
## post prob   0.060       0.030    
## 
##   1  observations deleted due to missingness.
### Poisson ###
rm(list = ls()); set.seed(010101)
n<-1000; B<-c(2,1.1,0.7)
X<-matrix(cbind(rep(1,n),rnorm(n,0,1),rnorm(n,0,1)),n,length(B))
y1<-exp(X%*%B); y<-rpois(n,y1)
nXgar<-25; Xgar<-matrix(rnorm(nXgar*n),n,nXgar)
df<-as.data.frame(cbind(y,X[,-1],Xgar))
colnames(df) <- c("y", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x14", "x15", "x16", "x17", "x18", "x19", "x20", "x21", "x22", "x23", "x24", "x25", "x26", "x27")
BMAglmPoisson <- BMA::bic.glm(y ~ x1+x2+x3+x4+x5+x6+x7+x8+x9+x10+x11+x12+x13+x14+x15+x16+x17+x18+x19+x20+x21+x22+x23+x24+x25+x26+x27, data = df, glm.family = poisson(link="log"), strict = FALSE, OR = 50)
summary(BMAglmPoisson)
## 
## Call:
## bic.glm.formula(f = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 +     x9 + x10 + x11 + x12 + x13 + x14 + x15 + x16 + x17 + x18 +     x19 + x20 + x21 + x22 + x23 + x24 + x25 + x26 + x27, data = df,     glm.family = poisson(link = "log"), strict = FALSE, OR = 50)
## 
## 
##   26  models were selected
##  Best  5  models (cumulative posterior probability =  0.6228 ): 
## 
##            p!=0    EV         SD         model 1     model 2     model 3   
## Intercept  100     2.004e+00  0.0120917   2.004e+00   2.002e+00   2.003e+00
## x1.x       100.0   1.093e+00  0.0071406   1.093e+00   1.096e+00   1.095e+00
## x2.x       100.0   7.020e-01  0.0076175   7.020e-01   7.023e-01   7.011e-01
## x3.x         1.9  -1.202e-04  0.0013752       .           .           .    
## x4.x         1.4  -2.783e-05  0.0009363       .           .           .    
## x5.x         5.8   7.628e-04  0.0036096       .           .       1.326e-02
## x6.x         1.9  -1.311e-04  0.0014441       .           .           .    
## x7.x         1.5   5.797e-05  0.0010305       .           .           .    
## x8.x         1.4  -2.484e-05  0.0009678       .           .           .    
## x9.x         1.6  -7.084e-05  0.0011074       .           .           .    
## x10.x        3.7  -4.232e-04  0.0026514       .           .           .    
## x11.x        1.4  -3.190e-05  0.0010090       .           .           .    
## x12.x        1.5   4.617e-05  0.0009407       .           .           .    
## x13.x        1.4   2.822e-05  0.0009560       .           .           .    
## x14.x        1.4   2.029e-05  0.0010251       .           .           .    
## x15.x        2.1  -1.445e-04  0.0014510       .           .           .    
## x16.x        1.6   7.135e-05  0.0011593       .           .           .    
## x17.x        1.5  -5.180e-05  0.0010208       .           .           .    
## x18.x        1.6  -8.503e-05  0.0012462       .           .           .    
## x19.x        3.2  -3.442e-04  0.0023704       .           .           .    
## x20.x        5.8  -7.149e-04  0.0033497       .      -1.223e-02       .    
## x21.x        1.4  -2.393e-05  0.0009079       .           .           .    
## x22.x        2.1   1.578e-04  0.0015657       .           .           .    
## x23.x        1.7   9.252e-05  0.0012586       .           .           .    
## x24.x        1.5  -4.766e-05  0.0010436       .           .           .    
## x25.x        2.1  -1.453e-04  0.0014830       .           .           .    
## x26.x        3.4  -3.780e-04  0.0025203       .           .           .    
## x27.x        4.1  -4.169e-04  0.0024626       .           .           .    
##                                                                            
## nVar                                        2           3           3      
## BIC                                      -5.798e+03  -5.794e+03  -5.794e+03
## post prob                                 0.429       0.058       0.058    
##            model 4     model 5   
## Intercept   2.004e+00   2.004e+00
## x1.x        1.094e+00   1.093e+00
## x2.x        7.024e-01   7.024e-01
## x3.x            .           .    
## x4.x            .           .    
## x5.x            .           .    
## x6.x            .           .    
## x7.x            .           .    
## x8.x            .           .    
## x9.x            .           .    
## x10.x           .      -1.139e-02
## x11.x           .           .    
## x12.x           .           .    
## x13.x           .           .    
## x14.x           .           .    
## x15.x           .           .    
## x16.x           .           .    
## x17.x           .           .    
## x18.x           .           .    
## x19.x           .           .    
## x20.x           .           .    
## x21.x           .           .    
## x22.x           .           .    
## x23.x           .           .    
## x24.x           .           .    
## x25.x           .           .    
## x26.x           .           .    
## x27.x      -1.027e-02       .    
##                                  
## nVar          3           3      
## BIC        -5.793e+03  -5.793e+03
## post prob   0.041       0.037    
## 
##   1  observations deleted due to missingness.
########################################
rm(list = ls()); set.seed(010101)
n<-1000; B<-c(2,1.1,0.7)
X<-matrix(cbind(rep(1,n),rnorm(n,0,1),rnorm(n,0,1)),n,length(B))
y1<-exp(X%*%B); y<-rpois(n,y1)
nXgar<-25; Xgar<-matrix(rnorm(nXgar*n),n,nXgar)
df<-as.data.frame(cbind(y,X[,-1],Xgar))
colnames(df) <- c("y", "x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8", "x9", "x10", "x11", "x12", "x13", "x14", "x15", "x16", "x17", "x18", "x19", "x20", "x21", "x22", "x23", "x24", "x25", "x26", "x27")
Xnew <- apply(df[,-1], 2, scale)
BICfunt <- function(Model){
    indr <- Model == 1; kr <- sum(indr)
    if(kr > 0){
        Xr <- as.matrix(Xnew[ , indr])
        model <- glm(y ~ Xr, family = poisson(link = "log"))
        model_bic <- BIC(model)
        mllMod <- -model_bic/2
    }else{
        model <- glm(y ~ 1, family = poisson(link = "log"))
        model_bic <- BIC(model); mllMod <- -model_bic/2
    }
    return(mllMod)
}
M <- 500; K <- dim(df)[2] - 1
Models <- matrix(rbinom(K*M, 1, p = 0.5), ncol = K, nrow = M)
mllnew <- sapply(1:M, function(s){BICfunt(matrix(Models[s,], 1, K))})
oind <- order(mllnew, decreasing = TRUE)
mllnew <- mllnew[oind]; Models <- Models[oind, ]
# Hyperparameters MC3
iter <- 25000
pb <- winProgressBar(title = "progress bar", min = 0, max = iter, width = 300)
s <- 1
while(s <= iter){
    ActModel <- Models[M,]
    idK <- which(ActModel == 1)
    Kact <- length(idK)
    if(Kact < K & Kact > 1){
        CardMol <- K
        opt <- sample(1:3, 1)
        if(opt == 1){ # Same
            CandModel <- ActModel
        }else{
                    if(opt == 2){ # Add
            All <- 1:K
            NewX <- sample(All[-idK], 1)
            CandModel <- ActModel
            CandModel[NewX] <- 1
        }else{ # Subtract
            LessX <- sample(idK, 1)
            CandModel <- ActModel
            CandModel[LessX] <- 0
        }
    }
    }else{
        CardMol <- K + 1
        if(Kact == K){
            opt <- sample(1:2, 1)
            if(opt == 1){ # Same
                CandModel <- ActModel
            }else{ # Subtract
                LessX <- sample(1:K, 1)
                CandModel <- ActModel
                CandModel[LessX] <- 0
            }
        }else{
            if(K == 1){
                opt <- sample(1:3, 1)
                if(opt == 1){ # Same
                    CandModel <- ActModel
                }else{
                    if(opt == 2){ # Add
                        All <- 1:K
                        NewX <- sample(All[-idK], 1)
                        CandModel <- ActModel
                        CandModel[NewX] <- 1
                    }else{ # Subtract
                        LessX <- sample(idK, 1)
                        CandModel <- ActModel
                        CandModel[LessX] <- 0
                    }
                }
            }else{ # Add
                NewX <- sample(1:K, 1)
                CandModel <- ActModel
                CandModel[NewX] <- 1
            }
        }
    }
    LogMLact <- BICfunt(matrix(ActModel, 1, K))
    LogMLcand <- BICfunt(matrix(CandModel, 1, K))
    alpha <- min(1, exp(LogMLcand-LogMLact))
    u <- runif(1)
    if(u <= alpha){
        mllnew[M] <- LogMLcand
        Models[M, ] <- CandModel
        oind <- order(mllnew, decreasing = TRUE)
        mllnew <- mllnew[oind]
        Models <- Models[oind, ]
    }else{
        mllnew <- mllnew
        Models <- Models
    }
    s <- s + 1
    setWinProgressBar(pb, s, title=paste( round(s/iter*100, 0),"% done"))
}
close(pb)
## NULL
ModelsUni <- unique(Models)
mllnewUni <- sapply(1:dim(ModelsUni)[1], function(s){BICfunt(matrix(ModelsUni[s,], 1, K))})
StMarLik <- exp(mllnewUni-mllnewUni[1])
PMP <- StMarLik/sum(StMarLik) # PMP based on unique selected models
plot(PMP)

ModelsUni[1,]
##  [1] 1 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0
PIP <- NULL
for(k in 1:K){
    PIPk <- sum(PMP[which(ModelsUni[,k] == 1)])
    PIP <- c(PIP, PIPk)
}
plot(PIP)

Xnew <- df[,-1]
nModels <- dim(ModelsUni)[1]
Means <- matrix(0, nModels, K)
Vars <- matrix(0, nModels, K)
for(m in 1:nModels){
    idXs <- which(ModelsUni[m,] == 1)
    if(length(idXs) == 0){
        Regm <- glm(y ~ 1, family = poisson(link = "log"))
    }else{
        Xm <- as.matrix(Xnew[, idXs])
        Regm <- glm(y ~ Xm, family = poisson(link = "log"))
        SumRegm <- summary(Regm)
        Means[m, idXs] <- SumRegm[["coefficients"]][-1,1]
        Vars[m, idXs] <- SumRegm[["coefficients"]][-1,2]^2 
    }
}
BMAmeans <- colSums(Means*PMP)
BMAsd <- (colSums(PMP*Vars)  + colSums(PMP*(Means-matrix(rep(BMAmeans, each = nModels), nModels, K))^2))^0.5 
plot(BMAmeans)

plot(BMAsd)

plot(BMAmeans/BMAsd)

References

McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. London: Chapman; Hall.
Nelder, J. A., and R. W. M. Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society: Series A (General) 135 (3): 370–84.
Raftery, A. 1995. “Bayesian Model Selection in Social Research.” Sociological Methodology 25: 111–63.