Chapter 7 Network Analysis
In this chapter, we will cover concepts and procedures related to network analysis in R. “Networks enable the visualization of complex, multidimensional data as well as provide diverse statistical indices for interpreting the resultant graphs” (Jones et al., 2018). Put otherwise, network analysis is a collection of techniques that visualize and estimate relationships among agents in a social context. Furthermore, network analysis is used “to analyze the social structures that emerge from the recurrence of these relations” where “[the] basic assumption is that better explanations of social phenomena are yielded by analysis of the relations among entities” (Science Direct; Linked Below).
Networks are made up of nodes (i.e., individual actors, people, or things within the network) and the ties, edges, or links (i.e., relationships or interactions) that connect them. The extent to which nodes are connected lends to interpretations of the measured social context.
“By comparison with most other branches of quantitative social science, network analysts have given limited attention to statistical issues. Most techniques and measures examine the structure of specific data sets without addressing sampling variation, measurement error, or other uncertainties. Such issues are complex because of the dependencies inherent in network data, but they are now receiving increased study. The most widely investigated approach to the statistical analysis of networks stresses the detection of formal regularities in local relational structure.
The figure above illustrates some of the relational structures commonly found in analyses of social networks.
A: Demonstrates a relationship of reciprocity/mutuality.
B: Demonstrates a directed relationship with a common target.
C: Relationships emerge from a common source.
D: Transitive direct relationships with indirect influences.
Another type is homophily, which is present, for example, when same-sex friendships are more common than between-sex friendships. This involves an interaction between a property of units and the presence of relationships” (Peter V. Marsden, in Encyclopedia of Social Measurement, 2005). This sort of model might reflect the tendency of people to seek out those that are similar to themselves.
7.0.0.1 Measures of Centrality
Measures of centrality provide quantitative context regarding the importance of a node within a network. There are four measures of centrality that we will cover.
Degree Centrality: The degree of a node is the number of other nodes that single node is connected to. Important nodes tend to have more connections to other nodes. Highly connected nodes are interpreted to have high degree centrality.
Eigenvector Centrality: The extent to which adjacent nodes are connected themselves also indicate importance (e.g., Important nodes increase the importance of other nodes).
Closeness centrality: Closeness centrality measures how many steps are required to access every other node from a given node. In other words, important nodes have easy access to other nodes given multiple connections.
Betweenness Centrality: This ranks the nodes based on the flow of
connections through the network. Importance is demonstrated through high
frequency of connection with multiple other nodes. Nodes with high
levels of betweenness tend to serve as a bridge for multiple sets of
other important nodes.
See this
link
for a set of journals and books that cover the topic.
Also, examine this (paid) online tool for text-based network analysis: https://www.infranodus.com
7.1 Zacharies Karate Club Case Study
We will be working with a dataset called Zacharies Karate Club, a
seminal dataset in network analysis literature. First we need to install
the relevant packages. Today we will need a package called igraph
, a
package useful for creating, analyzing, and visualizing networks. If you
do not have the packages already, install the tidyverse
, igraph
,
ggnetwork
, and intergraph
. igraph
helps us perform network
analysis. ggnetwork
and intergraph
are both packages used for
plotting networks in the ggplot framework.
# Load the libraries
library(tidyverse)
library(igraph)
library(ggnetwork)
library(intergraph)
Zachary’s Karate Club Background
Taken from wikipedia: “A social network of a karate club was studied by Wayne W. Zachary for a period of three years from 1970 to 1972. The network captures 34 members of a karate club, documenting pairwise links between members who interacted outside the club. During the study a conflict arose between the administrator”John A” and instructor “Mr. Hi” (pseudonyms), which led to the split of the club into two. Half of the members formed a new club around Mr. Hi; members from the other group found a new instructor or gave up karate. Based on network analysis Zachary correctly predicted each member’s decision except member #9, who went with Mr. Hi instead of John A.” In this case study, we will try to infer/predict the group splits with network analysis techniques.
7.1.0.1 Load Data and Extract Model Features
Now it’s time to extract the relevant information that we need from the dataset. We need the associations between members (edges), the groupings after the split of the network, and the labels of the nodes.
# Load and view the data
<- read_csv("Zacharies_Karate_Club.csv")
members <- read_csv("Zacharies_Karate_Club_edges.csv")
edges
# Extract information for nodes
<- members$node
nodes
# Extract information on edges
<- as.vector(rbind(edges$From, edges$To)) edges
Extract the groups and labels of the vertices and store them in vectors. Make sure that the labels are called as characters and not factors using the “str()” function, as igraph requires character data to cast labels.
7.1.0.2 Creating Networks From Data
Now that we have extracted the relevant data that we need, let’s construct a network of Zachary’s Karate club.
# Create our network
# Note that this will automatically enumerate the nodes (eg. 1, 2, 3, ...)
<- make_empty_graph(n = length(nodes), directed = F) %>%
G add_edges(edges)
We can also create vertex attributes. Let’s make a vertex attribute for each group (Mr. Hi and John A).
# Create a vertex attribute
<- G %>%
G ::set.vertex.attribute('group', index = V(G), value = groups) igraph
Create a vertex attribute for node label. Call the attribute ‘label’.
7.1.0.3 Visualizing Networks with baseR
Now visualize the network by running the plot function on our network ‘G’.
# Plot igraph object
plot(G)
Let’s change some of the plot aesthetics. We can change the vertex colors, edge colors, vertex sizes, etc. Play around with the arguments for plotting a network.
# Edit baseR plot aesthetics
plot(G, vertex.color="green", # Changes node color
edge.color = 'black', # Changes edge color
vertex.size = 10, # Changes vertex size
vertex.shape = 'circle', # Changes vertex shape
asp = 0, # Spread out nodes
layout = layout_in_circle)# Format nodes in a circle
We can also change the color of our vertices according to group.
plot(G, vertex.color = groups, # Changes node color
edge.color = 'black', # Changes edge color
vertex.size = 10, # Changes vertex size
vertex.shape = 'square', # Changes vertex shape
asp = 0) # Spread out node)
7.1.0.4 Visualizing Networks with ggnetwork
You can also use ggplot to visualize igraph objects.
# Plot igraph object with ggplot
ggplot(G, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges() +
geom_nodes()
Let’s see if we can make our the ggplot version look better.
# Plot igraph object with ggplot
ggplot(G, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges(color = 'grey', size = 1, linetype = 'dashed') + # Alter edge attributes
geom_nodes(size = 10, color = 'red', shape = 'square') + # Alter node attributes
geom_nodetext(label = people, fontface = "bold") + # Add text to nodes
theme_blank() # Remove grid
Using ggnetwork and ggplot, color or shape the nodes by karate group. Also make some other plot aesthetic changes to your liking.
7.1.0.5 Measuring Centrality
# Compute the degree centrality for our graph G.
<- centr_degree(G, mode = 'all')
degr_cent <- degr_cent$res
degr_cent
# Compute the eigenvector centrality of our network
<- eigen_centrality(G)
eign_cent <- eign_cent$vector
eign_cent
# Compute the closeness centraility
<- igraph::closeness(G)
clos_cent
# Compute betweeness centrality
<- igraph::betweenness(G) betw_cent
Finally, Let’s put all of the centrality measures in one table so that we can compare the outputs.
# Create data frame storing all of the measures of centrality
<- data.frame(vertex = nodes,
data label = people,
degree = degr_cent,
eigen = eign_cent,
closeness = clos_cent,
betweeness = betw_cent)
# Order the data by degree centrality
<- data %>% arrange(desc(degree))
data
# View the head of the data frame
head(data)
## vertex label degree eigen closeness betweeness
## 1 34 A 17 1.0000000 0.01666667 160.551587
## 2 1 H 16 0.9521324 0.01724138 231.071429
## 3 33 33 12 0.8266589 0.01562500 76.690476
## 4 3 3 10 0.8495542 0.01694915 75.850794
## 5 2 2 9 0.7123351 0.01470588 28.478571
## 6 4 4 6 0.5656143 0.01408451 6.288095
It makes sense that the most connected members of the network are indeed
John A. and Mr. Hi. We can view the centrality measures from the
perspective of the graph. Here, we add the object degr_cent
to the
vertex size to display the nodes via their degree centrality using
baseR
.
# Plot ZKC with igraph
plot(G, # Plot igraph object
vertex.color = groups, # Change vertex colors
edge.color = 'black', # Change edge color
vertex.size = 10+degr_cent, # Change node size
vertex.shape = 'circle', # Specify node shape
asp = 0, # Spreads out nodes
layout=layout_with_lgl) # Specify layout
tkplot(G,
vertex.color = groups,
edge.color = 'black',
vertex.label.color = 'white',
vertex.size = 10+degr_cent)
## [1] 1
Now, using the tidyverse
! Change the code below to make a graph of our
network where node sizes are scaled by the degree centrality.
ggplot(G, aes(x = x, y = y, xend = xend, yend = yend)) + # Do not change
geom_edges(color = 'grey', size = 1, linetype = 'dashed') + # Alter edge attributes
geom_nodes(aes(color = as.factor(group)),
size = 10,
shape = 'circle') + # Alter node attributes
geom_nodetext(label = people, fontface = "bold") + # Add text to nodes
theme_blank() + # Remove grid
scale_color_discrete(name = 'Group', # Edit legend
label= c('Mr. Hi','John A.'))
#Answer: add degr_cent to size under the "geom_nodes" layer.
7.1.0.6 Modularity
Modularity is a measure that describes the extent to which community structure is present within a network when the groups are labeled. A modularity score close to 1 indicates the presence of strong community structure in the network. In other words, nodes in the same group are more likely to be connected than nodes in different groups. A modularity score close to -1 indicates the opposite of community structure. In other words, nodes in different groups are more likely to be connected than nodes in the same group. A modularity score close to 0 indicates that no community structure (or anti-community structure) is present in the network.
Compute the modularity of the Zacharies Karate Club network using the modularity() function.
## [1] 0.3714661
Higher modularity scores are better, however, modularity should not be used alone to assess the presence of communities in network. Rather, multiple measures should be used to provide an argument for community in a network.
7.2 Community Detection
Suppose we no longer have the group labels, but we want to infer the existence of groups in our network. This process is known as community detection. There are many different ways to infer the existence of groups in a network.
7.2.0.1 Via Modularity Maximization
The goal here is to find the groupings of nodes that lead to the highest possible modularity score.
# Find communites using the modularity maximization algorithm
<- cluster_fast_greedy(G)
mod_groups <- mod_groups$membership
mod_groups
# Plot the computed modularity groupings
par(mfrow=c(1,2))
plot(G, vertex.color = mod_groups, # Changes node color
edge.color = 'black', # Changes edge color
vertex.size = 20, # Changes vertex size
vertex.shape = 'circle', # Changes vertex shape
asp = 0,
layout = layout_in_circle,
main = 'Algorithm')
# Plot the actual modularity groupings
plot(G, vertex.color = groups, # Changes node color
edge.color = 'black', # Changes edge color
vertex.size = 20, # Changes vertex size
vertex.shape = 'circle', # Changes vertex shape
asp = 0,
layout = layout_in_circle,
main = 'Actual')
It turns out that the modularity maximization algorithm finds 3 communities within the Zacharies Karate Club network. But, if we merge those two groups into two, only one node is incorrectly grouped. Let’s try another community detection algorithm.
7.2.0.2 Via Edge Betweenness
Edge betweenness community structure detection is based on the following assumption; that edges connecting separate groupings have high edge betweenness as all the shortest paths from one module to another must traverse through them. Practically this means that if we gradually remove the edge with the highest edge betweenness score, our network will separate into communities.
# Find communites using the edge betweeness algorithm
<- cluster_edge_betweenness(G)
btw_groups <- btw_groups$membership
btw_groups
# Plot the computed betweeness groupings
par(mfrow=c(1,2))
plot(G, vertex.color = btw_groups, # Changes node color
edge.color = 'black', # Changes edge color
vertex.size = 20, # Changes vertex size
vertex.shape = 'circle', # Changes vertex shape
asp = 0,
layout = layout_in_circle,
main = 'Algorithm')
# Plot the actual modularity groupings
plot(G, vertex.color = groups, # Changes node color
edge.color = 'black', # Changes edge color
vertex.size = 20, # Changes vertex size
vertex.shape = 'circle', # Changes vertex shape
asp = 0,
layout = layout_in_circle,
main = 'Actual')
7.3 Network Simulation
Say you want to model a new network with no data. it’s possible to simulate a network to find out if it is actually interesting, or random. If you are familiar with hypothesis testing, we can view these random networks as our “null models”. We assume that our null model is true until there is enough evidence to suggest that our null model does not describe the real-life network. If our null-model is a good fit, then we have achieved a good representation of our network. If we don’t have a good fit, then there is likely additional structure in the network that is unaccounted for.
Our Question: How can we explain the group structure of our network? Is it random or can we explain it via the degree sequence?
7.3.0.1 Random Network Generation
Erdos-Renyi random networks in R require that we specify a number of nodes \(n\), and an edge construction probability \(p\). Essentially, for every pair of nodes, we flip a biased coin with the probability of “heads” being \(p\). If we get a “heads”, then we draw an edge between that pair of nodes. This process simulates the social connections rather than plotting them from a dataset.
# Simulate an erdos-renyi random network and display the result
<- sample_gnp(n = length(nodes), p = .15, directed = FALSE, loops = FALSE)
ER
# Plot the erdos-renyi random network
ggplot(ER, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges(color = 'black', size = 0) +
geom_nodes(color = 'purple',
size = 10,
shape = 'circle') +
theme_blank()
Is this Erdos-Renyi random network a good representative model of the Zacharies Karate Club Network? Let’s construct the Erdos-Renyi random network that is most similar to our network.
We can map in parameters in the Erdo-Renyi random graph by specifying the number of nodes and the edge connection probability p. Considering the Zacharies Karate Club Network, we want to use 34 nodes in our graph. If we change the number of nodes, then we lose the ability to compare our network with the theoretical model. We can estimate a probability value for the simulated network using the mean of degr_cent over the length of the nodes - 1 from the ZKC network.
# Estimate parameter p for ZCC
<- mean(degr_cent)/(length(nodes)-1)
pval
# Simulate an erdos-renyi random network and display the result
<- sample_gnp(n = length(nodes),
ER p = pval,
directed = FALSE,
loops = FALSE)
# Plot the erdos-renyi random network
ggplot(ER, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges(color = 'black', size = 1) +
geom_nodes(color = 'purple',
size = 10,
shape = 'circle') +
geom_nodetext(label = people, fontface = "bold") +
theme_blank()
Let’s check out the degree distribution for our random graph and the actual ZCC graph.
# Compute degree centrality of ER Model
<- centr_degree(ER, mode = 'all')$res
degr_ER
# Construct data frame for centrality
<- data.frame(Nodes = nodes,
centr_compar ZKC = degr_cent,
ER = degr_ER)
# Reformat data frame to tidy
<- centr_compar %>%
centr_compar gather(key = 'Graph', value = 'Centrality', ZKC, ER)
# Create a bar plot of degree distributions
ggplot(data = centr_compar, aes(x = Centrality, fill = Graph)) +
geom_bar(alpha = .5, position = 'identity') +
ggtitle('Comparison of ZKC to ER random graph instance')
7.3.0.2 Configuration Model
For this kind of random-graph model, we specify the exact degree sequence of all the nodes. We then construct a random graph that has the exact degree sequence as the one given.
# Simulate a configuration model
# Note: The method simple.no.multiple prevents self loops and multiple edges
<- sample_degseq(degr_cent, method = "simple.no.multiple")
config
# Plot the configuration model
ggplot(config, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges(color = 'black', size = 1) +
geom_nodes(color = 'purple',
size = 10,
shape = 'circle') +
geom_nodetext(label = people, fontface = "bold") +
theme_blank()
Is the configuration model random network a good representative model of the Zachary’s Karate Club Network?
Let’s see if the configuration model captures the group structure of the model. We are going to perform a permutation test in which we generate 1000 different configuration models (with the same degree sequence as ZKC), and then estimate how the actual value of the ZKC modularity lines up with the distribution of configuration model modularities.
# Initialize vector to store values
<- 1000
sims <- rep(0,sims)
mod_vals
# Loop through simulations
for (i in c(1:sims))
{# Simulate a configuration model
<- sample_degseq(degr_cent, method = "simple.no.multiple")
config
# Compute the modularity of the network w/ respect to ZKC groups
<- modularity(config, groups)
mod_score
# Store the modularity value in our vector
<- mod_score
mod_vals[i]
}
Now let’s plot a histogram of these values, with a vertical line
representing the modularity of ZKC network that we computed earlier.
This value is stored in the object ZCCmod
.
# Plot a histogram modularity values
ggplot(data = as.data.frame(mod_vals), aes(x = mod_vals)) +
geom_histogram(bins = 10, color = 'black', fill = 'blue') +
geom_vline(xintercept = ZCCmod, color = 'purple')
We can see from the above that our computed modularity is extremely improbable. No simulations had a modularity that was as high as the one in ZKC. This tells us that the particular degree sequence of ZKC does not capture the community structure. Put otherwise, the configuration model does a bad job reflecting the community structure captured in the ZKC dataset.
7.3.0.3 Stochastic Block Model
Stochastic Block models are similar to the Erdos-Renyi random network but provide the additional ability to specify additional parameters. The stochastic block model adds a group structure into the random graph model. We can specify the group sizes and the edge construction probability for within group and between group modeling
# Construct the edge probability matrix and block sizes
= matrix(c(.5, .05, .05, .5), nrow = 2, ncol = 2)
pref.matrix = c(18,34-18)
block.sizes
# Simulate a stochastic block model
<- sample_sbm(n = length(nodes), pref.matrix, block.sizes, directed = FALSE, loops = FALSE)
mySBM
# Plot the SBM
ggplot(mySBM, aes(x = x, y = y, xend = xend, yend = yend)) +
geom_edges(color = 'black', size = .5) +
geom_nodes(color = 'purple',
size = 10,
shape = 'circle') +
geom_nodetext(label = people, fontface = "bold") +
theme_blank()
Is the stochastic block model a good representative model of the Zacharies Karate Club Network?
# Compute degree centrality of Stochastic Block Model
<- centr_degree(mySBM, mode = 'all')$res
degr_SBM
# Construct data frame for centrality
<- data.frame(Nodes = nodes,
centr_compar ZKC = degr_cent,
SBM = degr_SBM)
# Reformat data frame to tidy
<- centr_compar %>%
centr_compar gather(key = 'Graph', value = 'Centrality', ZKC, SBM)
# Create a bar plot of degree distributions
ggplot(data = centr_compar, aes(x = Centrality, fill = Graph)) +
geom_bar(alpha = .5, position = 'identity') +
ggtitle('Comparison of ZKC to SBM random graph instance')
7.4 Advanced Case Study
See this link (https://www.frontiersin.org/articles/10.3389/fpsyg.2018.01742/) to access a paper by Jones, Mair, & McNally (2018), all professors at Harvard University in the Department of Psychology who discuss visualizing psychological networks in R.
See this link (https://www.frontiersin.org/articles/10.3389/fpsyg.2018.01742/full#supplementary-material) to access all supplementary material, including the relevant datasets needed for the code below.
Read the paper and run the code alongside the narrative to get the most out of this case study. For a brief overview of the paper see this abstract:
“Networks have emerged as a popular method for studying mental disorders. Psychopathology networks consist of aspects (e.g., symptoms) of mental disorders (nodes) and the connections between those aspects (edges). Unfortunately, the visual presentation of networks can occasionally be misleading. For instance, researchers may be tempted to conclude that nodes that appear close together are highly related, and that nodes that are far apart are less related. Yet this is not always the case. In networks plotted with force-directed algorithms, the most popular approach, the spatial arrangement of nodes is not easily interpretable. However, other plotting approaches can render node positioning interpretable. We provide a brief tutorial on several methods including multidimensional scaling, principal components plotting, and eigenmodel networks. We compare the strengths and weaknesses of each method, noting how to properly interpret each type of plotting approach.”
## Package installations are included here for convenience
#install.packages("MPsychoR")
#install.packages("qgraph")
#install.packages("smacof")
#install.packages("wordcloud")
#install.packages("psych")
#install.packages("eigenmodel")
#install.packages("networktools")
## Note: The following R code is identical to code found in the manuscript
library("MPsychoR")
data(Rogers)
dim(Rogers)
data(Rogers_Adolescent)
dim(Rogers_Adolescent)
colnames(Rogers) <- colnames(Rogers_Adolescent) <- 1:26
library("qgraph")
<- cor(Rogers)
adult_zeroorder qgraph(adult_zeroorder, layout="spring",
groups = list(Depression = 1:16, "OCD" = 17:26),
color = c("lightblue", "lightsalmon"))
<- cor(Rogers)
adult_zeroorder
library("smacof")
<- sim2diss(adult_zeroorder)
dissimilarity_adult
<- mds(dissimilarity_adult)
adult_MDS head(round(adult_MDS$conf, 2)) # top of configuration matrix
<- mds(dissimilarity_adult, type="ordinal")
adult_MDS_ordinal plot(adult_MDS_ordinal, plot.type = "Shepard", main="Ordinal")
text(1.1,0.3, paste("Stress =", round(adult_MDS_ordinal$stress,2)))
<- mds(dissimilarity_adult, type="ratio")
adult_MDS_ratio plot(adult_MDS_ratio, plot.type = "Shepard", main="Ratio")
text(1.1,0.3, paste("Stress =", round(adult_MDS_ratio$stress,2)))
<- mds(dissimilarity_adult, type="interval")
adult_MDS_interval plot(adult_MDS_interval, plot.type = "Shepard", main="Interval")
text(1.1,0.3, paste("Stress =", round(adult_MDS_interval$stress,2)))
<- mds(dissimilarity_adult, type="mspline")
adult_MDS_mspline plot(adult_MDS_mspline, plot.type = "Shepard", main="Spline")
text(1.1,0.3, paste("Stress =", round(adult_MDS_mspline$stress,2)))
$stress
adult_MDS_mspline
qgraph(adult_zeroorder, layout=adult_MDS_mspline$conf,
groups = list(Depression = 1:16, "OCD" = 17:26),
color = c("lightblue", "lightsalmon"), vsize=4)
text(-1,-1, paste("Stress=", round(adult_MDS_mspline$stress,2)))
library("wordcloud")
qgraph(adult_zeroorder, layout=adult_MDS_mspline$conf,
groups = list(Depression = 1:16, "OCD" = 17:26),
color = c("lightblue", "lightsalmon"),
vsize=0, rescale=FALSE, labels=FALSE)
points(adult_MDS_mspline$conf, pch=16)
textplot(adult_MDS_mspline$conf[,1]+.03,
$conf[,2]+.03,
adult_MDS_msplinecolnames(adult_zeroorder),
new=F)
<- EBICglasso(cor(Rogers), n=408)
adult_glasso qgraph(adult_glasso, layout=adult_MDS_mspline$conf,
groups = list(Depression = 1:16, "OCD" = 17:26),
color = c("lightblue", "lightsalmon"), vsize=4)
text(-1,-1, paste("Stress=", round(adult_MDS_mspline$stress,2)))
<- cor(Rogers_Adolescent)
adolescent_zeroorder <- sim2diss(adolescent_zeroorder)
dissimilarity_adolescent <- mds(dissimilarity_adolescent, type="mspline")
adolescent_MDS
<- Procrustes(adult_MDS_mspline$conf, adolescent_MDS$conf)
fit_procrustes
<- EBICglasso(cor(Rogers_Adolescent), n=87, gamma=0)
adolescent_glasso
qgraph(adult_glasso, layout=fit_procrustes$X, groups = list(Depression = 1:16, "OCD" = 17:26),
color = c("lightblue", "lightsalmon"), title= "Adults, n=408", vsize=4)
text(-1,-1, paste("Stress=", round(adult_MDS_mspline$stress,2)))
qgraph(adolescent_glasso, layout=fit_procrustes$Yhat,
groups = list(Depression = 1:16, "OCD" = 17:26),
color = c("lightblue", "lightsalmon"), title="Adolescents, n=87", vsize=4)
text(-1,-1, paste("Stress=", round(adolescent_MDS$stress,2)))
round(fit_procrustes$congcoef, 3)
library("psych")
<- principal(cor(Rogers), nfactors = 2)
PCA_adult qgraph(adult_glasso, layout=PCA_adult$loadings, groups = list(Depression = 1:16, "OCD" = 17:26),
color = c("lightblue", "lightsalmon"), title= "Adults, n=408", layoutOffset=c(.3,.1), vsize=4)
text(1.5,-.8, paste("% var=", round(sum(PCA_adult$values[1:2]/length(PCA_adult$values)),2)))
title(xlab="Component 1", ylab= "Component 2")
library("eigenmodel")
diag(adult_glasso) <- NA ## the function needs NA diagonals
<- 2 ## 2-dimensional solution
p <- eigenmodel_mcmc(Y = adult_glasso, R = p, S = 1000, burn = 200, seed = 123)
fitEM <- eigen(fitEM$ULU_postmean)
EVD <- EVD$vec[, 1:p] ## eigenvectors (coordinates)
evecs
qgraph(adult_glasso, layout=evecs, groups = list(Depression = 1:16, "OCD" = 17:26),
color = c("lightblue", "lightsalmon"), title= "Adults, n=408", vsize=4)
title(xlab="Dimension 1", ylab= "Dimension 2")
library("networktools")
<- EBICglasso(cor(Rogers), n=408)
adult_glasso <- qgraph(adult_glasso)
adult_qgraph MDSnet(adult_qgraph, MDSadj=cor(Rogers))
PCAnet(adult_qgraph, cormat = cor(Rogers))
EIGENnet(adult_qgraph)
7.5 Datasets for Network Analysis
There is a package called “igraphdata” that contains many network datasets. Additionally, there are several more datasets at “The Colorado Index of Complex Networks (ICON)”. Here is the link: https://icon.colorado.edu/#!/
7.6 Review
In this chapter we introduced network analysis concepts and methods. To make sure you understand this material, there is a practice assessment to go along with this chapter at https://jayholster1.shinyapps.io/NetworksinRAssessment/
7.7 References
Bojanowski, M. (2015). intergraph: Coercion routines for network data objects. R package version 2.0-2. http://mbojan.github.io/intergraph
Csardi, G., Nepusz, T. (2006). “The igraph software package for complex network research.” InterJournal, Complex Systems, 1695. <https://igraph.org>.
Paranyushkin, D. (2019). InfraNodus: Generating insight using text network analysis. In The World Wide Web Conference (WWW ’19). Association for Computing Machinery, New York, NY, USA, 3584–3589. https://doi.org/10.1145/3308558.3314123
Payton, J. J., Mair, P., & McNally, R. J. (2018). Visualizing psychological networks: A tutorial in R. Frontiers in Psychology, 9(1), https://doi.org/10.3389/fpsyg.2018.01742
Tyner, S., Briatte, F., & Hofmann, H. (2017). Network Visualization
with ggplot2
, The R Journal
9(1): 27–59. https://briatte.github.io/ggnetwork/
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L.D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T.L., Miller, E., Bache, S.M., Müller, K., Ooms, J., Robinson, D., Seidel, D.P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K., & Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686.
7.7.1 R Short Course Series
Video lectures of each guidebook chapter can be found at https://osf.io/6jb9t/. For this chapter, find the follow the folder path Network Analysis in R -> AY 2021-2022 Spring and access the video files, r markdown documents, and other materials for each short course.
7.7.2 Acknowledgements
This guidebook was created with support from the Center for Research Data and Digital Scholarship and the Laboratory for Interdisciplinary Statistical Analaysis at the University of Colorado Boulder, as well as the U.S. Agency for International Development under cooperative agreement #7200AA18CA00022. Individuals who contributed to materials related to this project include Jacob Holster, Eric Vance, Michael Ramsey, Nicholas Varberg, and Nickoal Eichmann-Kalwara.