Introduction (5 pt)
Marriage plays an important and integral role in our society. Despite
recent waves of societal, cultural, and generational change, the family
remains the bedrock unit for studying the individual and society as well
as a prime focus of study in its own (Peterson
and Bush (2012)). It is widely believed that marriage associates
with overall increase happiness and life satisfaction. Indeed, the
empirical research of marital satisfaction has shown that in stable
marriages, spouses are healthier, happier, and live longer (Abreu-Afonso et al. (2022)). Moreover, marriage
has also been found to be correlated with higher levels of happiness in
Taiwan, where the results of most samples showed that the happiness
levels were significantly higher than the baseline within 3 years of
marriage (Tao (2019)).
However, the increase in cohabitation raises questions as to whether
only marriage has beneficial effects (Perelli-Harris et al. (2019)). Apart from this,
because of the current trends towards self-realization and personal
independence, more and more individuals are opting out of marriage,
preferring being single and happy at the same time. Marriage requires
financial losses, certain efforts and commitment. But what is even more
important and doubtfulness is the feeling of happiness and life
satisfaction while being married. Indeed in several studies, marital
satisfaction is mentioned to decrease over time, being higher in the
first years of the relationship (Abreu-Afonso et
al. (2022)). Regarding the gender differences in context of
happiness nd marriage, one research has found that there is a higher
separation risk among men who are happy with their
relationship, but not for women (Perelli-Harris and Blom (2021)).
The provided research paper is aimed at analyzing the connection
between marital status and happiness level (including overall life
satisfaction) among individuals in Germany based on the SOEP teaching
data. The main objective is to investigate whether there is a strong
correlation between marriage and the feeling of happiness.
Data and Method (5 pt)
The data that was used for the following Final Project is selected
variables from the SOEP teaching data.
The Socio-Economic Panel (SOEP) is a German dataset, that was
introduced in 1984. There are many individuals in the whole country who
are participating in questionnaire regarding various aspects of their
life - such as education, employment, health, happiness etc. Because the
same people are surveyed every year, it is possible to track long-term
psychological, economic, societal, and social developments. Teaching
SOEP database provides information about 50% of the original SOEP data
and, thus, is perfect for a small analysis.
In order to investigate correlation between marital status and
happiness level selected tables from teaching SOEP database were chosen
and Exploratory Data Analysis (EDA) was conducted.
Sample
In the selected sample we have 128752 observations over
2007-2018. In order to obtain these sample we explored 7 provided
datasets in the teaching and chose the most suitable
ones for the further analysis. The basic exclusion criteria that has
been applied was the relevance of the data for the given topic.
Variables
The selected dataset soep
contains eight variables,
which are the following:
pid
- unique person identifier
syear
- year of the observation
gender
age
marital status
life satisfaction
family satisfaction
Frequency of being happy in the last 4 weeks
The variables were filtered to be more than zero as the negative
values are regarded as NA and thus are not relevant for our analysis
(either not provided or not applicable).
From our perspective, the selected variables could provide the most
suitable results in regard to the correlation between marriage and the
level of happiness. In this analysis, we assume that happiness can be
measured by the level of life satisfaction along with family
satisfaction and frequency of being happy in the last 4 weeks.
Empirical Model
In this analysis two multiple regression models were applied:
The first one: LifeSatisfaction(Happiness)=MarriageStatus+Gender+MarriageStatus∗Gender - to see who has a better life satisfaction
level males or females and of what marriage status.
The second one: LifeSatisfaction(Happiness)=MarriageStatus+Age+MarriageStatus∗Age - to see how life satisfaction changes for
people among different marriage statuses over people’s lifetime.
Furthermore, we calculated Fixed Effects (FE) and First Difference
(FD) estimators of how the marriage status affects life satisfaction
level.
Data Analysis (10 pt)
Load data
Firstly, we need to load all the necessary libraries for our data
analysis.
library(tidyverse)
library(sjlabelled)
library(huxtable)
library(stargazer)
library(gtsummary)
library(plm)
library(haven)
library(interactions)
We will use the tidyverse
package to get access to a
great variety of packages such as dplyr
and
ggplot2
which helps us to clean and visualize the data. The
package sjlabelled
is a useful package for working with
labelled data. gtsummary
, huxtable
and
stargazer
will allow us to report the model estimation in
regression specific table format. plm
will make the
estimation of linear panel models straightforward. haven
will be helpful for reading .dta files. Finally,
interactions
package will significantly add functionality
to plot visualization.
Now we can import the necessary data.
pequiv <- read_dta("pequiv.dta",
col_select = c("pid", "syear", "d11101",
"d11102ll", "d11104", "p11101"))
pl <- read_dta("pl.dta",
col_select = c("pid", "syear", "plh0186", "plh0180"))
Data Management
Here we are merging two tables into one in order to accomplish
required data analysis.
master <- merge(pequiv, pl, by = c("pid", "syear"))
After that, the Stata labels are being dropped, the variables are
renamed, and factor variables are created. These steps are necessary for
the further analysis.
soep <- remove_all_labels(master)
soep <- soep %>%
rename(age = d11101,
gender = d11102ll,
marital = d11104,
life_satisfaction = p11101,
freq_happy_4_w = plh0186,
family_satisfaction = plh0180) %>%
filter(marital > 0, life_satisfaction > 0, freq_happy_4_w > 0,
family_satisfaction > 0) %>%
mutate(marital = factor(marital, levels = c(1, 2, 3, 4, 5)),
gender = factor(gender, levels = c(1, 2)),
freq_happy_4_w = factor(freq_happy_4_w, levels = c(1, 2, 3, 4, 5)))
levels(soep$marital) = c("married", "single", "widowed", "divorced", "separated")
levels(soep$gender) = c("male", "female")
levels(soep$freq_happy_4_w) = c("very seldom", "seldom", "sometimes", "often", "very often")
Summary statistics
Let’s look at the general statistics of our data set without columns
pid
and syear
with the help of the library
gtsummary
tbl_summary(soep[-c(1,2)])
Characteristic |
N = 128,752 |
gender |
|
male |
59,922 (47%) |
female |
68,830 (53%) |
age |
49 (36, 62) |
marital |
|
married |
77,533 (60%) |
single |
29,925 (23%) |
widowed |
7,584 (5.9%) |
divorced |
10,730 (8.3%) |
separated |
2,980 (2.3%) |
life_satisfaction |
8.00 (6.00, 8.00) |
family_satisfaction |
8.00 (7.00, 9.00) |
freq_happy_4_w |
|
very seldom |
2,348 (1.8%) |
seldom |
10,008 (7.8%) |
sometimes |
39,986 (31%) |
often |
64,033 (50%) |
very often |
12,377 (9.6%) |
In the dataset there are 47%
of male and
53%
of female individuals, the vast majority of them are
married (60%
), which is beneficial for the further
analysis. This is followed by 23%
of single people. To
conclude, the given data is valid for future investigation.
Moreover, here we can see that people mainly think that marriage is
very important for them (as much as 65%
), and they reported
being often happy over the last month.
To understand the linkage between marriage and happiness in depth we
should explore these data and analyze if this level of happiness is
influenced by the marital status or not.
Data Visulizations
In this section we visualize the given data and find some insights
regarding the effect of marriage on happiness among German
individuals.
soep %>%
group_by(marital) %>%
summarise(life_satisfaction = round(mean(life_satisfaction), 2)) %>%
ggplot(aes(x = marital, y = life_satisfaction, group = marital)) +
geom_col(fill = "lightblue", width = 0.5) +
geom_text(aes(label = life_satisfaction), vjust = 3) +
ggtitle("Average level of life satisfaction among people of different marriage statuses") +
xlab("Marriage Status") +
ylab("Life Satisfaction")

On this bar chart we may clearly see that married people have the
highest level of Life Satisfaction. This is, however, followed by
single individuals. What is eye catchy is that the difference
between this two groups is indeed small, only 0.13
.
Nevertheless, it is worth to mention here that these are the
average level over the whole observed period, which gives only
a vague representation of the data. It is still valid as here the
difference between satisfaction level of different marriage statuses
could be clearly observed, however, to understand the detailed
correlation we need to make further investigation.
soep %>%
filter(marital == c("married", "single", "divorced")) %>%
group_by(syear, marital) %>%
summarise(life_satisfaction = round(mean(life_satisfaction), 2),
family_satisfaction = round(mean(family_satisfaction), 2)) %>%
ggplot(aes(x = syear, y = life_satisfaction, color=marital)) +
ggtitle("Comparison of life satisfaction level of married, single and divorced people") +
geom_line() +
labs(x = "Year", y = "Life Satisfaction", color = "Marriage Status")

In the line graph, the tremendous difference between divorced and
not divorced people can be observed. Although the
overall life satisfaction has increased among all three groups over
time, people who are divorced have significantly lower level of life
satisfaction. More importantly is that although single people initially
had slightly higher level of life satisfaction than married, this has
changed after less than a year. This can be a hint towards the
increasing importance of being happy while having marriage at the later
stage of life. Young people may be happy without being married, however
this is not the case after a couple of years.
soep %>%
ggplot(aes(x=marital,y=life_satisfaction,fill=marital)) +
geom_boxplot() +
ggtitle("Life Satisfaction level among marriage statuses") +
labs(x = "Marriage Status", y = "Life Satisfaction", fill = "Marriage Status")

Here we can see another evidence of marriage and single people being
happier than widowed | divorced | separated. This finding concludes that
there is an effect on happiness of being either single or married. What
is here interesting as well is the smaller inter-quartile happiness
range of married and single people, which means that they are generally
have lower happiness spread and totally satisfied with their life.
Main analysis
Now we will move on to the main analysis and estimate our model.
Interaction (categorical * dummy)
interact_life_marital_gender <- lm(life_satisfaction ~ marital * gender, data = soep)
interact_life_marital_gender
##
## Call:
## lm(formula = life_satisfaction ~ marital * gender, data = soep)
##
## Coefficients:
## (Intercept) maritalsingle
## 7.31190 -0.09228
## maritalwidowed maritaldivorced
## -0.34467 -0.46368
## maritalseparated genderfemale
## -0.47829 0.07093
## maritalsingle:genderfemale maritalwidowed:genderfemale
## -0.05053 -0.14084
## maritaldivorced:genderfemale maritalseparated:genderfemale
## -0.13837 -0.13604
Distribution of life satisfaction level among males and females of
different marriage statuses
cat_plot(interact_life_marital_gender, pred = gender, modx = marital, legend.main = "Marriage Status") +
labs(x = "Gender", y = "Life Satisfaction")

Here we may see visualization of the model results. It totally
correlates with the previous findings of married and single people being
happier than the rest of the observed group. Moreover, the depicted
graph gives additional information regarding gender: married and single
women are generally slightly happier than male. On the contrary,
widowed, divorced and separated men have generally higher life
satisfaction compared to women. One of the possible reason for this may
be a child presence, which often stays with woman after the divorce.
Interaction (categorical * continuous)
interact_life_marital_age <- lm(life_satisfaction ~ marital * age, data = soep)
interact_life_marital_age
##
## Call:
## lm(formula = life_satisfaction ~ marital * age, data = soep)
##
## Coefficients:
## (Intercept) maritalsingle maritalwidowed
## 7.840417 0.021497 -0.683758
## maritaldivorced maritalseparated age
## -1.082701 -1.093432 -0.009217
## maritalsingle:age maritalwidowed:age maritaldivorced:age
## -0.011025 0.005869 0.010127
## maritalseparated:age
## 0.010171
Change of life satisfaction level of people with marriage statuses
with age
interact_plot(interact_life_marital_age, pred = age, modx = marital, legend.main = "Marriage Status") +
labs(x = "Age", y = "Life Satisfaction")

The depicted line graph highlights again very interesting correlation
between life satisfaction and age of Germans grouped by marital status.
The tremendous decrease by around 25% in life satisfaction can be seen
among single individuals, similar, but not that speed pattern is among
married ones (~ 11%). Notably, the changes in life satisfaction in other
groups are very negligible and overall remain at the same level. From
our perspective, this shows that single people are indeed less happy
over the ages, they feel necessity to be with somebody else, compared to
married.
Models comparison
huxreg("interact_life_marital_gender" = interact_life_marital_gender,
"interact_life_marital_age" = interact_life_marital_age,
coefs = c("married" = "(Intercept)", "single" = "maritalsingle", "widowed" = "maritalwidowed",
"divorced" = "maritaldivorced", "separated" = "maritalseparated",
"female" = "genderfemale", "single female" = "maritalsingle:genderfemale",
"widowed female" = "maritalwidowed:genderfemale",
"divorced female" = "maritaldivorced:genderfemale",
"separated female" = "maritalseparated:genderfemale", "age" = "age",
"age single" = "maritalsingle:age", "age widowed" = "maritalwidowed:age",
"age divorced" = "maritaldivorced:age", "age separated" = "maritalseparated:age"),
statistics = c("N. obs." = "nobs", "R squared" = "r.squared", "F statistic" = "statistic"),
align = "center")
| interact_life_marital_gender | interact_life_marital_age |
married | 7.312 *** | 7.840 *** |
| (0.009) | (0.023) |
single | -0.092 *** | 0.021 |
| (0.016) | (0.034) |
widowed | -0.345 *** | -0.684 *** |
| (0.041) | (0.121) |
divorced | -0.464 *** | -1.083 *** |
| (0.028) | (0.079) |
separated | -0.478 *** | -1.093 *** |
| (0.049) | (0.119) |
female | 0.071 *** | |
| (0.012) | |
single female | -0.051 * | |
| (0.023) | |
widowed female | -0.141 ** | |
| (0.047) | |
divorced female | -0.138 *** | |
| (0.035) | |
separated female | -0.136 * | |
| (0.063) | |
age | | -0.009 *** |
| | (0.000) |
age single | | -0.011 *** |
| | (0.001) |
age widowed | | 0.006 *** |
| | (0.002) |
age divorced | | 0.010 *** |
| | (0.001) |
age separated | | 0.010 *** |
| | (0.002) |
N. obs. | 128752 | 128752 |
R squared | 0.012 | 0.021 |
F statistic | 177.538 | 311.477 |
*** p < 0.001; ** p < 0.01; * p < 0.05. |
From the first column (first regression model) of the table above we
can say that if the person is married and is male the
expected level of life satisfaction is 7.312. If the person is
female the life satisfaction increases by
0.071. Also, if the person is a single male than the
anticipated level of life satisfaction decreases by 0.092, while for a
single female it decreases by 0.143. If the person is a widowed
male it decreases by 0.345, while for a widowed female it falls by
0.486, if the person is a divorced man the expected level of life
satisfaction drops by 0.464 and for a divorced woman it decreases by
0.602. Finally if the person is a separated man than the life
satisfaction reduces by 0.478 and for a separated female it decreases by
0.614. So, from the first regression model we may see that overall
females that are married or single are more satisfied with their lives
than males, therefore, they are happier. However, divorced, widowed or
separated males have a higher level of life satisfaction than that of
females. That totally coincides with the previous results of the
analysis.
From the second column (second regression model) of the table we can
conclude that if the person is married the life satisfaction makes up
7.84, however, in contrast to the first regression model, this figure
rises for single people by 0.021. If the person is widowed the level of
life satisfaction reduces by 0.684, if the person is divorced it drops
by 1.083 and if the person is separated it falls by 1.093. Moreover, we
can see that if the person is getting older the life satisfaction level
decreases. For instance life satisfaction for a 25 year old will be 7.84
- 0.009 * 25 = 7.615, while for an 80 years old person it will equal
7.84 - 0.009 * 80 = 7.12. The same assumption is valid for single and
widowed people (by 0.003); however the life satisfaction of singles
decreases more dramatically by 0.02 (-0.009 - 0.011). Unlike single and
married people, life satisfaction level of divorced and separated people
tends to rise over time of their life, but very insignificantly by
0.001. The second model tells us that the life satisfaction of married,
single snd widowed people tend to decrease over their lifetime, while
for divorced and separated people it gradually increases.
FE and FD estimators of how the marital category affects life
satisfaction
FE <- plm(life_satisfaction ~ marital, data = soep, model = "within")
FD <- plm(life_satisfaction ~ marital, data = soep, model = "fd")
stargazer(FE, FD,
type = "text",
column.labels = c("FE", "FD"),
dep.var.labels.include = FALSE,
omit = "Constant",
covariate.labels = c("single", "widowed", "divorced", "separated"),
omit.stat = "F",
model.numbers = FALSE)
##
## =========================================
## Dependent variable:
## ----------------------------
## FE FD
## -----------------------------------------
## single -0.169*** -0.100**
## (0.031) (0.050)
##
## widowed -0.389*** -0.749***
## (0.045) (0.075)
##
## divorced -0.001 -0.105*
## (0.038) (0.060)
##
## separated -0.311*** -0.243***
## (0.039) (0.051)
##
## -----------------------------------------
## Observations 128,752 104,414
## R2 0.002 0.001
## Adjusted R2 -0.231 0.001
## =========================================
## Note: *p<0.1; **p<0.05; ***p<0.01
From this table we may see that the immediate impact of getting
single or separated is higher than the overall level difference of being
single or separated. Thus, if person is getting separated or single than
on average the level of life satisfaction is generally higher than if
this person is already single or separated. The opposite can be said
about widowed and divorced people. If person is getting widowed or
divorced than the level of life satisfaction drops significantly higher
than if this person is already widowed or divorced.
Conclusion (10 pt)
Marriage is known to be an essential part of life among majority of
people. Although it is generally believed that marriage makes the life
better and contributes to overall increased happiness, some researches
argue towards this point of view and provide valid arguments. In this
study we analyzed the real data, provided by SOEP, and found the
following insights:
- married and single people have higher level of satisfaction than the
rest of the observed group;
- divorced people have significantly lower level of life satisfaction
over the years;
- married and single females are happier than men; opposite can be
seen among widowed, divorced and separated individuals;
- single people are becoming far less happier over the years than
married.
To summarize, we may certainly say that marriage has a big effect
on the happiness level. Although this was not clear from the
beginning as the level of happiness among married and single was
similar, after model estimation and visualization it was evident that
single people become far less happier than married ones. Therefore, we
may indeed conclude that the given analysis is robust and
meaningful.
However, there are certain limitations in the provided research. We
have used lm()
function, which stands for OLS regression.
OLS provides perfect results under certain conditions, in particular
when all OLS-assumptions are fulfilled and the model is indeed linear.
We did not conduct preliminary analysis and model comparison, therefore
this could be viewed as an implication for future research.
The following next steps could be to split the data into train and
test, calculate mean MSE for train and test datasets and compare this
results with for instance neural network using Ridge, Lasso or Elastic
Net Regression.
References (5 pt)
Abreu-Afonso, José, Maria Meireles Ramos, Inês Queiroz-Garcia, and
Isabel Leal. 2022. “How Couple’s Relationship Lasts over Time? A
Model for Marital Satisfaction.” Psychological Reports
125 (3): 1601–27.
Perelli-Harris, Brienna, and Niels Blom. 2021. “So Happy Together…
Examining the Association Between Relationship Happiness, Socio-Economic
Status, and Family Transitions in the UK.” Population
Studies, 1–18.
Perelli-Harris, Brienna, Stefanie Hoherz, Trude Lappegård, and Ann
Evans. 2019. “Mind the ‘Happiness’ Gap: The
Relationship Between Cohabitation, Marriage, and Subjective Well-Being
in the United Kingdom, Australia, Germany, and Norway.”
Demography 56 (4): 1219–46.
Peterson, G. W., and K. R. Bush. 2012.
Handbook of Marriage and the
Family. Springer US.
https://books.google.de/books?id=7c3-r5QmAn0C.
Tao, Hung-Lin. 2019. “Marriage and Happiness: Evidence from
Taiwan.” Journal of Happiness Studies 20 (6): 1843–61.