Chapter 3 Botanic Gardens Survey Analysis
The primary concern of this research is to explore how and why people use the Singapore Botanic Gardens.
An intercept survey was conducted among the visitors of the Gardens. The survey aims to capture visitors’ usage of the Gardens and their perceptions of the Gardens. Visitors of the Gardens were randomly approached to participate in the survey. Surveys were conducted within a week across both weekdays and weekends, in continuous time periods from 6AM to 10PM, and in different locations within the Northern (67 surveys) and Southern (66 surveys) cores of the Gardens (Figure 1). Survey responses were collected from the visitors during their visit in the Gardens to reflect their ‘in-the-moment’ experiences. Surveys were filled up by the surveyors through ‘a walk and talk’ approach or by the visitor filling up the digital survey themselves.
3.1 Loading data from googlesheets
data <- read_sheet("1BFAqMUUn3mDq0OdStNYnWZh0Rr392LXfPMJuXLKMXr0")
## Reading from 'Botanic Gardens Intercept Survey'
## Range "'Singapore Botanic Gardens Survey'"
A preview of the dataset:
When was the last time you have visited the Singapore Botanic Gardens? | How often do you normally visit this place? | How long do you plan to stay here today? | What is your reason for visiting the Singapore Botanic Gardens today? | Who are you here with today? | How did you get here today? | What do you like about this place? | What do you not like about this place? | If you could change or add something to the park, what would that be? | I am satisfied with the layout of the Singapore Botanical Gardens. | It is easy to find my way around the gardens. | It is easy to get to the gardens. | I am satisfied with the overall flora and fauna of the gardens. | There is a variety of food available at the gardens. | The food available at the gardens is affordable. | The garden is crowded. | Seating areas are available (e.g. benches and chairs) across the gardens. | Sheltered areas area available across the gardens. | There are many different activities to do within the gardens. | I feel safe within the gardens. | What is your age? | What gender do you identify as? | What neighbourhood do you live in? | Are there enough easily accessible green areas in the city? | Last one! Is there anything else you would like to share about the Singapore Botanic Gardens - something you absolutely love or hate about this park? | (Internal) What section of the park was this survey taken in? | (Internal) What is your (interviewer) name? | (Internal) Any other remarks you have as an interviewer for this particular survey | Submitted At | Token |

[Sample rows showing survey structure - full dataset contains 133 responses]
3.3 Is there any relation between frequency_visit and last_visit?
As expected, the highest counts are people who visit the garden more than once a week and last visited the garden less than a week ago.
data %>%
filter(last_visit != "This is my first time") %>%
ggplot(aes(frequency_visit, last_visit)) +
theme_fivethirtyeight() +
labs(title="Relation between frequency_visit and last_visit", y="",x="") +
theme(plot.title = element_text(size=14, hjust=0.5))+
theme(axis.text.x = element_text(angle = 20, hjust = 1))
3.4 Counts of transport modes taken to the garden
We can plot a frequency bar chart to find out how many people got to the gardens by each transport mode.
data %>%
separate_rows(transport_mode,sep=", ") %>%
theme_fivethirtyeight() +
labs(title="How did you get here today?", y="",x="") +
theme(plot.title = element_text(size=14, hjust=0.5))+
theme(axis.text.x = element_text(angle = 20, hjust = 1))
3.5 What is the age distribution of the respondents?
data %>%
ggplot(aes(x=age)) + geom_histogram(binwidth=10)+
theme_fivethirtyeight() +
theme(plot.title = element_text(size=14, hjust=0.5))+
labs(title="Age distribution of respondents") +
theme(axis.title = element_text()) + ylab("Frequency")+xlab("Age")
3.6 Are there any differences between how different age groups reach SBG?
To create a facet plot of the travel modes per age group! But the ages have to be grouped into bins first if not we would get a plot per age.
With the age groups formed, we expect 14 plots.
data[age <1, agegroup := "0-1"]
data[age >0 & age <5, agegroup := "1-4"]
data[age >4 & age <10, agegroup := "5-9"]
data[age >9 & age <15, agegroup := "10-14"]
data[age >14 & age <20, agegroup := "15-19"]
data[age >19 & age <25, agegroup := "20-24"]
data[age >24 & age <30, agegroup := "25-29"]
data[age >29 & age <35, agegroup := "30-34"]
data[age >34 & age <40, agegroup := "35-39"]
data[age >39 & age <45, agegroup := "40-44"]
data[age >44 & age <50, agegroup := "45-49"]
data[age >49 & age <55, agegroup := "50-54"]
data[age >54 & age <60, agegroup := "55-59"]
data[age >59 & age <65, agegroup := "60-64"]
data[age >64 & age <70, agegroup := "65-69"]
data[age >69 & age <75, agegroup := "70-74"]
data[age >74 & age <80, agegroup := "75-79"]
data[age >79 & age <85, agegroup := "80-84"]
data[age >84, agegroup := "85+"]
data %>%
separate_rows(transport_mode,sep=", ") %>%
filter(transport_mode!="NA") %>%
filter(! %>%
filter(transport_mode!="") %>%
geom_bar(aes(x = transport_mode, fill=agegroup))+
theme_fivethirtyeight() +
labs(title="Count of transport modes by age groups", y="",x="") +
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0)) +
theme(axis.text.x = element_text(angle = 20, hjust = 1))+
ylab("frequency")+xlab("transport mode")
3.7 How do different age groups rate the gardens’ accessibility?
data %>%
group_by(agegroup) %>%
count = n(),
mean = mean(satisfaction_access, na.rm = TRUE),
sd = sd(satisfaction_access, na.rm = TRUE),
median = median(satisfaction_access, na.rm = TRUE),
IQR = IQR(satisfaction_access, na.rm = TRUE)
## # A tibble: 14 x 6
## agegroup count mean sd median IQR
## <chr> <int> <dbl> <dbl> <dbl> <dbl>
## 1 15-19 3 5 0 5 0
## 2 20-24 11 4.18 0.751 4 1
## 3 25-29 16 4.44 0.814 5 1
## 4 30-34 25 4.24 0.970 4 1
## 5 35-39 20 4.65 0.489 5 1
## 6 40-44 13 4.15 1.14 4 1
## 7 45-49 4 4.25 0.957 4.5 1.25
## 8 50-54 12 4.08 1.31 4.5 1
## 9 55-59 6 4.5 0.837 5 0.75
## 10 60-64 8 4.75 0.463 5 0.25
## 11 65-69 6 4.83 0.408 5 0
## 12 70-74 7 4.86 0.378 5 0
## 13 75-79 1 5 NaN 5 0
## 14 80-84 1 5 NaN 5 0
3.8 Are people who visit the gardens more often, more satisfied with it?
Most attributes except for “crowdedness” and “seating” showed a general increase in satisfaction, as we compare the mean of the likert rating for the two extremes of visit frequencies – “less that every month” and “more than once a week”. Only “layout” had an steadily increasing mean as the visit frequency increases. Most attributes such as “food variety”, “food affordability”, “crowdedness”, “seating”, “wayfinding”, “flora and fauna” had the highest mean of likert rating at “1-3 times a week”.
Step 1. For each frequency of visit, what is the mean, sd and median of the likert ratings for each attribute?
likert_visitfreq <- data %>%
group_by(frequency_visit) %>%
summarise_at(c("satisfaction_layout", "satisfaction_wayfinding", "satisfaction_access","satisfaction_florafauna","satisfaction_food_variety","satisfaction_food_affordability","satisfaction_crowdedness","satisfaction_seating","satisfaction_shelter","satisfaction_activities","satisfaction_safety"), funs(mean, sd,median), na.rm = TRUE)
frequency_visit | satisfaction_layout_mean | satisfaction_wayfinding_mean | satisfaction_access_mean | satisfaction_florafauna_mean | satisfaction_food_variety_mean | satisfaction_food_affordability_mean | satisfaction_crowdedness_mean | satisfaction_seating_mean | satisfaction_shelter_mean | satisfaction_activities_mean | satisfaction_safety_mean | satisfaction_layout_sd | satisfaction_wayfinding_sd | satisfaction_access_sd | satisfaction_florafauna_sd | satisfaction_food_variety_sd | satisfaction_food_affordability_sd | satisfaction_crowdedness_sd | satisfaction_seating_sd | satisfaction_shelter_sd | satisfaction_activities_sd | satisfaction_safety_sd | satisfaction_layout_median | satisfaction_wayfinding_median | satisfaction_access_median | satisfaction_florafauna_median | satisfaction_food_variety_median | satisfaction_food_affordability_median | satisfaction_crowdedness_median | satisfaction_seating_median | satisfaction_shelter_median | satisfaction_activities_median | satisfaction_safety_median |
Less than every month | 4.268293 | 3.682927 | 4.170732 | 4.219512 | 2.902439 | 2.634146 | 2.804878 | 3.585366 | 3.292683 | 3.219512 | 4.536585 | 0.6334189 | 0.9858759 | 0.9461088 | 0.7909550 | 1.019923 | 0.9938837 | 0.9278877 | 0.9212928 | 1.0060791 | 0.9620861 | 0.6744465 | 4.0 | 4.0 | 4 | 4 | 3 | 3 | 3 | 4.0 | 3 | 3.0 | 5 |
1-3 times a month | 4.285714 | 4.428571 | 4.428571 | 4.642857 | 3.500000 | 3.071429 | 3.357143 | 3.857143 | 3.357143 | 3.571429 | 4.500000 | 1.0690450 | 0.6462062 | 0.7559289 | 0.6333237 | 1.091928 | 0.7300459 | 0.8418974 | 0.7703289 | 0.8418974 | 1.0163499 | 0.6504436 | 4.5 | 4.5 | 5 | 5 | 3 | 3 | 3 | 4.0 | 3 | 3.5 | 5 |
Once a week | 4.368421 | 4.263158 | 4.684210 | 4.631579 | 3.052632 | 2.578947 | 2.947368 | 3.578947 | 3.368421 | 3.842105 | 4.473684 | 0.6839856 | 0.7334928 | 0.4775669 | 0.4955946 | 1.129094 | 1.0173926 | 0.7798635 | 0.9612370 | 0.8306976 | 0.6882472 | 0.6117753 | 4.0 | 4.0 | 5 | 5 | 3 | 2 | 3 | 3.0 | 3 | 4.0 | 5 |
More than once a week | 4.416667 | 4.138889 | 4.472222 | 4.472222 | 2.722222 | 2.555556 | 3.111111 | 3.555556 | 3.277778 | 3.500000 | 4.444444 | 0.9673233 | 1.0461570 | 1.0277885 | 0.9098229 | 1.322576 | 0.9394358 | 1.0358648 | 1.1574466 | 1.1615534 | 1.0000000 | 0.9085135 | 5.0 | 4.5 | 5 | 5 | 3 | 3 | 3 | 3.5 | 3 | 3.5 | 5 |
NA | 4.478261 | 4.260870 | 4.608696 | 4.565217 | 2.913044 | 2.739130 | 2.304348 | 4.130435 | 3.826087 | 3.347826 | 4.739130 | 0.6653478 | 0.9153932 | 0.5830274 | 0.6623709 | 1.040675 | 1.2510865 | 1.2945614 | 0.9678631 | 1.1140497 | 0.9820524 | 0.4489778 | 5.0 | 4.0 | 5 | 5 | 3 | 3 | 2 | 4.0 | 4 | 3.0 | 5 |
Step 2. To transform the datatable from wide to long:
likert_visitfreq <- melt(setDT(likert_visitfreq),
measure = patterns("mean", "sd","median"), = 'var', = c('avg', 'sd','median'))
likert_visitfreq<-likert_visitfreq %>%
mutate(var=recode_factor(likert_visitfreq$var, `1` = "layout", `2` = "wayfinding", `3` = "access",`4`="flora&fauna",`5`="food_variety",`6`="food_affordability",`7`="crowdedness",`8`="seating",`9`="shelter",`10`="activities",`11`="safety")) %>%
freq_visit_levels <- c("Less than every month","1-3 times a month","Once a week","More than once a week")
likert_visitfreq <- likert_visitfreq %>%
## frequency_visit var avg sd median
## 1 Less than every month layout 4.268293 0.6334189 4.0
## 2 1-3 times a month layout 4.285714 1.0690450 4.5
## 3 Once a week layout 4.368421 0.6839856 4.0
## 4 More than once a week layout 4.416667 0.9673233 5.0
## 5 Less than every month wayfinding 3.682927 0.9858759 4.0
## 6 1-3 times a month wayfinding 4.428571 0.6462062 4.5
## 7 Once a week wayfinding 4.263158 0.7334928 4.0
## 8 More than once a week wayfinding 4.138889 1.0461570 4.5
## 9 Less than every month access 4.170732 0.9461088 4.0
## 10 1-3 times a month access 4.428571 0.7559289 5.0
## 11 Once a week access 4.684211 0.4775669 5.0
## 12 More than once a week access 4.472222 1.0277885 5.0
## 13 Less than every month flora&fauna 4.219512 0.7909550 4.0
## 14 1-3 times a month flora&fauna 4.642857 0.6333237 5.0
## 15 Once a week flora&fauna 4.631579 0.4955946 5.0
## 16 More than once a week flora&fauna 4.472222 0.9098229 5.0
## 17 Less than every month food_variety 2.902439 1.0199235 3.0
## 18 1-3 times a month food_variety 3.500000 1.0919284 3.0
## 19 Once a week food_variety 3.052632 1.1290942 3.0
## 20 More than once a week food_variety 2.722222 1.3225756 3.0
## 21 Less than every month food_affordability 2.634146 0.9938837 3.0
## 22 1-3 times a month food_affordability 3.071429 0.7300459 3.0
## 23 Once a week food_affordability 2.578947 1.0173926 2.0
## 24 More than once a week food_affordability 2.555556 0.9394358 3.0
## 25 Less than every month crowdedness 2.804878 0.9278877 3.0
## 26 1-3 times a month crowdedness 3.357143 0.8418974 3.0
## 27 Once a week crowdedness 2.947368 0.7798635 3.0
## 28 More than once a week crowdedness 3.111111 1.0358648 3.0
## 29 Less than every month seating 3.585366 0.9212928 4.0
## 30 1-3 times a month seating 3.857143 0.7703289 4.0
## 31 Once a week seating 3.578947 0.9612370 3.0
## 32 More than once a week seating 3.555556 1.1574466 3.5
## 33 Less than every month shelter 3.292683 1.0060791 3.0
## 34 1-3 times a month shelter 3.357143 0.8418974 3.0
## 35 Once a week shelter 3.368421 0.8306976 3.0
## 36 More than once a week shelter 3.277778 1.1615534 3.0
## 37 Less than every month activities 3.219512 0.9620861 3.0
## 38 1-3 times a month activities 3.571429 1.0163499 3.5
## 39 Once a week activities 3.842105 0.6882472 4.0
## 40 More than once a week activities 3.500000 1.0000000 3.5
## 41 Less than every month safety 4.536585 0.6744465 5.0
## 42 1-3 times a month safety 4.500000 0.6504436 5.0
## 43 Once a week safety 4.473684 0.6117753 5.0
## 44 More than once a week safety 4.444444 0.9085135 5.0
- To visualise:
ggplot(data=likert_visitfreq, aes(x=frequency_visit, y = avg, colour = var, group = var)) +
geom_dl(aes(label = var), method = list(dl.combine("first.points"), cex = 0.7))+
theme_fivethirtyeight() +
labs(title="Mean of likert rating per attribute by frequency of visit", y="",x="") +
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0)) +
3.8.1 Is there any difference in satisfaction for northern and southern section?
The responses showed higher satisfaction (higher mean of likert rating) in the southern section for: * layout * food variety * flora and fauna * seating * shelter * safety
The responses showed higher satisfaction (higher mean of likert rating) in the northern section for: * wayfinding * access * food affordability * crowdedness * activities
likert_visitfreq_northsouth <- data %>%
group_by(frequency_visit,park_section) %>%
summarise_at(c("satisfaction_layout", "satisfaction_wayfinding", "satisfaction_access","satisfaction_florafauna","satisfaction_food_variety","satisfaction_food_affordability","satisfaction_crowdedness","satisfaction_seating","satisfaction_shelter","satisfaction_activities","satisfaction_safety"), funs(mean, sd,median), na.rm = TRUE) %>%
filter(frequency_visit!="NA") %>%
likert_visitfreq_northsouth <- melt(setDT(likert_visitfreq_northsouth),
measure = patterns("mean", "sd","median"), = 'var', = c('avg', 'sd','median'))
likert_visitfreq_northsouth<-likert_visitfreq_northsouth %>%
mutate(var=recode_factor(likert_visitfreq_northsouth$var, `1` = "layout", `2` = "wayfinding", `3` = "access",`4`="flora&fauna",`5`="food_variety",`6`="food_affordability",`7`="crowdedness",`8`="seating",`9`="shelter",`10`="activities",`11`="safety")) %>%
freq_visit_levels <- c("Less than every month","1-3 times a month","Once a week","More than once a week")
likert_visitfreq_northsouth <- likert_visitfreq_northsouth %>%
knitr::kable(likert_visitfreq_northsouth %>%
group_by(park_section, var) %>%
summarise(avg=mean(avg)) %>%
park_section | var | avg |
Northern (nearest to Botanic Gardens MRT) | layout | 4.200000 |
Southern (nearest to Gleneagles Hospital) | layout | 4.462744 |
Northern (nearest to Botanic Gardens MRT) | wayfinding | 4.180357 |
Southern (nearest to Gleneagles Hospital) | wayfinding | 4.120860 |
Northern (nearest to Botanic Gardens MRT) | access | 4.461607 |
Southern (nearest to Gleneagles Hospital) | access | 4.450122 |
Northern (nearest to Botanic Gardens MRT) | flora&fauna | 4.441071 |
Southern (nearest to Gleneagles Hospital) | flora&fauna | 4.527800 |
Northern (nearest to Botanic Gardens MRT) | food_variety | 2.936607 |
Southern (nearest to Gleneagles Hospital) | food_variety | 3.141802 |
Northern (nearest to Botanic Gardens MRT) | food_affordability | 2.746429 |
Southern (nearest to Gleneagles Hospital) | food_affordability | 2.679627 |
Northern (nearest to Botanic Gardens MRT) | crowdedness | 3.177679 |
Southern (nearest to Gleneagles Hospital) | crowdedness | 2.949188 |
Northern (nearest to Botanic Gardens MRT) | seating | 3.255357 |
Southern (nearest to Gleneagles Hospital) | seating | 3.978044 |
Northern (nearest to Botanic Gardens MRT) | shelter | 3.085714 |
Southern (nearest to Gleneagles Hospital) | shelter | 3.530073 |
Northern (nearest to Botanic Gardens MRT) | activities | 3.541071 |
Southern (nearest to Gleneagles Hospital) | activities | 3.525041 |
Northern (nearest to Botanic Gardens MRT) | safety | 4.321429 |
Southern (nearest to Gleneagles Hospital) | safety | 4.643912 |
ggplot(data=likert_visitfreq_northsouth, aes(x=frequency_visit, y = avg, colour = var, group = var)) +
geom_dl(aes(label = var), method = list(dl.combine("first.points"), cex = 0.7))+
theme_fivethirtyeight() +
labs(title="Mean of likert rating per attribute by frequency of visit", y="",x="") +
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0)) +
theme(axis.text.x = element_text(angle = 20, hjust = 1))
3.9 Is there any difference in age distribution for northern and southern section?
Respondents of both sections were mostly around the age of 30 years old. The southern section showed a higher proportion of older people (>50 years old).
gg <- data %>%
filter(park_section!="NA") %>%
ggplot(aes(x=age)) + geom_histogram(binwidth=10)+facet_wrap(~park_section)+
theme_fivethirtyeight() +
theme(plot.title = element_text(size=14, hjust=0.5))+
labs(title="Age Distribution in different park sections") +
theme(axis.title = element_text()) + ylab("Frequency")+xlab("Age")
3.10 Botanic Gardens Survey Analysis (previous try)
3.10.1 Loading the data
botanic_surveys <- read_sheet("1BFAqMUUn3mDq0OdStNYnWZh0Rr392LXfPMJuXLKMXr0")
## Reading from 'Botanic Gardens Intercept Survey'
## Range "'Singapore Botanic Gardens Survey'"
3.10.2 Lots of data cleaning
botanic_surveys <- subset(botanic_surveys,botanic_surveys$`(Internal) What is your (interviewer) name?`!="Test")
botanic_surveys <- subset(botanic_surveys,botanic_surveys$`(Internal) What is your (interviewer) name?`!="Test Kana")
botanic_surveys <- subset(botanic_surveys,botanic_surveys$`(Internal) What is your (interviewer) name?`!="Atr")
botanic_surveys <- subset(botanic_surveys,botanic_surveys$`(Internal) What is your (interviewer) name?`!="Mister")
table(botanic_surveys$`What gender do you identify as?`)
## Couple Female Male
## 2 63 68
botanic_surveys$`What neighbourhood do you live in?`<-toupper(botanic_surveys$`What neighbourhood do you live in?`)
botanic_surveys <- botanic_surveys %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "AMK", "ANG MO KIO")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "BY TIMAH","BUKIT TIMAH")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "NEARBY BUKIT TIMAH", "BUKIT TIMAH")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "BUKIT TIMAH ROAD", "BUKIT TIMAH")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "NEARBY NUKIT TIMAH", "BUKIT TIMAH")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "UPP BUKIT TIMAH", "BUKIT TIMAH")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "JURONG WEST.", "JURONG WEST")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "NOVENA.", "NOVENA")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "WATTEN ESTATE.", "WATTEN ESTATE")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "MALAYSIA.", "MALAYSIA")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "HOLLAND ROAD", "HOLLAND V")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "TELOK BELANGAH", "TELOK BLANGAH")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "YCK", "YIO CHU KANG")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "WOODLAND", "WOODLANDS")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "QUEENSTOWN AND CCK", "QUEENSTOWN")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "WEST", "OTHERS")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "NOT MENTIONED", "OTHERS")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "NORTH EAST", "OTHERS")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "JERVOIS ROAD", "OTHERS")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "SHERATON TOWER", "OTHERS")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "YOTEL", "OTHERS")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "STEVEN ROAD", "STEVENS ROAD")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "YOU PANDAN/FARREER MARKET", "FARRER ROAD")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "FARRER RD", "FARRER ROAD")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "CLEMENT", "CLEMENTI")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "JURONG EAST", "JURONG")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "JURONG WEST", "JURONG")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "JURONG OTHERS", "JURONG")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "HOUGANG AREA", "HOUGANG")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "MALAYSIASABAH", "MALAYSIA"))
botanic_countries <- botanic_surveys %>%
mutate(Overseas=`What neighbourhood do you live in?`%!in%c("ANG MO KIO","BEDOK","BISHAN","BOTANIC AREA","BUKIT BATOK","BUKIT MERAH","BUKIT PANJANG","BUKIT TIMAH",
botanic_surveys <- botanic_surveys %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "CLEMENTII", "CLEMENTI")) %>%
mutate(`What neighbourhood do you live in?` = str_replace(`What neighbourhood do you live in?`, "WOODLANDSS","WOODLANDS"))
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="ANG MO KIO"] <- "North-east"
## Warning: Unknown or uninitialised column: 'Region'.
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="BEDOK"] <- "East"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="BISHAN"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="BOTANIC AREA"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="BUKIT BATOK"] <- "West"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="BUKIT MERAH"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="BUKIT PANJANG"] <- "West"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="CHOA CHU KANG"] <- "West"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="CLEMENTI"] <- "West"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="CORONATION ROAD"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="EAST COAST"] <- "East"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="ENGLAND"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="FARRER ROAD"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="FRANCE"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="HAVELOCK"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="HO CHIH MINH"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="HOLLAND"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="HOLLAND V"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="HOUGANG"] <- "North-east"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="INDIA"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="INDONESIA"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="ITALY"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="JURONG"] <- "West"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="KAKI BUKIT"] <- "East"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="KATONG"] <- "East"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="KENT RIDGE"] <- "West"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="KOVAN"] <- "North-east"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="LONDON, UK"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="MALAYSIA"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="MALAYSIA SABAH"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="OTHERS"] <- "Others"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="NEWTON"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="ORCHARD"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="NOVENA"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="PASIR RIS"] <- "East"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="PAYA LEBAR"] <- "East"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="PENANG"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="POLAND"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="PUNGGOL"] <- "North-east"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="QUEENSTOWN"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="REDHILL"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="SELETAR"] <- "North-east"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="SENGKANG"] <- "North-east"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="SERANGOON"] <- "North-east"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="SINGAPORE"] <- "Others"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="STEVENS ROAD"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="TAIWAN"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="TAN KAH KEE"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="TANGLIN"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="TELOK BLANGAH"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="THAILAND"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="THOMSON"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="TIONG BAHRU"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="TOA PAYOH"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="USA"] <- "Overseas"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="WATTEN ESTATE"] <- "Central"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="WOODLANDS"] <- "North"
botanic_surveys$Region[botanic_surveys$`What neighbourhood do you live in?`=="YIO CHU KANG"] <- "North-east"
botanic_surveys <- mutate(botanic_surveys,`What is your age?`=5*round(`What is your age?`/5))
Data parsing for multiple response: reasons of visit and creating a table with some open-ended “reasons of visit” as “others”
botanic_reasons <- botanic_surveys %>%
select(`What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
separate(`What is your reason for visiting the Singapore Botanic Gardens today?`, c('1','2','3','4','5','6','7','8','9','10','11','12'), sep =',', remove = FALSE, fill="right") %>%
stack(botanic_reasons, select=-`What is your reason for visiting the Singapore Botanic Gardens today?`)
botanic_reasons <-$values)))
names(botanic_reasons)[1] = 'Reasons'
'%!in%' <- function(x,y)!('%in%'(x,y))
botanic_reasons <- botanic_reasons %>%
mutate(Others=(Reasons%!in%c("To walk the dog","To exercise","To relax","To listen to and observe nature","To escape from the city","To take photographs","To meet others","To be with my children","To get inspiration","To meditate","To attend events (eg. movies in the park, symphony performances, and tours)")))
botanic_reasons_others <- botanic_reasons %>%
group_by(Others) %>%
botanic_reasons_others <- botanic_reasons_others %>%
filter(Others=="TRUE") %>%
botanic_reasons <- rbind(botanic_reasons,botanic_reasons_others)
botanic_reasons_less <- botanic_reasons %>%
filter(Others==FALSE | Reasons=="Others") %>%
botanic_who <- botanic_surveys %>%
select(`Who are you here with today?`) %>%
separate(`Who are you here with today?`, c('1','2','3','4','5','6','7'), sep =',', remove = FALSE, fill="right") %>%
stack(botanic_who, select=-`Who are you here with today?`)
botanic_who <-$values)))
names(botanic_who)[1] = 'Who'
Data parsing for multiple response: commute
botanic_commute <- botanic_surveys %>%
select(`How did you get here today?`) %>%
separate(`How did you get here today?`, c('1','2','3','4','5','6','7'), sep =',', remove = FALSE, fill="right") %>%
stack(botanic_commute, select=-`How did you get here today?`)
botanic_commute <-$values)))
names(botanic_commute)[1] = 'mode'
3.10.3 Frequency plots
ggplot(data=botanic_surveys, aes(botanic_surveys$`What is your age?`)) +
geom_histogram(breaks=seq(15, 80, by = 10),
col="black") +
theme(axis.title = element_text()) + ylab("Frequency")+xlab("Age")+
ggtitle("Histogram for age of respondents")
positions <- c("This is my first time", "Longer than a year ago", "Within the last year", "Within the last month","Less than a week ago")
ggplot(botanic_surveys, aes(botanic_surveys$`When was the last time you have visited the Singapore Botanic Gardens?`))+
ggtitle("When was the last time you have visited \n the Singapore Botanic Gardens?")+
geom_bar(fill=c("#6baed6", "#bdd7e7","#9ecae1","#c6dbef","#08519c"))+
geom_text(stat='count', aes(label=..count..), vjust=0, hjust=-2,color="black")+
scale_x_discrete(limits = positions)+
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0))+
theme(axis.title = element_text()) + ylab("Frequency")+xlab(" ")
positions <- c("Less than every month", "1-3 times a month", "Once a week","More than once a week")
botanic_surveys %>%
filter(`How often do you normally visit this place?`!="NA") %>%
ggplot(aes(`How often do you normally visit this place?`))+
ggtitle("How often do you normally visit this place?")+
geom_text(stat='count', aes(label=..count..), vjust=0, hjust=-1,color="black")+
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0))+
theme(axis.title = element_text()) + ylab("Frequency")+xlab(" ")
a <- table(botanic_surveys$`When was the last time you have visited the Singapore Botanic Gardens?`,botanic_surveys$`How often do you normally visit this place?`)
a <-
names(a)[1] = 'Last Visit'
names(a)[2] = 'Visit Frequency'
positions <- c("Less than every month", "1-3 times a month", "Once a week","More than once a week")
botanic_surveys %>%
filter(`When was the last time you have visited the Singapore Botanic Gardens?`!="This is my first time") %>%
ggplot() + geom_bar(aes(y = Freq, x = `Visit Frequency`, fill = `Last Visit`),
data = a,
stat="identity") +
theme_fivethirtyeight() +
theme(plot.title = element_text(size=14, hjust=0.5))+
labs(title="Count of visit frequency by last visit") +
theme(axis.title = element_text("")) + ylab("Frequency")+xlab(" ")+
theme(legend.spacing.x = unit(0.3, 'cm'))+
positions <- c("< 15 minutes", "15 - 30 minutes", "30 minutes to an hour","> 1 hour")
ggplot(botanic_surveys, aes(`How long do you plan to stay here today?`))+
ggtitle("How long do you plan to stay here today?")+
scale_x_discrete(limits = positions)+
geom_text(stat='count', aes(label=..count..), vjust=-1)+
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0))+
theme(axis.title = element_text()) + ylab("Frequency")+xlab(" ")
botanic_reasons_less %>%
geom_text(stat='identity', aes(label=Freq), vjust=0, hjust=-1,color="black")+
theme_fivethirtyeight() +
labs(title="What is your reason for visiting the \n Singapore Botanic Gardens today?")+
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0))+
theme(axis.title = element_text()) + ylab("Frequency")+xlab(" ")
botanic_who %>%
filter(Who!="") %>%
geom_text(stat='identity', aes(label=Freq), vjust=0, hjust=-1,fill="black")+
ylab("Frequency")+xlab(" ")+
scale_fill_brewer(palette = "Blues")+
theme_fivethirtyeight() +
labs(title="Who are you with here today?")+
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0))+
theme(axis.title = element_text()) + ylab("Frequency")+xlab(" ")
## Warning: Ignoring unknown parameters: fill
botanic_commute %>%
filter(mode!="") %>%
geom_text(stat='identity', aes(label=Freq), vjust=0, hjust=-1,color="black")+
theme_fivethirtyeight() +
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0))+
labs(title="How did you get here today?")+
theme(axis.title = element_text()) + ylab("Frequency")+xlab(" ")
## Attaching package: 'ggridges'
## The following object is masked from 'package:ggplot2':
## scale_discrete_manual
botanic_surveys %>%
filter(`What gender do you identify as?`!="Couple") %>%
ggplot(aes(x = `What is your age?`, y = `What gender do you identify as?`)) +
geom_density_ridges(aes(fill = `What gender do you identify as?`),scale=0.6) +
scale_fill_manual(values = c("#c60d11", "#1c2e7c"))+
theme_fivethirtyeight() +
theme(plot.title = element_text(size=14, hjust=0.5))+
labs(title="Age Distribution by Gender") +
theme(axis.title = element_text()) + ylab(" ")+xlab("Age")
## Picking joint bandwidth of 5.31
3.10.4 Inter-variable frequency plots
botanic_mode_neighbourhood_part1 <- botanic_surveys %>%
select(Region, `How did you get here today?`) %>%
separate(`How did you get here today?`, c('1','2'), sep =',', remove = FALSE, fill="right") %>%
select(-'How did you get here today?') %>%
botanic_mode_neighbourhood_part2 <- botanic_surveys %>%
select(Region, `How did you get here today?`) %>%
separate(`How did you get here today?`, c('1','2'), sep =',', remove = FALSE, fill="right") %>%
select(-'How did you get here today?') %>%
names(botanic_mode_neighbourhood_part2)[2] = '1'
botanic_mode_neighbourhood <- na.omit(rbind(botanic_mode_neighbourhood_part1,botanic_mode_neighbourhood_part2))
names(botanic_mode_neighbourhood)[2] = 'Mode'
botanic_mode_neighbourhood <- botanic_mode_neighbourhood[-which(botanic_mode_neighbourhood$Mode==""), ]
botanic_mode_neighbourhood<- botanic_mode_neighbourhood %>%
Bicycle/Personal mobility device | Bus | Car | Grab | Lorry | Motorbike | MRT | On foot | Taxi | |
Central | 0 | 4 | 22 | 0 | 0 | 0 | 7 | 11 | 2 |
East | 0 | 2 | 4 | 0 | 1 | 0 | 3 | 1 | 0 |
North | 1 | 1 | 1 | 0 | 0 | 0 | 2 | 0 | 0 |
North-east | 0 | 0 | 7 | 0 | 0 | 0 | 9 | 2 | 1 |
Others | 1 | 1 | 1 | 0 | 0 | 0 | 2 | 1 | 0 |
Overseas | 0 | 4 | 0 | 0 | 0 | 1 | 12 | 2 | 3 |
West | 0 | 0 | 6 | 1 | 0 | 0 | 7 | 1 | 0 |
botanic_mode_neighbourhood_part1 <- botanic_surveys %>%
select(`What neighbourhood do you live in?`, `How did you get here today?`) %>%
separate(`How did you get here today?`, c('1','2'), sep =',', remove = FALSE, fill="right") %>%
filter(`What neighbourhood do you live in?`!="NA") %>%
select(-'How did you get here today?') %>%
botanic_mode_neighbourhood_part2 <- botanic_surveys %>%
select(`What neighbourhood do you live in?`, `How did you get here today?`) %>%
separate(`How did you get here today?`, c('1','2'), sep =',', remove = FALSE, fill="right") %>%
filter(`What neighbourhood do you live in?`!="NA") %>%
select(-'How did you get here today?') %>%
names(botanic_mode_neighbourhood_part2)[2] = '1'
botanic_mode_neighbourhood <- na.omit(rbind(botanic_mode_neighbourhood_part1,botanic_mode_neighbourhood_part2))
names(botanic_mode_neighbourhood)[2] = 'Mode'
botanic_mode_neighbourhood <- botanic_mode_neighbourhood[-which(botanic_mode_neighbourhood$Mode== ""), ]
botanic_mode_neighbourhood<- botanic_mode_neighbourhood %>%
knitr::kable(table(botanic_mode_neighbourhood$`What neighbourhood do you live in?`,botanic_mode_neighbourhood$Mode))
Bicycle/Personal mobility device | Bus | Car | Grab | Lorry | Motorbike | MRT | On foot | Taxi | |
ANG MO KIO | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 1 | 0 |
BEDOK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
BISHAN | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 0 | 0 |
BOTANIC AREA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
BUKIT BATOK | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
BUKIT MERAH | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
BUKIT PANJANG | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
BUKIT TIMAH | 0 | 0 | 7 | 0 | 0 | 0 | 3 | 8 | 0 |
CHOA CHU KANG | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
CLEMENTI | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 |
CORONATION ROAD | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
EAST COAST | 0 | 1 | 3 | 0 | 0 | 0 | 1 | 0 | 0 |
ENGLAND | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
FARRER ROAD | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
FRANCE | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
HAVELOCK | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
HO CHIH MINH | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
HOLLAND | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
HOLLAND V | 0 | 0 | 5 | 0 | 0 | 0 | 0 | 0 | 0 |
HOUGANG | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 1 |
INDIA | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 |
INDONESIA | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
ITALY | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
JURONG | 0 | 0 | 3 | 0 | 0 | 0 | 3 | 1 | 0 |
KAKI BUKIT | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
KATONG | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
KENT RIDGE | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 |
KOVAN | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
LONDON, UK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
MALAYSIA | 0 | 1 | 0 | 0 | 0 | 1 | 2 | 0 | 0 |
MALAYSIA SABAH | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
NEWTON | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 |
NOVENA | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
ORCHARD | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
OTHERS | 1 | 1 | 0 | 0 | 0 | 0 | 2 | 1 | 0 |
PASIR RIS | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
PAYA LEBAR | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
PENANG | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
POLAND | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
PUNGGOL | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
QUEENSTOWN | 0 | 1 | 1 | 0 | 0 | 0 | 2 | 0 | 1 |
REDHILL | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
SELETAR | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
SENGKANG | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 |
SERANGOON | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
SINGAPORE | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
STEVENS ROAD | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
TAIWAN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
TAN KAH KEE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
TANGLIN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 1 |
TELOK BLANGAH | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
THAILAND | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
THOMSON | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
TIONG BAHRU | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
TOA PAYOH | 0 | 1 | 1 | 0 | 0 | 0 | 2 | 0 | 0 |
USA | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
WATTEN ESTATE | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
WOODLANDS | 1 | 1 | 1 | 0 | 0 | 0 | 2 | 0 | 0 |
YIO CHU KANG | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |
botanic_age_reason_part1 <- botanic_surveys %>%
select(`What is your age?`, `What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
separate(`What is your reason for visiting the Singapore Botanic Gardens today?`, c('1','2','3','4','5','6'), sep =',', remove = FALSE, fill="right") %>%
filter(`What is your age?`!="NA") %>%
select(-`What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
botanic_age_reason_part2 <- botanic_surveys %>%
select(`What is your age?`, `What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
separate(`What is your reason for visiting the Singapore Botanic Gardens today?`, c('1','2','3','4','5','6'), sep =',', remove = FALSE, fill="right") %>%
filter(`What is your age?`!="NA") %>%
select(-`What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
botanic_age_reason_part3 <- botanic_surveys %>%
select(`What is your age?`, `What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
separate(`What is your reason for visiting the Singapore Botanic Gardens today?`, c('1','2','3','4','5','6'), sep =',', remove = FALSE, fill="right") %>%
filter(`What is your age?`!="NA") %>%
select(-`What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
botanic_age_reason_part4 <- botanic_surveys %>%
select(`What is your age?`, `What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
separate(`What is your reason for visiting the Singapore Botanic Gardens today?`, c('1','2','3','4','5','6'), sep =',', remove = FALSE, fill="right") %>%
filter(`What is your age?`!="NA") %>%
select(-`What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
botanic_age_reason_part5 <- botanic_surveys %>%
select(`What is your age?`, `What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
separate(`What is your reason for visiting the Singapore Botanic Gardens today?`, c('1','2','3','4','5','6'), sep =',', remove = FALSE, fill="right") %>%
filter(`What is your age?`!="NA") %>%
select(-`What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
botanic_age_reason_part6 <- botanic_surveys %>%
select(`What is your age?`, `What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
separate(`What is your reason for visiting the Singapore Botanic Gardens today?`, c('1','2','3','4','5','6'), sep =',', remove = FALSE, fill="right") %>%
filter(`What is your age?`!="NA") %>%
select(-`What is your reason for visiting the Singapore Botanic Gardens today?`) %>%
names(botanic_age_reason_part2)[2] = '1'
names(botanic_age_reason_part3)[2] = '1'
names(botanic_age_reason_part4)[2] = '1'
names(botanic_age_reason_part5)[2] = '1'
names(botanic_age_reason_part6)[2] = '1'
botanic_age_reason <- na.omit(rbind(botanic_age_reason_part1,botanic_age_reason_part2,botanic_age_reason_part3,botanic_age_reason_part4,botanic_age_reason_part5,botanic_age_reason_part6))
names(botanic_age_reason)[2] = 'Activity'
botanic_age_reason <- botanic_age_reason[-which(botanic_age_reason$Activity== ""), ]
botanic_age_reason<- botanic_age_reason %>%
a <- table(botanic_age_reason$`What is your age?`,botanic_age_reason$Activity)
a <-
a <- a %>%
filter(Activity%in%c("To relax","To exercise","To listen to and observe nature","To escape from the city","To take photographs","To meet others","To be with my children","To get inspiration","To walk the dog","To meditate")) %>%
group_by(Activity) %>%
mutate(per=(round(Freq/sum(Freq)*100, 2))) %>%
a$per <- as.numeric(as.character(a$per))
botanic_age_reason_elderly <- botanic_age_reason %>%
filter(`What is your age?`>=60)
botanic_surveys_elderly <- botanic_surveys %>%
filter(`What is your age?`>=60)
#table(botanic_surveys_elderly$`Who are you here with today?`)
ggplot(a,aes(Age,Activity)) + geom_tile(aes(alpha = per, fill = Activity)) +
scale_alpha(range = c(0.1, 1))+theme_fivethirtyeight()+
labs(title="Count of age group and activity") +
theme(axis.title = element_text("")) + ylab("Activity")+xlab("Age")
3.10.5 Word clouds for likes and dislikes of the garden
slam_url <- ""
Likes of the gardens
like <- botanic_surveys$`What do you like about this place?` #open-response variable to a new variable for text transformations
like <- gsub("[^[:graph:]]", " ", like) #get rid of non graphical characters
like <- gsub("[[:punct:]]", "", like)# Remove punctuation
like <- gsub("rt", "", like)# Replace blank space ("rt")
like <- gsub("\\s+"," ",like)
like <- gsub("[ |\t]{3,}", "", like)# Remove tabs
like <- gsub("^ ", "", like)# Remove blank spaces at the beginning
like <- gsub(" $", "", like)# Remove blank spaces at the end
like <- tolower(like)#convert all text to lower case
## Warning in like == c("no", "nil", "none"): longer object length is not a
## multiple of shorter object length
## 133
myCorpus <- Corpus(VectorSource(like))
myCorpus <- tm_map(myCorpus, removeNumbers)
myCorpus <- tm_map(myCorpus, removeWords, stopwords("english")) #removes common english stopwords
myCorpus <- tm_map(myCorpus, removeWords, c("don?t","nil","none","can?t","it?")) #You can specify words to remove
#build a term-document matrix
myTDM = TermDocumentMatrix(myCorpus, control = list(minWordLength = 1))
m = as.matrix(myTDM)
v = sort(rowSums(m), decreasing = TRUE)
d = data.frame(word = names(v),freq=v)
wordcloud(d$word, freq=d$freq, min.freq=1,scale=c(3,0.5), max.words=500, random.order=FALSE, rot.per=0.35,
use.r.layout=FALSE, colors=brewer.pal(8, "Dark2"))
#word associations of "greenery"
findFreqTerms(myTDM, lowfreq =6)
## [1] "air" "beautiful" "city" "fresh" "garden"
## [6] "good" "green" "greenery" "nature" "nice"
## [11] "peaceful" "place" "plants" "quiet" "trees"
findAssocs(myTDM,terms="green",corlimit = 0.1)
## green
## near 0.35
## ample 0.24
## better 0.24
## centre 0.24
## compared 0.24
## conveniently 0.24
## day 0.24
## field 0.24
## greeneries 0.24
## high 0.24
## hot 0.24
## keep 0.24
## located 0.24
## oasis 0.24
## office 0.24
## old 0.24
## parking 0.24
## parks 0.24
## quite 0.24
## rise 0.24
## road 0.24
## surroundings 0.24
## time 0.24
## whole 0.24
## great 0.21
## spaces 0.18
## cool 0.15
## free 0.15
## hard 0.15
## home 0.15
## large 0.15
## like 0.15
## pleasant 0.15
## fauna 0.11
## flora 0.11
## get 0.11
## space 0.11
Dislikes of the gardens
#dislike wordcloud
dislike <- botanic_surveys$`What do you not like about this place?` #open-response variable to a new variable for text transformations
dislike <- gsub("[^[:graph:]]", " ", dislike) #get rid of non graphical characters
dislike <- gsub("[[:punct:]]", "", dislike)# Remove punctuation
dislike <- gsub("rt", "", dislike)# Replace blank space ("rt")
dislike <- gsub("\\s+"," ",dislike)
dislike <- gsub("[ |\t]{3,}", "", dislike)# Remove tabs
dislike <- gsub("^ ", "", dislike)# Remove blank spaces at the beginning
dislike <- gsub(" $", "", dislike)# Remove blank spaces at the end
dislike <- tolower(dislike)#convert all text to lower case
## Warning in dislike == c("no", "nil", "none", "NA"): longer object length is
## not a multiple of shorter object length
## 116 9
myCorpus <- Corpus(VectorSource(dislike))
myCorpus <- tm_map(myCorpus, removeNumbers)
myCorpus <- tm_map(myCorpus, removeWords, stopwords("english")) #removes common english stopwords
myCorpus <- tm_map(myCorpus, removeWords, c("don?t","nil","none","can?t","it?","don?","can?","ceain","tre","tshi")) #You can specify words to remove
#build a term-document matrix
myTDM = TermDocumentMatrix(myCorpus, control = list(minWordLength = 1))
m = as.matrix(myTDM)
v = sort(rowSums(m), decreasing = TRUE)
d = data.frame(word = names(v),freq=v)
wordcloud(d$word, freq=d$freq, min.freq=1,scale=c(2,0.5), max.words=500, random.order=FALSE, rot.per=0.35,
use.r.layout=FALSE, colors=brewer.pal(8, "Dark2"))
Desired changes of the gardens
change <- botanic_surveys$`If you could change or add something to the park, what would that be?` #open-response variable to a new variable for text transformations
change <- gsub("[^[:graph:]]", " ", change) #get rid of non graphical characters
change <- gsub("[[:punct:]]", "", change)# Remove punctuation
change <- gsub("rt", "", change)# Replace blank space ("rt")
change <- gsub("\\s+"," ",change)
change <- gsub("[ |\t]{3,}", "", change)# Remove tabs
change <- gsub("^ ", "", change)# Remove blank spaces at the beginning
change <- gsub(" $", "", change)# Remove blank spaces at the end
change <- tolower(change)#convert all text to lower case
## Warning in change == c("no", "nil", "none"): longer object length is not a
## multiple of shorter object length
## 123 6
myCorpus <- Corpus(VectorSource(change))
myCorpus <- tm_map(myCorpus, removeNumbers)
myCorpus <- tm_map(myCorpus, removeWords, stopwords("english")) #removes common english stopwords
myCorpus <- tm_map(myCorpus, removeWords, c("don?t","nil","none","can?t","it?")) #You can specify words to remove
#build a term-document matrix
myTDM = TermDocumentMatrix(myCorpus, control = list(minWordLength = 1))
m = as.matrix(myTDM)
v = sort(rowSums(m), decreasing = TRUE)
d = data.frame(word = names(v),freq=v)
wordcloud(d$word, freq=d$freq, min.freq=1,scale=c(3,0.5), max.words=500, random.order=FALSE, rot.per=0.35,
use.r.layout=FALSE, colors=brewer.pal(8, "Dark2"))
3.10.6 Likert rating frequency plot
# packages needed for the nice likert plot
likert_data <-[,10:20])
likert_data_proportions <-, uniqueitems = 1:5))
likert_data_proportions <- likert_data_proportions %>%
mutate(Question=c("Layout of botanic gardens","Wayfinding within botanic gardens","Accessibility of botanic gardens","Satisfaction of flora and fauna", "Variety of food","Affordability of food","Perception of crowdedness","Availability of seating areas","Availability of sheltered areas","Range of activities","Perception of safety"))
likert_data_proportions <- likert_data_proportions[c("Question","1","2","3","4","5","miss")]
colnames(likert_data_proportions)<-c("Question","Strongly Disagree","Disagree","Neutral","Agree","Strongly Agree","Unanswered")
# Append missing levels
#likert_data_proportions %>% group_by(Question) %>% mutate(value = value / sum(value)) %>%
#ggplot(aes(x = Question, y = ifelse(ind %in% 1:2, -value, value), fill = ind)) +
#geom_col() +
mytitle<-"Perception of Botanic Gardens"
mylevels<-c("Strongly Disagree","Disagree","Neutral","Agree","Strongly Agree")
tab <- likert_data_proportions[-7]
pal<-c(pal[1:(ceiling(numlevels/2)-1)], pal[ceiling(numlevels/2)],
pal[ceiling(numlevels/2)], pal[(ceiling(numlevels/2)+1):(numlevels-1)])
tab3$Aspect<-str_wrap(tab3$Aspect, width = 40)
tab3$Aspect<-factor(tab3$Aspect, levels = tab2$Aspect[order(-(tab2[,5]+tab2[,6]+tab2[,7]))])
lows <- lows[rev(rownames(lows)),]
lows$col <- factor(lows$col, levels = c("#CA0020","#F4A582", "#DFDFDF"))
ggplot() + geom_bar(data=highs, aes(x = Aspect, y=value, fill=col), position="stack", stat="identity") +
geom_bar(data=lows, aes(x = Aspect, y=-value, fill=col), position="stack", stat="identity") +
geom_hline(yintercept = 0, color =c("white")) +
scale_fill_identity("Percent", labels = mylevels, breaks=legend.pal, guide="legend") +
theme_fivethirtyeight() +
coord_flip() +
labs(title=mytitle, y="",x="") +
theme(plot.title = element_text(size=14, hjust=0.5)) +
theme(axis.text.y = element_text(hjust=0)) +
theme(legend.position = "bottom") +
scale_y_continuous(breaks=seq(mymin,mymax,25), limits=c(mymin,mymax))
3.10.7 Likert analysis for age and gender differences
3.10.8 Correlation analysis of attributes
#correlation matrix of factors
botanic_surveys_shorter <- botanic_surveys[,10:20]
colnames(botanic_surveys_shorter) = c("layout","wayfinding","accessibility","flora & fauna","food variety","food affordability","crowd level","seating areas","sheltered areas","activities range","safety")
# 1. Compute correlation
cormat <- round(cor(botanic_surveys_shorter),2)
# 2. Reorder the correlation matrix by
# Hierarchical clustering
hc <- hclust(as.dist(1-cormat)/2)
cormat.ord <- cormat[hc$order, hc$order]
# 3. Get the upper triangle
cormat.ord[lower.tri(cormat.ord)]<- NA
# 4. Melt the correlation matrix
melted_cormat <- melt(cormat.ord, na.rm = TRUE)
# Create the heatmap
ggplot(melted_cormat, aes(Var2, Var1, fill = value))+
geom_tile(color = "white")+
scale_fill_gradient2(low = "blue", high = "red", mid = "white",
midpoint = 0, limit = c(-1,1), space = "Lab",
name="Pearson\nCorrelation") + # Change gradient color
theme_fivethirtyeight()+ # minimal theme
theme(axis.text.x = element_text(angle = 45, vjust = 1,
size = 12, hjust = 1))+
3.11 Reflections on survey analysis
Comparison of methods for survey data cleaning
Data cleaning | Bad practice | Good practice |
Working with long column titles with spacings | Leaving it as it is, and using ` throughout | To use clean_names() from janitor package to replace spacings with underscores and rename function to shorten the column names |
Dealing with multiple response questions | To create a separate table from each column of multiple response question, which withdraws relation to other columns in the original dataset | To use group_by, count and spread functions to split the column and append the resulting columns to the original dataset |