Chapter 2 Ramen Analysis

2.0.1 Data Set-up

In this analysis, we try to answer the question: “Which instant noodles is the best in the world”

I first read in the ramen-ratings.csv data and did some cleanings to remove non-sensical ratings.
Our data has 2551 ramen products with ratings between \((0, 5]\).

ramen_data_orig <- read.csv("ramen-ratings.csv")
ramen_data <- ramen_data_orig[!is.na(as.numeric(as.character(ramen_data_orig$Stars))) & ramen_data_orig$Stars != 0, ]
ramen_data$Stars <- as.numeric(as.character(ramen_data$Stars))
head(ramen_data)
##   Review..   Brand                                    Variety Style
## 4      734 Indomie Mi Goreng Rasa Ayam Panggang Jumbo (Local)  Pack
## 5       45 Indomie                             Mi Goreng Sate  Pack
## 6      105 Indomie                 Special Fried Curly Noodle  Pack
## 7      608    Koka                         Spicy Black Pepper  Pack
## 8       47 Indomie           Mi Goreng Jumbo Barbecue Chicken  Pack
## 9      392  Nissin                   Yakisoba Noodles Karashi  Tray
##     Country Stars  Top.Ten
## 4 Indonesia     5       \n
## 5 Indonesia     5       \n
## 6 Indonesia     5  2012 #1
## 7 Singapore     5 2012 #10
## 8 Indonesia     5  2012 #2
## 9     Japan     5  2012 #3

2.0.2 General Statistics from Data

Let’s take a look at some useful statistics calculated from the data.

ramen_summary <- aggregate(ramen_data$Stars, list(ramen_data$Country), mean)
colnames(ramen_summary) <- c("uniq_cnt", "mean_rates")
ramen_summary[, -1] <- round(ramen_summary[, -1], 2)
ramen_summary
##         uniq_cnt mean_rates
## 1      Australia       3.14
## 2     Bangladesh       3.71
## 3         Brazil       4.35
## 4       Cambodia       4.20
## 5         Canada       2.49
## 6          China       3.55
## 7       Colombia       3.29
## 8          Dubai       3.58
## 9        Estonia       3.50
## 10          Fiji       3.88
## 11       Finland       3.58
## 12       Germany       3.64
## 13         Ghana       3.50
## 14       Holland       3.56
## 15     Hong Kong       3.80
## 16       Hungary       3.61
## 17         India       3.40
## 18     Indonesia       4.07
## 19         Japan       3.99
## 20      Malaysia       4.15
## 21        Mexico       3.73
## 22       Myanmar       3.95
## 23         Nepal       3.55
## 24   Netherlands       2.48
## 25       Nigeria       1.50
## 26      Pakistan       3.00
## 27   Philippines       3.33
## 28        Poland       3.62
## 29       Sarawak       4.33
## 30     Singapore       4.13
## 31   South Korea       3.82
## 32        Sweden       3.25
## 33        Taiwan       3.73
## 34      Thailand       3.38
## 35            UK       3.04
## 36 United States       3.75
## 37           USA       3.53
## 38       Vietnam       3.22

2.0.3 Select Country of Interest

I then continued to select some countries with highly rated instant noodles and also where products can be purchased easily through an online shop.

library(ggplot2)
plot_ramen <- ggplot(ramen_summary, aes(uniq_cnt, mean_rates)) + geom_bar(stat = "identity", fill = "steel blue") + theme(axis.text.x = element_text(angle = 45, hjust = 1))
plot_ramen

2.0.4 Conclusion

This is a very rough analysis that only takes into account rating by countries. As the deadline is approaching, I will keep working on more detailed analysis later next week. This includes searching for key words such as “chicken” and “curry”, if sample sizes are large enough to reduce variance, and analyzing packagings of instant noodles products.