2.6 Big data: The Vs
- 3 Vs go back to Laney (2001)
- Volume: e.g. 2014: Facebook’s data warehouse Hive, with 300 petabytes (300 million GB) of data in 800,000 tables. (Source 2014)
- Variety: Can you think of examples?
- Velocity
- e.g. 2012: Facebook generates 4 new petabyes of data per day
- e.g. Facebook: 2700000000 Likes, 300000000 Photos per day (Source 2012).. around 31250 Likes, 3472 Photos per second…
- e.g. searches per second on Google (hard to estimate)
- Additional Vs
- Veracity/validity: Measurement quality/error (Monroe 2013, 5)
- Value: What is the value of such data?
- Vinculation:
- to vinculate = “bind together”; emphasizes interdependent nature of social data (Monroe 2013, 4–5)
- Q: Who has/owns the data, e.g., Facebook data?
References
Laney, Doug. 2001. “3d Data Management: Controlling Data Volume, Velocity and Variety.” META Group Research Note 6 (70): 1.
Monroe, Burt L. 2013. “The Five Vs of Big Data Political Science Introduction to the Virtual Issue on Big Data in Political Science Political Analysis.” Polit. Anal. 21 (V5): 1–9.