3.5 Genres from Rom-Com to Film Noir
Films may be described as being of a type or genre. There is no uniform way of doing this and IMDb uses descriptions of up to three genres for each film. When there is more than one genre listed for a film, they are listed alphabetically. In this dataset of films with more than 100 ratings there are almost 1250 different descriptions, i.e., combinations of up to three genres. The top 20 combinations of up to three genre descriptors with cumulative percentages were
genres | cumPerc |
---|---|
Drama | 11.9 |
Comedy | 18.8 |
Documentary | 22.8 |
Comedy,Drama | 26.7 |
Drama,Romance | 30.1 |
Comedy,Romance | 32.4 |
Comedy,Drama,Romance | 34.7 |
Horror | 36.9 |
Animation,Comedy,Family | 38.6 |
Crime,Drama | 40.0 |
Action,Crime,Drama | 41.3 |
Drama,Thriller | 42.6 |
Thriller | 43.9 |
Horror,Thriller | 45.0 |
Comedy,Short | 46.1 |
Crime,Drama,Thriller | 47.2 |
Short | 48.1 |
Drama,Short | 49.1 |
Western | 50.1 |
Documentary,Short | 51.0 |
Figure 3.11 shows the corresponding cumulative distribution.
There were 22 individual genre descriptors that are each mentioned over 3000 times in the dataset. This excludes films with no description, Adult films, Film Noir, and four other minor categories. Film Noir is a well-known term referring to cynical crime dramas filmed in black and white, but has not been used by IMDb for any film made after 1958. The last Film Noir listed is the famous Orson Welles film “Touch of Evil”.
Another version of the dataset has been constructed with up to 3 records per film, listing the genres separately in the same column. This allows grouping by genre, but means that some films may appear in up to 3 groups. Figure 3.12 plots time series of the percentages of films for which three particular genre descriptors were used. In the silent era the descriptor Comedy was used a lot, but there were far fewer films and they were shorter. Romance had a peak when sound came in. Drama has been used the most for many years. As on average around two descriptors were used for each film, the percentages for all would add up to about 200.
Instead of calculating percentages on the total numbers of films, it could have been done on the total numbers of genre descriptors used. As an alternative to percentages, the absolute numbers of films for which particular descriptors were used could be plotted, and this is done in Figure 3.13 for the 22 main genres by year from 1901 to 2019. The genres have been put into groups with roughly the same maximum number of films in a year. Each group is plotted on a separate row with its own vertical scale, so that the top row has a range over ten times that of the bottom row. Otherwise the Drama genre would determine the scale and the development over time of the other genres would hardly be visible.
Drama is the genre listed most often, whether alone or as one of two or one of three descriptors, and its number increased steadily until around the year 2000, when the increase became much more rapid. The Comedy genre followed a related pattern, but with a slower increase since 2000. Of the four genres in the second row, Action and Thriller rose earlier, and Horror even fell in the 1990s, while Documentary only took off after 2000. All four are at about the same annual level in 2019. In the third row, the peak in Romance films after sound films began stands out. A related feature can be seen in the fourth row, where both the Family and Animation genres had peaks across the 1930s and 1940s. The numbers for the genres in the fifth and final row are lower. There were more musicals made shortly after sound was introduced than at any time afterwards. War films peaked in World War II and Westerns had their last peak in the 1960s.
The increasing number of films in more recent years is probably due to the development of film industries in countries like India. Lower numbers in earlier years may reflect the restriction to films having more than 100 ratings. Comparing the old dataset referred to in §3.1 with the new one, numbers of ratings have increased. This is likely to have favoured newer films. IMDb first moved to the web in 1993 and is relatively young compared to the film industry itself. This doubtless also favours new films over old in terms of whether they are rated by someone.
There are a large number of genres and combinations of genres. The accuracy or otherwise of the classifications of films may be debatable, but, if nothing else, these kinds of graphics offer much opportunity for entertaining discussion. Experts in film may be able to offer explanations for some of the features on view. As always, good graphics can stimulate more involvement with the data than tables or text alone.