Sooner or later, every film fan who engages in discussion of movies will encounter both of the following disparaging remarks:
- Those snooty Academy Awards! Why do they always nominate obscure art films? In 20 years, nobody will remember them. When the films of today have stood the test of time, they’ll have been forgotten. The films people will remember, see, and cherish are the ones that dominate the box office!
- The public is stupid! Every year they turn the most mindless drivel into cash cows. But the fame these movies have will not endure. When the films of today have stood the test of time, they’ll have been forgotten. The films people will remember, see, and cherish are the ones that win Academy Awards!
These claims are obviously at odds with each other. They can both be false, but they cannot both be true, except in cases where the movies that win at the box office are also the ones garnering awards attention. In all probability, both statements are partially true: that is, some of the old movies we remember were box office hits and some were award winners. And some were both and some neither, but never mind. What I am interested in exploring is which of these two statements is more true? That is, if we are to use either the box office or the Academy Awards to predict which which of today’s movies will be seen and loved in 20 or more years, which would yield the more accurate forecast?
Technical details follow; skip to the pretty graphs at the bottom if you so desire.
It’s important to understand that I have just asked two different questions:
- Which movies will be seen in 20+ years?
- Which movies will be loved in 20+ years?
Unfortunately, there is no good objective way to measure how much a 20+ year old movie has been seen or loved. You can look at DVD sales and rental figures, ratings of television airings, and so on, but this is a lot of data and will still produce a potentially error-laden result.
For this research, I have used the IMDb ratings as my metric for both questions. The number of users who have submitted an IMDb rating for a film is my metric for determining how much a movie is seen. The average rating is my metric for determining how much a movie is loved. I recognize that this is absolutely an imperfect scheme. The biggest problem is selection bias: the people that submit ratings to the IMDb is far from a representative sample of the demographics of the country as a whole.
But there are four things that make this metric attractive:
- The data requires little to no interpretation. An IMDb rating means how much someone liked the movie, and its existence is an indicator that it has been seen. Other metrics require a degree of extrapolation: revenue figures must take into account average cost; television ratings do not take into account the average number of people watching one set; and so on.
- The data is uniform across all movies. DVD sales wouldn’t take into account things like how much marketing the release got, how much restoration work was done, how many extras there are, and how many prior home video releases there were. Television ratings wouldn’t take into account how often a movie gets aired generally, when the airings were, on what channel, etc.
- The IMDb has built up an impressively large collection of ratings, so errors due to sample size will be minimal.
- It was simple, cheap, and easy to obtain the data.
Range of Study
For this study, I decided to go back to 1944. Why? 1944 was the first year when the Academy Awards standardized on five Best Picture nominees. Considering the set of Oscar nominees prior to 1944 would mean considering anywhere from 3-10 nominees per year. It might not have made a difference, but I felt it best to avoid the possibility of this skewing the data.
On the other end of the spectrum, I only look at movies up through 1999. Why? As one gets closer to the present day, it becomes harder to know if a movie has truly “stood the test of time.” 1999 is still probably too late. Furthermore, the IMDb’s collection of ratings date back to the mid-1990s, so it’s possible for a mid-1990s film to have a lot of ratings and yet be relatively unseen today. My data for the 1990s should be considered with this caveat in mind.
The next question: Within the years of 1944-1999, what have I studied? I considered (1) Oscar’s 5 Best Picture nominees for the year, and (2) the Top 5 domestic box office grossers for the year. I took the average IMDb rating for each group and the average number of votes for each group and compared these numbers.
Of course there is overlap between these two groups of films. There are a total of 280 films in each group. 104 films (which is 37% of 280) appear in both groups.
The graph below shows the average IMDb ratings for the Best Picture nominees and the top grossers. As the graph shows, there is a clear winner: being a Best Picture nominee is an unequivocally superior heuristic than box office performance for how well a movie will be loved in time.
On the other hand, the second graph, which charts the number of IMDb ratings these same films have, paints a less clear picture. It seems that being an Oscar nominee and being a top 5 domestic grosser are roughly equally predictive of how much a movie is seen in time.
There are two outliers: one is the 1980s, where there is a severe discrepancy between the two. The 1980s films widely seen today are more often the box office hits of the day rather than the Best Picture nominees. But this gap closed (and then some) by the 1990s.
The other outlier is the 1940s, which isn’t as obvious on the graph but
actually represents a larger percentage of difference than the 1980s do. Here,
it is the Best Picture nominees, rather than the box office winners, that
are better seen today.
It’s important to recognize that there may be a level of causation, not just
correlation, in these results. A particular film may be widely seen in later
years because it was an Oscar nominee or a box office hit, and not
just because it inherently continues to interest audiences. Likewise, a film
may be loved because its awards or box office success predisposes
audiences to admire it.
I regard this issue of causation vs. correlation to be irrelevant to the
question at hand, which is not why a movie might be seen or loved
but if. Nonetheless, it’s a matter worth keeping in mind.
Some of the data points are interesting to consider. A great many animated Disney films, such as Cinderella and Lady and the Tramp appear in the box office group, but only Beauty and the Beast was also a Best Picture nominee. These films remain widely seen and widely loved.
Within the 1980s, there start to be some conspicuously campy titles at the top of the box office, such as Porky’s and Rambo: First Blood, Part II. In some cases they remain widely seen, but they are not even very well liked, let alone loved.
Not surprisingly, the further back one goes — and especially when one hits the 1940s — the more titles there are amongst the box office hits that are utterly unrecognized today.
Here are the films included in this study, including the IMDb rating data as it was when I captured it (on June 25, 2010).