In the last post I wrote about my perception that video game journalism is little more than advertising in disguise. In this post, I’ll discuss numbers that may support my opinion. I use data gathered by the website Metacritic. For those who are not familiar with it: Metacritic collects “expert” reviews of various entertainment media, including video games, normalizes their scores, and posts an average. Users are also able to rate games, which then leads to the games in their database having both an expert and a user average.
One objection you could now voice is that user opinion may be unreliable because it might primarily be people who either dislike a game and give bad reviews, or who love a game and like to see it having many recommendations. Sure, both camps may have their agenda. However, there are PR guys who pose as users, and try to inflate scores, like it happened with the recent Star Trek game. I don’t think any company would pay a PR agency to plant negative reviews, so while you might question the legitimacy of user scores, it’s probably better to view them as slightly inflated due to corporate meddling. Therefore, “true” user scores for big-budget would probably be slightly lower than they are. I do agree that the metrics are hardly reliable, and that it’s questionable whether you could objectively grade a game on a ten scale anyway. People have different tastes, and some people are more forgiving of gameplay flaws than others. Still, as a voice of “the people”, it may be interesting to see how they differ from the alleged experts.
I’ve conducted my analysis with data that was current as of August 2, 2013. Games that were released since then are not considered. Further, there may have been slight changes in some of the user scores, simply due to ratings that were submitted to Metacritic in the mean time. I only focus on Xbox 360 games, but I don’t think there would be a fundamental difference when analyzing ratings for PlayStation 3 or PC games.
Metacritic holds records for 1505 Xbox 360 games. I did start with some rather basic analysis. My aim was to uncover discrepancies between users and “experts”. The first few examples may not be overly interesting, but there is a climax towards the end, so please keep reading.
Here’s a top-level overview of the data:
- user rating < expert rating: 760 games (50.5 %)
- user rating > expert rating: 683 games (45.4 %)
- user rating = expert rating: 63 games (4.1 %)
This doesn’t look like much of a discrepancy, though, so I adjusted my parameters to check how the picture changes if I define “equal” to be a difference of +/- 5 %:
- user rating < expert rating: 494 games (32.8 %)
- user rating > expert rating: 443 games (29.4 %)
- user rating = expert rating: 568 games (37.8 %)
Since this wasn’t particularly revealing either, I then moved on to have a look at the average user and “expert” ratings, considering all 1505 games:
- expert average: 69.0
- user average: 68.3
The difference looks miniscule. However, the video game industry is focussing on “AAA games”, i.e. games that cost a lot of money to make, and whose advertising budget is even higher than their astronomical production costs. Since I had the impression that it is primarily this category of games that receives inflated ratings, I changed my script to only consider games that were rated with at least 90 by the “experts”. This applies to just 33 games. Suddenly, the picture gets much more interesting:
- expert average: 93.3
- user average: 81.1
I’ll let you draw your conclusions about that when you consider that as a whole the difference between user and expert ratings is 0.7. It seems that as soon as big advertising money comes into play, the “experts” identify greatness where users see just another okay game. A difference of 12.2 points is quite staggering.
If I run the same script but limit the scope so that it only counts games that were rated with 85 or more by the “experts”, a total of 137 games, the discrepancy is still quite startling:
- expert average: 89.3
- user average: 79.5
A difference of about 10 points is not to be ignored either, but not quite as damning as the previous subset of games. Let’s now have a look at those fabulous 37 games that the “experts” thought were marvels of digital entertainment. The columns have to be interpreted the following way:
- column 1: user rating minus expert rating
- column 2: expert rating
- column 3: user rating
- column 4: game title
To make it perfectly clear what the data means, look at this line: “-38 93 55 Mass Effect 3”. This means that the users gave the game a 55, the experts a 93, and that the user score is 38 lower than expert score.
Here is the entire list:
-38 93 55 Mass Effect 3 -33 94 61 Call of Duty: Modern Warfare 2 -20 93 73 Street Fighter IV -19 98 79 Grand Theft Auto IV -18 94 76 Halo 3 -17 93 76 Gears of War 2 -15 91 76 Gears of War 3 -15 91 76 Super Street Fighter IV -14 91 77 Halo: Reach -13 92 79 Forza Motorsport 3 -12 93 81 Pac-Man Championship Edition DX -12 93 81 Rock Band 3 -12 96 84 The Elder Scrolls V: Skyrim -11 91 80 Forza Motorsport 4 -11 92 81 Guitar Hero II -11 92 81 Rock Band -11 94 83 Gears of War -11 95 84 Portal 2 -10 94 84 Call of Duty 4: Modern Warfare -9 92 83 Rock Band 2 -9 94 85 Batman: Arkham City -8 92 84 The Walking Dead: A Telltale Games Series -8 93 85 BioShock Infinite -8 93 85 Fallout 3 -8 96 88 BioShock -7 93 86 Braid -7 94 87 The Elder Scrolls IV: Oblivion -7 96 89 Mass Effect 2 -7 96 89 The Orange Box -6 91 85 Far Cry 3 -6 95 89 Red Dead Redemption -5 92 87 Batman: Arkham Asylum -4 91 87 Mass Effect
It turned out that across the board users were less impressed than professional reviewers, and oftentimes dramatically less. Maybe you have played some of those games and asked yourself why they don’t live up to their glowing reviews. For instance, I was thoroughly unimpressed by Halo 3, and Grand Theft Auto IV I view as a game with many issues, but with plenty of fun moments nonetheless. The “experts” give the game a 98, the users a 79. The user score is closer to my perception of that game’s quality. The Gears of War games have a ham-fisted plot and okay gunplay, and don’t think they deserved the praise it got. The third part had some really great moments, though, but also much more ham. The by far best third person shooter I played on the Xbox 360 was Binary Domain. The critics gave it a 74, the users an 81.
If you wonder whether you should trust video game reviews, then you now have some good justification why you shouldn’t. Nowadays I ignore mainstream reviews and look for the consensus in niche communities, like on message boards for people who play STGs. This is much more helpful than the writeup of some “expert reviewer” who is forcing himself to type a few hundred words about a game he may not even have finished or hasn’t even understood the controls of. My personal consequence is that I buy fewer games, and if I want to play some “AAA title”, I wait until I can pick it up for cheap to minimize the amount of money I might potentially waste. If there are more people like me, then the money hatting strategy of the big publishers might not be as effective as the MBA types who dreamt it up think. The rise of indie games might suggest that people are getting fed up with “AAA gaming”.
For a more uplifting conclusion that supports the view that fan feedback can be very valuable with regards to games that are under-appreciated by professional reviewers, let’s now look at 20 titles fans enjoyed but critics despised:
33 48 81 Otomedius Excellent 33 44 77 Warriors Orochi 2 31 52 83 Samurai Warriors 2 30 57 87 World of Outlaws: Sprint Cars 29 53 82 Tetris Splash 29 42 71 Venetica 27 49 76 Dynasty Warriors: Gundam 2 26 63 89 Triggerheart Exelica 26 53 79 Samurai Warriors 2 Empires 26 49 75 Puzzle Arcade 25 58 83 Vigilante 8: Arcade 25 53 78 Ecco the Dolphin 24 64 88 Project Sylpheed: Arc of Deception 24 61 85 Dynasty Warriors 5 Empires 24 60 84 Sonic Adventure 2 23 63 86 Serious Sam 3: BFE 23 62 85 WRC 3 23 54 77 Rise of Nightmares 22 53 75 Warriors Orochi
I recognize two STGs, Otomedius Excellent and Triggerheart Exelica, plenty of samurai games, and a couple of classic games, or reinterpretations of classic games. People seem to have a lot of fun with those, even though the “experts” don’t get them. As a cynic, I’m tempted to conclude from those data that it doesn’t make a lot of sense to let hats full of money go around when the game you’re about to release only appeals to a more “hardcore” audience, and its sales potential is comparably low. On the other hand, if you are marketing Halo, then you can be a bit more generous and discredit video game journalism even more. Just look at Geoff Keighley:
But, hey, the man’s got to eat, too! Who knows, maybe he’ll one day figure out how to play a video game despite having one hand buried in a bowl full of Doritos, and holding a Mountain Dew in the other.
Pingback: Metacritic and the Legitimacy of Video Game Journalism, Part I | Gregor Ulm