Metacritic and the Legitimacy of Video Game Journalism, Part II

In the last post I wrote about my perception that video game journalism is little more than advertising in disguise. In this post, I’ll discuss numbers that may support my opinion. I use data gathered by the website Metacritic. For those who are not familiar with it: Metacritic collects “expert” reviews of various entertainment media, including video games, normalizes their scores, and posts an average. Users are also able to rate games, which then leads to the games in their database having both an expert and a user average.

One objection you could now voice is that user opinion may be unreliable because it might primarily be people who either dislike a game and give bad reviews, or who love a game and like to see it having many recommendations. Sure, both camps may have their agenda. However, there are PR guys who pose as users, and try to inflate scores, like it happened with the recent Star Trek game. I don’t think any company would pay a PR agency to plant negative reviews, so while you might question the legitimacy of user scores, it’s probably better to view them as slightly inflated due to corporate meddling. Therefore, “true” user scores for big-budget would probably be slightly lower than they are. I do agree that the metrics are hardly reliable, and that it’s questionable whether you could objectively grade a game on a ten scale anyway. People have different tastes, and some people are more forgiving of gameplay flaws than others. Still, as a voice of “the people”, it may be interesting to see how they differ from the alleged experts.

I’ve conducted my analysis with data that was current as of August 2, 2013. Games that were released since then are not considered. Further, there may have been slight changes in some of the user scores, simply due to ratings that were submitted to Metacritic in the mean time. I only focus on Xbox 360 games, but I don’t think there would be a fundamental difference when analyzing ratings for PlayStation 3 or PC games.

Metacritic holds records for 1505 Xbox 360 games. I did start with some rather basic analysis. My aim was to uncover discrepancies between users and “experts”. The first few examples may not be overly interesting, but there is a climax towards the end, so please keep reading.

Here’s a top-level overview of the data:

  • user rating < expert rating: 760 games (50.5 %)
  • user rating > expert rating: 683 games (45.4 %)
  • user rating = expert rating: 63 games (4.1 %)

This doesn’t look like much of a discrepancy, though, so I adjusted my parameters to check how the picture changes if I define “equal” to be a difference of +/- 5 %:

  • user rating < expert rating: 494 games (32.8 %)
  • user rating > expert rating: 443 games (29.4 %)
  • user rating = expert rating: 568 games (37.8 %)

Since this wasn’t particularly revealing either, I then moved on to have a look at the average user and “expert” ratings, considering all 1505 games:

  • expert average: 69.0
  • user average: 68.3

The difference looks miniscule. However, the video game industry is focussing on “AAA games”, i.e. games that cost a lot of money to make, and whose advertising budget is even higher than their astronomical production costs. Since I had the impression that it is primarily this category of games that receives inflated ratings, I changed my script to only consider games that were rated with at least 90 by the “experts”. This applies to just 33 games. Suddenly, the picture gets much more interesting:

  • expert average: 93.3
  • user average: 81.1

I’ll let you draw your conclusions about that when you consider that as a whole the difference between user and expert ratings is 0.7. It seems that as soon as big advertising money comes into play, the “experts” identify greatness where users see just another okay game. A difference of 12.2 points is quite staggering.

If I run the same script but limit the scope so that it only counts games that were rated with 85 or more by the “experts”, a total of 137 games, the discrepancy is still quite startling:

  • expert average: 89.3
  • user average: 79.5

A difference of about 10 points is not to be ignored either, but not quite as damning as the previous subset of games. Let’s now have a look at those fabulous 37 games that the “experts” thought were marvels of digital entertainment. The columns have to be interpreted the following way:

  • column 1: user rating minus expert rating
  • column 2: expert rating
  • column 3: user rating
  • column 4: game title

To make it perfectly clear what the data means, look at this line: “-38 93 55 Mass Effect 3”. This means that the users gave the game a 55, the experts a 93, and that the user score is 38 lower than expert score.

Here is the entire list:

-38 93 55 Mass Effect 3
-33 94 61 Call of Duty: Modern Warfare 2
-20 93 73 Street Fighter IV
-19 98 79 Grand Theft Auto IV
-18 94 76 Halo 3
-17 93 76 Gears of War 2
-15 91 76 Gears of War 3
-15 91 76 Super Street Fighter IV
-14 91 77 Halo: Reach
-13 92 79 Forza Motorsport 3
-12 93 81 Pac-Man Championship Edition DX
-12 93 81 Rock Band 3
-12 96 84 The Elder Scrolls V: Skyrim
-11 91 80 Forza Motorsport 4
-11 92 81 Guitar Hero II
-11 92 81 Rock Band
-11 94 83 Gears of War
-11 95 84 Portal 2
-10 94 84 Call of Duty 4: Modern Warfare
-9 92 83 Rock Band 2
-9 94 85 Batman: Arkham City
-8 92 84 The Walking Dead: A Telltale Games Series
-8 93 85 BioShock Infinite
-8 93 85 Fallout 3
-8 96 88 BioShock
-7 93 86 Braid
-7 94 87 The Elder Scrolls IV: Oblivion
-7 96 89 Mass Effect 2
-7 96 89 The Orange Box
-6 91 85 Far Cry 3
-6 95 89 Red Dead Redemption
-5 92 87 Batman: Arkham Asylum
-4 91 87 Mass Effect

It turned out that across the board users were less impressed than professional reviewers, and oftentimes dramatically less. Maybe you have played some of those games and asked yourself why they don’t live up to their glowing reviews. For instance, I was thoroughly unimpressed by Halo 3, and Grand Theft Auto IV I view as a game with many issues, but with plenty of fun moments nonetheless. The “experts” give the game a 98, the users a 79. The user score is closer to my perception of that game’s quality. The Gears of War games have a ham-fisted plot and okay gunplay, and don’t think they deserved the praise it got. The third part had some really great moments, though, but also much more ham. The by far best third person shooter I played on the Xbox 360 was Binary Domain. The critics gave it a 74, the users an 81.

If you wonder whether you should trust video game reviews, then you now have some good justification why you shouldn’t. Nowadays I ignore mainstream reviews and look for the consensus in niche communities, like on message boards for people who play STGs. This is much more helpful than the writeup of some “expert reviewer” who is forcing himself to type a few hundred words about a game he may not even have finished or hasn’t even understood the controls of. My personal consequence is that I buy fewer games, and if I want to play some “AAA title”, I wait until I can pick it up for cheap to minimize the amount of money I might potentially waste. If there are more people like me, then the money hatting strategy of the big publishers might not be as effective as the MBA types who dreamt it up think. The rise of indie games might suggest that people are getting fed up with “AAA gaming”.

For a more uplifting conclusion that supports the view that fan feedback can be very valuable with regards to games that are under-appreciated by professional reviewers, let’s now look at 20 titles fans enjoyed but critics despised:

33 48 81 Otomedius Excellent
33 44 77 Warriors Orochi 2
31 52 83 Samurai Warriors 2
30 57 87 World of Outlaws: Sprint Cars
29 53 82 Tetris Splash
29 42 71 Venetica
27 49 76 Dynasty Warriors: Gundam 2
26 63 89 Triggerheart Exelica
26 53 79 Samurai Warriors 2 Empires
26 49 75 Puzzle Arcade
25 58 83 Vigilante 8: Arcade
25 53 78 Ecco the Dolphin
24 64 88 Project Sylpheed: Arc of Deception
24 61 85 Dynasty Warriors 5 Empires
24 60 84 Sonic Adventure 2
23 63 86 Serious Sam 3: BFE
23 62 85 WRC 3
23 54 77 Rise of Nightmares
22 53 75 Warriors Orochi

I recognize two STGs, Otomedius Excellent and Triggerheart Exelica, plenty of samurai games, and a couple of classic games, or reinterpretations of classic games. People seem to have a lot of fun with those, even though the “experts” don’t get them. As a cynic, I’m tempted to conclude from those data that it doesn’t make a lot of sense to let hats full of money go around when the game you’re about to release only appeals to a more “hardcore” audience, and its sales potential is comparably low. On the other hand, if you are marketing Halo, then you can be a bit more generous and discredit video game journalism even more. Just look at Geoff Keighley:

Geoff Keighley, a paragon of integrity in video games journalism

But, hey, the man’s got to eat, too! Who knows, maybe he’ll one day figure out how to play a video game despite having one hand buried in a bowl full of Doritos, and holding a Mountain Dew in the other.

Metacritic and the Legitimacy of Video Game Journalism, Part I

I recently bought an Xbox 360 since I was interested in playing a couple of recent games. Some of my buying choices were influenced by reviews on websites. Games like Red Dead Redemption were great, but others didn’t quite live up to my expectations. I’ve now had about a handful of experiences were the glowing reception of the games by reviewers didn’t at all match my subjective experience. Thus, I wondered how legitimate video game journalism really is.

I also noticed that some games I tremendously enjoyed, such as Vanquish, received a rather lukewarm reception. Also, I couldn’t help but notice that gaming websites are plastered with ads, so the obvious assumption is that those sites don’t want to bite the hand that feeds them and therefore promote “AAA titles”, while they spend little attention to games that cater to a niche audience. It’s also quite obvious that many mainstream reviewers don’t understand particular genres or, well, just plain suck at playing video games.

Here is a prime example: the Destructoid review of Vanquish by Platinum Games, written by Jim Sterling. He gave the game a 5 out of 10, and every single word he wrote indicates that he just didn’t get how to play this game. Vanquish isn’t a “cover-based shooter” in the vein of Gears of War but instead puts heavy emphasis on offensive play. The game is a bit more complex than in, say, Call of Duty, but you’ll get amply rewarded if you spend a few minutes to learn the controls. This was seemingly too much effort for Jim Sterling, so he wrote:

Sam [the protagonist] actually needs energy to punch his opponents, and once he’s landed a single successful punch, he can’t glide away since the energy meter completely drains. Several times, I punched an enemy, failed to kill it thanks to Sam’s inability to aim his punches properly, and was killed because I could neither defend myself or swiftly escape.

The issue is that the energy meter that enables you to perform more powerful attacks depletes as you do your little tricks. However, there is a risk/reward mechanism built in. If you completely replete the energy meter, your combat suit overheats. You then have to allow it to cool down, which makes you vulnerable to enemy attacks since you can neither defend yourself properly nor quickly evade. The core mechanic is therefore to find a rhythm for your attacks. This is not as bad as it may sound since the energy meter replenishes very quickly. I found the game mechanic to be highly satisfying, and with some practice, it’s quite easy to get into a state of flow. Frankly, I thought that Vanquish was absolutely fantastic and that it reaches a high-water mark for action games. Enthusiastic reception of this game by actual players, like on NeoGAF, seems to indicate that I’m not the only one who had been very impressed by it.

Jim Sterling’s review of Vanquish may be an egregious example, but the average games journalist is hardly an expert. Especially when it comes to niche games they don’t seem know what they are looking at. For instance, one of my favorite genres are shooting games (STGs). No, not the Call of Duty kind, but the modern descendants of Space Invaders. While most games strive to be “entertainment”, and therefore offer at best a moderate challenge, STGs are designed for repeated play throughs, with the goal being mastery so that you can eventually clear the entire game on just one credit. Depending on your skill level, this may take many months, and with the harder games you may never even get there because you’re just not good enough. This can be a humbling experience, but if you master a game like that, you’ll feel a sense of achievement which you just don’t get from games that hold your hand all the way through. Sure, it’s not for everyone, but those games have enthusiastic fans. Yet, the typical mainstream reviewer is quick to dismiss those games because you can just hit “start” again and “see everything in 15 minutes”. The thread Amusingly bad reviews on collects statements like that. You can only shake your head.

I’ve now mentioned some examples of games or genres that tend to get short shrift by mainstream reviewers. Now let’s look at games that receive lavish praise, and whose faults either get ignored or justified. One prime example is one of the greatest commercial successes in recent years: Grand Theft Auto IV. It sold 25 million copies, and it’s the top rated Xbox 360 game on Metacritic. I think it’s a decent game, but it has its flaws, like a repetitive mission structure, poor driving, and clunky weapon mechanics. It’s not a bad game, but hardly the masterpiece it is claimed to be. Less than a quarter of the people who bought the game finished it. The other three quarters probably got bored or frustrated.

Another example is Resident Evil 5. It’s probably not a bad game once you get into it, but it doesn’t make it easy for you to like it. My main gripe is that your character controls like a tank. One of the very first scenes has you enter a shack. Then zombies start attacking you from two and then three sides. They first come running at you, but before they reach you, they seem to hit an invisible wall that makes them stop. From this point onward, they take turns attacking you, which results in incredibly awkward gameplay. This isn’t my idea of having fun, so I have yet to return to this game. When Resident Evil 5 came out, reviewers were defending the controls as “traditional Resident Evil gameplay”, and hordes of obnoxious gaming fanboys were eager to tell anybody who dared to criticize their favorite franchise a variation of “the controls are fine, maybe you just suck at the game.”

Seeing that some of the biggest games have quite startling flaws, I ended up wondering whether “AAA games” that are backed by multi-million advertising campaigns get much more praise than they deserve, and not because they are so great, but because “money hatting” buys good review scores. The big games normally don’t dare to be challenging, so even your average video game reviewer can play them. On the other hand, it seems that many journalist lack the knowledge and skills to appraise niche games, and are therefore quick to dismiss them. A glance at review scores on Metacritc, which contrasts user and “expert” opinions seemed to support this hypothesis. Looking for hard facts, I then made use of my programming skills and analyzed their data. I will share the results with you in the next post.