Not A Man, A Number: Metacritic Rates Devs

News by Alec Meer Co-founder

Published on March 28, 2011

If it exists, it must have a number stuck to it: this is the Metacritic way. Its voracious maw of review aggregation has now expanded to include individual developers - which means actual people are now being given a personal numerical rating. This is an average number (out of 100) based on the various reviews of games they've worked on. I must confess I find this concept mildly sinister, but maybe I'd feel differently if I was able to go around telling people I was worth 91%.

So, who of the many alumni of PC gaming has been treated favourably by this new system? And who's fallen foul of it? Who is empirically proven to be the better man: Warren Spector or Ken Levine? Richard Garriott or Derek Smart? John Carmack, John Romero or American McGee? And Portal's Erik Wolpaw or Portal's Chet Faliszek? And which poor bastard was deemed to be worth just 8%?

Here's who I've looked up so far - I am, of course, expecting you to suggest further honourable gentlebeings below. This is achieved simply by typing dudes and dudettes' names into the main search box on Metacritic.

Cevat Yerli (Crytek): 91
Bill Roper (Blizzard/Flagship/Cryptic): 90
Ken Levine (Irrational): 89
Chet Faliszek (Valve): 89
Erik Wolpaw (Valve/Double Fine): 88
Sid Meier (Firaxis/Microprose): 86
American McGee (id/EA/Spicy Horse) : 86
Cliff Bleszinski (Epic): 86
Doug Church (Looking Glass/Ion Storm/EA/Valve): 84
Warren Spector: (Origin/Looking Glass/Ion Storm/Junction Point) 82
Peter Molyneux (Bullfrog/Lionhead): 82
Chris Avellone (Black Isle/Obsidian): 81
Chris Taylor (Gas Powered Games/Cavedog): 80
Brian Reynolds (Microprose/Firaxis/Big Huge Games/Zynga): 79
John Carmack (id): 78
Ruslan Didenko (GSC Gameworld): 77
Julian Gollop (Mythos/Ubisoft): 76
John Romero (Origin/id/Ion Storm/Midway/Monkeystone/Gazillion/Loot Drop): 75
Richard Garriot (Origin/NCsoft/Portalarium): 66
Derek Smart (3000AD): 61
Oleksandr Khrutskyy (Deep Shadows): 61
Sergey Titov (Stellar Stone): 8

Phew! Much to comment on there (Garriott and Roper make for particularly interesting discussion points in and of themselves; the former's recent games have soured his mighty past achievements, while the latter's Blizzard mega-scores mask troubled times such as Hellgate and Champions), but let's use as our launching point Valve's Wolpaw and Faliszek.

The ex-Old Man Murray pair have worked together for most of their adult lives, but have different scores. Chet is officially one better than Erik. Will he gloat and chuckle and sneer about this every day? Will Erik fall into a deep, dark depression that results in him carving '88' into every tree trunk, car door and human face in Seattle? Who knows.

What we do know is that the disparity results from Wolpaw having worked on one game that Faliszek didn't - Psychonauts - while Wolpaw isn't credited as having worked on the Left 4 Dead games. The latter was generally better-reviewed than Psychonauts: hence the mystery 1%.

My fear is that the games industry might use this system as a factor when looking to recruit people or decide on payrises. Metacritic numbers have already been known to affect the likes of retailer stock buys, publishers and studios' public profiles ("we are/want to be a 90% Metacritic average company" is a refrain I've heard many times of late, especially from larger devs and publishers) and even developer bonuses. I can all too easily imagine a firm deciding it will only hire or promote staff of a certain score or higher, believing that to be the benchmark of their aptitude, and thus an indicator of how likely their work is to bring in the high scores and thus the high revenues.

"Only 88%? Not good enough, Mr Wolpaw. We're after an 89%er at the very least." "If you're not worth 91% by Christmas you won't get that payrise you need to feed your 18 children, Meier." And pity the guys with a 60 or 70 albatross hung around their necks. What if they've just not been able to be attached to a big, expensive manshoot project that attracts enough breathless, drooling reviews from shooter-hungry review sites? Cutting one's teeth on lower-key projects is no bad thing, and frankly learning hard lessons on shitty, underfundedgames is only going to increase your skills. This doesn't meaningfully reflect any of that, or a whole lot more.

For instance, let's try Ruslan Didenko, lead designer on the last two STALKER games. He's a 77, which by current games industry score perception standards means mediocrity. Yet we know STALKER titles are amongst the most technically and creatively ambitious, clever and atmospheric videogames of recent years; yet their relative inaccessibility (by mainstream standards) mean they'll never score the big numbers on some of the biggest sites/publications. Not to mention that Stalker Clear Sky was a bit of a boo-boo (at least at launch) so it's understandable that its numbers are a bit lower than its successor, Call of Pripyat. Yet Pripyat demonstrated deftly that the devs had learned their lesson and honed their craft. Does a personal rating of 77 reflect that accurately?

Similarly Deep Shadows' Oleksandr Khrutskyy, one of the lead designers on Boiling Point. That might be a game of legendary comedy, but in a very real way it's an incredible technical achievement from a tiny team with a tiny budget. 61% doesn't exactly convey how driven Khrutskyy surely is.

Then there's poor, poor Sergey Titov, who was producer and programmer on Big Rigs: Over the Road Racing. Apparently he is only 8% good enough. There is 92% wrong with this person. How's that going to look on the CV? Quite clearly Metacritic scores will not be the only factor involved in hiring and salary decisions, not by a long way: my feeling is simply that they perhaps shouldn't be involved at all. I'm not sure what other purpose this new rating has, however. Theories?

Sure, it's an entertainingly, perhaps even passingly informative idea to see and do these kinds of rankings, but I'm concerned they paint a incredibly inaccurate picture of a developer's achievements and skills. Aggregating game review scores might be a useful touchstone but unavoidably loses nuance - do we really want that of people too?

Read this next