the median is the message

June 2, 2009

Over at Wowenomics, Gevlon from the Greedy Goblin left a comment about datamining that is worth replying to. Basically, his point is that sites like this one report on what the average player is doing, but that is not much use because the average player is only making average choices. (…or at least that’s the polite paraphrase of his point.)

In fact the data shows that this sort of argument is not true. Let’s start from what the data actually looks like. Here, I’ve taken an example picked at random: all choices made by 69 DKs for the chest slot:

Item Count
Mightstone Breastplate 2421
Battlemaster’s Breastplate 1010
Scavenged Tirasian Plate 996
Adamantite Breastplate 898
Murkblood Avenger’s Chestplate 655
Gorge’s Breastplate of Bloodrage 309
Battle Leader’s Breastplate 208
Saronite War Plate 197
Fel Iron Breastplate 141
Unscarred Breastplate 136
Westguard Armor 78
Coldrock Breastplate 75
Azure Chain Hauberk 71
Durotan’s Battle Harness 70
Baleheim Armor 68
Light-Touched Breastplate 63
Breastplate of the Warbringer 48
Andrethan’s Masterwork 46
Bone-Threaded Harness 44
Segmented Breastplate 32
Conqueror’s Breastplate 30
Blacksoul Protector’s Hauberk 26
Bloodfist Breastplate 23
Chestguard of Illumination 21
Nether Protector’s Chest 21
Vest of Vengeance 21
Soul Saver’s Chest Plate 21
Scavenged Breastplate 18
Chestguard of Salved Wounds 18
Breastplate of Blade Turning 18
Light-Bound Chestguard 17
Warmaul Breastplate 17
Breastplate of Retribution 15
Blackened Chestplate 15
Boulderfist Armor 14
Heavy Earthforged Breastplate 14
Shattered Hand Breastplate 12
Talonguard Armor 12
Reaver Armor 11
The Exarch’s Protector 9
Lost Chestplate of the Reverent 8
Bloodscale Breastplate 8
Gilded Crimson Chestplate 7
Chestplate of A’dal 7
Jerkin of the Untamed Spirit 7
Khan’aish Breastplate 7
Redeemer’s Plate 7
Torn-heart Family Tunic 7
Protectorate Breastplate 6
Marshwalker Chestpiece 4
Bogslayer Breastplate 4
Demon-Forged Chestguard 4
Warden’s Hauberk 4
Shamblehide Chestguard 3
Garmaul Chestpiece 3
Elegant Dress 2
Crimson Mail Hauberk 2
Cenarion Thicket Jerkin 2
Ango’rosh Breastplate 2
Azure Silk Vest 1
Acherus Knight’s Tunic 1
Black Mageweave Vest 1
Bonechewer Berserker’s Vest 1
Corsair’s Overshirt 1
Darkcrest Breastplate 1
Demon-Forged Hauberk 1
Drakescale Breastplate 1
Breastplate of Many Graces 1
Chestguard of the Dark Stalker 1
Chestguard of the Stormspire 1
Chestguard of the Talon 1
Farshire Robe 1
Flimsy Chain Vest 1
Lovely Black Dress 1
Lovely Blue Dress 1
Simple Black Dress 1
Skom Chain Vest 1
Scale Brand Breastplate 1
Refuge Armor 1
Runecloth Robe 1
Nexus-Strider Breastplate 1
Spring Robes 1
Tuxedo Jacket 1
Twilight Cultist Robe 1
Warrior’s Embrace 1
Worgblood Berserker’s Hauberk 1
Wrathfin Armor 1

It’s worth charting this distribution too, since the shape of the distribution curve is important:

69 DK chest item choice distribution.

69 DK chest items - distribution.

Log plot 69 DK chest item choice.

69 DK item distribution - log scale

The error in Gevlon’s argument stems from our common-sense understanding of average. Most often we think in terms of a Gaussian distribution – so often that it is actually called a normal distribution. When events or things are distributed normally, then the average outcome is, well, average. But, as various people have observed, when networking effects come into the picture, the typical distribution is not Gaussian but power-law-like. The majority cluster around a very few choices, with a rapid fall-off into a long tail of more funky choices but where each choice in the long tail made by only a few individuals.

Now WTF does all that mean in plain English? Simple. If WoW gear, gems or enchant choices were normally distributed, a few people would make the best choice,  a few people would make the worst choice and most would make a so-so choice. We would expect to see a few 69 DKs with the Uber Breastplate of Pwnage, a few with the Scruffy Tunic of Suckage, but most would be wearing the Mediocre Breastplate of, um…, Mediocrity. And that’s what my report would find for you.

But you can see from the charts that the data looks nothing like a normal distribution. Most players have in fact made the same few choices – which generally represent a trade-off between how powerful the item is and how easy it is to get hold of.  Those people who don’t follow the crowd are out in the long tail – here the picture is murky because we don’t know whether they are there because of ignorance or whether they have hit on some effective but as-yet unknown (or difficult to obtain) solution to the problem. (And some are out there because the data is capturing multiple playstyles – no doubt those toons wearing tuxedo jackets and lovely blue dresses are being played by people who know exactly what they are doing.)

But for our purposes, averages are just what we want – they show the consenus view across the player base on what are the reasonably sensible and effective choices.

To me, the interesting question is how this consensus forms. Undoubtedly Gevlon’s point has an element of truth – the average WoW player is no theorycrafter. But there are feedback mechanisms that shape their choices. They have the game itself. And they have instant access to the collective wisdom of the player-borg’s vast hive mind thanks to all the commentary and guides here in cyberspace. It is these network effects that make the distribution take the shape that it has.

About these ads

9 Responses to “the median is the message”

  1. Furl, Cairne-US Says:

    I would note that in this specific case the Mightstone Breastplate is so especially prevalent on account of it being the quest reward chest available at 68 that upgrades the chest directly below it in popularity.

    Also, I think that if you were to grade each item here by, say, whether it upgrades the standard–Saronite War Plate from the DK questline–you would find the distribution more gaussian than logarithmic… id est:

    http://bayimg.com/image/maaedaach.jpg

    I used the same numerical data, used numbers from Wowhead’s weight scale system (took the DK preset and removed hit rating). That looks a bit more gaussian, eh? (Very, very noisy, but that’s halfway accountable to my laziness in applying numbers to the BoE greens and the robes.)

  2. jederus Says:

    Well said Zardoz. This was, in fact, the point of our post (although I probably mangled it in my writing style or lack thereof). The point being that what is most popular is often most profitable. Not talking about what’s best or what players ‘should’ be doing… that’s a discussion best left to the theory crafters. All we care about is making money and, as I hoped to point out, your site can be used as an excellent resource for this line of thinking. Thanks again for such a powerful tool.

  3. zardoz Says:

    To Jederus:
    Thanks. I’m having fun with this site so I’m glad it’s useful to a broad range of players.

    To Furl:

    Now that is an interesting chart! I’ll see if I can reproduce it here with my data.

    The term “average” takes on more than one meaning in this discussion which makes it a bit hard to be clear.

    In terms of item quality, I don’t expect the average leveling player to pick the uber gear (because the game does not force them to do so). The only thing that can be said is that the most popular choices are likely to be reasonable choices for the specific class and level. By “reasonable”, I mean some trade-off between a) how easy it is to obtain an item and b) how effective the item is given the demands set by the game. The most popular items (especially for leveling players) will be the ones that are the easiest to obtain but give enough bonuses to handle bog-standard PvE questing which is what the majority of players do.

    So I’d expect a similar story to be told for other classes and levels – the more popular items may be those that flow from specific quest chains etc, so long as they are sufficiently effective for the level.

    But gear scoring metrics do add another dimension to the picture – a dimension that is certainly missing from the random sampling data I provide here. I am looking at ways to address that issue.


  4. [...] Zardoz from Armory Data Mine explains the reasoning wonderfully in his post about median vs average. [...]


  5. Very interesting post. As someone who works at the intersection of rhetoric and network/complexity theory, I think you’ve hit this one out of the park, so to speak. If anything, Gevlon would want to argue that the “hive mind” mentality impinges upon experimentation (making it difficult to discover innovations deep in the long tail–those conversations do take place on the web, but only the most dedicated junkies will try experiments such as Paladins wearing tanking gear while Ret spec’d for PvP). However, the kind of socially regulated norming (you can’t wear X if you are Y) in WoW does create stable expectations for performance and make it easier for more players to enter into high level content.

    I wonder if there is a way for you to collect data on characters based on achievements? Could you, for instance, look at all the Paladin tanks with the Siege of Ulduar vs. a much more difficult achievement, such as “He’s not Getting Any Older”?

  6. zardoz Says:

    Achievements tell an interesting story but they’re lower down my to-do list at the moment. There are a few sites that cover them from various angles, but there’s a lot more to be said about them.

  7. blee Says:

    good job on the data by the way. most of the data here appears to be twink orientated. I was wondering it would be interesting and nice to know the actual value of these data instead of percentages (or even better, break it down for each battle group too) this way, we could have a better analysis of the data. Now the problem with this is that your sample size is just a fraction of the entire data. however, statistically, the ratio should be pretty close if your sample size is large (which appears to be since you have about 100k)

  8. mister six Says:

    I was wondering the same as insignificantwrangler. Any news about where they are in your to-do list? It’s basically a potentially interesting filter to bring more nuance to the level 80 swath of data you’re already so marvelously providing for us.

  9. zardoz Says:

    Ah, sorry, but I doubt I’ll be doing anything about achievements any time soon. I agree that there’s a lot of material worth looking at there, but I’m moving onto other projects for the forseeable future.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: