back to the future

February 2, 2011

Just a brief note to say that I haven’t abandoned all hope of getting this site going again. Even though I’m not playing MMOs at the moment, it seems a shame to just leave everything sit idle. My basic infrastructure runs without too much effort, so it is no great problem to refresh the data every couple of months.

The main obstacle is that Blizz is now serving the up-to-date data from battle.net in HTML format rather than XML. My page-scraping code needs to change to cope with that. Fortunately however the Blizz engineers are serving up valid XHTML, which means that XPath expressions can still be used to extract the data we need.

If I’ve been a good little engineer then only my XPaths need to change and nothing else…

There is a danger that the XPath paths can become more than a bit baroque because they have to navigate through all the HTML markup to get to the data nodes, although there are tricks to get the XPath engine to do a lot of the searching.

Anybody looking for inspiration on how to parse the battle.net XHTML should check out these posts by a geek blooger called Kastang.  That’s the method I’ll be using when I get back to all this.

Things may look quiet here but behind the scenes I’ve been working on my project to get up to speed with modern datamining algorithms. The first step has been to assemble some sources of information and some tools for the job.

For a textbook, I’ve chosen Introduction To Data Mining by Tan, Steinbach and Kumar. It provides a good overview of the key algorithms, along with important issues like data quality and consistency. It also introduces the maths in a reasonably gentle way.

Fortunately, while it is important to understand how the algorithms work, it is not necessary to work the maths by hand. There are some first class freeware datamining programs available that do all the heavy lifting, so long as you know how to prepare the data and how to set the parameters of the algorithms so they produce valid results.

Three datamining packages in particular are worth noting:

Weka and RapidMiner are GUI-driven toolsets where R is more command-line oriented. You can download all of these and play around with them at home. They’re not toys, so you need to have some confidence about plowing through the user guides and technical manuals, but they are easy enough to get up and running.

The choice between Weka and RapidMiner is a difficult one. At the moment I’m working with Weka but that is mainly because it was the first one I started experimenting with.

The other crucial thing to have at hand is some test data. Of course you may want to remind me that I have several GB of WoW-related data right here. But that is not the place to start. The first step is to learn how the tools work and how to use them. For that we need data where we know the expected results – so that when the tool doesn’t produce the right result we know to look again at how we have applied the algorithm.

The data has to be challenging enough to put the algorithms to a test, but not so challenging that we are left wondering whether the wrong answer is due to the complexity of the data or just due to some dumb mistake.

Over at the Expressive Intelligence Studio blog, Chris Lewis posted an interesting report about using a toon’s gear to predict its class. That’s exactly the sort of place we want to start since all sorts of classifying and clustering algorithms could be tested on a data set like that. The other idea that occurred to me is to use talent builds to predict the spec of the toon. Of course you can do that in a very simple way by just adding up the points spent in each of the three trees: a paladin that has most points in the holy tree is a holy paladin.

Where the problem becomes interesting is with those classes where there is a tendency to spread talent points across more than one tree. I’m thinking mainly of mages and warlocks but any class where the three trees don’t map straight onto the tank/healz/dps holy trinity should see some points spread across multiple trees. Can datamining algorithms handle “fuzzy” data like that?

To make this discussion more concrete, let’s have a quick look at that very question. We can fire up Weka and feed in a sample of level 80 paladin talent builds. To keep it simple, I’m using a toy data set of only 150 paladins with 50 from each of the 3 trees. We can run a basic k-means clustering algorithm over the data, which we hope should produce 3 clusters: one each for the holy, protection and retribution trees. And voilà…

Paladins clustered

That works because holy paladins don’t spend many points in protection or retribution talents. But for mages, where there is a significant tendency to spend points in more than one tree we get this:

Mages clustered.

Now the algorithm is flummoxed – putting arcane and frost mages in the same cluster and splitting fire mages into two clusters. So we have a simple data set that is also challenging enough to put these tools to a bit of a test.

I’ve also made a third data set using priest builds. Priests have more talent points invested across the trees than paladins but fewer than mages. Clustering this data set is left as an exercise for the reader…

No, seriously… Anybody who’d like to experiment with these data sets can download them from the links here. They’re in a standard .arff format (really just an annotated csv file) that Weka and RapidMiner both know how to load. Note that I’ve used a “.pdf” extension since WordPress will not allow me to upload arbitrary file types. But if you open them in a text editor you’ll see they are just csv data. Rename the extension to “.arff” and they’re ready to go.

ClassByGear.arff
PaladinBuilds.arff
MageBuilds.arff
PriestBuilds.arff

I’ll have a lot more to say about these little data sets in the next few posts.

pet sounds

July 3, 2009

Here’s news for Hunters:  Blizz has added your pets and their talents to your character talent page in the armoury.

That will give us a lot of new insight into the types of creatures that make the most popular pets, and how players spend pet talent points. The XML is straightforward and contains no quirks and so I’m cranking up the SQL editor as we speak.

I’ll try and have the new reports ready before the next refresh of my data. As to when that will be, well, about a week after Patch 3.2 drops. As to when that will be… well…

Over at Wowenomics, Gevlon from the Greedy Goblin left a comment about datamining that is worth replying to. Basically, his point is that sites like this one report on what the average player is doing, but that is not much use because the average player is only making average choices. (…or at least that’s the polite paraphrase of his point.)

In fact the data shows that this sort of argument is not true. Let’s start from what the data actually looks like. Here, I’ve taken an example picked at random: all choices made by 69 DKs for the chest slot:

Item Count
Mightstone Breastplate 2421
Battlemaster’s Breastplate 1010
Scavenged Tirasian Plate 996
Adamantite Breastplate 898
Murkblood Avenger’s Chestplate 655
Gorge’s Breastplate of Bloodrage 309
Battle Leader’s Breastplate 208
Saronite War Plate 197
Fel Iron Breastplate 141
Unscarred Breastplate 136
Westguard Armor 78
Coldrock Breastplate 75
Azure Chain Hauberk 71
Durotan’s Battle Harness 70
Baleheim Armor 68
Light-Touched Breastplate 63
Breastplate of the Warbringer 48
Andrethan’s Masterwork 46
Bone-Threaded Harness 44
Segmented Breastplate 32
Conqueror’s Breastplate 30
Blacksoul Protector’s Hauberk 26
Bloodfist Breastplate 23
Chestguard of Illumination 21
Nether Protector’s Chest 21
Vest of Vengeance 21
Soul Saver’s Chest Plate 21
Scavenged Breastplate 18
Chestguard of Salved Wounds 18
Breastplate of Blade Turning 18
Light-Bound Chestguard 17
Warmaul Breastplate 17
Breastplate of Retribution 15
Blackened Chestplate 15
Boulderfist Armor 14
Heavy Earthforged Breastplate 14
Shattered Hand Breastplate 12
Talonguard Armor 12
Reaver Armor 11
The Exarch’s Protector 9
Lost Chestplate of the Reverent 8
Bloodscale Breastplate 8
Gilded Crimson Chestplate 7
Chestplate of A’dal 7
Jerkin of the Untamed Spirit 7
Khan’aish Breastplate 7
Redeemer’s Plate 7
Torn-heart Family Tunic 7
Protectorate Breastplate 6
Marshwalker Chestpiece 4
Bogslayer Breastplate 4
Demon-Forged Chestguard 4
Warden’s Hauberk 4
Shamblehide Chestguard 3
Garmaul Chestpiece 3
Elegant Dress 2
Crimson Mail Hauberk 2
Cenarion Thicket Jerkin 2
Ango’rosh Breastplate 2
Azure Silk Vest 1
Acherus Knight’s Tunic 1
Black Mageweave Vest 1
Bonechewer Berserker’s Vest 1
Corsair’s Overshirt 1
Darkcrest Breastplate 1
Demon-Forged Hauberk 1
Drakescale Breastplate 1
Breastplate of Many Graces 1
Chestguard of the Dark Stalker 1
Chestguard of the Stormspire 1
Chestguard of the Talon 1
Farshire Robe 1
Flimsy Chain Vest 1
Lovely Black Dress 1
Lovely Blue Dress 1
Simple Black Dress 1
Skom Chain Vest 1
Scale Brand Breastplate 1
Refuge Armor 1
Runecloth Robe 1
Nexus-Strider Breastplate 1
Spring Robes 1
Tuxedo Jacket 1
Twilight Cultist Robe 1
Warrior’s Embrace 1
Worgblood Berserker’s Hauberk 1
Wrathfin Armor 1

It’s worth charting this distribution too, since the shape of the distribution curve is important:

69 DK chest item choice distribution.

69 DK chest items - distribution.

Log plot 69 DK chest item choice.

69 DK item distribution - log scale

The error in Gevlon’s argument stems from our common-sense understanding of average. Most often we think in terms of a Gaussian distribution – so often that it is actually called a normal distribution. When events or things are distributed normally, then the average outcome is, well, average. But, as various people have observed, when networking effects come into the picture, the typical distribution is not Gaussian but power-law-like. The majority cluster around a very few choices, with a rapid fall-off into a long tail of more funky choices but where each choice in the long tail made by only a few individuals.

Now WTF does all that mean in plain English? Simple. If WoW gear, gems or enchant choices were normally distributed, a few people would make the best choice,  a few people would make the worst choice and most would make a so-so choice. We would expect to see a few 69 DKs with the Uber Breastplate of Pwnage, a few with the Scruffy Tunic of Suckage, but most would be wearing the Mediocre Breastplate of, um…, Mediocrity. And that’s what my report would find for you.

But you can see from the charts that the data looks nothing like a normal distribution. Most players have in fact made the same few choices – which generally represent a trade-off between how powerful the item is and how easy it is to get hold of.  Those people who don’t follow the crowd are out in the long tail – here the picture is murky because we don’t know whether they are there because of ignorance or whether they have hit on some effective but as-yet unknown (or difficult to obtain) solution to the problem. (And some are out there because the data is capturing multiple playstyles – no doubt those toons wearing tuxedo jackets and lovely blue dresses are being played by people who know exactly what they are doing.)

But for our purposes, averages are just what we want – they show the consenus view across the player base on what are the reasonably sensible and effective choices.

To me, the interesting question is how this consensus forms. Undoubtedly Gevlon’s point has an element of truth – the average WoW player is no theorycrafter. But there are feedback mechanisms that shape their choices. They have the game itself. And they have instant access to the collective wisdom of the player-borg’s vast hive mind thanks to all the commentary and guides here in cyberspace. It is these network effects that make the distribution take the shape that it has.

You may not have noticed yet, given all the 3.1 (and now 3.1.1!)  fun, but a whole slab of character stats have disappeared from the armoury. The most interesting ones that have gone are the detailed BG performance stats. Strangely enough, the raiding performance numbers are still there – the missing ones all seem to be PvP related.

Now if I were a, like, y’know, paranoid kinda person, I’d be putting forward the following conspiracy theory. PvP stats are about class performance and raid stats are about group performance. Raid stats tell you something about a guild because performance depends on the ability of the guild to coordinate, to lead, to control the Leeroy Jenkins element etc etc. And indeed there are websites out there that do exactly that:  rate guilds by how many, and which, raid dungeon bosses they’ve downed.

But PvP stats, being all about the mano a mano thing, tell you something about the relative performance of classes. And that subject really does seem to be Blizz’s bête noire these days. Just exactly why they’re so focused on it escapes me – but then I never go anywhere near the official fora so maybe that’s why I’m in the dark.

My conspiracy theory would be that they don’t want anybody to be able to just run class balance through the ol’ spreadsheet to see what comes out the other end.

I made a modest contribution to using the BG data to look at class balance here on this blog. But I believe I was pretty careful to say, like dude, we don’t expect the classes to be balanced across any narrow set of performance measures. Classes that can tank and CC and heal are people too.

But anyway I’m not the paranoid type, so that’s enough of that. It’s just a silly game. Let’s move on.

Just to complete the picture, here is a set of battleground class performance charts for each of the x9 BG levels plus level 80. The y-axis now shows average deaths per game and not the inverse, so the sweet spot of high-kills-low-deaths is in the bottom right hand corner of the chart.

The sample consists of all players at each level who have played 100 or more BGs. The data is from patch 3.0.9.

There’s a lot of interesting things to note in those charts, especially when you compare the same class at different levels. Some are effective at all levels, others appear to change roles as they level up.

If you want the executive summary, these are the points that strike me:

  • DKs are OP
  • Rogues aren’t, even though people think they are
  • Warriors are fragile, despite all that armour
  • Warlocks are still a force in PvP as long as you don’t mind dying a lot
  • Baby Paladins may be easy meat but the adult of the species sure isn’t
  • Hunters seem to be the consistent high performer, but that is probably because they just play the same role (of ranged attacker) at every level

Do exercise some caution when interpreting these results. In particular remember:

  1. Some classes have fewer attacking players and therefore a lower average kill rate just because they have a healing tree. Other classes may spend a lot of time CC-ing instead of attacking.
  2. Some BGs have objectives that conflict with straight PvP. For example in Warsong Gulch, the classes that spend most time running the flag will have a lower kill rate because of that.
  3. This is data aggregated across every BG accessible at the level. There may be specific features of individual BGs that make certain classes more effective there, despite these charts.

19

battleground-class-effectiveness-level-19

BG Class Effectiveness, Level 19

29

battleground-class-effectiveness-level-29

BG Class Effectiveness, Level 29

39

BG Class Effectiveness, Level 39

BG Class Effectiveness, Level 39

49

BG Class Effectiveness, Level 49

BG Class Effectiveness, Level 49

59

BG Class Effectiveness, Level 59

BG Class Effectiveness, Level 59

69

BG Class Effectiveness, Level 69

BG Class Effectiveness, Level 69

79

BG Class Effectiveness, Level 79

BG Class Effectiveness, Level 79

80

BG Class Effectiveness, Level 80

BG Class Effectiveness, Level 80

One annoying problem for armoury mining is that the armoury servers do not give the name of the enchant attached to an enchanted item. What you get are the bonuses granted by the enchant, along with a magic number key that references some internal Blizz data source.

The trick is to find the enchant that goes with the bonus, since the thingy that grants the bonus is what matters to players.

So we get bonus strings like +10 Defense Rating/+10 Stamina/+15 Block Value. But that’s not a lot of use unless I can tell you how to get these bonuses. What you need to know is that Presence of Might grants this. But nothing in the armoury gives us that link.

I must admit that I’d basically filed this one away in the too-hard basket – that’s why there is no enchant data on this site. But now my plan to do twink analysis means that I can no longer ignore the problem – it’d be a pretty tenth-rate twink that wasn’t enchanted up to the gills.

Fortunately, over at Armory Musings, Okoloth came up with an ingenious solution to this problem. You can read all about it here, but basically what he’s doing is searching various sites in the WoW datasphere for the bonus strings, and finding the matching enchant or item. Brilliant!

Okoloth produced an XML file that maps the enchants-to-bonuses he was able to discover. You can get that XML file from here. But being a do-it-yourself kinda bloke, I thought I’d have a go at the problem myself and see what I could do.

Now, when you mention search, one word pops into my mind straight away – starts with ‘G’… Fortunately, Google offers a way of using their data from software. It is possible to write a program that will get Google to search for a phrase, and return the search results in a structured form that in turn can be processed locally. In particular, you get the URL of the page, which is what we want for the link, and the title of the page – which should,with a bit of luck, be the name of the enchant spell or item. The rest of the page indexed by Google can be ignored – that’s the two pieces of data we’re after.

A bit of string-searching on the returned results for a well-known WoW database site like Wowhead or Thottbot is all that is required. The correct search results can be identified and written out to XML or whatever.

Google has a nice REST API, intended for embedding in webpages, but easily callable from your favourite programming or scripting language. There is even an open-source C# wrapper for those of us trapped in Billg-land. If you want to repeat this exercise from an MS environment, I highly recommend this, which hides all the googly grunge.

It didn’t take long to write the code and I was ready to boldly go where only one armoury datamining site had gone before. And it worked pretty well too; with a being as omniscient as Google on my side, how could I possibly lose?

Not every enchant bonus string is accurately found. Some are too vague. +2 Fishing gets a lot of hits on bait-and-tackle shops, for example. A few end up finding no matches for reasons I can’t quite fathom. But the vast majority of bonuses find a matching enchant spell or item, and the rest can be fixed up by hand.

Those ones that my software couldn’t match can generally be found by a manual search because they are buried away in tables such as those at WoWWiki.

The other problem is that a lot of enchants have links to both spells and items. Generally what we want is an item, if one exists, since that is what you have to obtain ingame to start the process. But that is not a fatal problem, since the spell-to-item cross-links can be found in the WoW database sites themselves. All I need to give you  is one link to Wowhead. When you get there, you’ll find the spells and items linked together and you can sort it all out quickly.

Okoloth made his XML file freely available so it would be remiss of me not to do the same. For some mad reason I can’t post XML here, so I’ve renamed the extension to .doc. But if you look inside you’ll see it is well-formed XML; just remove the .doc extension and you’re good to go. You can download it here: {link temporarily removed by me – see update}.

Thanks again to Okoloth for solving this irksome little problem!

UPDATE: Not quite there yet – I’ve removed the xml file download as the quality isn’t quite up to scratch. I’m still confident that this method will work but there are a couple of issues to work through before I release it.

Warsong Gulch chart porn

March 20, 2009

I’ve been having a bit of fun with the battleground stats data from the Armoury. There are plenty of good strategy guides on the net for each battleground, but it’s interesting to see how well the strategy is reflected in the data. A good chart really is worth a thousand words…

Warsong Gulch is our example here. You probably know that there are a number of traps in what seems like a simple game. Trap number one is to be on a team where everyone wants to run to the middle of the field and have a big punch-up while the flag carriers run by unchallenged.

The following charts are for level 19 characters who have played at least 100 games, so we’re not talking about noobs here. But you can still see all the nuances of the game in the data.

For a start, teamwork and focus on the flags rather than on scoring kills is vital to winning. So, if we chart games won to killing blows struck by individuals, we don’t find a strong correlation. You can help to win by doing other things than killing – CC-ing, running interference, guarding the flag rooms.

Killing blows vs games won.

Killing blows vs games won.

Being able to drop the opponent does have one important role:  getting the flag back when the other side is running with it. We can chart flags-returned to killing blows, and we get a stronger correlation:

Flags returned vs killing blows

Flags returned vs killing blows

But again, capturing the flag from the other side is about skills other than fighting and the data proves that. In fact, this is my favourite chart since the whole ding an sich of the game is in there. The best characters at capturing the flag are generally those with lower killing blow scores. They’re too busy running and hiding to be killing. So you get a gentle negative correlation as the chart shows:

Flags captured vs killing blows.

Flags captured vs killing blows.

The purpose of this exercise is to see if these indicators can be used to pick out twinks from the data. I’m confident that they can. So, the next chart shows deaths per game vs games won, and suggests what we all know – that the character with the best gear and enchants stays upright for longer. They spend less time running back from the spawn point and more time on-target.

deaths-per-game-vs-games-won

Deaths per game vs games won.

Dead toons don’t win games – it’s not rocket science.

One interesting point is that there is a poor correlation between deaths per game and killing blows per game, which surprises me. I would have thought that twinked characters both killed more and died less. Perhaps there is a kamikaze style of player that kills a lot and dies a lot, and a more um… tactical… toon that can dish it out without getting too much in return.

Deaths per game vs kills per game.

Deaths per game vs kills per game.

One final point – if you look carefully at the last chart, you will see the bane of the armoury data miner – the annoying little outlier that makes all our averages skew away from the median. That character circled on the bottom right of the chart is one lean mean killin’ machine, a real leader of the pack. Here’s his armoury profile – check him out. He’s a twink, no surprise there, but he doesn’t seem out of the general range of twink stats. To me, he is a reminder that skill does play a part in the game too. Mind you, he played nearly 900 games to get that good.

twinks inc

March 18, 2009

While we’re waiting for patch 3.1, I’ve gone back to my pet project from a month or so ago – looking for ways to extract information on x9 battleground twinks from the armoury. Building BG twinks seems to be something that a lot of people have thought about doing at one time or another, but have run into difficulties finding good information on how to go about it.

There are a couple of good sites for twink guides, and a few individuals who have created guides for individual classes at specific levels (like this one for 19 warlock twinks). But there seems to me to be a lot of gaps in our knowledge of what people are doing at the various x9 levels. Let’s see if we can improve the situation.

The fundamental problem is that there is no IsTwink() function in the armoury. We have to identify twinked characters from amongst the general population. To do that, two other issues have to be addressed.

The first problem is simply one of getting enough characters into the database. Only about 15% of leveling characters do any BGs at all. The vast majority of those try only a few games. A sizeable majority of the rest are… well… suboptimal… PVPers, so it isn’t likely that they’re twinked.

In other words, the characters we are after represent a tiny fraction of the total number of toons in the armoury. There is no way that an armoury crawler is going to find all of them in any reasonable time frame. I’ve really only got a partial solution to this one. I’ve modified my armoury crawler so that it switches over to extracting just characters in the x9 levels after it has built up a reasonable sample of characters from every level.

I spent a bit of time today watching the algorithm run (via a debugging dump) and it is clear what is happening – the crawler spends a lot of time rejecting characters that are not at an x9 level, but then finds a big guild list of those who are. Clearly there are some sizeable BG PvP guilds out there, or a lot of guilds that have little BG-ing armies inside them. This keeps the fetch queues surprisingly full, on average, which means the crawler is mining characters at a not-too-bad rate.

Problem number two is to identify indicators of twinkyness so I can write some database queries to find the little buggers. Of course you might say, well, why not just use gear. After all, a twink by definition has better gear than would be expected for a toon of that level.

Yes, we might come to that. But the idea I want to try first is to use the new character achievements and statistics data to identify the most effective battleground toons. Of course the most powerful level 19 WSG player might have got there by skill alone, equipped with nothing other than normal whites and questing greens. But, somehow, I doubt it…

The hypothesis is that the most effective characters in the BGs will turn out to be the twinks.

So which character stats do we look at? For example, the armoury tells us the number of BGs that the character won – we can find out that a  toon has played, say, 80 WSG BGs and won 75. That character has won 94% of their games. Does that make them likely to be a twink?

Well maybe… That’s the point I’m up to at the moment. I’m looking at the spread of these stats across the population base. Then running various queries based on the available data to see how selective they really are. My guess is that stats like the number of BGs won are less likely to be selective for what we want than stats related to the individual performance of the character.

After all, whether you win depends on the quality of the characters around you – a cr@p team on your side or a fully twinked pre-made on t’other can make all the difference to victory or defeat. What interests me are stats like the number of killing blows landed or the number of deaths, since these are indicators of personal survivability – which is the real hallmark of the twink. If you have the health, the mana and the armour, your death rate will be lower and your time-on-target higher.

But that’s enough for one post. Stand by for some serious chart porn on battleground character stats…

kindergarden cop

February 4, 2009

A blog reader (hi Jess…) made a couple of good suggestions for data mining the more youthful part of the Azeroth population:  toons in the 10 to 20 bracket.

I posted a chart a little while ago on the number of characters at each level from 10 to 80. There is a spike in numbers at the low end and I guessed that that represented a bit of a surge in players rolling new characters. But there is an alternative explanation – that there are just a lot of abandoned toons here. Players pick a class, level the character for a while then decide they don’t like playing that class. But they don’t delete the character and the little rug rat just lingers on in toon limbo.

Unfortunately its hard to tell from a single scan of the armoury which is the correct explanation. I’ve set up a couple of queries that may help produce an answer but it’ll take a while to collect the data.

The other suggestion was to see if the class composition of the ankle-biter toons is being influenced by the nerf wars. At the top end, paladins are flavour of the month; that’s clear enough. But does that mean a lot of players are running out and rolling new pallys to get in on the act?

The answer from the data is “no”. Surprisingly, the numbers in each class in the 10-20 bracket are just about evenly balanced – around 10-12% for each of the nine classes allowed at these levels.

There’s no sign of any hard swings between classes, or between tank/healer/dps playstyles for that matter. The healer classes are a bit under-represented at these levels but then isn’t that true at all levels?