back to the future
February 2, 2011
Just a brief note to say that I haven’t abandoned all hope of getting this site going again. Even though I’m not playing MMOs at the moment, it seems a shame to just leave everything sit idle. My basic infrastructure runs without too much effort, so it is no great problem to refresh the data every couple of months.
The main obstacle is that Blizz is now serving the up-to-date data from battle.net in HTML format rather than XML. My page-scraping code needs to change to cope with that. Fortunately however the Blizz engineers are serving up valid XHTML, which means that XPath expressions can still be used to extract the data we need.
If I’ve been a good little engineer then only my XPaths need to change and nothing else…
There is a danger that the XPath paths can become more than a bit baroque because they have to navigate through all the HTML markup to get to the data nodes, although there are tricks to get the XPath engine to do a lot of the searching.
Anybody looking for inspiration on how to parse the battle.net XHTML should check out these posts by a geek blooger called Kastang. That’s the method I’ll be using when I get back to all this.
teach yourself datamining in 21 days
July 5, 2010
Things may look quiet here but behind the scenes I’ve been working on my project to get up to speed with modern datamining algorithms. The first step has been to assemble some sources of information and some tools for the job.
For a textbook, I’ve chosen Introduction To Data Mining by Tan, Steinbach and Kumar. It provides a good overview of the key algorithms, along with important issues like data quality and consistency. It also introduces the maths in a reasonably gentle way.
Fortunately, while it is important to understand how the algorithms work, it is not necessary to work the maths by hand. There are some first class freeware datamining programs available that do all the heavy lifting, so long as you know how to prepare the data and how to set the parameters of the algorithms so they produce valid results.
Three datamining packages in particular are worth noting:
Weka and RapidMiner are GUI-driven toolsets where R is more command-line oriented. You can download all of these and play around with them at home. They’re not toys, so you need to have some confidence about plowing through the user guides and technical manuals, but they are easy enough to get up and running.
The choice between Weka and RapidMiner is a difficult one. At the moment I’m working with Weka but that is mainly because it was the first one I started experimenting with.
The other crucial thing to have at hand is some test data. Of course you may want to remind me that I have several GB of WoW-related data right here. But that is not the place to start. The first step is to learn how the tools work and how to use them. For that we need data where we know the expected results – so that when the tool doesn’t produce the right result we know to look again at how we have applied the algorithm.
The data has to be challenging enough to put the algorithms to a test, but not so challenging that we are left wondering whether the wrong answer is due to the complexity of the data or just due to some dumb mistake.
Over at the Expressive Intelligence Studio blog, Chris Lewis posted an interesting report about using a toon’s gear to predict its class. That’s exactly the sort of place we want to start since all sorts of classifying and clustering algorithms could be tested on a data set like that. The other idea that occurred to me is to use talent builds to predict the spec of the toon. Of course you can do that in a very simple way by just adding up the points spent in each of the three trees: a paladin that has most points in the holy tree is a holy paladin.
Where the problem becomes interesting is with those classes where there is a tendency to spread talent points across more than one tree. I’m thinking mainly of mages and warlocks but any class where the three trees don’t map straight onto the tank/healz/dps holy trinity should see some points spread across multiple trees. Can datamining algorithms handle “fuzzy” data like that?
To make this discussion more concrete, let’s have a quick look at that very question. We can fire up Weka and feed in a sample of level 80 paladin talent builds. To keep it simple, I’m using a toy data set of only 150 paladins with 50 from each of the 3 trees. We can run a basic k-means clustering algorithm over the data, which we hope should produce 3 clusters: one each for the holy, protection and retribution trees. And voilà…
That works because holy paladins don’t spend many points in protection or retribution talents. But for mages, where there is a significant tendency to spend points in more than one tree we get this:
Now the algorithm is flummoxed – putting arcane and frost mages in the same cluster and splitting fire mages into two clusters. So we have a simple data set that is also challenging enough to put these tools to a bit of a test.
I’ve also made a third data set using priest builds. Priests have more talent points invested across the trees than paladins but fewer than mages. Clustering this data set is left as an exercise for the reader…
No, seriously… Anybody who’d like to experiment with these data sets can download them from the links here. They’re in a standard .arff format (really just an annotated csv file) that Weka and RapidMiner both know how to load. Note that I’ve used a “.pdf” extension since WordPress will not allow me to upload arbitrary file types. But if you open them in a text editor you’ll see they are just csv data. Rename the extension to “.arff” and they’re ready to go.
ClassByGear.arff
PaladinBuilds.arff
MageBuilds.arff
PriestBuilds.arff
I’ll have a lot more to say about these little data sets in the next few posts.
the median is the message
June 2, 2009
Over at Wowenomics, Gevlon from the Greedy Goblin left a comment about datamining that is worth replying to. Basically, his point is that sites like this one report on what the average player is doing, but that is not much use because the average player is only making average choices. (…or at least that’s the polite paraphrase of his point.)
In fact the data shows that this sort of argument is not true. Let’s start from what the data actually looks like. Here, I’ve taken an example picked at random: all choices made by 69 DKs for the chest slot:
Item | Count |
Mightstone Breastplate | 2421 |
Battlemaster’s Breastplate | 1010 |
Scavenged Tirasian Plate | 996 |
Adamantite Breastplate | 898 |
Murkblood Avenger’s Chestplate | 655 |
Gorge’s Breastplate of Bloodrage | 309 |
Battle Leader’s Breastplate | 208 |
Saronite War Plate | 197 |
Fel Iron Breastplate | 141 |
Unscarred Breastplate | 136 |
Westguard Armor | 78 |
Coldrock Breastplate | 75 |
Azure Chain Hauberk | 71 |
Durotan’s Battle Harness | 70 |
Baleheim Armor | 68 |
Light-Touched Breastplate | 63 |
Breastplate of the Warbringer | 48 |
Andrethan’s Masterwork | 46 |
Bone-Threaded Harness | 44 |
Segmented Breastplate | 32 |
Conqueror’s Breastplate | 30 |
Blacksoul Protector’s Hauberk | 26 |
Bloodfist Breastplate | 23 |
Chestguard of Illumination | 21 |
Nether Protector’s Chest | 21 |
Vest of Vengeance | 21 |
Soul Saver’s Chest Plate | 21 |
Scavenged Breastplate | 18 |
Chestguard of Salved Wounds | 18 |
Breastplate of Blade Turning | 18 |
Light-Bound Chestguard | 17 |
Warmaul Breastplate | 17 |
Breastplate of Retribution | 15 |
Blackened Chestplate | 15 |
Boulderfist Armor | 14 |
Heavy Earthforged Breastplate | 14 |
Shattered Hand Breastplate | 12 |
Talonguard Armor | 12 |
Reaver Armor | 11 |
The Exarch’s Protector | 9 |
Lost Chestplate of the Reverent | 8 |
Bloodscale Breastplate | 8 |
Gilded Crimson Chestplate | 7 |
Chestplate of A’dal | 7 |
Jerkin of the Untamed Spirit | 7 |
Khan’aish Breastplate | 7 |
Redeemer’s Plate | 7 |
Torn-heart Family Tunic | 7 |
Protectorate Breastplate | 6 |
Marshwalker Chestpiece | 4 |
Bogslayer Breastplate | 4 |
Demon-Forged Chestguard | 4 |
Warden’s Hauberk | 4 |
Shamblehide Chestguard | 3 |
Garmaul Chestpiece | 3 |
Elegant Dress | 2 |
Crimson Mail Hauberk | 2 |
Cenarion Thicket Jerkin | 2 |
Ango’rosh Breastplate | 2 |
Azure Silk Vest | 1 |
Acherus Knight’s Tunic | 1 |
Black Mageweave Vest | 1 |
Bonechewer Berserker’s Vest | 1 |
Corsair’s Overshirt | 1 |
Darkcrest Breastplate | 1 |
Demon-Forged Hauberk | 1 |
Drakescale Breastplate | 1 |
Breastplate of Many Graces | 1 |
Chestguard of the Dark Stalker | 1 |
Chestguard of the Stormspire | 1 |
Chestguard of the Talon | 1 |
Farshire Robe | 1 |
Flimsy Chain Vest | 1 |
Lovely Black Dress | 1 |
Lovely Blue Dress | 1 |
Simple Black Dress | 1 |
Skom Chain Vest | 1 |
Scale Brand Breastplate | 1 |
Refuge Armor | 1 |
Runecloth Robe | 1 |
Nexus-Strider Breastplate | 1 |
Spring Robes | 1 |
Tuxedo Jacket | 1 |
Twilight Cultist Robe | 1 |
Warrior’s Embrace | 1 |
Worgblood Berserker’s Hauberk | 1 |
Wrathfin Armor | 1 |
It’s worth charting this distribution too, since the shape of the distribution curve is important:
The error in Gevlon’s argument stems from our common-sense understanding of average. Most often we think in terms of a Gaussian distribution – so often that it is actually called a normal distribution. When events or things are distributed normally, then the average outcome is, well, average. But, as various people have observed, when networking effects come into the picture, the typical distribution is not Gaussian but power-law-like. The majority cluster around a very few choices, with a rapid fall-off into a long tail of more funky choices but where each choice in the long tail made by only a few individuals.
Now WTF does all that mean in plain English? Simple. If WoW gear, gems or enchant choices were normally distributed, a few people would make the best choice, a few people would make the worst choice and most would make a so-so choice. We would expect to see a few 69 DKs with the Uber Breastplate of Pwnage, a few with the Scruffy Tunic of Suckage, but most would be wearing the Mediocre Breastplate of, um…, Mediocrity. And that’s what my report would find for you.
But you can see from the charts that the data looks nothing like a normal distribution. Most players have in fact made the same few choices – which generally represent a trade-off between how powerful the item is and how easy it is to get hold of. Those people who don’t follow the crowd are out in the long tail – here the picture is murky because we don’t know whether they are there because of ignorance or whether they have hit on some effective but as-yet unknown (or difficult to obtain) solution to the problem. (And some are out there because the data is capturing multiple playstyles – no doubt those toons wearing tuxedo jackets and lovely blue dresses are being played by people who know exactly what they are doing.)
But for our purposes, averages are just what we want – they show the consenus view across the player base on what are the reasonably sensible and effective choices.
To me, the interesting question is how this consensus forms. Undoubtedly Gevlon’s point has an element of truth – the average WoW player is no theorycrafter. But there are feedback mechanisms that shape their choices. They have the game itself. And they have instant access to the collective wisdom of the player-borg’s vast hive mind thanks to all the commentary and guides here in cyberspace. It is these network effects that make the distribution take the shape that it has.
more information than you require
April 22, 2009
You may not have noticed yet, given all the 3.1 (and now 3.1.1!) fun, but a whole slab of character stats have disappeared from the armoury. The most interesting ones that have gone are the detailed BG performance stats. Strangely enough, the raiding performance numbers are still there – the missing ones all seem to be PvP related.
Now if I were a, like, y’know, paranoid kinda person, I’d be putting forward the following conspiracy theory. PvP stats are about class performance and raid stats are about group performance. Raid stats tell you something about a guild because performance depends on the ability of the guild to coordinate, to lead, to control the Leeroy Jenkins element etc etc. And indeed there are websites out there that do exactly that: rate guilds by how many, and which, raid dungeon bosses they’ve downed.
But PvP stats, being all about the mano a mano thing, tell you something about the relative performance of classes. And that subject really does seem to be Blizz’s bête noire these days. Just exactly why they’re so focused on it escapes me – but then I never go anywhere near the official fora so maybe that’s why I’m in the dark.
My conspiracy theory would be that they don’t want anybody to be able to just run class balance through the ol’ spreadsheet to see what comes out the other end.
I made a modest contribution to using the BG data to look at class balance here on this blog. But I believe I was pretty careful to say, like dude, we don’t expect the classes to be balanced across any narrow set of performance measures. Classes that can tank and CC and heal are people too.
But anyway I’m not the paranoid type, so that’s enough of that. It’s just a silly game. Let’s move on.
more on classes and battlegrounds
April 2, 2009
Just to complete the picture, here is a set of battleground class performance charts for each of the x9 BG levels plus level 80. The y-axis now shows average deaths per game and not the inverse, so the sweet spot of high-kills-low-deaths is in the bottom right hand corner of the chart.
The sample consists of all players at each level who have played 100 or more BGs. The data is from patch 3.0.9.
There’s a lot of interesting things to note in those charts, especially when you compare the same class at different levels. Some are effective at all levels, others appear to change roles as they level up.
If you want the executive summary, these are the points that strike me:
- DKs are OP
- Rogues aren’t, even though people think they are
- Warriors are fragile, despite all that armour
- Warlocks are still a force in PvP as long as you don’t mind dying a lot
- Baby Paladins may be easy meat but the adult of the species sure isn’t
- Hunters seem to be the consistent high performer, but that is probably because they just play the same role (of ranged attacker) at every level
Do exercise some caution when interpreting these results. In particular remember:
- Some classes have fewer attacking players and therefore a lower average kill rate just because they have a healing tree. Other classes may spend a lot of time CC-ing instead of attacking.
- Some BGs have objectives that conflict with straight PvP. For example in Warsong Gulch, the classes that spend most time running the flag will have a lower kill rate because of that.
- This is data aggregated across every BG accessible at the level. There may be specific features of individual BGs that make certain classes more effective there, despite these charts.
19
29
39
49
59
69
79
80
Warsong Gulch chart porn
March 20, 2009
I’ve been having a bit of fun with the battleground stats data from the Armoury. There are plenty of good strategy guides on the net for each battleground, but it’s interesting to see how well the strategy is reflected in the data. A good chart really is worth a thousand words…
Warsong Gulch is our example here. You probably know that there are a number of traps in what seems like a simple game. Trap number one is to be on a team where everyone wants to run to the middle of the field and have a big punch-up while the flag carriers run by unchallenged.
The following charts are for level 19 characters who have played at least 100 games, so we’re not talking about noobs here. But you can still see all the nuances of the game in the data.
For a start, teamwork and focus on the flags rather than on scoring kills is vital to winning. So, if we chart games won to killing blows struck by individuals, we don’t find a strong correlation. You can help to win by doing other things than killing – CC-ing, running interference, guarding the flag rooms.
Being able to drop the opponent does have one important role: getting the flag back when the other side is running with it. We can chart flags-returned to killing blows, and we get a stronger correlation:
But again, capturing the flag from the other side is about skills other than fighting and the data proves that. In fact, this is my favourite chart since the whole ding an sich of the game is in there. The best characters at capturing the flag are generally those with lower killing blow scores. They’re too busy running and hiding to be killing. So you get a gentle negative correlation as the chart shows:
The purpose of this exercise is to see if these indicators can be used to pick out twinks from the data. I’m confident that they can. So, the next chart shows deaths per game vs games won, and suggests what we all know – that the character with the best gear and enchants stays upright for longer. They spend less time running back from the spawn point and more time on-target.
Dead toons don’t win games – it’s not rocket science.
One interesting point is that there is a poor correlation between deaths per game and killing blows per game, which surprises me. I would have thought that twinked characters both killed more and died less. Perhaps there is a kamikaze style of player that kills a lot and dies a lot, and a more um… tactical… toon that can dish it out without getting too much in return.
One final point – if you look carefully at the last chart, you will see the bane of the armoury data miner – the annoying little outlier that makes all our averages skew away from the median. That character circled on the bottom right of the chart is one lean mean killin’ machine, a real leader of the pack. Here’s his armoury profile – check him out. He’s a twink, no surprise there, but he doesn’t seem out of the general range of twink stats. To me, he is a reminder that skill does play a part in the game too. Mind you, he played nearly 900 games to get that good.
kindergarden cop
February 4, 2009
A blog reader (hi Jess…) made a couple of good suggestions for data mining the more youthful part of the Azeroth population: toons in the 10 to 20 bracket.
I posted a chart a little while ago on the number of characters at each level from 10 to 80. There is a spike in numbers at the low end and I guessed that that represented a bit of a surge in players rolling new characters. But there is an alternative explanation – that there are just a lot of abandoned toons here. Players pick a class, level the character for a while then decide they don’t like playing that class. But they don’t delete the character and the little rug rat just lingers on in toon limbo.
Unfortunately its hard to tell from a single scan of the armoury which is the correct explanation. I’ve set up a couple of queries that may help produce an answer but it’ll take a while to collect the data.
The other suggestion was to see if the class composition of the ankle-biter toons is being influenced by the nerf wars. At the top end, paladins are flavour of the month; that’s clear enough. But does that mean a lot of players are running out and rolling new pallys to get in on the act?
The answer from the data is “no”. Surprisingly, the numbers in each class in the 10-20 bracket are just about evenly balanced – around 10-12% for each of the nine classes allowed at these levels.
There’s no sign of any hard swings between classes, or between tank/healer/dps playstyles for that matter. The healer classes are a bit under-represented at these levels but then isn’t that true at all levels?