back to the future

February 2, 2011

Just a brief note to say that I haven’t abandoned all hope of getting this site going again. Even though I’m not playing MMOs at the moment, it seems a shame to just leave everything sit idle. My basic infrastructure runs without too much effort, so it is no great problem to refresh the data every couple of months.

The main obstacle is that Blizz is now serving the up-to-date data from battle.net in HTML format rather than XML. My page-scraping code needs to change to cope with that. Fortunately however the Blizz engineers are serving up valid XHTML, which means that XPath expressions can still be used to extract the data we need.

If I’ve been a good little engineer then only my XPaths need to change and nothing else…

There is a danger that the XPath paths can become more than a bit baroque because they have to navigate through all the HTML markup to get to the data nodes, although there are tricks to get the XPath engine to do a lot of the searching.

Anybody looking for inspiration on how to parse the battle.net XHTML should check out these posts by a geek blooger called Kastang.  That’s the method I’ll be using when I get back to all this.

blog status

October 29, 2010

Sigh. You can see that things have gone quiet here. I’m moving on to new projects and interests. But I’m hoping to set aside a bit of time to see how… um… cataclysmic… the changes in the game are going to be for my database.

Everything is set up to keep the data pages up to date without a lot of effort on my part. So long as the armoury data doesn’t change too much, that is! If I can find a way to keep the site current I’ll do that.

But no promises as to timescale. Sorry.

updated for patch 3.3.5

August 13, 2010

I’ve refreshed all the reports over at the Google Appengine site to bring them up to date, except for the ones related to twinks and bg performance. I should have enough data to refresh them early next week.

No dramatic changes meet my eye, but alas I’m a bit busy with other things so I’ve only given the reports a quick once-over.

druid forms at last!

July 29, 2010

I love the smell of data in the morning. It smells like… victory!

Druid forms; phew! It’s taken a while. The count of moonkin and tree forms was always straightforward since these forms derive from a talent. But now, thanks to all my correspondents on the question of feral druids and thanks to the power of modern datamining packages, I’m happy that I’ve got a reasonable estimate of those level 80 feral raiders who favour bear form and those who favour cat form.

The number of unclassifiable ferals is still a bit high but that is mainly due to the number of feral druids who do not raid. Only a tiny number of feral raiders really have an equal investment in cattish and bearish talents and glyphs. There may also be roles for cats and bears in PvP, but I don’t have a solution for estimating those.

Still we’ve got most of the answers we were after.

These numbers are for patch 3.3.3 and are the percentages based on all level 80 druids. So without further ado:

Form Popularity
Moonkin 26%
Tree 40%
Feral Raiders Cat-oriented 11%
Feral Raiders Bear-oriented 17%
Unclassifiable Ferals 6%

If we estimate cats and bears as a percentage of  feral combat druids only, we get this:

Form Popularity
Feral Raiders Cat-oriented 33%
Feral Raiders Bear-oriented 50%
Unclassifiable ferals 16%

Percentages are based on active specs only, to keep things simple. That still strikes me as reasonable, over a large sample, since the percentages reflect what you’d see in-game on average.

I’ve got a couple more posts to come which explain in detail how the feral estimates are derived. And I’ll put up the data set so interested people can have a play around with it and see if there is any better way to cluster the bears and the cats. I’ll also add these tables to my druids reports over at the Google appengine site.

But that’s as good an estimate as I know how to get. And it was a fun ride getting to this point too.

the talented Mr Druid

July 28, 2010

Now that we’ve got a dataset which can give us feral druids who:

  • are consistently geared and spec-ed and
  • are serious participants in instance-running and raiding

then we can move to the next stage: trying to find ways to partition that set into bears and cats.

This is where the visualization tools in a datamining package really come into their own. We can add a third dimension to any cluster by using colour. And we can quickly iterate through all the data dimensions to see which ones produce the best clusters. In these charts I’m filtering out all ferals druids with spellpower gear and all who have run fewer than 75 instances or raids.

I’m still plotting health vs mana, to keep the charts consistent across posts, but we’re getting close to the point where we will have to find different stats to graph. We know now that mana is irrelevant and health only a partial indicator of tank-ness. But for the time being, the main cluster that results from that plot is good enough.

Now we want to know which talents can partition the cluster. (And we could ask the same question of glyphs or character stats too.) How about Primal Gore? This is the result – not a lot of partitioning going on there:

Primal Gore - not effective in clustering

Thanks to various comments, it’s clear that there are a set of talents which people expect to effectively partition the cluster. Popular suggestions have included: Thick Hide, Natural Reaction and Protector of the Pack for bears and Shredding Attacks, Predatory InstinctsKing of the Jungle, Survival Instincts and Natural Shapeshifter for cats.

Now we can look at each of those in detail:

Bear:

Protector of the Pack cluster

Natural Reaction cluster

Thick Hide cluster

Cat:

Survival Instincts Cluster

Shredding Attacks cluster

Predatory Instincts cluster

Natural Shapeshifter cluster

King of the Jungle cluster

You can see that some talents appear to be better than others at defining two distinct clusters. They all have a bit of partitioning effect, but some are better than others at producing the largest “distance” between the two clusters. Predatory Instincts produces clear gold and light blue clusters but Natural Shapeshifter produces more of a greenish middle ground which means that many players in both camps have put a point or two into it.

Datamining clustering algorithms work by calculating “distances” between data points along each of the data dimensions then aggregating those distance measures across all the dimensions. For example, the distance between a toon which has 3 points in Thick Hide and a toon which has zero points in the talent could be measured as “3” and then a sum of all distances could produce a measure of how distinct one toon is from another (although the algorithms generally use more sophisticated maths than just that.)

So we want the talents with the greatest distance between the two clusters. You can have a look at the charts and see which ones you think are the best ones. I’ll put up my numbers on that in the next post.

Now if we use the better of those talent dimensions as inputs to our clustering algorithm we get this:

Clusters in five talent dimensions.

The crucial thing here is that the blue cluster, which are the toons with bear-ish talents, extends right along the health x-axis. No doubt serious tanks are picking gear, gems and enchants that boost health. But since we are looking for a count of all tanks, all the way from those running 5-toon instances to those in the endgame raids, we should expect that there will be a wide spread of health between those just starting out and those nearer the end of the raiding dungeon chain.

That’s one reason why I’m about to abandon the health vs mana thing and move onto other character stats. More about that in the next post. But the reason we can make decisions like that is due to the insights that this data visualization gives us.

the truth is in there

July 20, 2010

Thanks to all the correspondents who commented on my feral druids datamining experiment. I’m happy that I’ve got a reasonable estimate for the number of bear tanks now. But I’m holding back from putting up my final word on the subject since I’m trying to encourage a couple of people to write up their own analysis first.

You may recall we left off with a simple graph of health vs mana for level 80 feral druids that produced two very distinct clusters – a red and a blue one – sorta like one of those political maps of the USA except with all the republicans and democrats clumped together in separate parts of the country.

And the key question was… um… While we’re on that subject… Can anybody explain to me why those political maps always colour the conservatives red and the liberals blue? It’s very confusing to a foreigner since just about everywhere else in the world, red is associated with the left or progressive side and blue with the Tory or conservative side.

Remember that great movie from the Reaganite ’80s? It was Red Dawn, not Blue Dawn. But I digress…

And the key question was: what were those red ferals doing at the high mana end of the scale? I had my doubts that there could be so many toons carrying mismatched specs and gear. But I’ve been convinced that, yes, there is something not quite right there. A simple filter that drops those toons in the sample with significant spellpower gear basically makes the red cluster disappear.

Now that might not sound like progress – ending up with one cluster – but don’t forget that the power of these datamining algorithms is that they cluster in multiple data “dimensions”. To the eye, there is one cluster, because we are drawing the graph in two “dimensions”: health and mana.

But as soon as we add some talent and glyph dimensions, the big blue blob starts to break up into separate clusters. And this time, there is a good match between the talents and glyphs that we expect to distinguish cats from bears and the actual location of each cluster in the multi-dimensional space.

But it’s a whole lot easier to show you than to tell you, so I’ll leave you with a simple illustration of how that all works. We can add a third dimension to the graph by using colour. The datamining packages that I’m playing with are very good at that sort of visualization, as you can see here.

With the spellpower toons gone, the high mana group has also mostly gone  and the shape of the blue cluster has become clearer as the graph scale has changed. Then we overlay, say, a cat glyph:

Feral Druids with Glyph of Shred

and a bear glyph:

Feral Druids with Glyph of Maul

and the clusters within the cluster become pretty clear. Thanks again to Narkondas for the key clues that inspired those graphs.

The datamining algorithms will generate a count of the toons in each cluster, but I’ll leave that till the next post. But as you can imagine, with a big clump of ferals filtered out, then the percentage of bear tanks in the overall mix is getting smaller.

I should also say that I’m about to collect a new data set and update my armoury reports since the data is getting a bit old and stale. As usual that will take a week or so.

practical cats

July 10, 2010

Thanks to various people for input on that last post. I’m still happy that the blue cluster represents the feral druid bear tanks. Characters in that cluster are stacking all the stats recommended by the various bear tanking blogs, including agility.

But I’m happy to admit that the red cluster is more of a mystery – something at least partly to do with cat form, but there are some oddities there.

All the druids in the sample are feral druids – the balance and resto ones were all filtered out in the database query. Undoubtedly there are druids with two specs who forget to swap gear when they swap specs, but could there really be so many? Those two clusters are pretty dense – to me they represent lots of players following a standard pattern rather than something that could be the result of  mistakes.

I’ve got a lot more charts to post on this question, but I see from the comments that I haven’t quite selected all the right stats. Let me fix that on Monday and we’ll have another look at what’s going on.

UPDATE: More interesting comments. Thanks all. Unfortunately RL affairs are diverting me for the next couple of days but I’ll get back to it as soon as I can.

Just briefly but:

I have done an analysis of talent distribution and yes, they seem to be very poor predictors of rading roles.

I agree that the red cluster may represent hybrid behaviour and I suspect now that there is possibly no way to get a sense of how many players heavily lean towards cat over bear.

Thnks especially to Narkondas who pointed out something that I hadn’t properly considered. I talked about the blobs representing players who were “stacking” certain stats. But that has to be proved; it is not a starting point. The mana available to the red blob toons may simply be the default mana values granted by the gear etc that the toon is wearing. Unfortunately the armoury picture is static and doesn’t replace mana with energy when the toon is in cat form (otherwise we’d have a foolproof way of counting cats).

UPDATE 2: D’Oh! Yes there is an error in the talents data. The database query was picking up some talents from the inactive spec. Thanks to Narkondas for spotting that. I’ve replaced the files in the previous post with corrected versions. Every entry now adds up to 71 points or less. Also I’ve now become convinced that all those high spellpower/high mana toons really are running with gear that is not ideal for feral builds – either by accident or by design. So I’ve added a spellpower column into the talents data to see if we can see any patterns there. Or the spellpower column can be used to filter out those toons that are um… trying to subvert the dominant feral paradigm…

A long, long time ago on a website far, far away there appeared a post which argued that feral druid bear tanks were in short supply. The article also made the perfectly reasonable point that armoury datamining sites had no data on the popularity of the various druid forms.

Now, being the kind of nerd who likes a challenge (is there any other kind?) I couldn’t let that one go past. But how to solve the problem? Druids get some of their forms through talents; easy enough to get a count of the toons invested in those talents. But cats and bears were not so straightforward.

As an old SQL hacker of the in Codd we trust school, my first cunning plan to get the numbers on feral druid bear tanks basically went like this:

  1. Mine data
  2. SELECT level 80 feral druids FROM toons
  3. GROUP BY ???
  4. Profit!

Unfortunately, that didn’t work so well, for a number of reasons. I now understand one very interesting reason why it didn’t work: the feral talents that we expected to use to identify cats and bears are not actually distributed that way. But more on that later.

The solution involved spending some time getting up to speed with more sophisticated datamining algorithms. These algorithms are also based on a sort of GROUP BY principle, but are capable of grouping (or “clustering”, as the datamining jargon has it) across multiple data dimensions. They can easily handle the 85 talents in the three druid trees and, in essence, can group samples of druids into clusters of toons in an 85-dimensional space. Alternatively, we can cluster on character stats – health, mana, strength, agi etc – or on any combination of talents, stats and playstyle numbers that interest us.

They also use calculus techniques to find the borders of each cluster in a way that can tolerate outliers. This is important in real life data, but also important in WoW data where there is always a small but significant number of players who insist on being… individuals…

Up until a few days ago, I was expecting to have to put off working on the druid forms question until I had a really good understanding of these algorithms. But that is not necessary, for the simple reason that the data we are dealing with is not all that complex. Consider the following graph:

Health vs Mana for 80 Feral Druid Raiders

Here we have selected for toons that have a history of running instances and raid dungeons, and have filtered out the toons that are serious PvPers. What we have are two groups that in essence are clustering themselves – no real datamining required.

The bottom, horizontal, group is emphasizing health over mana. This group is stacking stamina, agility, armour and dodge (I’ll prove all those things in the next post) and selecting some of the talents we’d expect for bears. In other words, Tanking 101. The  top group is selecting for mana over health and is stacking intellect and spirit. This group has taken some of the cat talents too, although as we’ll see in the next post, talents do not seem to be a great predictor of role.

That graph was generated from a 500-toon data set. So we’re ready to cluster and count the full sample. And voilà:

Feral Druid Raiding Tanks and DPSers.

You can see there are some outliers, but the bulk of the sample falls neatly into the two clusters. What you can’t see properly from the chart is that the blue tank cluster is in fact more populous than the red DPSers. It’s just that their health and mana stats don’t vary much, so the cluster is more dense. That’s where a clustering algorithm is needed to get a count of the population of each blob.

And the answer? There are 13, 187 level 80 feral druids in the sample who have done more than 75 instances and raids and have done no arenas. That’s my (s0mewhat generous) working definition of a PvE raider. It’s also a problematic definition because the arena stats are historical – they don’t prove that the toon was not geared up for raiding when the armoury snapshot was taken.

Of those raiders, 60% are in the blue tank cluster and 40% are in the red DPS cluster. So that’s one useful piece of information: feral druid raiders do seem to prefer to tank rather than DPS by a narrow majority.

But the PvE raiders are less than 1/2 of the total sample. So, the worst case scenario is that on any given day, only 30% of level 80 feral druids are set up for tanking (although, to repeat, it is not likely that every arena player is geared for PvP all the time).

The data is from patch 3.3.3.

Then there is the question of effective tanks. Some of those blue crosses in the bottom left hand corner of the chart are probably not seriously gearing up for very much at all. That will be the subject of the next post, when we will use some of the wonderful data visualization tools in these datamining packages to look much more closely into the dark heart of that big blue blob.

Meanwhile, if you’d like to play around with the data for yourself, here are the data sets I’m using. I’ve got a small set of feral talent builds, a larger set of builds and a large set of character stats. Each data set contains counts of instances and raids, battlegrounds and arenas played so you can filter on the raiders.

NB these data sets have been corrected and updated on 12 July. If you downloaded them before that, apologies for the error, and please download them again:

Things may look quiet here but behind the scenes I’ve been working on my project to get up to speed with modern datamining algorithms. The first step has been to assemble some sources of information and some tools for the job.

For a textbook, I’ve chosen Introduction To Data Mining by Tan, Steinbach and Kumar. It provides a good overview of the key algorithms, along with important issues like data quality and consistency. It also introduces the maths in a reasonably gentle way.

Fortunately, while it is important to understand how the algorithms work, it is not necessary to work the maths by hand. There are some first class freeware datamining programs available that do all the heavy lifting, so long as you know how to prepare the data and how to set the parameters of the algorithms so they produce valid results.

Three datamining packages in particular are worth noting:

Weka and RapidMiner are GUI-driven toolsets where R is more command-line oriented. You can download all of these and play around with them at home. They’re not toys, so you need to have some confidence about plowing through the user guides and technical manuals, but they are easy enough to get up and running.

The choice between Weka and RapidMiner is a difficult one. At the moment I’m working with Weka but that is mainly because it was the first one I started experimenting with.

The other crucial thing to have at hand is some test data. Of course you may want to remind me that I have several GB of WoW-related data right here. But that is not the place to start. The first step is to learn how the tools work and how to use them. For that we need data where we know the expected results – so that when the tool doesn’t produce the right result we know to look again at how we have applied the algorithm.

The data has to be challenging enough to put the algorithms to a test, but not so challenging that we are left wondering whether the wrong answer is due to the complexity of the data or just due to some dumb mistake.

Over at the Expressive Intelligence Studio blog, Chris Lewis posted an interesting report about using a toon’s gear to predict its class. That’s exactly the sort of place we want to start since all sorts of classifying and clustering algorithms could be tested on a data set like that. The other idea that occurred to me is to use talent builds to predict the spec of the toon. Of course you can do that in a very simple way by just adding up the points spent in each of the three trees: a paladin that has most points in the holy tree is a holy paladin.

Where the problem becomes interesting is with those classes where there is a tendency to spread talent points across more than one tree. I’m thinking mainly of mages and warlocks but any class where the three trees don’t map straight onto the tank/healz/dps holy trinity should see some points spread across multiple trees. Can datamining algorithms handle “fuzzy” data like that?

To make this discussion more concrete, let’s have a quick look at that very question. We can fire up Weka and feed in a sample of level 80 paladin talent builds. To keep it simple, I’m using a toy data set of only 150 paladins with 50 from each of the 3 trees. We can run a basic k-means clustering algorithm over the data, which we hope should produce 3 clusters: one each for the holy, protection and retribution trees. And voilà…

Paladins clustered

That works because holy paladins don’t spend many points in protection or retribution talents. But for mages, where there is a significant tendency to spend points in more than one tree we get this:

Mages clustered.

Now the algorithm is flummoxed – putting arcane and frost mages in the same cluster and splitting fire mages into two clusters. So we have a simple data set that is also challenging enough to put these tools to a bit of a test.

I’ve also made a third data set using priest builds. Priests have more talent points invested across the trees than paladins but fewer than mages. Clustering this data set is left as an exercise for the reader…

No, seriously… Anybody who’d like to experiment with these data sets can download them from the links here. They’re in a standard .arff format (really just an annotated csv file) that Weka and RapidMiner both know how to load. Note that I’ve used a “.pdf” extension since WordPress will not allow me to upload arbitrary file types. But if you open them in a text editor you’ll see they are just csv data. Rename the extension to “.arff” and they’re ready to go.

ClassByGear.arff
PaladinBuilds.arff
MageBuilds.arff
PriestBuilds.arff

I’ll have a lot more to say about these little data sets in the next few posts.

phone phreaks

June 10, 2010

W00t! My HTC Desire has arrived at last. It took a bit longer than I’d hoped, but that was just because the UK suppliers ran out of stock. It clearly is a phenomenally popular phone.

The Voight-Kampff Test

And my cunning plan to defeat a certain evil Oz telco has worked perfectly too. Seriously, any Australian readers who want a non-Telstra Desire, all you have to do is order it over the internet from here or here. A hundred Oz dollars cheaper and FedEx will get it to you in under five days. The British phones run on the right frequencies for the Optus and Vodafone 3G networks here.

My phone configured itself correctly from the Optus sim card; 3G voice and data came up without a hitch when I turned the phone on. WiFi and GPS, of course, are true world standards, so no problems there either.

And would you like a review of the phone? Remember Arthur C Clarke, the science fiction writer, who once made the remark that “Any sufficiently advanced technology is indistinguishable from magic”? I guess these phones must be pretty advanced then; they do have a magical quality as far as I’m concerned. It’s the first thing I’ve yet encountered that really makes me feel that I’m living in the 21st century.

I’m still adding apps and sorting out the configuration of the thing. Trouble is, everything I install I then want to play with. I spent a couple of hours yesterday playing with Google Goggles. Not perfect yet by any means, but insanely great, nonetheless. I’ll post a list of the apps that I consider essential, just as soon as I’ve figured out what they are.

And I want to try a mobile blog post too. The phone does voice-to-text that works pretty well; that should save a lot of the typing. One of my main concerns was that I’d need a physical keyboard. And indeed, if you’re planning to write War and Peace II on the thing then a keyboard is a must. But the beauty of Android is the close integration with the Google cloud. A lot of data can be set up in the cloud using your PC, and sync-ed over to the phone. I got all my browser bookmarks, RSS subscriptions and personal contacts onto the phone without typing a line.

Other options include cloud-based note-taking services like Evernote.

Meanwhile I hear that another company has just released an upgraded iProduct thinggy. Yawn… Why are they even bothering? Aside from anything else, it’s Android that has the geek street cred now. Remember back in the Mac vs PC days, the Apple computer was the minority taste and you got certain bragging rights by having one.

Now, in Australia at least, everyone has an iPhone and the Android phones are the minority. How good will it be when you pull your new Desire out of your pocket and all your friends say “Oh, is that the latest iPhone”?

Of course you’ll have your response carefully rehearsed – the quick pitying glance, the condescending smirk, the just-so tone of voice as you reply “Oh no. I’ve gone with Google. I didn’t want to be… like… evil.”

I think you can imagine how good that will feel. Hop to it then.

UPDATE: There is an issue with the phone picking up the 3G data link after turning off the WiFi. This seems to be common to all Desires and not just the UK ones. The workaround is here.