July 28, 2010
Now that we’ve got a dataset which can give us feral druids who:
- are consistently geared and spec-ed and
- are serious participants in instance-running and raiding
then we can move to the next stage: trying to find ways to partition that set into bears and cats.
This is where the visualization tools in a datamining package really come into their own. We can add a third dimension to any cluster by using colour. And we can quickly iterate through all the data dimensions to see which ones produce the best clusters. In these charts I’m filtering out all ferals druids with spellpower gear and all who have run fewer than 75 instances or raids.
I’m still plotting health vs mana, to keep the charts consistent across posts, but we’re getting close to the point where we will have to find different stats to graph. We know now that mana is irrelevant and health only a partial indicator of tank-ness. But for the time being, the main cluster that results from that plot is good enough.
Now we want to know which talents can partition the cluster. (And we could ask the same question of glyphs or character stats too.) How about Primal Gore? This is the result – not a lot of partitioning going on there:
Thanks to various comments, it’s clear that there are a set of talents which people expect to effectively partition the cluster. Popular suggestions have included: Thick Hide, Natural Reaction and Protector of the Pack for bears and Shredding Attacks, Predatory Instincts, King of the Jungle, Survival Instincts and Natural Shapeshifter for cats.
Now we can look at each of those in detail:
You can see that some talents appear to be better than others at defining two distinct clusters. They all have a bit of partitioning effect, but some are better than others at producing the largest “distance” between the two clusters. Predatory Instincts produces clear gold and light blue clusters but Natural Shapeshifter produces more of a greenish middle ground which means that many players in both camps have put a point or two into it.
Datamining clustering algorithms work by calculating “distances” between data points along each of the data dimensions then aggregating those distance measures across all the dimensions. For example, the distance between a toon which has 3 points in Thick Hide and a toon which has zero points in the talent could be measured as “3” and then a sum of all distances could produce a measure of how distinct one toon is from another (although the algorithms generally use more sophisticated maths than just that.)
So we want the talents with the greatest distance between the two clusters. You can have a look at the charts and see which ones you think are the best ones. I’ll put up my numbers on that in the next post.
Now if we use the better of those talent dimensions as inputs to our clustering algorithm we get this:
The crucial thing here is that the blue cluster, which are the toons with bear-ish talents, extends right along the health x-axis. No doubt serious tanks are picking gear, gems and enchants that boost health. But since we are looking for a count of all tanks, all the way from those running 5-toon instances to those in the endgame raids, we should expect that there will be a wide spread of health between those just starting out and those nearer the end of the raiding dungeon chain.
That’s one reason why I’m about to abandon the health vs mana thing and move onto other character stats. More about that in the next post. But the reason we can make decisions like that is due to the insights that this data visualization gives us.
July 20, 2010
Thanks to all the correspondents who commented on my feral druids datamining experiment. I’m happy that I’ve got a reasonable estimate for the number of bear tanks now. But I’m holding back from putting up my final word on the subject since I’m trying to encourage a couple of people to write up their own analysis first.
You may recall we left off with a simple graph of health vs mana for level 80 feral druids that produced two very distinct clusters – a red and a blue one – sorta like one of those political maps of the USA except with all the republicans and democrats clumped together in separate parts of the country.
And the key question was… um… While we’re on that subject… Can anybody explain to me why those political maps always colour the conservatives red and the liberals blue? It’s very confusing to a foreigner since just about everywhere else in the world, red is associated with the left or progressive side and blue with the Tory or conservative side.
And the key question was: what were those red ferals doing at the high mana end of the scale? I had my doubts that there could be so many toons carrying mismatched specs and gear. But I’ve been convinced that, yes, there is something not quite right there. A simple filter that drops those toons in the sample with significant spellpower gear basically makes the red cluster disappear.
Now that might not sound like progress – ending up with one cluster – but don’t forget that the power of these datamining algorithms is that they cluster in multiple data “dimensions”. To the eye, there is one cluster, because we are drawing the graph in two “dimensions”: health and mana.
But as soon as we add some talent and glyph dimensions, the big blue blob starts to break up into separate clusters. And this time, there is a good match between the talents and glyphs that we expect to distinguish cats from bears and the actual location of each cluster in the multi-dimensional space.
But it’s a whole lot easier to show you than to tell you, so I’ll leave you with a simple illustration of how that all works. We can add a third dimension to the graph by using colour. The datamining packages that I’m playing with are very good at that sort of visualization, as you can see here.
With the spellpower toons gone, the high mana group has also mostly gone and the shape of the blue cluster has become clearer as the graph scale has changed. Then we overlay, say, a cat glyph:
and a bear glyph:
and the clusters within the cluster become pretty clear. Thanks again to Narkondas for the key clues that inspired those graphs.
The datamining algorithms will generate a count of the toons in each cluster, but I’ll leave that till the next post. But as you can imagine, with a big clump of ferals filtered out, then the percentage of bear tanks in the overall mix is getting smaller.
I should also say that I’m about to collect a new data set and update my armoury reports since the data is getting a bit old and stale. As usual that will take a week or so.
July 9, 2010
A long, long time ago on a website far, far away there appeared a post which argued that feral druid bear tanks were in short supply. The article also made the perfectly reasonable point that armoury datamining sites had no data on the popularity of the various druid forms.
Now, being the kind of nerd who likes a challenge (is there any other kind?) I couldn’t let that one go past. But how to solve the problem? Druids get some of their forms through talents; easy enough to get a count of the toons invested in those talents. But cats and bears were not so straightforward.
As an old SQL hacker of the in Codd we trust school, my first cunning plan to get the numbers on feral druid bear tanks basically went like this:
- Mine data
- SELECT level 80 feral druids FROM toons
- GROUP BY ???
Unfortunately, that didn’t work so well, for a number of reasons. I now understand one very interesting reason why it didn’t work: the feral talents that we expected to use to identify cats and bears are not actually distributed that way. But more on that later.
The solution involved spending some time getting up to speed with more sophisticated datamining algorithms. These algorithms are also based on a sort of GROUP BY principle, but are capable of grouping (or “clustering”, as the datamining jargon has it) across multiple data dimensions. They can easily handle the 85 talents in the three druid trees and, in essence, can group samples of druids into clusters of toons in an 85-dimensional space. Alternatively, we can cluster on character stats – health, mana, strength, agi etc – or on any combination of talents, stats and playstyle numbers that interest us.
They also use calculus techniques to find the borders of each cluster in a way that can tolerate outliers. This is important in real life data, but also important in WoW data where there is always a small but significant number of players who insist on being… individuals…
Up until a few days ago, I was expecting to have to put off working on the druid forms question until I had a really good understanding of these algorithms. But that is not necessary, for the simple reason that the data we are dealing with is not all that complex. Consider the following graph:
Here we have selected for toons that have a history of running instances and raid dungeons, and have filtered out the toons that are serious PvPers. What we have are two groups that in essence are clustering themselves – no real datamining required.
The bottom, horizontal, group is emphasizing health over mana. This group is stacking stamina, agility, armour and dodge (I’ll prove all those things in the next post) and selecting some of the talents we’d expect for bears. In other words, Tanking 101. The top group is selecting for mana over health and is stacking intellect and spirit. This group has taken some of the cat talents too, although as we’ll see in the next post, talents do not seem to be a great predictor of role.
That graph was generated from a 500-toon data set. So we’re ready to cluster and count the full sample. And voilà:
You can see there are some outliers, but the bulk of the sample falls neatly into the two clusters. What you can’t see properly from the chart is that the blue tank cluster is in fact more populous than the red DPSers. It’s just that their health and mana stats don’t vary much, so the cluster is more dense. That’s where a clustering algorithm is needed to get a count of the population of each blob.
And the answer? There are 13, 187 level 80 feral druids in the sample who have done more than 75 instances and raids and have done no arenas. That’s my (s0mewhat generous) working definition of a PvE raider. It’s also a problematic definition because the arena stats are historical – they don’t prove that the toon was not geared up for raiding when the armoury snapshot was taken.
Of those raiders, 60% are in the blue tank cluster and 40% are in the red DPS cluster. So that’s one useful piece of information: feral druid raiders do seem to prefer to tank rather than DPS by a narrow majority.
But the PvE raiders are less than 1/2 of the total sample. So, the worst case scenario is that on any given day, only 30% of level 80 feral druids are set up for tanking (although, to repeat, it is not likely that every arena player is geared for PvP all the time).
The data is from patch 3.3.3.
Then there is the question of effective tanks. Some of those blue crosses in the bottom left hand corner of the chart are probably not seriously gearing up for very much at all. That will be the subject of the next post, when we will use some of the wonderful data visualization tools in these datamining packages to look much more closely into the dark heart of that big blue blob.
Meanwhile, if you’d like to play around with the data for yourself, here are the data sets I’m using. I’ve got a small set of feral talent builds, a larger set of builds and a large set of character stats. Each data set contains counts of instances and raids, battlegrounds and arenas played so you can filter on the raiders.
NB these data sets have been corrected and updated on 12 July. If you downloaded them before that, apologies for the error, and please download them again:
March 2, 2010
This is a guest post by Darush.
A while back Zardoz posted an estimate of feral druid builds (bears vs. cats). The original analysis was based on human expertise: bears usually choose certain talents while cats choose others. However, one can take another approach. There are inherent differences in the builds for each spec, and computational methods that know nothing about World of Warcraft might able to identify these differences by examining the raw data. Armed with Matlab and some free time, I tried to explore the great kitty question.
A quick note about myself: I have been playing World of Warcraft since Wrath of the Lich King was released. I currently have two characters on Cairne, Darush (a hunter) and Azabroth (a priest). Outside of the game, I’m a PhD student in computational biology, and my thesis deals with identifying subpopulations in cell data. Interestingly, one can easily replace “cell data” with the druid data Zardoz mined from the armory.
For those who would just like the executive summary: Out of the data Zardoz sent me, 27% of the toons could be bears and 65% could be cats. Be aware that there are some ifs-and-buts associated with these numbers and they should be treated with due caution.
Now for the details:
I started with a very large table that included the agility, stamina, dodge rating, and talent points allocation for 28,970 toons who had a Feral build (either active or not). I initially wanted to assess how well agility and stamina predict whether a toon is a cat or a bear. Plotting stamina vs. agility, I see the following:
First of all, notice that annoying flat line at the bottom. These are toons with low agility, which means they’re not wearing feral gear in their feral spec. Since I don’t want to confuse my algorithm, I removed these. I ended up with 20,365 toons and the following plot:
Two separate populations emerge. There are the high stamina/low agility toons, which I’m guessing are mostly bears, and the high agility/low stamina toons, which are probably mostly cats. Unfortunately, there is a huge overlap, especially in the lower attributes range. It appears that stamina and agility will not be enough to decide who is what.
Now, for some magic! I used an algorithm called PCA (principal component analysis). In a nutshell, PCA attempts to score the variability in the data. The algorithm takes a long list of numbers (in this case, I used the talent builds for each toon) and outputs a series of scores for each toon. Each number in the series is a component; the first number for each toon is the first component, the second is the second component, etc. Intuitively, PCA compresses the data by discarding less variable information.
(Please notice that this is a very hand wavy explanation. Apologies to all mathematicians, physicists and computer scientists in the crowd.)
After running PCA using just the talent specs, I looked at the first and second components, and ended up with this plot:
We can clearly see two populations. I guessed that one was bears, the other cats. Usually at this step I will run another fancy algorithm to actually identify these automatically, and score how different they are; such algorithms are called clustering algorithms, and each group is a cluster. I have chosen the lazier path for the purpose of this analysis, and decided on two arbitrary lines:
The top green cluster has 5,571 toons. The bottom one has 13,226. There are 1,568 blue toons, which are undecided (7.6% of data).
It’s time for a guessing game. I am guessing one of these groups is bears and the other is cats. It is quite possible that I am completely mistaken, but since this is one of the major differences between feral druids (and the initial purpose of my analysis), I decided to give it a shot. If I am correct, then which one of these is the bear cluster, and which one is the cat cluster? I decided to do some more speculation. Let’s examine the stamina vs. agility plot again:
It appears that toons around black line #1 are mostly bears and toons around black line 2 are mostly cats. Again, as mentioned previously, there is a huge overlap. Fortunately, the overlap ends at some point. We can guess that points to the right of red line 1 are “definitely bears”, since they have very high stamina, while points above red line 2 are “definitely cats”, since they have very high agility.
What shall we do next? On one hand, from the PCA analysis (only talent spec) we have the top green cluster (which I called c1) and the bottom green cluster (c2). From the stamina vs. agility plot, we have the “definitely bear” and the “definitely cat” toons. We can now ask the following four questions:
- Does C1 have many “definitely bears”?
- Does C1 have many “definitely cats”?
- Does C2 have many “definitely bears”?
- Does C2 have many “definitely cats”?
We answer these using a statistical test called a hypergeometric test, also called a one-tailed Fisher’s exact test. The answers are yes, no, no, yes. Therefore, we can say that c1 is highly enriched for bears while c2 is highly enriched for cats.
(For the statistically inclined, all four p-values are lower than 10-20)
It is time for some truth in advertising. If I will present my thesis adviser with this analysis, she will probably hang me, rez me, hang me again, and then /gkick me out of my PhD program. There is much unsubstantiated guesswork involved, a mix of rules of thumb and hunches. The good news is that Zardoz manually examined some of the toons in c1 and c2, and did not identify any contradictions with my analysis.
And, of course, both of us will be glad to hear any remarks and comments.
November 23, 2009
Well only just… I’ve got some results from my attempt to divide up the feral druid population into cats and bears. We started from the fact that there is no “form” tag in the armoury XML – no direct way to count the thing we want to count. The only way to get an insight into this is to find a proxy for each of the forms – something that is in the data which can be used to separate the sheep from the goats, if you’ll pardon the mixed metaphor.
Talents seem to be the obvious choice, so long as there is one talent that bears will take and cats not and another talent that is vice versa. Glyphs are the other possibility. Whatever we choose just has to be i) something that players are highly likely to take and ii) something that is orthogonal; something that definitely points in one direction for bears and another for cats.
But the basic problem is that there are a lot of um… how to put this politely… there are a lot of left-of-centre specs out there. Talents and glyphs are both less orthogonal than I was hoping for – many specs look a bit bearish and a bit cattish at the same time. And there is a big group that takes none of the talents or glyphs that we want to use.
That’s why I decided not to make the queries very complex – adding more talents or glyphs into the selection criteria just increases the number of toons that fall into the grey area. Also I’ve counted specs and not toons since the original question was related to the number of druids specced for tanking.
Thanks to the commenters who made suggestions on possible talents and glyphs that might fit these criteria. I’ve run two queries against the data.
The first query counts feral druids who have Natural Reaction versus those who have Predatory Instincts. A druid with some points in Natural Reaction and none at all in Predatory Instincts might be a bear; t’other way round for cats. Those with points in neither are marked as “unknown”; those with some points in both are the “could be either” group.
The second query counts druids who have a Glyph of Maul versus those who have either a Glyph of Shred and/or a Glyph of Rip. Equipping Maul but not Shred or Rip indicates bear; Shred or Rip but no Maul indicates cat. Again we have groups with a mix of these glyphs, and, unfortunately, a huge group with none of them.
Anyway this is what we’ve got:
(Patch 3.2.2 data; sample size 16327 level 80 feral druids with 28970 specs).
Talent-based spec count:
- Bear: 30%
- Cat : 33%
- Could be either: 5%
- Unknown: 31%
Glyph-based spec count:
- Bear: 18%
- Cat: 9%
- Could be either: 15%
- Unknown: 58%
Frankly I’m still not sure how valid these numbers are, but I hope they provide a bit of insight. The talent-based count may at least provide a low-water-mark indication of the number of bearish specs in there.
October 30, 2009
There’s a post over at wow.com that has set me a bit of a challenge. The post is about bear tanks, but makes the valid point that we don’t have any clear data on the popularity of the various druid forms. There’s a pretty simple reason for that – the armoury data doesn’t provide any direct way of getting such a count.
Still, we don’t let little obstacles like that get in our way. What we need are some data items that can be used as proxies for what we want to count. Unfortunately I’m far from being a druid expert, so I’m looking for suggestions on what items to use.
What we basically want is a talent, or a glyph (or maybe a gem) that bears will want to equip and cats not. And then something that’s vice versa – something that cats will have and bears not. One talent or glyph, or several… whatever makes the most sense. All suggestions on this are most welcome.
(Thanks to the commenters who have already made suggestions on other threads; I’ll be taking those comments on board.)
If I can get suggestions for both talents and glyphs then I can run more that one query and see how well the numbers match up.
I’d imagine that, with dual specs, players who liked both forms would have a spec for each. In any case, going from specs to forms and getting a count against the total druid population should tell us something interesting.