feral druid forms yet again

July 9, 2010

A long, long time ago on a website far, far away there appeared a post which argued that feral druid bear tanks were in short supply. The article also made the perfectly reasonable point that armoury datamining sites had no data on the popularity of the various druid forms.

Now, being the kind of nerd who likes a challenge (is there any other kind?) I couldn’t let that one go past. But how to solve the problem? Druids get some of their forms through talents; easy enough to get a count of the toons invested in those talents. But cats and bears were not so straightforward.

As an old SQL hacker of the in Codd we trust school, my first cunning plan to get the numbers on feral druid bear tanks basically went like this:

  1. Mine data
  2. SELECT level 80 feral druids FROM toons
  3. GROUP BY ???
  4. Profit!

Unfortunately, that didn’t work so well, for a number of reasons. I now understand one very interesting reason why it didn’t work: the feral talents that we expected to use to identify cats and bears are not actually distributed that way. But more on that later.

The solution involved spending some time getting up to speed with more sophisticated datamining algorithms. These algorithms are also based on a sort of GROUP BY principle, but are capable of grouping (or “clustering”, as the datamining jargon has it) across multiple data dimensions. They can easily handle the 85 talents in the three druid trees and, in essence, can group samples of druids into clusters of toons in an 85-dimensional space. Alternatively, we can cluster on character stats – health, mana, strength, agi etc – or on any combination of talents, stats and playstyle numbers that interest us.

They also use calculus techniques to find the borders of each cluster in a way that can tolerate outliers. This is important in real life data, but also important in WoW data where there is always a small but significant number of players who insist on being… individuals…

Up until a few days ago, I was expecting to have to put off working on the druid forms question until I had a really good understanding of these algorithms. But that is not necessary, for the simple reason that the data we are dealing with is not all that complex. Consider the following graph:

Health vs Mana for 80 Feral Druid Raiders

Here we have selected for toons that have a history of running instances and raid dungeons, and have filtered out the toons that are serious PvPers. What we have are two groups that in essence are clustering themselves – no real datamining required.

The bottom, horizontal, group is emphasizing health over mana. This group is stacking stamina, agility, armour and dodge (I’ll prove all those things in the next post) and selecting some of the talents we’d expect for bears. In other words, Tanking 101. The  top group is selecting for mana over health and is stacking intellect and spirit. This group has taken some of the cat talents too, although as we’ll see in the next post, talents do not seem to be a great predictor of role.

That graph was generated from a 500-toon data set. So we’re ready to cluster and count the full sample. And voilà:

Feral Druid Raiding Tanks and DPSers.

You can see there are some outliers, but the bulk of the sample falls neatly into the two clusters. What you can’t see properly from the chart is that the blue tank cluster is in fact more populous than the red DPSers. It’s just that their health and mana stats don’t vary much, so the cluster is more dense. That’s where a clustering algorithm is needed to get a count of the population of each blob.

And the answer? There are 13, 187 level 80 feral druids in the sample who have done more than 75 instances and raids and have done no arenas. That’s my (s0mewhat generous) working definition of a PvE raider. It’s also a problematic definition because the arena stats are historical – they don’t prove that the toon was not geared up for raiding when the armoury snapshot was taken.

Of those raiders, 60% are in the blue tank cluster and 40% are in the red DPS cluster. So that’s one useful piece of information: feral druid raiders do seem to prefer to tank rather than DPS by a narrow majority.

But the PvE raiders are less than 1/2 of the total sample. So, the worst case scenario is that on any given day, only 30% of level 80 feral druids are set up for tanking (although, to repeat, it is not likely that every arena player is geared for PvP all the time).

The data is from patch 3.3.3.

Then there is the question of effective tanks. Some of those blue crosses in the bottom left hand corner of the chart are probably not seriously gearing up for very much at all. That will be the subject of the next post, when we will use some of the wonderful data visualization tools in these datamining packages to look much more closely into the dark heart of that big blue blob.

Meanwhile, if you’d like to play around with the data for yourself, here are the data sets I’m using. I’ve got a small set of feral talent builds, a larger set of builds and a large set of character stats. Each data set contains counts of instances and raids, battlegrounds and arenas played so you can filter on the raiders.

NB these data sets have been corrected and updated on 12 July. If you downloaded them before that, apologies for the error, and please download them again:

Advertisements

9 Responses to “feral druid forms yet again”

  1. zardoz Says:

    Enjoy the data. Let me know if you find anything interesting.

  2. zardoz Says:

    Hi Narkondas:

    Yes, unfortunately the possibility that the toon’s gear and spec don’t match occurred to me as well. I wasn’t sure how many cases like that were in the data set. Your point about spellpower is a good one – I’ve noticed the sparse >0 spellpower thing too but couldn’t make up my mind what it meant.

    Those two clusters are pretty dense but and it seems to me that they do represent something valid that cats and bears are really doing.

    My reading of the data is that bears are stacking agi (and dodge too) for damage avoidance. I’m certainly no druid expert but the key druid tank bloggers recommend basically what we see the druids in the blue cluster doing.

    I can make a data set with crit and armour penetration, no problem. Also average gear level; I’m away from the database but from memory I’ve got that in there somewhere. I’ll post it up on Monday and you’re welcome to play with it and tell us what you find.

    Meanwhile I’ve got a set of charts showing the distribution of the key stats and talents across the two clusters. I’ll try to get the next post up tonight.

  3. zardoz Says:

    Hi Nelson.

    Every druid in the sample is a Feral Combat one; I filtered the others out at the database query end. And I’ll take that bet on talents – you can buy me a coffee next time I’m in San Francisco! I’m about to put up a post on the distribution of the key cat and bear talents across the clusters; it makes an interesting story.

    Your last point is right on the money – talents like Survival of the Fittest are taken by the bears but by lots of cats too. I must admit to being not entirely sure why the talents are poor predictors of role? Do druids just tank one run and DPS the next? Do they all forget to swap out the gear with the specs? Anyway I’ll put the charts up for comment as soon as I can.

    They’re first class chart porn if nothing else.

  4. Mike Says:

    You have no idea how excited I am about this! Thank you for providing the arff files!

  5. Narkondas Says:

    Interesting.

    I fail to understand, however, why feral cats would ever stack intellect or spirit – since that would lend itself more to restoration (Tree) or balance (Moonkin)

    Playing a feral druid myself – with one bear and one cat spec – I certainly don’t. (In fact my cat gear is awful since I reuse a lot of the bear pieces)

    I’ve looked over the dataset (FeralDruidStatsData-arff) and in almost all the cases where mana is high, agility is extremly low. Agility (Along with Crit and ArP) are KEY cat-talents.
    Another common trend is that when Mana is high, Spellpower is also high. Spellpower is also equally useless for cats.

    So my conclusion on those would be that some people have logged out in their feral spec, but with their resto or balance gear equipped.

    Of the almost 20000 toons – only around 1600 of those have more than 0 spellpower.

    The average manapool of those 0 Spellpower toons is 6087, max 8841, min 5391.

    The average manapool of those >0 spellpower toons is 10375, max 30666, min 5451.

    These numbers are just crunched in a spreadsheet (I have a desire to start doing datamining, but have not gotten “into it” yet)

    I tried to setup something simple – but without crit and ArP it’s really hard.

    I’ll try to get some more into it.. I have a feeling that if you clustered by Stamina divided by avg. ILevel – you would get some fun numbers – or perhaps AGI divided by Dodge (That would essentially give you a cluster of those that got all their dodge from AGI (cats), and another of those that got extra dodge on gear (bears))

  6. Nelson Minar Says:

    Nice graphs! But I’m not sure it tells the full story. I think your red vs. blue is basically Balance druid vs Feral druid, and I bet those clusters would look identical if you just plotted # of talent points in each tree. But what about separating Feral tank druid from Feral DPS druid?

    What’s confusing about feral druids is there’s two different roles in a single talent tree: cat DPS vs. bear tanking. Have you tried looking at attack power vs. stamina, or maybe critical change vs. stamina? There’s not an enormous difference between cat and bear gear, but there is a little.

    BTW, the defining talent for bear tanks is Survival of the Fittest. You can’t tank without it. OTOH I bet most cat DPS druids take it too, so it may not be a good signifier.

  7. Antje Says:

    Greetings from a feral druid!

    I don’t see what mana has to do with cat DPS. As a feral druid, I have identical amounts of mana in both cat and bear gear: a whopping 7011.

    Looking at agility will not help you separate bears from cats, as the stat is useful to both. I believe the defining stats will be stamina for bears and armor pen for cats. I would particularly look at gem choice, since many items of gear work well enough for both cat and bear roles, but it’s the gems that will give away what role you’re preparing for.

    Of course, there are some cats out there who will be stacking agility instead of armor pen, so that will mess with things. But those cats shouldn’t have any stamina gems, whereas bears will have 50%-100% stamina gems.

    As for talents, players such as myself will make this very difficult for you. I raid in a hybrid bear/cat spec, with emphasis on cat. (My offspec is for feral PvP.) I regularly top the damage meters of my 10-man raid, but if our MT suddenly dies it’s my job to go bear and save the day. A hybrid spec is also useful for those fights where I might be assigned a tanking role for one phase, but I want to have meaningful DPS for the remainder of the fight. I suspect many 10-man raiding ferals will have at least somewhat hybridized specs, so you may want to consider us as a third category alongside pure cat and pure bear.

    Perhaps the best way to tell a bear by talents is to see if he’s missing the key cat talents, and vice versa. If someone is missing Omen of Clarity, Shredding Attacks, King of the Jungle, or Primal Gore then they cannot be a serious cat. In the same way, someone who is missing Natural Reaction, Survival of the Fittest, or Protector of the Pack will not be a serious bear. Of course, hybrids like myself will have all of the above (or at least most) of the above, and for that matter many pure cats will nab Survival of the Fittest just in case.

    But you’re right, separating cats from bears by talents will be a very imprecise science. My hunch is that when all is said and done, you’ll only find 15% pure cats, 25% pure bears, and the remaining 60% will span the entire hybrid spectrum in between.

  8. hoeding_azgalor Says:

    Very interesting stuff here 🙂

    One dataset i would be interested in looking at would enchants are popular for leveling characters and/or statistics on what people are using on their BOA gear. I don’t really have the know how to scrape the armory myself and process the data but with a dataset I think I can PROBABLY figure out how to wiggle the data properly with rapidminer 😛

  9. zardoz Says:

    They’re interesting ideas but I may not have the data for everything there. I’ll email you.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: