the truth is in there
July 20, 2010
Thanks to all the correspondents who commented on my feral druids datamining experiment. I’m happy that I’ve got a reasonable estimate for the number of bear tanks now. But I’m holding back from putting up my final word on the subject since I’m trying to encourage a couple of people to write up their own analysis first.
You may recall we left off with a simple graph of health vs mana for level 80 feral druids that produced two very distinct clusters – a red and a blue one – sorta like one of those political maps of the USA except with all the republicans and democrats clumped together in separate parts of the country.
And the key question was… um… While we’re on that subject… Can anybody explain to me why those political maps always colour the conservatives red and the liberals blue? It’s very confusing to a foreigner since just about everywhere else in the world, red is associated with the left or progressive side and blue with the Tory or conservative side.
And the key question was: what were those red ferals doing at the high mana end of the scale? I had my doubts that there could be so many toons carrying mismatched specs and gear. But I’ve been convinced that, yes, there is something not quite right there. A simple filter that drops those toons in the sample with significant spellpower gear basically makes the red cluster disappear.
Now that might not sound like progress – ending up with one cluster – but don’t forget that the power of these datamining algorithms is that they cluster in multiple data “dimensions”. To the eye, there is one cluster, because we are drawing the graph in two “dimensions”: health and mana.
But as soon as we add some talent and glyph dimensions, the big blue blob starts to break up into separate clusters. And this time, there is a good match between the talents and glyphs that we expect to distinguish cats from bears and the actual location of each cluster in the multi-dimensional space.
But it’s a whole lot easier to show you than to tell you, so I’ll leave you with a simple illustration of how that all works. We can add a third dimension to the graph by using colour. The datamining packages that I’m playing with are very good at that sort of visualization, as you can see here.
With the spellpower toons gone, the high mana group has also mostly gone and the shape of the blue cluster has become clearer as the graph scale has changed. Then we overlay, say, a cat glyph:
and a bear glyph:
and the clusters within the cluster become pretty clear. Thanks again to Narkondas for the key clues that inspired those graphs.
The datamining algorithms will generate a count of the toons in each cluster, but I’ll leave that till the next post. But as you can imagine, with a big clump of ferals filtered out, then the percentage of bear tanks in the overall mix is getting smaller.
I should also say that I’m about to collect a new data set and update my armoury reports since the data is getting a bit old and stale. As usual that will take a week or so.