September 26, 2008

OK… the data mining train is leaving the station at last. My first set of posts will be basic stats for all the classes. You will see the information on the class-specific pages linked in ‘Pages’. The initial versions of these pages will all be generated from a test database that I’ve built over the past day or so. It only contains about 50K characters, drawn 50:50 from the US and EU armouries. Of course that is just a drop in the ocean but have no fear, my goal is to scan the whole of both armouries. And soon too; a certain Lich King is probably going to do some serious damage to my lovely XSLT in the not-too-distant future.

At the moment I want to write the key queries that I need against this database schema, just to test that… well… we can get some actual information out of the damn thing. If all goes well, I should be able to start the full scan in the next week or so. My armoury crawler does about 5000 characters per hour so I’ll leave the calculation of how long a complete scan might take as an exercise for the reader.

The WotLK event horizon also has to be taken into account. It seems worth doing a full pre-WotLK scan, for its value as a baseline if nothing else. The interesting question is when to start the post-WotLK scan.  Still thinking about that one…

One thing that is becoming clear is that the job of turning the data into useful information is going to be the rate-limiting step. These free blog sites are probably OK for your basic gossiping teenager, but as high-performance database reporting tools they leave something to be desired. Also I’m having trouble locating a simple database report generator that can crank out clean, valid HTML. And its not a matter of money; I have access to tools like Crystal Reports and Billg’s SQL Server reporting tools. But the lunacy that gets generated when you hit the ‘Save As…’ button has to be seen to be believed. Sigh.

So, a list of reporting priorities. The questions that interest me are these:

  1. What are the stats values (health, mana, armour, crit etc etc) to be expected from an above average character of each class?
  2. How many imba characters exist in each class and what did they do to get there? I’m expecting that this will lead to gear, gem and enchant lists.
  3. What are the most popular talent builds across the classes? What are the talent builds of the above-average characters?
  4. What are the characteristics of an x9 twink (ie 19, 29, 39 etc..) for each class?
  5. What are the differences between raiders, general PvE and PvP-oriented characters?

All that and more lies buried in the armoury. Lets see how much of it we can dig out.


