blog status

February 12, 2009

If you’re not a geek, you may not know that the most popular web server is called “Apache” basically because the developers kept updating it with patches (a patchy server – geddit?) .

Blizz seems to be moving towards an Apache MMO at the moment, which is p*ssing me off somewhat. My plan is to keep this site up-to-date by doing regular fresh scans. But the trick is to pick the right time to do the scan – there has to be enough time for players to experiment with the nerfs and buffs in a patch and form new opinions on talents, gear etc.

I do have a scan made after 3.0.8 hit, but I’m not happy with the quality of the data – it was done too soon and doesn’t fully capture the changes Blizz made to re-buff some of the talent trees that were massacred by a certain Lich King. I’m using it to update some reports, but only those ones unlikely to be greatly impacted by the patch.

I’d be doing a fresh scan just about now to get a better view of the impact of  3.0.8, except that 3.0.9 is now upon us, with 3.1 rumoured to be not far behind it.

So I think I’ve got no choice but to hold back and do a fresh scan to capture the 3.0.9 changes before updating this site again.

In the meantime, don’t forget that there is another armoury data mining site: Armory Musings. He’s got later 3.0.8 data which is more likely to reflect the current state-of-play with talents and builds – which is where the nerf wars seem most violent at the moment.


2 Responses to “blog status”

  1. brogthar Says:

    First, I would like to say that I really like your site.
    I work as a database engineer, and i’m very interested in data mining. I intendeted to do something similiar like you do here. For happenstance i found your site. Now i don’t have to do that, unfortunately.

    Would be interesting for me, how long your armory scan lasts, and which criterias you use for the scan?


  2. zardoz Says:

    Well, if you are interested in datamining, there are still plenty of things that can be done with the armoury data beyond what I do here. Leave an email address, if you like, and we can talk about ideas.

    I scan both the US and EU armouries, using a threaded crawler, and get a throughput of about one character per second. I usually leave the scanner to run for about two weeks and so end up with something above a million characters. I use a random sampling algorithm – I don’t aim to scan every toon. You can read about my algorithm here. If the sample is random, we can draw general conclusions – the same principle as a political poll or market survey.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: