Saturday, June 28, 2008 11:57:57 PM (GMT Standard Time, UTC+00:00)

I'll get the code stuff out the way first, by saying that I've been setting up the community sites for Scrobbles, and I've written most of the .NET SDK around the web services. That's cool.

Last night on Friday saw a few of us heading out to Koko to see that band I liked a couple of weeks ago (Die Die Die), and I had quite a good time dancing and rolling over the floor with people in some crazy chaotic action. The people with me didn't enjoy it all that much sadly, as the crowd just wasn't their thing (and Angela is sober for a month, ouch!).

Ah well, we decided to salvage the evening by walking from Camden to Paddington at 1:30am. That was a lot of fun, we walked around the park, over the canal, found a mattress to bounce on, and then on reaching Edgware road, on what should have been a simple walk to Paddington, got distracted by some guy who shouted at us and we thought was going to attack us. Happily this did not happen, but because of the distraction we took the wrong road and ended up buying ice cream from a service station. We got home at 5am and decided that the walk was probably the best part of the entire night. It was much fun.

Tonight, oh man. Tonight.

I went to see My Bloody Valentine at Manchester Apollo. I am currently lying in a Travelodge bed and feeling slightly overwhelmed.

It was a pleasure to see MBV live, it was a pleasure to hear all that stuff that I listened to when I first started getting into music. It was so awesome hearing some decent shoegaze music live, although it made me wish that perhaps Neil Halstead would get the gang together and reform Slowdive (oh pleaaase, modern day Shoegaze revival!!).

I am so glad I bought earplugs last week. So so glad. At first the music was just music, it was loud, but no louder than anything else I've ever been to. I wore the earplugs anyway, because that is what I have started doing at all live music, as I've noticed it making the music even better to listen to. Anyway, the set ended with a gorgeous fifteen minute long feedback loop. It felt and sounded like being in a jet engine, I have never experienced anything like it. People were backing away from the stage rapidly with their hands over their ears and bouncers were seen running around handing out earplugs to those who didn't have the foresight to pick them up on the way in.

Oh it was absolutely amazing.

It was worth the trip up to Manchester, and my only lament is that I'm not going to be seeing them again anytime soon. If you ever a get a chance, do it for the love of all that is holy.

 

Friday, June 27, 2008 1:21:08 AM (GMT Standard Time, UTC+00:00)

Whew.

Spent this evening firstly trying to get PHP to play nicely with my nice Windows 2003 Server, and then intalling PHPBB on top of that. Once that was done, I wanted to tie together PHPBB with my current authentication system. And that is where the fun really started.

There isn't a lot out there on writing authentication modules for PHPBB, most people seem to write plug-ins for other web software to authenticate against PHPBB rather than the other way around. I decided therefore that the best way would be to take the LDAP plug-in provided with PHPBB and rip the relevant bits out, replacing them with a SOAP web service call to my existing authentication system.

That's right, I'm authenticating across a web service instead of going directly into the database. I couldn't be faffed playing about with the hashing method I use in the ASP.NET application for salting the passwords and storing them in the database, so I decided to use my existing .NET code to do the legwork for me.

It wasn't actually that hard in the end, although I came across a few problems that I should probably document in case I ever have to try this again.

  1. Creating a SoapClient around my ASP.NET WSDL endpoint - remember to add ?wsdl to the end of the url for the web service, so that the SoapClient actually gets the wsdl instead of the html placeholder...
  2. Calling into the Webservice method.
    • $authToken = $Client->Login($username, $password); did not work
    • $authToken = $Client->Login( array( 'username' => $username, 'password' => $password, ) ); did work. I don't know why this is the case, as the docs didn't demonstrate this way of calling the method.
    • My web service method returns a 'string', but in PHP this is represented as a standard StdClass, meant to deal with potentially complex return types from the web service. The actual string, can be found in $authToken->LoginResult - go figure...

  3. In PHP, the md5 method returns a lower-case hexadecimal string. In .NET, most examples tend to use the format string "X2", which creates an upper-case hexadecimal string. Wrapping up the password hash with strotupper before passing it to .NET solved this.

The actual process of writing the authentication plug-in couldn't be simpler using the LDAP plug-in as a base. Simply take the username and password, and attempt to authenticate against the web service. If this fails, then try going through the database directly. If it succeeds and the user doesn't exist in PHPBB, then tell PHPBB to create the user. Store the password in a PHPBB hash and retrieve the user's details from the web service. If it exists, then just make sure the password is up to date and carry on.
This way, if the web service goes down due to Scrobbles being updated or whatever, my users can still log in and complain on the forums. Happy days.

Thursday, June 26, 2008 4:58:07 PM (GMT Standard Time, UTC+00:00)

This is the list as it stands for things I need to do before release.

  • Core Scrobbles
    • Friendly URLs
    • Referal system for 'earning' queries
    • Embedding Data From Third Party Sites
    • Compressed Raw Data Queries
    • Compressed Batch Data Submission
    • Registration System
    • Arbitrary Views (using Snippet System)
    • Automatic Data Submission (Scrobbles App)
    • Security/Validation of All Existing Forms
  • Third Party
    • Online submission of WoW Data
    • Heatmaps (location tracking)
  • Community
    • Wiki
      • SDK for PHP and .NET (wrappers around system)
      • Service Documentation
      • Snippet Documentation
      • Family/Key/Value Documentation
      • Sync with Scrobbles DB
    • Forums
      • Sync with Scrobbles db
      • Requests etc

Seems I mostly just have housekeeping tasks left, so I'm going to run through a few of them tonight.

I have a pile of ideas for lots of 'small' projects once I have this completed and I'm anxious to get started on them. I think longterm, I'll make more money from the small projects per hour put in.

Scrobbles may end up being successful and that will be great, but if it's not, it will make a great technical portfolio item, and that was the reason I started it in the first place.

The small projects are to make money, and by spreading my time over many projects with smaller scope and smaller time input required, I hope to make something of them. Can't wait for the rest of this year to have rolled by. I'm feeling really motivated and can't believe my head can actually generate this many ideas!!

Wednesday, June 25, 2008 1:01:40 AM (GMT Standard Time, UTC+00:00)

Well well well.

Radiohead totally deserve all the praise they get. What a performance.

I turned up early, before the gates opened, and headed right down the front before lying down in the grass and keeping my spot whilst listening to music and not drinking too much liquid so I could stay there all night. (Hey, if you're competing with 50,000 people in a crowd, may as well do it properly).

Bat for Lashes were... interesting. Absolutely appalling and uninspired lyrics, but a lovely ghostly sound. with some interesting moments in it. Her voice really doesn't quite make the grade on some of the songs she tries to sing, but I guess not everybody can be Bjork.
The performance was slightly marred by the power cutting off three quarters through the set. I shouted out that they should dance for us while they waited for the power to come back on and they obliged whilst playing some tribal drum beats. Cool stuff. I'll check their album out.

Radiohead... Thom is so entertaining, he's like a hyperactive monkey or something. They played some lovely tracks from all over their discography, (excluding Pablo Honey of course). I was pleased to hear tracks from Amnesiac and Kid A getting a decent airing.

Highlights for me were hearing Dollars and Cents, The Gloaming (oh Wow, what a track when played live), There There, Bangers and Mash and Idioteque. Everything else was fairly amazing too. I sometimes forget that Radiohead seem to have managed to make all of these classics and are therefore able to put together a two hour set out of nothing but amazing material, and still have lots of spares.

Special mention must be made of Thom coming on and doing a solo rendition of Cymbal Rush from his album The Eraser. It sent shivers down my spine and brought tears to my eyes a little.

I don't want to do big gigs like this very often, at the start of the set I was contemplating NEVER doing it again, I was contemplating NOT going again tomorrow (I have a ticket). People are idiots, pushing and shoving each other out of the way, being rude and generally quite selfish.
I like the small gigs that I go to, where if that happens, it happens on a much smaller scale and as a group people can do something about it. I like being able to hear the actual performers sing, instead of having a 50,000 strong choir. (Although some songs were made awesome with that effect).

However, Radiohead won out and by the end of the set I was just amazed at how brilliant everything sounded. The crowd had settled down and I was quite happy stood at the front and zoning out to the rather excellent music. Not bad considering I wasn't really that excited about it for some reason.

Tomorrow, I'll probably not head into the fray, and I'll probably not go there as early. I'll be quite happy lying in the field on the outskirts of the crowd, and just soaking up the summer evening clouds, and the atmosphere of it all. Beautiful.

PS: I recommend these, I was so glad to have them with me tonight and I'm never going to go to a loud gig down the front without them ever again. You can still hear everything perfectly - better even, as your ears aren't being pounded with an overload of information. No further hearing damage for me thanks.

-----------------------

'15 Step'
'Bodysnatchers'
'All I Need'
'The National Anthem'
'Pyramid Song'
'Nude'
'Weird Fishes/Arpeggi'
'The Gloaming'
'Dollars And Cents'
'Faust Arp'
'There There'
'Just'
'Climbing Up The Walls'
'Reckoner'
'Everything In Its Right Place'
'How To Disappear Completely'
'Jigsaw Falling Into Place'
'Videotape'
'Airbag'
'Bangers + Mash'
'Planet Telex'
'The Tourist'
'Cymbal Rush'
'You And Whose Army'
'Idioteque'

Apparently.

There have been some complaints that Thom didn't "interact with the audience very much". Fuck that shit, they only had a couple of hours and they were determined to knock out as much music as possible. Damn straight.

Tuesday, June 24, 2008 10:53:41 AM (GMT Standard Time, UTC+00:00)

Last week on a rather long train journey to Morecambe to see British Sea Power, I decided to write the next layer underneath the top level querying system, to cache 'collections' of individual common queries, which the results of could then also be queried in order to speed things up.

This has been planned since the original xml based querying system, but I had decided that writing it on top of that system would be like slapping a band aid on a broken neck. When something is that unworkable, trying to speed things up just by adding additional caching levels is a lost cause.

Happily, also on the train I wrote yet another version of the core querying system, which used a ridiculous level of nested inner joins to whittle down arbitrary data into a workable form. This is entirely counter intuitive, as you're told academically that joins are expensive, and nesting queries is expensive - and that by combining them you must be in for a world of pain. Imagine my surprise on running the query against SQL Server 2005's rather excellent query visualiser, that the resulting path was not only incredibly simple, but also incredibly fast due to the rapidly diminishing size of the data.

Because of the analytics code I already had written, it was very easy for the initial inner statement to cut down the size of the data by a huge amount in one simple query. Because quite a lot of the core queries tend to revolve around single keys, and single values (for example, "get me all the keys/values where KEY=WoWActionName and VALUE=MINING). Caching the results of this query, and even subsequent queries can result in replacing that part of the overall nested query with ("get me all the keys/values where the query = <>").

Running some trials, with the twenty snippets I have written for testing purposes, resulted in just 10 actual unique cached queries. It is important to note that in the original system I was filtering by date/time in the inner-most query - resulting in a unique query for every single page, and in this version I moved the date/time filter to the outermost query. This meant the results of this query could be re-used across all pages generated for that user.

This cut down the time taken to generate a typical 'all-time' page from 50-60 seconds, to 3-5 seconds. A rather massive increase of performance. (The fifteen seconds listed below wasn't when doing it concurrently)

These cached queries could then be refreshed or culled periodically using the background windows service I have written for Scrobbles, and most importantly rather than these resource intensive queries happening on different threads and really killing the server from over-zealous usage of RAM and therefore over-zealous paging and hard disk thashing, I can keep them on a single pipeline until I have a server that can deal with parallelising the whole shebang.

I foresee this not being enough when I have a huge amount of data in there for each user, the amount of data generated for an eight hour session of World of Warcraft is immense, and I only probably have sixty hours of data in there myself. I'll be able to add date and time to those cached queries later, and combine the daily results across a month, and the monthly results across a year and the yearly results across 'all time'. So I'm not worried about that.

I ran some tests on my laptop on the train, and was able to generate 2000 pages (all the pages possible for my single user over a year) in under ten minutes. This was using a threadpool to generate pages concurrently, as that is how it will happen once on the internet. At any one point during this, thirty pages were being generated simultaneously and taking seconds to complete. SQL Server's ram usage didn't even get above 300mb so I'm fairly confident this will work for my first few users. (Going forward!).

I appreciate the above is probably quite hard to follow, so here is a picture of the whole process - showing how each process has been decoupled to allow parallelisation as required in the future.

In other news, life is really busy at the moment and Scrobbles is having a week or so of hiatus. This weekend saw me at the Natural History Museum watching British Sea Power (again, yes I know), culminating in me lying on the stage drunkely trying to play the cornet which I had 'borrowed' from the band, tonight sees me at Victoria Park for Radiohead, as does tomorrow, Friday sees me at Koko seeing a band I saw a few weeks ago called "Die Die Die" (Awful name), but they were quite cool, and Saturday sees me up in Manchester seeing My Bloody Valentine, the legends that they are.

Yes, I have purchased and am using ear plugs...

Monday, June 16, 2008 1:45:17 AM (GMT Standard Time, UTC+00:00)

There we go, all that effort has paid off.

Time taken to generate a single 'one day' page of even the more complicated Scrobbles info is down to under a second, and down to about fifteen seconds for an 'all time' page view. (I cache these with quite a high longevity - think of the single default view that last.fm gives you and how often that updates..). On the old system with the current amount of data, this was going into the 'several minutes' for even a single day page. (Yeah, XQuery not so good...)

Getting the right balance of normalisation was key to this success, and I have written a lot more code than I would have originally liked to, most of it not being used in the final solution. Some of it quite experimental and rather cool though - and perhaps the key to future attempts in optimising Scrobbles. One of my solutions worked out 'groups' of keys which could isolate the data for most requests down over 90%, to generate permanent tables for crunching data into - and this may end up being a good way to go if my current solution doesn't last the distance. With the data for the entire year, this algorithm only generated ten tables - and I was able to index the entire year's data through this process in under five minutes so it was quite scaleable.

That was a bit hard to integrate with the query system as it stood however, so I'm bypassing it and going straight to the core data store for now.

Should be grand though, the background service can happily be constantly ticking over the 'all time' pages on a low priority (Well of course I have a priority queue based system!), and generating the one-day views as they are requested - and perhaps some sort of balance over the monthly views and yearly views based on what month or year it currently is.

At say, three pages per user, and 5 views to be updated constantly, that's fifteen views taking about 120 seconds in processing time altogether. I could still update every single users page more often than last.fm does (for its users' music pages) and support 500 users on the rather underpowered server this system currently sits on. Not that I would of course, because there is little point in updating pages unless people look at them sometimes.

Lots more work to do yet on making things even faster - but with a firm database design to now stand on, I feel a bit more confident about pushing ahead with the real development of the system.

Saturday, June 14, 2008 6:43:12 PM (GMT Standard Time, UTC+00:00)

Just a little whoopsie. I knew I had a small bottleneck somewhere in the code where I 'crunch' and validate the pending data submitted by users. It was taking 100ms for each item submitted and that is just not acceptable at all when you're hoping to support a few hundred users at least on the current machine.

I'm currently re-designing the database so it can build itself over the course of page requests, because the page request speed is slow. But I decided to give the crunching process a quick profile because there was no way it should be taking so long per item!

Whoops, it turns out that every single item was being followed by a "DELETE FROM [X] WHERE [Id] = Y" statement. Instead of storing up my deletes until the end and doing one delete statement to delete 100 items, I was doing 100 delete statements and boy are they expensive!!

Problem solved and with the new indexing system in place, it's down to taking 6ms per item crunched. 100ms to 6ms - there is no real contest here is there? The moral of the story? If you think there is a problem in performance, don't just hope you'll find it later on - get the profiler out and see why it is being so slow - it's quite motivating, if only a little embarassing

Just as a footnote, that means my six hour migration process outlined below now takes just under 20 minutes. 20 minutes to crunch a year's data for a single user? If only I could get page generation this fast... now there is something to push towards I guess.

With novice mistakes like this still being present in my database code, it's not beyond the stretch of my imagination to think that it might be possible.

Saturday, June 14, 2008 1:42:23 PM (GMT Standard Time, UTC+00:00)

My checklist has taken a hit this week, with a busy social calendar (organised by me a couple of months ago in the past apparently).

Not only that but I hit some technical issues after I migrated the server-side install of the Scrobbles system across to a new version, along with the database (a process that took over six hours, there being now over a year's data in there).  A migration that was supposed to make the whole system a lot more scaleable and maneagable in the future. De-coupling page requests from page generation and making a whole load of the resource intensive operations parallelisable.

The new system also removed the data-loss incurred by using discrete blocks of time, and moved the system across to an entirely continuous method of time-based data storage. And most importantly was meant to make it a lot easier to form complicated queries by storing a lot of this data in the original XML format, for querying using the XQuery support built into SQL Server 2005. I thought that there was no way I could do anything better than this with my limited knowledge, and that SQL Server's magic black box would just index this data and keep things fast and nifty for me.

In my trials, this seemed valid, and page generation was no faster or slower than the previous iterations of the querying system. On migrating across a larger data set, SQL Server started eating ridiculous amounts of Ram and CPU, before finally giving up in a big heap. Back to the drawing board again, for probably the fourth time.

It should go without saying that I still think the XQuery support in SQL Server 2005 is fantastic, and I think that the solution was simply not compatible with my needs. It was educational to play with however, and I can think of a few projects which would benefit from having the masses of XML on the hard drive transferred into a database which can then do the hard work of actually querying it for information.

I downloaded a full copy of the working database for testing and set about trying to index this massive amount of data myself. Many conversations were had with different people with varying levels of expertise - my colleagues (Karsten and Pat) were helpful over coffee and a pad of paper, and we came up with a new design which should hopefully have been more efficient. I also had quite a few conversations with Paul Evans who let me know of the oh so many potential pitfalls before I even began work on the new prototype system. Sadly, these pitfalls seemed to pop up all too soon and my query graphs were soon looking just as complicated, if not more complicated than the original XML driven attempt.

I think I've finally come to a proper solution now, which involves creating tables on the fly to fit the needs of the system as it evolves. This is a scary solution for me, as it gives away some of the intelligence of the database design to an indirect process rather than directly from me. It also increases the complexity of the code by quite a bit - and I was hoping by just having the one-size fits all database solution to avoid that. 

Sadly in the world where speed and efficiency counts more than anything else, and as I was originally warned at the start of the week, this seems to be quite a standard compromise in the world of database design.

Sunday, June 08, 2008 3:03:06 PM (GMT Standard Time, UTC+00:00)

The last one was starting to get a bit busy, and I was starting to panic because I'm never going to release if I have to work my way through them all.

I've split it up into things I need to do before I release Scrobbles, and those that can wait until its out the door. For one thing, I should stop assuming that my potential users are going to want things, and should really pass the lead onto them. Unless of course nobody wants to use Scrobbles, in which case I'll either start working on one of my many other ideas, or I'll push Scrobbles development in my own direction until I have what users will actually want off me.

I also realised that developing the snippet editor was going to take me a long time, not through technical difficulty - but actually designing something that wasn't harder to use than simply writing some XML was actually quite challenging. It means less people will be able to create snippets if I don't do it, but I'm not looking at wanting a particularly large amount of users right now anyway, just some - so I can start learning off them.

Focusing on the important list, and developing a good pitch so the World of Warcraft users understand what Scrobbles is all about should be my priorities. With the enormous scope that Scrobbles has, I run the risk of overcomplicating the description and confusing people, and if people don't know what software does, they won't use it.

Before Release
-----------------------------------------------

  • Core Scrobbles
    • Friendly URLs
    • Referal system for 'earning' queries
    • Embedding Data From Third Party Sites
    • Compressed Raw Data Queries
    • Compressed Batch Data Submission
    • Registration System
    • Sort the database out [fast fast fast]
    • Arbitrary Views (using Snippet System)
    • Ajax-Driven Page Requests
    • Windows Service for Data Generation
      • Page Generation Queue
      • Pending Data Crunching
    • Automatic Data Submission (Scrobbles App)
    • Security/Validation of All Existing Forms
  • Third Party
    • Online submission of WoW Data
    • Heatmaps (location tracking)
  • Community
    • Wiki
      • SDK for PHP and .NET (wrappers around system)
      • Service Documentation
      • Snippet Documentation
      • Family/Key/Value Documentation
    • Forums
      • Requests etc


Post Release
-----------------------------------------------

  • Core Scrobbles
    • Javascript Snippet Editor
    • Create an intallation manager (DOH, re-invent wheel!?) 
    • XML based 'module' installation/uninstallation 
    • Filterable install list 
    • Rollback on fail
    • Create click-once installer for the installation manager
    • Cache At Query Level (in DB)
      • Add cached query pre-generation to Windows Service
    • Crunch data on commonly used keys, primary, secondary, tertiary (speed up queries)
    • WoWWebStats Emulation
  • Logistics
    • Finance of new server
    • Add a 'status' to each account for payment info
    • Provide the means with which to easily pay (Paypal/Google Checkout/Direct CC/Phone??) 
    • Limit the number of queries each user is allowed and make this dependent on account status 
    • Third party limitations (payment system too?), to prevent DoS attacks
  • Third Party
    • World of Warcraft Blogging 
      • Develop automatic user creation based on Scrobbles data
      • Choose a decent template for advertising revenue
      • Architect a data-driven pipeline for generation of blog posts based on:
        • Location 
        • Activity 
        • Player Character 
        • Party 
        • ???
      • Develop background process to generate blog posts
      • Use the World of Warcraft Armory to populate character profiles 
      • Add capability for characters to automatically post comments on each other's blogs for purposes of hilarity
    • World of warcraft Avatar/Signature Generation
    • Embeddable Widgets into blogs/etc
    • Generate Vista Gadgets from form
       

So, there we have it. I'm not going to copy and paste this entire list again - I'm going to focus on the things I have to do in order to release the product, and then I'll revisit the second list.

Saturday, June 07, 2008 8:49:17 AM (GMT Standard Time, UTC+00:00)

I did quite well this week to get as much done as I did (In my opinion).

My scrobbles page shows quite an improvement in the amount of code being written anyway.

Here is what I have left - I'm going to work on the Javascript based snippet editor this weekend, as that would be a massive win to acchieve. After I've done that, I'll look at replicating those stats found at WoWWebStats and probably come up with some easily implementable features to bring the user experience more in line with what the users expect.

  • Javascript Based Snippet Editing/Creation (MAJOR TASK)
  • World of Warcraft Scripts + Research  (WWStats)
    • Replicate their stats using the snippet system (Modifying Lua as needed)
  • World of Warcraft Data Submittal
    • Create an online page (using the third party data API) to allow online data submission for World of Warcraft (IE: without using the client)
  • Write a background service for all data crunching
    • Asynchronous Page Generation
    • Pending Data Crunching
    • Cached Query Generation
  • Snippet format work
    • Add capabiity for third party websites to insert data into snippets (using public services
  • Generic Server Work
    • Work out how to finance the purchase of a new server
  • Security + Validation
    • Go through all pages and check all user-input for limits/etc [Make sure automatic validation is turned off]
    • Validate postbacks for modifiable data and security concernsLogistics
  • Add a 'status' to each account for payment info
    • Provide the means with which to easily pay (Paypal/Google Checkout/Direct CC/Phone??)
    • Limit the number of queries each user is allowed and make this dependent on account status
    • Third party limitations (payment system too?), to prevent DoS attacks
  • World of Warcraft Automated Blogging
    • Develop automatic user creation based on Scrobbles data
    • Choose a decent template for advertising revenue
    • Architect a data-driven pipeline for generation of blog posts based on
      • Location
      • Activity
      • Player Character
      • Party
      • ???
    • Develop background process to generate blog posts
    • Use the World of Warcraft Armory to populate character profiles
    • Add capability for characters to automatically post comments on each other's blogs for purposes of hilarity
  • Online Community
    • Populate Wiki with 'general' information
    • Populate Wiki with stat family documentation
      • Write a script to do this automatically from the Scrobbles database.
    • Populate Wiki for API documentation
    • Create forums for snippet requests/application request
  • Make client application for automatic data submission stable (One is already written, it just needs a lot of work!)
    • Automatic World of Warcraft upload (Make this more atomic)
    • Create an intallation manager (DOH, re-invent wheel!?)
      • Elevated installer process
      • XML based 'module' installation/uninstallation
      • Filterable install list
      • Rollback on fail
    • Create click-once installer for the installation manager
Friday, June 06, 2008 6:18:07 PM (GMT Standard Time, UTC+00:00)

Just a small thought that just occurred to me.

I was just creating a quick form, where I pull a list of inputs from the database in to generate a form of those inputs and their current values. In order to do this, I write some code which creates a load of controls and adds them to the ASP.NET form dynamically during form creation. This is something that I do quite frequently, and it has become second nature to seperate the creation of the controls from the setting of the values in those controls, and to ensure that whilst all the controls are created during a postback, that the values are not set. (As they'll be set by the rather clever ASP.NET postback mechanism).

This does however mean that on a postback (When the submit button - or even cancel button is clicked), I am pulling the values from the database and creating a pile of controls so that I can cycle through them and retrieve their new values before updating the database.

Over the past week I have been allocating more and more work to the client via Javascript, and performing a lot of actions asynchronously- where it makes sense to do so, and I had a sudden "woah" moment when I caught myself starting to write some javascript to submit these new values once the user clicked the submit button. I had to stop and remind myself what the purpose of adding asynchronous behaviour actually was. The purpose is to add a fluid and faster user experience when a full postback to the server just simply isn't necessary.

Rather than creating all those controls again, a simple check to whether it's a postback, and then a loop through the inputs would suffice. Once that is done, the page re-directs back to the view screen so those controls never needed re-creating.

Once again - the zealous overdesign of ASP.NET covers up something that in PHP would be stonkingly obvious (And indeed, is the defacto standard method of handling such problems). I almost didn't do it because I'm so used to working within this friendly framework of controls and forms.

Silly Rob... silly Rob.

Friday, June 06, 2008 2:19:35 PM (GMT Standard Time, UTC+00:00)

With work on Scrobbles swiftly moving ahead, I find myself thinking about the inevitable and yet seemingly unreachable release date.

"Users are going to want this, so I should add it now.."

How many times have I now said that to myself? "Users are going to want to customize their pages", "Users are going to want to submit custom data", "Users are going to want to embed content in their pages", "Users will need snippets to be configurable so they don't need to write a new one for each 'key' or 'value'"... Each time I do this, it's for a good reason - I don't want to fail as a service, and therefore I need to be the best service around.

As mentioned previously, my main competitor (in the World of Warcraft arena anyway) is probably WWS (WoW Web Stats) - who have a mature, but nowhere near as flexible system as the one I have written. The keyword there however, is "mature". I took a look earlier and the wealth of information available from it is astounding. I can of course do better, and I do aim to write snippets which emulate the statistics that it throws out.

However, I then need to think about the groups of people who will be involved in these events, and think that perhaps they will want to combine their data and compare each other during raids, I need to possibly write a system that allows snippets to link to further in depth data based on a keyword in that snippet, I need to to write a system that allows users to create 'views' of their data between a user-defined period of time, with inputs coming through the existing snippet data. It needs to be really easy to create these views, possibly from templates so that data about a raid can be retrieved within a set period of time.

What about those casual users who are wanting statistics not about raids, but about their day to day activities? There is still a wealth of data that I am still not collecting, and I'm going to have to create a character of each class and profession in order to find out about them. It is absolutely terrifying how much stuff that I might "miss out" in the initial release of the software.

And there is the clincher, if I get it wrong, there is the chance that a future version of the WoW stuff might make previous data invalid. I can't be having that, so it has to be perfect, or at least - forward compatible to begin with - do I need a system for this??

What about those users for whom stats don't mean too much, I need to write that 'third party' website, WoWScrolls.com, so they can see the potential of throwing all their data at Scrobbles. (Public services are *awesome*). How do I achieve that? My colleagues at work have suggested that I use something like AIML to generate the blog posts and I can see their point, but it still leaves me with the daunting task of actually populating the database with "witty" phrases about each location, each task, each type of character and etc - nevermind creating the actual profiles from the data available at the WoW Armory.

My head is full of ideas, and getting that final feature list is not easy - because the moment I allow people to use Scrobbles they're going to start having even more ideas than I can deal with, and being the sole developer it's going to be very hard to keep up with the demand for features - nevermind technical support, complaints and all the normal day to day problems that come with running a website.

There is also that niggling issue, that releasing a service like this feels a bit like throwing down your cards at the end of a poker round, there is always the risk that your opponents might have a full house - and then what do you do?

Where do I call it quits? I could do with Scrobbles being out before the summer holidays so I could just prioritise my list of ideas and just work on them as much as possible before then, throwing it out in whatever form it has at that time. (Limiting the total users so I have time to assess server load and start thinking about monetizing the operation so I can spend more time on it - World of Warcraft is not the be all and end all of this system after all!).

I start to understand why games and software in general can often take such a long time to get out the door, there is always that one little thing that you just know the software will not be complete without. At some point, the users need to start leading the development strategy, and if their ideas conflict with mine - what on earth do I do then?

Wednesday, June 04, 2008 8:13:58 PM (GMT Standard Time, UTC+00:00)

Epically tired after only sleeping 4 hours between Sunday and now, and a day of rollercoasters at Thorpe Park.

Started work on wowscrolls.com last night, in so far as I chose some technology to work with and started looking at LINQ for real (instead of merely acknowledging its existance.)

It looks incredibly useful, and I think I'll be getting up early tomorrow before I head into Microsoft and get some learning done, so I have some code to write while presentations are being given to the students.

I think thoughts on LINQ are probably incoming..

Monday, June 02, 2008 10:03:41 PM (GMT Standard Time, UTC+00:00)

Yet another evening of a work closer to the end than I was at the beginning.

I really didn't feel like starting, but I forced myself into it, wrote a small list from the main to-do list and got to tick a few things off it.

Seems setting specific tasks is more useful than I'd have thought. I also figured out some stuff which will help me at work tomorrow.

Monday, June 02, 2008 2:39:15 PM (GMT Standard Time, UTC+00:00)

Over the year of my contract with the university, I have been teaching myself how to do web dev - and coming from a background of professional desktop software development, moving to this world was surprisingly difficult.

I don't mean development as in the ability to put together a few pages about myself, or the ability to design a pretty website.I mean development as in putting together a full featured web application wtih the same features as an equivalent desktop version (if one was to be written).

My chosen area of learning revolved around ASP.NET because I'm already very familiar with the .NET framework. This turned out originally to cause me problems, because when developing ASP.NET applications you're given a lot of things you simply do not need, and you're given a structure to work inside of, that may or may not fit your end goals.

I found going to PHP and doing work in that helped, as I was given direct control over the process of form postbacks, and made to do everything myself. This gave me an understanding of the technology underlying ASP.NET and therefore the ability to work within the framework and create syncronous websites.

Obviously, the future is in asyncronous requests - a world without postbacks, and I've spent the past few months getting to grips with javascript, getting it to talk to the server, and architecting my web applications around a combination of syncronous and asyncronous behaviour.

When developing any code, I try to keep everything as organised as possible, to keep functionality in re-usable libraries, to keep presentation and business logic seperate, and this is where my main headache has been - in developing web software that is as organised as my desktop software. Trying to develop and utilise patterns across the entire web application so that once a few concepts have been described, anybody else could find what they were looking for if modifying/re-using any of this code.

Again, surprisingly difficult to do in a web application environment, as you are occasionally forced into mixing your logic and layout with this eery combination of Javascript, XHTML and VB.NET (Pick a fight on my choice of language and I will hurt you).

ASP.NET advocates the use of re-usable web controls, which can be slotted into web forms, just like in the desktop world, and I often find myself putting those in a seperate class library so I can easily have access to them in my visual studio toolbox. But then these controls end up referring to, or requiring certain markup to be available on a page (such as common dialogs inside of div, or web services exposed via the WebMethod system on pages). They therefore end up needing to be heavily commented, "do not use unless these thigns are present". They should probably be entirely private to the website itself - and even then there is the scope for abuse when you have over 50 seperate pages they could end up being used on. Can you say spaghetti code? No wonder PHP tends to be so all over the place - as you're not even forced into using any framework when writing it.

There are dozens of things like this, that the developer ends up just having to make a decision on. If you want to learn how to use individual components of code, learn how to use the framework, learn how to write the code, how to do little things, then there are books, there are websites to learn from. This has never been a problem.

If you want to learn how to put together, how to architect a solution that's elegant and forward-thinking, there is surprisingly little out there. It's left to the developer to work it out. (I'm talking a lower level than just N-Tier diagrams before anybody asks).

If I was working in a company that did web development, then I would no doubt be picking up on these things from my peers, who would have picked it up off their peers, who would have developed and learned from other people too. I am not however, and have ended up with my own style of doing things which may or may not be in keeping with other peoples.

A reflection on where I am now? I think this kind of learning is all very well and good, but it is harmful to productivity if you're doing it for your job. The amount of times I have now written this web client for the MeAggregator to a certain level of functionality before realising that I can't go any further without doing it all an entirely different way. If I was experienced in this field, I would have had it completed a long time ago.

I am lucky to work in a job where this is acceptable, and hope that the final product of this effort reflects the time I've taken to learn how to do things properly through trial and error.

Writing code is easy, writing huge amounts of code is easy - but writing large volumes of code that is understandable and well designed as an overall concept... it takes effort and knowledge.  The former I'm willing to put in, to extreme levels - but the latter can only come with time, and that's something we all wish we had more of.

Sunday, June 01, 2008 8:43:43 PM (GMT Standard Time, UTC+00:00)

Last week I worked quite heavily through the list, and towards core site completion. This leaves me with the tasks of building the third party websites, developing more client software and the creation of more user-friendly systems.

These are the tasks I have left to do, with a few more added. I aim to complete the ones in italic by next Sunday. I am however out all day on Wednesday, and at Microsoft on Thursday/Friday (I think), so time will be tight.

  • Generic Server Work
    • Work out how to finance the purchase of a new server
  • Snippet format work
    • Add capabiity for third party websites to insert data into snippets (using public services)
  • Editing of pages (Ajax stylee, I already had a syncronous version done)
    • Hide/Show editing controls - based on user authentication
    • Renaming of Pages
    • Adding snippets to pages
    • Editing the inputs to those snippets
  • Security + Validation
    • Go through all pages and check all user-input for limits/etc [Make sure automatic validation is turned off]
    • Validate postbacks for modifiable data and security concerns
  • Write a background service for all data crunching
    • Asynchronous Page Generation
    • Pending Data Crunching
    • Cached Query Generation
  • Logistics
    • Add a 'status' to each account for payment info
    • Provide the means with which to easily pay (Paypal/Google Checkout/Direct CC/Phone??)
    • Limit the number of queries each user is allowed and make this dependent on account status
    • Third party limitations (payment system too?), to prevent DoS attacks.
  • World of Warcraft Scripts + Research
    • Check out WowStats and see what stats they present to their users
    • Replicate those stats using the snippet system (Modifying Lua as needed)
  • Javascript Based Snippet Editing/Creation (MAJOR TASK)
  • World of Warcraft Automated Blogging
    • Choose a technology to build on (Going to build it myself in asp.net, it will actually be easier that way)
    • Develop automatic user creation based on Scrobbles data
    • Choose a decent template for advertising revenue
    • Architect a data-driven pipeline for generation of blog posts based on
      • Location
      • Activity
      • Player Character
      • Party
      • ???
    • Develop background process to generate blog posts
    • Use the World of Warcraft Armory to populate character profiles
    • Add capability for characters to automatically post comments on each other's blogs for purposes of hilarity
  • Online Community
    • Populate Wiki with 'general' information
    • Populate Wiki with stat family documentation
      • Write a script to do this automatically from the Scrobbles database.
    • Populate Wiki for API documentation
    • Create forums for snippet requests/application requests
  • World of Warcraft Data Submittal
    • Create an online page (using the third party data API) to allow online data submission for World of Warcraft (IE: without using the client)
  • Make client application for automatic data submission stable (One is already written, it just needs a lot of work!)
    • Automatic World of Warcraft upload (Make this more atomic)
    • Create an intallation manager (DOH, re-invent wheel!?)
      • Elevated installer process
      • XML based 'module' installation/uninstallation
      • Filterable install list
      • Rollback on fail
    • Create click-once installer for the installation manager