Monthly Archive for June, 2007

Big Cheap Storage and Music Recommendation

Valdemar Poulsen - inventor of the telegraphone - has got a lot to answer for. Music went with the rival phonogram, and then the optical disc, but the next 20 years is the age of magnetic recording and storage.

Hitachi’s Deskstar 7K1000 is the first 1TB hard drive - on sale now for £200. That’s 20p per GB, about one quarter of what it was 2 years ago. By 2009 it’s going to be 2TB using current proven technology. And drives for consumer electronics will be up to 200GB. That’s a year and a half.

Toshiba has shipped over 5 million hard drives to automobile manufacturers. There are plenty of people talking about wireless access to music, so it’s ‘just there’ wherever you happen to be with your device, but I’d bet on people carrying around huge libraries for a while yet.

You can’t fill that kind of storage if you have to make a purchasing decision for each track, and nor can you really choose track by track what you hear. Massive personal storage is ideal for ’smart caching’ and that needs neat ways to organise playlists.

I think algorithmic music recommendation is missing a trick, perhaps through assuming a lot more friction around choice than there will turn out to be. When you can say to a friend ‘here, have this playlist I put together last night’ and you get both the list and the music in a few seconds (I was recently checking out WiMedia) you have social interaction and recommendation in one hit and with almost no math!

Bus waiting times

Lazy bastard and travelcard-holder that I am, I regularly hover around the bus stop for a while waiting to see if a 55 or a 243 will arrive and take me some of the ~10min walk from tube station to Playlouder MSP (see, relevance to work!).

Having a maths degree which I struggle to put to use, I find myself wondering about the following problem: How long do you wait for a bus before you start walking? What is the optimal strategy for this?

To simplify things you might presume, along with numerous textbook probability questions, that buses follow an exponential distribution. In english: that buses are randomly scattered subject to an average number per time period. The answer is then quite clear-cut. There are only two valid strategies - a strategy in which you always wait indefinitely until a bus arrives, and a strategy where you always walk no matter what. If the average waiting time, plus the bus journey time, is greater than the walking time, then you always walk; otherwise you always wait.

This is a bit counter-intuitive, though, and doesn’t satisfy my desire for a practical strategy. In practise buses don’t follow an exponential distribution, the waiting times are correlated as buses are subject to bunching phenomena, service disruptions etc. So if you wait 3 hours and no bus arrives, you might well be able to infer extra information about subsequent waiting times based on more sophisticated distributional assumptions (perhaps a bus route suspension is a probable occurrence? perhaps you’re likely to have just missed a bunched grouping of buses?). And so with a real-life bus distribution, there is actually likely to be a valid strategy based on “Wait for n minutes; if no bus arrives then start walking”.

It turns out that there’s a lot theory on this, and the M/G/1 queue is a better model to use than the more simple M/M/1 queue based on an exponential waiting times. M/G/1 tells us that, if bus waiting times are correlated, you may actually expect to wait longer than the mean interval between buses! This is due to phenomena like bunching (’why do buses always come in threes?’), and the extra expected time can be given in quite a general way, in terms of the mean and variance of the distribution of inter-arrival times.

However the M/G/1 queuing model is a little too general to infer more detail, such as specific strategies on how long to wait before walking. To do this I’d need to assume a particular distribution for bus arrival times, and it seems at this point that most of the research turns to using empirical distributions (ie just going out and measuring loads of buses rather than trying to derive from a mathematical model), and simulations (simulate a bunch of buses on a computer and measure the arrival time distribution in different situations).

And so I lack a simple answer: given sensible real-world assumptions about bus waiting times, how do I calculate a value for my “Wait n minutes and then walk” strategy?! Queuing theorists please respond kthx.

On with some MSP work ;)

Migrating from Trac to Fogbugz

As we’re an ISP and need to do lots of customer-support stuff, we decided to migrate to FogBugz for both our customer support ticketing and our development bug and feature-tracking. It has some great features, although I do miss the wiki and the tighter source-control integration from Trac.

Anyhow nobody seems to have a decent migration script from Trac to Fogbugz, so I thought I’d post my script. It’s obviously pretty specific to both our Trac setup and our Fogbugz setup, with hard-coded IDs and so on, but hopefully will be a decent starting-point to make your own script.

Before you run this, you need to migrate your Trac database to MySQL if it isn’t already. Chances are it’s currently in SQLite. This isn’t entirely trivial - I followed the steps here.

Caveat: please don’t ask me to support or modify this script for you. You do get what you pay for here.

Continue reading ‘Migrating from Trac to Fogbugz’

Woo: New Safari

Apple has a new public beta of Safari 3! Nice to see all that work on the Webkit engine reach a wider audience.

The reason I’m excited, though, is that the new Safari has some great web development tools built in which previously were limited to the (somewhat unstable) nightly WebKit builds. There’s a great DOM inspector, you can view the engine’s render tree, and most crucially, there’s support for a decent javascript debugger. Previously debugging javascript in Safari has been hell.

How to get this working:

  • Install the public beta, and reboot (sigh - stop making me reboot Apple!)
  • Download the latest nightly build of webkit, and install Drosera from the dmg
  • Enter the following into your terminal of choice:

    defaults write com.apple.Safari WebKitScriptDebuggerEnabled -bool true
    defaults write com.apple.Safari IncludeDebugMenu 1
  • Fire up the new Safari, fire up Drosera and attach to Safari. Note the Debug menu and the Drosera goodness.

Note the new Safari supports custom CSS for form elements, although seems to switch to using some slightly dubious non-apple-looking custom widgets the moment you touch the CSS.

Content Management: Too Many Ideas!

We’ve rethought the way that content is going to work on the new Playlouder website. We wanted to give people simple but powerful tools to share views and discuss music, and we have a principle that all tools are available to all members - we have no arbitrary elites here.

Of course we are very aware of the context we exist in. So many sites now offer ways to participate that it has become difficult to recognise a great new idea in the clamour. We think that we have something strong to offer, with our special focus on music and our integration with a whole range of other services through the ISP.

In our model content can be created by anyone, given any tags the user likes, and attached to any ‘objects’ (such as artists, releases, tracks, events). Competition to be the best writer of content tagged ‘news’ or ‘review’ might turn out to be fierce, but equally you will be able to define your own set of tags with your friends and stay out of the mainstream.

Let’s see if it works.

IPv6

wah-hey! buzzword!

I’ve had this vague notion in the back of my mind for a while now, that with IPv6’s multicast support, distributing streaming music to a membership would be incredibly slick and easy.. and as we all know, vague notions can sometimes turn into reality, and knock you on the back of the head when you are least expecting them.

also, I like to see what these people are doing with IPv6 too, even if it is tongue in cheek:

http://www.sixxs.net/misc/coolstuff/
The Great IPv6 Experiment

these guys have done it already:
http://www.blackcatnetworks.co.uk/

I’d be interested to see anyone else’s setups - don’t be shy.