Saturday 20 February 2010

All actions in your application should require 1 mouse click, AT MOST!

The less mouse clicks the better right? If you're even vaguely aware of usability you are probably aware of this notion, for some it seems to be the holy grail of usability, the be all and end all. You show them a UI change, they're more than happy to help you evaluate it, using their super human knowledge of this single usability metric and excellent counting ability they can give you a quick and useless evaluation of your UI.

The problem with this? Following this rule to it's end, as you apply it to more and more of your UI the mouse clicks required for every action would tend towards one. There's a nice example of an application apparently built with this constraint here http://stackoverflow.com/questions/1486420/how-can-i-enhance-the-aesthetics-of-an-ugly-windows-form-packed-with-too-many-ne

In truth, when people bring this metric up they don't really mean they want everything to only be one click away, just whatever it was they were recently doing which now requires one more mouse click than it did before. Inevitably as you evolve a product and add functionality you eventually end up with too much functionality to fit in the space available. At some point you need to simplify your interface, simple to say, hard to do. Sometimes they can be right about the number of clicks, the important thing is to look at the key tasks anyone using your software is trying to achieve. Those key tasks need to be as frictionless as possible, making anything else so simple to do can lead to clutter in your interface. Hide it away and you may well be doing people a favour. As your product evolves possibly somethings which you allowed to be simple early on will need to take a back seat to other functionality.

This slide share presentation is the best example I've seen of someone going beyond a glib 'Keep It Simple Stupid' attitude or unhelpfully telling you that your application should do next to nothing and that any customers who want more can piss off.
http://www.slideshare.net/cxpartners/secrets-of-simplicity

(I'd also say it's also a good example of how to do a really good presentation )


Sometimes you'll need to hide things away from view to keep the core of your application simple to use, sometimes you'll need to add an extra mouse click between a user and some piece of functionality. I'm not going to try and summarise the presentation linked above, I couldn't do it justice and it's actually really really interesting to watch so stop paying attention to me and head over there.

Wednesday 10 February 2010

David Heineimeier Hansson podcast on Entrepreneurial Thought Leaders

Really good podcast in the Entrepreneurial Thought Leaders series from Stanford here:
http://ecorner.stanford.edu/authorMaterialInfo.html?mid=2334

It's David Heineimeier Hansson of 37 signals and Ruby on Rails fame, always good for an opinion and I think part of a company a lot of people would like to imitate one day, as in the best kind of flattery type of imitation.

Top bits for me:

Don't try to compete with Microsoft etc. on programming effort, marketting spend etc, they'll crush you. Especially on the marketting front this carrys some weight with me, I think some small companies try to imitat the big boys in this arena too much and don't make enough of what they have the big boys don't.

Don't take VC finding unless you actually need to build a factory in India or something. Having sat in a meeting to discuss how we'd spend a million dollars if we got it I know some companies are attracted to the VC capital without neccesarily thinking through what it is they actually need it for. Also, the risk of wasting your time on something which isn't going to be profitable is increased, fail fast and fail often as they say.

All his stuff about building a scalable company, not as in one which can scale to tonnes of employees, but one which scales the cash coming in without scaling up the employee numbers. Very nice idea to try and focus on, keeping this in mind is I guess how you keep a lean startup lean as it ages and evolves.

Saturday 6 February 2010

Exceptional? No.

Spotted this the other day whilst making a small change to the code at work. It was strange, something felt very wrong but I couldn't quite put it into words. I see this sort of stuff occasionally that just feels wrong but I can't always say exactly what it is. As luck would have it a coworker wandered in at that moment so we looked at it together, he declared if he;d written code like that he should be shot. Here it is. It's an accessor method on a class representing a row in a db table.

public boolean getColumnBlah() {

    try {
        return columnBlah.getValue();
    } catch (NullPointerException ex) {
        return false;
    }
}

It was the use of a NPE to catch an empty value, if the default was meant to be false, make it false by default surely and assume it won't be null in this method. Also, where did the NPE come from, obviously some lazy dev assumed it was due to a null value on that column, and not as was possible from some place else in the getValue() call which was entirely possible. If it was you'd be silently failing at something else and detecting the failure would be far less likely.

Exceptions should indicate something exceptional happened not something predicatable and expected. Special bonus points awarded this time as Mr Cowrker joining me in looking at this turned out after some cvs detective work to be responsible. Some time ago admitedly so we decided he'd not leave the company with his head hung low. He did also point out that paying the company back for the 2 years emplyment after creating the sinful code and rolling back all his commits for the last 2 years would be pretty hard work as well.

Saturday 16 January 2010

Things to do when your startup gets popular... Capacity Planning

If you've not read it yet, get a copy of The Art Of Capacity Planning by John Allspaw, Order it now, then come back and finish reading this. It's a fantastic book and was honestly something of a revelation to me when I read it. Most if not all of what I present below will be far more eloquently and comprehensively covered in that book. I actually originally studied control systems at university, my lasting impression of control was that it falls into two major segments. The most interesting being keeping the wings on a plane as it breaks the sound barrier whilst banking for example, complex, cool and very interesting. The rest of it is more like a toilet cistern, if it's empty fill it up, if it's nearly full stop filling it up. None of the capacity planning below is rocket science, if it was control systems it would definitely falls into the second category. It is important though, it's interesting (to me at least) and hopefully may be of use to someone out there.


image by Bookshelf Boyfriend


So... It probably goes something like this, you've got your software as a service, your selling it, people are loving it, you're building and releasing new functionality all the time. Things are going well, that is until they stop going well, you notice things are running a bit slow, or even worse, your customers notice. You login to your servers to find out what? It's maxed out the CPU, your database is living in page file, or you're close to running out of disk space and fragmented to hell. Either way everything is grinding to a halt and as a result, you are boned. Obviously you sort this out asap, your customers get over it and everyone can relax again. Except for the fact it's going to happen again, unless you can work out when it's going to happen and make sure you're ready for it next time. This is what happened to me a about year ago, or had been happening for some time, but it was then that I started working on making sure it wouldn't happen again. I certainly don't know everything there is to know about this stuff, but I know enough to have survived since then without repeating the mistakes of the past. Here's my top tips, things I think you should do now if you aren't already.


1) Start measuring server performance

Measure the performance on your servers, cpu, memory, disk reads writes and disk space free. I use Hyperic for this, it's pretty simple to set up and start using. I use a separate machine for the server component, you should at least check out their recommended specs and be careful installing it on a production server which is already stressed or key to the performance of your app. This will allow you to monitor over a long period of time the load on your servers and get an idea of where the next bottleneck might come from. You can also set alerts to notify you should something get out of hand, ie free disk space getting below 25%, cpu utilisation over 75% for more than 5 mins, whatever. Hyperic is extremely easy to install and set up, there's a free edition and aside from a few UI peculiarities I've not had any serious issues with it in 12 months of use.


2) Start measuring some application level metrics

Ie, number of users on the system, number of widgets wangled etc. Flickr monitor pictures uploaded/downloaded for example, this showed them that Sunday is when they get the most uploads, and Monday is when people back at work start browsing said photos. This is very useful as it gives you the background to how people use your service, and how this then drives the measurements above. I'd been doing this long before making any use of it as user logins and various actions were recorded in an audit trail, you want the data somewhere it's easily queried so a database would be ideal. You may well be able to get this data into Hyperic although I've not tried. I currently use total user logins per day and then pick the peak day each week to represent that week. I'm interested in the peaks since it's those I need to plan for and looking at data at a weekly resolution is fine for my needs.


3) Make some simple predictions

Now you've got some data, which may mean waiting for a while after point 2 above, it's time to work out when the shit is next going to hit the fan. I've used Excel for this so far as a) I know how to get it to do what I want, b) it works well enough for my current needs. I've worked primarily with my user login numbers, these predict load on my servers closely enough for me to get upgrades in at the right time. This is likely as I've really only the one server doing everything, if you've a number of more specialised servers you'll probably want to work out which application metric drives load on each one. Start simple though, then add more detail if you need to. What you are looking for here is a trend over time, I have peaks every 3 months and the pattern from one year to the next is very similar allowing last years figures to predict very closely this years figures by simply multiplying them up.


4) Work out when you're boned

Simple as this, you know what (cpu/memory/disk) is or is likely to be the bottleneck on one server. You know that one level of your application metric correlates roughly to some level of load on your server. You know roughly what increase in that metric gets you some measurable increase in load on those servers. You know where you think that metric will go for the next few months. So at what point will it approach load on that server which you can't sustain. Make sure you get the relevant upgrade in before that happens, be aware here of how long it takes to actually get the upgrade in. Some upgrades are quicker than others and some require more or less investment of your time to do them, ie replacing a cpu with a faster one vs moving to a machine with the capacity to add a second cpu.


A concrete example of all the above for me was CPU which is often the driver behind our upgrades. We up our logins about 70% every 12 months at the moment with spikes every 3 months. Since cpus can't be increased by small steps I don't worry about directly predicting utilisation, instead I ask myself if I think we'll get through the next 3 monthly spike based on cpu utilisation currently and how high the spike in user logins is predicted to be. If I'm concerned I get the next upgrade, if not I wait. With cpu specifically I try and keep under an average utilisation of 75% for any significant amount of time. Significant isn't a very large amount of time when users are waiting for a response from your servers. this is based on the work of far smarter people than myself, there's a good write up here.

That's it for now, buy John Allspaws book if you're really interested, or if you think you may need to get a handle on this stuff.