A heuristic evaluation or expert review is the bread and butter of my design life. Yes, even more so than actually sitting and drawing stuff. Anyone well versed in the principles of design for a particular domain or platform simply looks at the product (preferably a functioning one actually installed and running) and applies industry knowledge of what are best practices and expected user behaviors (the "heuristics") to identify key problems and estimate how well the product does or will work with actual users. Pretty much every project I work on gets at least one of these. Sometimes, they are quite formal, and other times just is the underlying practice behind acceptance testing.
Which Heuristics?
There's been a bit of a secret crisis over these for the past 10 years or so. Many have promoted these with long checklists which are quite specific. Some of my first issues with these came with trying to apply web-centric tools to new technologies, or to the mobile web way back in 2003. It didn't work. Even simple specifications like the amount of time that users are willing to wait for a page to load will change. Over time, and with the type of service or expected audience. Expectations change, so strict heuristics are difficult to define.
But, that's fine, because I feel that's a totally misguided approach. Norman says to "judge its compliance with recognized usability principles," and seems to reinforce the focus on principles (not specifications) with his 10 usability heuristics but then links to fully 2,397 usability guidelines(!). That's far too many, far too precise and leads to application problems. What about my B2B mobile, social site? Do I apply them all?
Guidelines like those I offer on the previous page are better to start with (for mobile, go back to Norman for general guidelines and other platforms). See other sections like General Touch Interaction Guidelines, General Touch Interaction Guidelines, and really lots of stuff in the Appendix for some additional guidance and details, with caveats and principles to help you apply them to your design.
When encountering a specific widget, like an Infinite List you can refer to the pattern principles to determine if the implementation was well-done. This is a particular gripe of mine; infinite scrolling has gotten a bad name as a whole, but only because the patterns are not being applied correctly. Patterns are (usually) not evil, but bad implementations can ruin them. This is what heuristic evaluation is very good at discovering.
Plan & Setup
As a regular practitioner of this, I have been asked to share my heuristics. Which I am therefore interpreting as sharing the basic principles, checkpoints and methods that I use instead. Since I spent last night doing a review of a new app on Android, for a global audience, for five hours, you are getting a lot of that as the for-instances. Interpret as needed for your domains.
Set constraints - The project will have limits of engagement. Even if answers below are different, and your users actually have lots of featurephone use of the Web, if the business has decided to only address how it works on iPhone 4 and above, there's no point in logging issues on Android or Blackberry much less featurephones. It's not ideal, but it's how the world works. Argue that in a different document and discussion.
Know your audience - You are an office worker. 25% of us remote work, so it's irrelevant if your office is the park, your back yard or Starbucks. You are therefore not who most of your users are. Make no assumptions,a and find out how they work. What do they expect, what do they use on their device right now and how often.
Know what your audience uses - Within the limits above, find out what devices and browsers your users employ. Ideally, you have specific research and maybe even analytics from the first launch or similar products. Otherwise, use basic knowledge of the industry and regions. For me, last night:
- Galaxy S4, Galaxy Nexus - For modern OSs on a few different platforms. Carrier locked and unlocked devices, for example.
- Hero 200, Casio Commando, Galaxy S1 - Some say 80% of the installed base in China is on Android 2.3. Not on old phones, that's just what they put on new ones. Ideally I'd get some of these but they are in China so this will do to check old versions of the OS. Even Apple doesn't have perfect migration to the new OS. Check your audience. Also, one has a keyboard. Good to check on since in some markets keyboards are still big.
- Nexus 7, Polaroid PMID701 - Test on different sizes, form factors. The Polarioid is a Cheap Chinese Tablet, so even though it runs 4.0.3 it is a good example of the popular tablets in other markets. And it didn't work the same as the Nexus.
Know how your audience works - Just last week I kept going outside to check on the readability and legibility of an interface, both as I designed it and as it was built. Because this product is for mechanics who might be in repair shops, truck cabs or even outside. It has to work everywhere. Ideally, I'd account for dirty fingers also, and within the limits I have (I don't make phones) I did by making targets even bigger than usual for the basic functions. Editing the equipment name? That's a small target, because you do that when there's time to stop and think.
Do not forget browsers - (Only for Web) In some markets, the default browser is not used, or is no longer the default browser due to language or other regional needs. Know this, and check in the correct browser, or in several.
- Think about the ecosystem - How do people discover this? For apps, check the store screenshots and descriptions, pay atttention to icons and app names. For the Web, do NOT assume everyone starts at the home page. Simulate entry points from Google searches, shared links, or whatever is the really likely way they will enter the site. Check on sent emails, sms and make sure the links to the camera or maps all work.
Don't get lost - Make a chart of what you are going to test, both process and platforms, and mark it off as you go. Some of these can take all day. Or two days. You will forget what you did or where you left off at lunchtime.
All those devices you have identified (and acquired) need to get charged and laid out in front of you. If you worry it's expensive, get used ones and no plan. WiFi works fine for all this. Check connectivity, and get the app installed if that's what you are testing. Go to screen settings and make sure they do not sleep. If worried about power consumption, get cables and plug into the wall. I have a favorite 11 port USB hub so I have less electrocution risk, but do arrange all this early, so you don't interrupt the process with overhead like charging a phone.
Prepare for screen capture. I strongly prefer DropBox as it can be set to automatically load images (including screenshots) to a folder which you can get to from your desktop computer. Easy. A lot easier than emailing, or trying to make iPhoto work.
A Process for Evaluation
Now you are sitting at your desk with an array of devices in front of you. Now what? Well, my method varies depending on what you are testing.
Web - I tend to lay out all the devices in front of me, then do each view on every device at the same time. It is much easier to compare the differences, because humans notice change and difference very well, and decide which really matter.
Apps - I tend to do one straight through. For the most part, apps just work and the variations per device/OS-level are very minor or so catastrophic you cannot miss them. So, not as important to compare devices. Pick one it should work on, finish that, then note the others separately as having bugs or not.
- For apps launched at the same time on multiple platforms (iOS vs. Android), you need to deal with them as almost separate products. Make the report two tabs in the same spreadsheet so you turn it in at once, but platform variations are strong enough you should complete one platform before starting the next.
Don't forget the other devices you need around to test:
Record it - This is a big enough topic I moved it to the very end. But you have to also set up a device to type your evaluation into. Probably a computer and keyboard, but it must be a different device from one you test on, or you will forget what you are doing, or change the results with constant task switching.
Refer to specifications - Yup, you need another monitor or maybe another device so you can see the original design spec. Obviously, not applicable if reviewing something from the competition, or an old project with no documentation.
Evaluate Views
Say it with me: There are no pages. We have to stop using this word. Believing there are pages leads to lots of missed opportunities and missed bugs. Think in terms of views and states. That means you evaluate every time the data changes, every time you open an accordion, press a multi-selector field, type into a text field, or open a dialogue.
Test interaction as well. Do the tabs work? Great. Do they look good during the transition? Do alternative methods (gesture) work? Both ways?
Do not get hung up on being pixel perfect. Close enough is close enough, with all the variations. If you didn't specify the margin or size in your design document, it probably isn't critical. Think before opening a bug for every single thing that is not like your design.
This is more or less, depending on the product, the order in which I do things:
For each view, I check these items, in more or less this order:
- Load in portrait mode.
- Check that all components are visible.
- Check the color and contrast for legibility and readability of all components.
Is the size of all type suitable for the size of the device and the viewing distance?
Are all touch targets the right size?
Is there any interference between touch targets?
- Confirm touch and gestures do not interfere with each other. Check gestures in multiple areas, and try moving while on click areas to make sure gestures do not submit tap actions.
- Check press-and-hold actions operate correctly. Check that tap actions do not have unexpected behaviors when pressed and held.
- Check the virtual keyboard layout (or hardware keyboard constraints) on each input field to assure it is in a suitable mode. Numeric entry should pull up the right keyboard, not force the user to change modes.
- Check form entry, errors and submission in general.
- Make sure all other applications link properly. If a Share link loads the email client, make sure the correct data is passed in a valid format, for example.
Are all targets reachable from conventional grasping positions. Are there any critical labels or functions obscured by touchable items?
Make sure all designed haptics (vibratory feedback) operates as expected, and are perceptible in the user's expected context?
Assure that all auditory feedback (sounds, readback, etc.) operate as designed, and are audible in the expected user environment.
- Check for visibility of all items. Is anything covering or overlapping other items?
- Make sure all images are properly-displayed. Check for crispness, proper aspect ratio, and that transparency works correctly.
- Check for consistent margins and gutters.
- Make sure everything is aligned properly, especially with immediately-adjacent elements.
- Look for overly-narrow spaces around icons, words and other components that make them hard to read or disguise divisions between hierarchical components.
- Check that gradients are properly aligned, oriented (aren't upside down) and are the right color. Programmatically-defined gradients can vary by OS version so work in one version, but look wrong in another.
- Read all text. Make sure there are no mis-spellings, and that nothing is cropped off or overlapping another component.
- Make sure it all works. Does clicking something cause the expected action to occur?
- Do all device (or browser) functions work? Check Back, Home, and menu buttons (or any others) work as expected. Be sure to follow OS guidelines regardless of what specifications say; in Android the Back and Menu buttons should almost always work, even if they are solved in other ways inside your application.
- Switch to landscape.
- Especially watch for the transition itself. Make sure items do not jump around, or that it does not take too long to re-render.
- Make sure items appear in the proper location, and stretch to fit or move into multiple columns, as expected. Look for unnecessary gaps.
- Re-confirm all items from the portrait list. Especially confirm that no items are cropped or covered.
- Enter information into form fields, and perform the transition. Assure the entered content is in place (not cleared), and the field is in focus still.
- Switch back to portrait, and observe the same transitions to make sure this work also.
- Confirm any remote connected devices required to make the application work will connect and exchange data.
- Force a switch to any other input devices; slide out keyboards for example.
- Use voice input, or other control features that are built into or accessed by the application.
Yes, the above list is fully-inclusive, so not ever product or interface will do all of these. If you think I missed something, it may just be that it's about your project alone. That's fine. This is a list of guidelines, but in many ways test plans are based on accepting the specification for your product. Test against that, certainly! Do be sure to measure when possible. Whether you use a screenshot (carefully, know your scales) or measure directly with a tool like a Touch Template, be sure to not trust your eyeballs. Confirm.
In an ideal world, with infinite time, after doing the evaluation on the actual hardware it is good to take screenshots and compare design to the actual built product side by side. I have discovered things doing this that were not clear on the screen. My favorite example was a 5 px black line around the entire application, that could not be seen when on the handset (it blended into the device bezel).
Recording Your Findings
It is critical to take good, clear notes. You will find many things worth noting, so cannot rely on memory. Really, you can't. As you find more issues your brain will exaggerate or forget previous issues. Write them down, as you find them.
My favorite way is a spreadsheet. My favorite by far is Google Spreadsheet, as it can be shared, so multiple practitioners can work on one document, or it can be shared for the team to discuss what to do about the findings. See this example. Yes, real ones are much longer, but it was easier to sanitize a real one by just keeping a few real and generic points so pretend it's very long.
- You may need to add items to a bug tracking database instead. I still like lists and spreadsheets for discussion and tracking. Very often, you may find the same issue on multiple pages, which clarify the root cause so instea
The large checklist of hundreds of heuristics requires us to assign a value to each point. Meaning you say what is good as well as what is bad. I tend not to do this and only note the bad. Not just because I am mean and only like to talk about the bad things. Instead, for two reasons:
- Not much should be bad. When things get so bad that there are 100 items per page, I usually give up and write a different type of report talking about fundamental issues that need to be solved. You will hopefully find just a few items per view, and it won't be so overwhelming to just talk about things that should be improved.
Implementation teams don't know what to do with praise. Oh, sure, say it's lovely otherwise (if true) but if you give a list of good and bad points, they may all be added to a bug database, and there will be attempts to solve the good points. Really. I have seen this.
Next: Part I Page