I’ve written before on an approach called Content Value Analysis (CVA) and I’ve now produced the detail of how this might be done in practice. Chiara Fox of Adaptive Path in her presentation on Content Analysis came closest to this approach when she talked about Content Audits at Euro IA 2008.
However there is a crucial difference between our approaches – the use of statistical techniques and documented heuristics. No don’t reach for that mouse yet! What I’m suggesting is not difficult, will give great substance to your final analysis and will probably save you a lot of work.
When trying to tell an intranet manager or a content owner the hard truth (e.g. their content sucks) it can sometimes result in a ‘ yes it does – no it doesn’t ‘ sort of debate that usually goes nowhere. By introducing random sampling and using a statistically calculated sample size it will be possible to state with absolute confidence that –
– The results have a very high probability that they are representative of all the pages in the site or sub-site being assessed
– All sample pages were arrived at randomly and the process was free of any bias
In order to achieve this you need three things –
Sample size – Using the correct sample size means that you will assess the minimum mumber of pages to ensure that you can be confident in your result whilst also ensuring that you don’t waste time and effort assessing more pages than you should. Assessing more pages than the sample size will not improve the accuracy of the result.
Sample size can be quickly calculated using a statistic sampling table. The one I have pointed to is used by the US military for assessing deliveries of parts in ‘lots’ or ‘batches’. I would suggest using an AQL of 0.25% (Acceptable Quality Level is the number of parts you would accept as being defective but in our case it means that at worst there would be a 0.25% or 1 chance in 400 that we were wrong). In this case a web site containing 1,000 pages would require a sample size of 75 and a web site containing 15,000 pages would require a sample size of 135. As the ‘lot’ or size of the web site is presented in ranges the number of pages can usually be approximated.
If there is little time or there is a need to benchmark a lot of sites a smaller sample size might be selected e.g. 2.5% meaning that the sample size for a web site of 15,000 pages would be 35. The downside of this is that, at worst, there is a probability that you might be wrong of 1 in 40. However if all you are looking for is a ‘ball park’ estimate this ‘CVA lite’ approach might be of value.
Random selection of pages – This took a bit of thought and only when I remembered using my old calculator to generate random numbers years ago did the penny drop. All you need is a set of random numbers (I used random.org). Ring sets of numbers randomly into groups, as many in each group as you like (make sure that there are some single numbers in there too). Simply use the random numbers to select links starting from the home page until you arrive at the last number in the group (you can set the maximum number generated so if you know the most nav links is 8 then set the limit to 8). If the first number is 7 select the seventh link in the navigation bar, then if the next number is 2 select the second link on that page etc. until you arrive at the last number in the group. This will be the page you will assess. If you find that a particular set of numbers do not work simply go to the next group and try again but my approach would be to include contextual links also if they are used . Once the page has been assessed carry on going deeper into the site until you finally can go no further. Then return to the home page and start again. Carry on until the sample has been completed.
Heuristics So OK, you now know many pages you need to look at and you can also select them randomly but once you arrive at a page what then? You need to document the heuristics that you are going to use and ensure that the intranet manager or content owner also get a copy before the assessment so that they can’t argue afterwards. My suggestions are as follows –
No content value = 0
Good content value = 1
Obsolete content = 0
Content outside review date = 0
Irrelevant content = 0
Incomplete content = 0 to 0.5 depending
Content not reflecting the link title – 0 to 0.5 depending
Context of page not clear (when assessing it will be like coming from a search page – Is it clear where you are? Does the content make sense?) = 0 to 0.5 depending
Over usage of jargon and acronyms = 0.5
Bad grammar and spelling = 0.5
Once you have assessed a page any score of zero would give an overall score of 0 for the page and any two 0.5 scores would also equal a zero score.
You will also need to record each page visited and the details of the assessment (I’m putting together a form that I am intending to use myself, if you want a copy please post a comment). Then you need to analyse the results and look for patterns. This part is vital. Do it well and it will provide incontrovertable evidence of the content value. You can go down the route of calculating the standard deviation , which will give you a predicted range of results, but in most cases an average figure presented as a percentage will be more than sufficient. Percentages are great – if you can say that on average 48.9% of a site’s content sucks then you are more likely to be believed.
Ensure that the results are presented in easy to read graphics so everyone can understand the analysis. Strangely most managers I’ve worked with don’t question what I’m telling them if I present the information slickly as a graph or bar chart.
This may seem a lot of work but it is much easier than carrying out a content inventory and for a couple of days work you will get an extremely accurate snapshop of the content value of a site. A snapshot that will be very hard to argue against.
CVA Flowchart content_value_analysis_flowchart1
You can also download the Excel CVA form from the Information Architecture Institute’s Tools site
(Thanks to ‘Here’s Kate’ Kate Andrews Flickr photo as a metaphor of how most intranets might look in real life!)