Tree of Data / Sea of Data / Web of Data

Posted on: May 18th, 2005 5:14 PM GMT

By: Greg Reimer (Code Monkey Extraordinaire)

Topic: tech, web, information architecture, information modelling, xtm, topic maps

The problem of finding relevant information on the web is a pretty big problem when putting together a large, multifaceted web site. Information Architects (people who more or less design "navbars") deal with this issue regularly, so I thought I'd lay out some of my thoughts on it since I've had involvement with both web design and application information modeling at various points in the past. The three navigational idioms whereby users find relevant content are: Tree of Data, Sea of Data and Web of Data. (TOD, SOD and WOD) (Say rather than spell them to increase the fun level of reading this.)

In the TOD, content is arranged in the familiar tree hierarchy similar to folders and directories in a filesystem. You have branch nodes and leaf nodes defining section titles and pages. The huge advantage of the TOD is this: the way the site is organized and navigated is visible and easy to comprehend, which feels familiar and safe to users. The disadvantages are that it's rigid and inflexible, limiting any association between content resources to parent/child/sibling relationships already defined in the tree. Also, no solid convention exists regarding whether a branch node should be a leaf node, in other words should the title over a list of sub-page links itself be a link. If so, should it link to a section summary page, or to the first in the list of sub pages? Finally, navigating the TOD can force a user to make lots of clicks as they try to find a certain resource, especially if they initially "go down the wrong hole".

The second idiom is the SOD, where users find content by keyword search. This involves a nice, short path to the target resource; the user types a bit, presses a button and clicks a link. Even if the the number of pages is in the millions or billions, search works great. The disadvantage is that the structure of a site is unknowable as seen through the SOD paradigm, and the user may feel like they're swimming in a sea of disorganization and only searching as a last ditch effort to find what they need. The warm fuzziness of the TOD is non-existent.

The third one is the WOD. The WOD maybe isn't as clearly defined as the other idioms, but it's the oldest on the web since hypertext and the World Wide Web basically originated from it. A common manifestation of the WOD idiom is when a TOD site is richly crosslinked. A what's related section containing links to various relevant resources is another example. The WOD admits that the TOD is too rigid and that the SOD is too unknowable, so it creates a little bit of knowable structure for wherever the user currently is right then—whilst simultaneously being free of the rigidity of the TOD.

All three idioms are useful and necessary on a big website. The quality factor is in the relevance of the associations that form the links between the resources: parent/child for the TOD, keyword/URL for the SOD, and resource/resource for WOD. For example, are the top level links in the navbar the most useful to the target audience and the goals of the website, and are the sub-links relevant in their parent categories? Are the search results relevant to what the user typed? Are the what's related links helpful to the user, and are they well-maintained as the site grows and changes? What about context? When you throw things like p13n, l10n and i18n into the mix the issue complicates. It isn't just relevance of an association anymore, but the contextual relevance of an association. Why would an engineer care about marketing fluff when they're looking for product specs for example? If you're building a dynamic website, what kind of monstrously complicated information model can encompass all of these things?

Incidentally, XML Topic maps (XTM) are a good candidate, and they aren't actually all that complicated either. If a dynamic website were built on a topic map engine, and the topic map were well-built and maintained, it could auto-generate the TOD, SOD and WOD idioms with ease, plus do it in an internationalized, personalized, role-oriented, task-oriented, or entitlement-oriented way, or any combination of the above plus more. This is because topic maps have typed associations between topics, and typed contextual relevance between topics, so you can say the X2000 Workstation intro page is a considered "child" of the product page in the TOD hierarchy, but only relevant to it in a marketing sort of way, so if an engineer is on the product page they'll be spared the pain of possibly glimpsing the X2000 Workstation intro page's pretty colors and reading its jazzed verbiage.

In any case websites (especially the big ones) usually end up with some topic-map-like model, however implicit or explicit, inconsistent or manual it may be. The quality of a website depends, IMHO, on how well its creators are able to grasp and formalize this model, and translate it into well-placed TOD, SOD and WOD idioms.


(Originally posted 08 Oct 2004 at my work blog)

weblog home »
show all posts »

Valid XHTML Valid CSS Valid Atom