Doing a Content Inventory (Or, A Mind-Numbingly Detailed Odyssey Through Your Web Site)

by Jeffrey Veen 18 Jun 2002 · 5 minute read

I’ve spent the last year working with clients on a variety of information architecture and design problems. One of the most strikingly consistent issues, however, has been how many of these companies still haven’t developed content management systems. I’ve spoken with enterprises in the Fortune 100 who find themselves sitting on top of 6 years’ worth of Web content trapped in static HTML files. They know they need to get this stuff into database and redesign their site into a template-driven system. But their first question is inevitably, “So, uh, where do we start?”

If you’re in a similar situation, your first step is to take stock of what you’ve got. This process, known as a content inventory, is a relatively straightforward process of clicking through your Web site and recording what you find. We’ve developed a simple Excel spreadsheet to help you structure your findings, and some tips on how to get through it.

Start at your home page. Identify the major sections of your site. For example, at, we’ve divided our site into these sections: team, services, workshops, publications, and contact. If I were doing an inventory of this site, I’d start with one of those sections, click in, and see what’s linked from it. For each page that I visit, I’d record the information specified in the columns of the spreadsheet. I’d follow every link and navigate as far as I could through the site, making sure to gather data about every possible page on the site.

Here’s a description of the things I look for:

alt text

After you’ve filled in a couple hundred lines of the spreadsheet, you’ll inevitably start to wonder if there is something — anything! — that can speed this process up. Surely technology can come to the rescue. Sorry. The best we’ve been able to do is enlist the help of a programmer to write us a script that will crawl a Web site and spit out the URLs it finds. And that merely ensures that we don’t miss any pages. Even with this head start, we always go through the pages by hand. A content inventory is a decidedly human task. In fact, we find that the process can often be as valuable as the final spreadsheet. If you invest the time in scouring your Web site and deconstructing every page (or at least a good selection of pages), you will end up as the uncontested expert in how it all goes together. And that’s invaluable knowledge to possess when redesigning your site.