This post is reflective of the business environment I work in, notably in the B2B publishing sector. However, I feel that all sites would benefit from this approach if they use a site search technology, a CMS and a specialised group of individuals.
Taxonomies on business-to-business websites are industry based around the communities that interact with their content. The problem that many sites find is that the content can evolve and the taxonomy can not adjust to changes in what the content creators produce, or with what advertisers wish to sponsor, with regards to useful popular content.
Taxonomies should not be a snapshot in time but should be a living reflection of the markets they represent.
Taxonomy creep
This ‘taxonomy creep’ can only be combated by frequent reviews by people who have most contact with the site content. Usually this would be the website manager working with the site content creators and other specialists.
Taxonomy creep inevitably occurs to all sites and there is a need to be able to monitor and adjust the taxonomies without impacting on the user experience or the workflow of the content producers.
Here I propose to set out a process that businesses can employ that will ensure their taxonomies are accurate. That they reflect the industry, user groups and business objectives of the site and will utilize their web technologies and people available.
This process recognizes the evolving nature of what we produce and ensures that the users will find the content, enhancing their experience on sites.
Benefits of a search technology
Various types of software can automatically search a site’s content and map it to nodes in an existing industry taxonomy. This taxonomy is has been built primarily for the global search engine crawlers and so has a generic take on industry topics.
This forms a good basis for industry taxonomies within business to business websites and from there we can take research findings, personas, keyword research and the product team’s industry knowledge and any future content proposition plans, to inform the site specific taxonomy.
This bespoke taxonomy, with a targeted categorization rule base, allows the content to be categorized in a manner that is accurate (using a categorization tool) and automatic.
Many search technologies are designed to work so they operate globally, categorizing and indexing millions of pages to robust taxonomies, to enable users to find information from around the web. Using this feature within a site enables the search engine to aggregate content around topic areas that can be used to provide users with extra valuable information.
With a site specific taxonomy (SST) and a search engine taxonomy (SET) combined, extra features can be employed onto the site.
- Landing pages can be automatically created on agreed terms that have been highlighted by site owners.
- Pages can be tagged using contextual links to point to these pages automatically, using words or phrases pre-determined by the SEO team and the site managers.
- There is an ability to list articles by author and show related content. By displaying tags alongside an article that points to a landing page or takes the user to specific search results pages on that topic – relevance is increased.
- The SET is also enhanced as it can take the new categories back to the global indexing engine.
With this increased level of user interaction the amount of pages that will be viewed will rise.
However, the key qualities in this assumption are relevancy and timeliness.
Different situations will require a different weighting but relevance will always be the key to a serendipitous user experience.
The relevance, or ‘aboutness’ of a page is driven by the ability of the website CMS to categorize content automatically and accurately and in some situations extra coding will be required for the formulation of rules (within the classification engine) to be able to do this accurately.
The review process explained
Step 1 – Look at existing taxonomy and identify gaps.
Gaps will be highlighted by documents created by specialists such as an IA, an SEO expert or a site manager (taxonomies, controlled vocabularies, content proposition plans and navigation schemes).
In publishing websites, journalists may alert editors that articles are not being classified correctly and a mapping of the SET to the SST would occur.
Mapping the SET ensures that we gain a perspective from the industry, then the content producers will align their view on the subject matter backed up by the research documents mentioned above.
Finally, by using research on the users (personas, web metrics and keyword research) we can ensure that the suggested new SST sits well. This ensures the content, users and business context are addressed.
Step 2 – Creation and consolidation of categories and rule development
After the gap analysis of the SST it may be the case that new categories need to be created and rules written for the SET categorisation tool. Bespoke channels may also need to be addressed within the presentation of the site through the UI but these areas will also be a part of the SST. This ensures all useful content will be retrievable through search and browsing.
At this stage the corpus is tested against the SET categorisation tool using existing rules. Results of the test will allow us to deem if it is necessary to create new categories within the SET and in turn the SST.
It may also be evident that certain categories need consolidating as there is not enough content to occupy these areas. This needs to be executed in a considered manner, with a view to future proof if possible, as tweaking a navigation item in the UI will lead to user confusion.
Once this sequence of work is complete the new rules can be developed (if necessary).
Step 3 – Testing stage
The corpus is tested against the SET categorisation tool using the new rules. This will result in the corpus being reclassified to the new topics in the site taxonomy.
The results of the categorisation will need to be checked by taking a random selection and seeing that they are categorised to a standard that the team is happy with.
Step 4 – Quality Assurance
If the results are judged to be inadequate the team, or a 3rd party resource, (depending on budget / complexity) refines the new rules and then uses the SET categorisation tool to reclassify documents again until the results returned are deemed satisfactory.
Step 5 – Quarterly Review
Once implemented the taxonomy will need to be revisited in three months and the sequence begins again.
Who should be involved with the process?
A review every quarter should be a key task for every website manager or product owner. They will need to set up a reporting procedure when the categorisation fails and this would be recorded and analysed as part of the review process. It may be a spreadsheet with the article ID and then the details are listed. Though this may be initially demanding on their time it will improve the classification engine and improve the workflow within the team.
The search engine technology team need to be involved at all stages of the process, to enable testing of the rules base with the content. They also need to interact with the site web developers to ensure a test environment is present and they can report results.
SEO specialists would also be able to provide the website editor with a list of top keywords from which landing pages could be built. This list would also help inform the direction of the taxonomy and the naming of navigation items.
The user experience team would provide personas and the information architect would provide any documents relating to information organisation or site structure. The web analytics team will also provide usage stats of the site and how the users interact with it. This will help the web editor make decisions around consolidation, creation or naming of the taxonomy and navigation items. These specialists should be notified of any changes to the user interface which may occur through re-ordering of site categories.
Finally, and most importantly, content creators need to be in constant dialogue with the website managers to ensure the taxonomy supports the content that they are writing.
The long term view
Benefits of a smoother workflow and a better user experience in the long term outweighs any potential for labour intensive activity in the short term. There will be some initial work to do to get site taxonomies up to an adequate standard that the website manager, the content creators and the specialist teams can be happy with.
By implementing a taxonomy creation and maintenance strategy, one can be confident that the auto categorization of content will be findable by the user and will be accurately referenced within the site architecture.
Employing a taxonomist would always be the best option if it at all possible. But by ensuring that those who work on the website know the importance of a good taxonomy to the user experience, then that should be enough to ensure the taxonomy will be managed effectively and reviewed frequently.Tweet




Very interesting post, though I don’t understand how landing pages would be automatically generated. What would be their content?
The content is created in the traditional way – using a CMS.
However because the site taxonomy and the classification engine of the search technology work together, it crawls and brings together content around a search term or taxonomy node within the site.
Hope that answers your question…
[...] Managing taxonomies [...]
[...] to whenever site structures are changed and this need to be reflected in these two documents. Managing taxonomies is a core part of website [...]
I found your blog by chance . but i have to say that it’s great blog very useful information and very interesting subjects just greetings and good luck
i’m not going i will be always checking for updates.I’m very interested in CMS and all its related subjects.