The problem was, to enable users (internal staff and external consumers) to be able to find what they needed and to discover content serendipitously, the metadata needed to be rigorously structured and adhere to recognised standards.
Various kinds of metadata were important to the sharing and re-use of articles, and this post will highlight these areas and also illustrate how we incorporated these into the metadata schema.
The lack of metadata in a file, adversely affects search engine retrieval but also the working efficiency in collaborative teams. Data is lost, assets are not published and effort in creation is wasted. Ensuring effective metadata implementation in the documents we create and publish results in gaining competitive advantage in the search domain but also an increased efficiency in our working practice.
Without metadata management intellectual property rights become eroded, and liability increases. Files such as an image, PDF, video or audio all need to be tagged to provide the user or employee a method in finding valuable content.
A set of Metadata Standards should govern the implementation of consistent and uniform metadata architecture. Consistency in metadata is important to enable information sharing across an organisation and to make optimal use of document management tools which rely upon this.
For metadata to be effective, it must be incorporated into the workflow from creation to publication. This emphasises the importance of any content producer making a concerted effort to synchronise their information management.
There are significant contributors in metadata standards and after analysis the following organisation’s standards formed a basis for the metadata schema.
- Dublin Core Metadata Initiative (DCMI)
- International Press Telecommunications Council (IPTC)
- Publishing Requirements for Industry Standards Metadata (PRISM)
These standards are evident in the following formats that create types of metadata:
- Adobe Software’s Extensible Metadata Platform (XMP)
- Exchangeable Image File Format (EXIF)
- Resource Description Framework (RDF)
To aid in classification of file format types we used the Internet Assigned Numbers Authority (IANA) to give us a definitive list of file types. Its list, the Multipurpose Internet Mail Extensions (MIME) covers various formats that were used and may be used in the future. All other taxonomical types of labels (countries, regions etc) we looked towards the ISO for their classifications.
- Descriptive metadata – Describes the contents of a file
- Administrative metadata – Data that can not be retrieved or inferred from the content and pertains to management of the content within a system
- Rights metadata – Asserts the ownership of the content, who owns it and who may distribute it and usually pertains to the usage of the document.
- Technical metadata – data about the physical properties of the content
All four areas have specific use for different reasons. The descriptive quality helps the item properties to be found either by search or by a user interface element on the page. This can be online and also offline in an application such as Adobe Bridge.
Drafting a schema
These are revealed through tools the Adobe suite of applications and in the content of the websites where the ability to find our information is paramount to the quality of the user experience. The importance of classifying different aspects of our content is becoming increasingly important as new technologies based on the XML platform come into fruition.
The list below outlines what we thought was of most use to be tagged from the moment content was created to the moment it became published.
|Headline||Information Type (set)|
|Entity Type (set)||Genre|
|Date Modified||Date Published|
|Contact Information (set)||Job ID|
|Instructions||Description of writer|
|Creator||Creator Job Title|
|Copyright Notice||License Contact|
|Model Release||Property Release|
|Other Third Party Rights||Usage Rights|
|Still Image||Moving Image|
The image below shows exactly how these fields are applied in practice.
Risks of not managing your metadata
Unless a unified metadata strategy is initiated there are risks that;
- The data communication between enterprise applications will be flawed. Thereby negating any efficiency that may be gained through hardware upgrades, workflow will be inefficient, content will reside in silos .
- The investment of designing taxonomies needs to be implemented at a technical level and this requires taxonomy management and data architecture. Not following up the investment of a metadata implementation by employing a data architect or someone responsible for metadata management is a critical weakness in the enterprise’s information management.
- Little or no use can be made of existing content and it often presents a valuable commercial resource. As the file has substandard metadata we immediately lose our content but we need to profit from its creation and value. Without a way of implementing a schema, content is effectively lost as soon as it is created.
With so many content creators working daily, the management of this information is our most important challenge. The content needs to be found easily, both internally amongst colleagues and externally amongst users.
Certain properties in the schema pave the way for future search technologies. For instance entity type (such as brand, product, location, event) are complex in their variety. However this label allows that complexity of context to be stated and clarified.
Perhaps its for this reason that I see metadata being the foundation of semantic search. Only with a rich metadata schema that incorporates several different facets will we start to enjoy highly advanced searches over content that has inherent relationships. The challenge for interaction and interface designers is to design the interface to be intuitive and allow searching in a unique, non-text field way, that is more exploratory than is possible at present.