Hysteresis, History and empty metadata fields

March 6, 2006, 12:24 PM —  ITworld.com — 

As a teenager, some well meaning teachers decided to put us boys through some training in the fine arts of metalwork. We had classes two times a week for a couple of years. It is all a blur to me now. All I remember is the heat, the sweat, the smell of oiled metal filings and the noise. I cannot find words to express how much I hated those classes.




Actually, there is one memory that rises above the blur and stays with me to this day. A single, gorgeous sounding word: hysteresis [1]. That I should remember this obscure word rather than anything else from my metalwork classes is telling. You will not be surprised to hear that I do not have a workshop out back filled with hack saws and drill bits.




Hysteresis occurs whenever the effect that accompanies some cause is delayed for some reason. The term is most often associated with processes in the physical world. The movement of interest rates, the growth of insect populations, the rise and fall of magnetic fields, that sort of thing. It is also relevant in studying the strength of soldering joints which is where I came across it in my metalwork classes.




Recently it occurred to me that the concept of hysteresis is equally applicable to the more abstract concept of 'information' or 'knowledge'. The thought was prompted by a project I am involved in, in which content is created and then tagged as to its purpose and contents by a team of authors working with a content management system.




One of the perennial problems with content management systems is that they are generally designed with an existing corpus of information in mind. For this existing corpus, the users/owners tend to have a pretty good mental model of what the content is about, how it should be organized and so on. This is used to drive the design process for the new content management system. This results, almost invariably, with (a) some concept of a "document" and (b) some concept of the metadata to be associated with each document. The engineers then take the metadata information and craft a beautiful document metadata screen which users are invited to fill in when they create new content. One year later, most of the metadata fields for new content are found to be either blank or wrong, consisting of convenient dummy values to get the content management system to stop beeping. Much scratching of heads and nursing of sore wallets ensues.




And now for my theory. There is a hysteresis-based relationship between content and non-trivial metadata about the content. By non-trivial here I mean metadata that tells you what a piece of content is about, how it relates to other content and so on. Trivial metadata are things like author, date created and so on. Look at history, everywhere you look you will find classification systems that tidy everything into categories for us. The pre-Raphaelites, the stone age, the romantic poets, the continental philosophers. What do they have in common? The classification systems we use today to speak about these things came into existence afterwards. To take a flippant example, pre-Raphaelite artists did not have that term written on their business cards.




The 'aboutness' of the content we create and use in our endeavors is only obvious after the fact. This, I think, is the fundamental reason why so many metadata based content management systems have trouble getting good metadata out of content creators. The 'aboutness' of the stuff that was used to design the content management system was obvious because it was created after the content itself. However, for new content, the 'aboutness' has yet to be cooked so to speak.




My advice, if you find yourself in this situation, is to take a completely different tack. Writers write and categorizers categorize. There is an unavoidable delay between the two activities. The writers and the categorizers can be the same people but the activities are very different and cannot be done at the same time. Build this hysteresis into your workflows rather than fight against it. The alternative is blank or dummy metadata fields.





[1] http://www.lassp.cornell.edu/sethna/hysteresis/WhatIsHysteresis.html

 

ITworld.com

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Resources
White Paper

Symantec Backup Exec 12 and Backup Exec System Recovery 8 deliver industry leading Windows data protection and system recovery. Download this whitepaper to find out the top reasons to upgrade and how to get continuous data protection and complete system recovery.

Webcast

Data and system loss — from a hard drive failure, malicious attack, natural disaster, or simple human error — can happen anytime. Don’t leave your business vulnerable. Make sure you have a secure recovery strategy in place. Symantec's latest backup and system recovery technology can efficiently restore critical applications, individual emails and documents and even restore your entire system in minutes in the event of a loss.

White Paper

Businesses face a growing challenge to ensure that the IT environment is properly protected. Backup Exec 12 integrates with other applications in the Symantec family of products, to complement your current data protection strategy, keep your data securely backed up and make it recoverable when you need it most.

Free stuff

Crimeware: Understanding New Attacks and Defenses
By Markus Jakobsson, Zulfikar Ramzan
Published Apr 6, 2008 by Addison-Wesley Professional. Part of the Symantec Press series.
Enter now! | Official rules | Sample chapter

Securing VoIP Networks: Threats, Vulnerabilities, and Countermeasures
By Peter Thermos, Ari Takanen
Published Aug 1, 2007 by Addison-Wesley Professional.
Enter now! | Official rules | Sample chapter

Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

More Resources