- From: Dan Brickley <danbri@danbri.org>
- Date: Mon, 25 Aug 2008 11:22:18 +0200
- To: Kristof Zelechovski <giecrilj@stegny.2a.pl>
- Cc: 'Julian Reschke' <julian.reschke@gmx.de>, 'Ben Adida' <ben@adida.net>, 'Ian Hickson' <ian@hixie.ch>, "'Bonner, Matt'" <matt.bonner@hp.com>, "'Tab Atkins Jr.'" <jackalmage@gmail.com>, 'Henri Sivonen' <hsivonen@iki.fi>, www-archive@w3.org
Kristof Zelechovski wrote: > It is not metadata vs data, it is metadata vs content. Data in HTML > documents go into the SCRIPT element and they are usually expected to be > private to the page. > Chris There's a significant body of work and thought around microformats (see below) that argues against keeping a separate and hidden pot of [meta]data. And in RDF land, we've found time and again that the distinction between so-called "metadata" and "data" is one that serves largely to confuse. Re content vs [meta]data and microformats, eg. see http://tantek.com/log/2005/06.html#d03t2359 via http://microformats.org/wiki/principles [[ One of the principles of microformats is to be presentable and parsable. This means we prefer visible data to invisible metadata. This is one of the lessons we learned from the meta keywords debacle. In the early days of HTML, authors used to place keywords for their pages in an invisible <meta> tag and search engines used this information, because the specifications said to do so. However, before long, in the realm of the Wild Wild Web, these meta keywords fell out of sync with the content on pages, were polluted, spammed, and otherwise abused until there was so much noise, any semblance of signal was lost. Along came a new search engine that ignored meta keywords, used visible hyperlinks instead, and instantly provided better results than all other existing search engines. Lesson learned: hyperlinks, being visible by default, proved more reliable and persistently accurate for many reasons. Authors readily saw mistakes themselves and corrected them (because presentation matters). Readers informed authors of errors the authors missed, which were again corrected. This feedback led to an implied social pressure to be more accurate with hyperlinks thus encouraging authors to more often get it right the first time. When authors/sites abused visible hyperlinks, it was obvious to readers, who then took their precious attention somewhere else. Visible data like hyperlinks with the positive feedback loop of user/market forces encouraged accuracy and accountability. This was a stark contrast from the invisible metadata of meta keywords, which, lacking such a positive feedback loop, through the combination of gaming incentives and natural entropy, deteriorated into useless noise. ]] In the RDF scene, many agree with the core claim here: data that is not used, rots. We RDFish people perhaps tend to take a broader notion re use, and allow that the data might live primarily eg. in a database or app, with its expression in HTML markup being a downstream copy. So, for example, FOAF files that are generated automatically from a "social network" site are vastly more likely to be up to date than FOAF files that are hand edited or were created by one-shot tools like foaf-a-matic. The core data might live in the social network site's db rather than in HTML, but the principle here is that data that's un-used and un-seen by humans is unlikely to be kept accurate. Data embedded in real life activity is much healthier. The Microformat view tends towards putting data in human-readable blocks of markup as a way of keeping it visible and alive. The RDF community tends more towards making sure it can be consumed by multiple tools, so that it is "seen" and consumed widely. Both generally agree that the head section of an HTML document isn't usually the healthiest place to store and manage [meta]data. cheers, Dan -- http://danbri.org/
Received on Monday, 25 August 2008 09:23:06 UTC