Re: Encoding tags in DRS from Phil Archer on 2008-03-03 (public-powderwg@w3.org from March 2008)

From: Phil Archer <parcher@icra.org>
Date: Mon, 03 Mar 2008 10:01:04 +0000
To: Public POWDER <public-powderwg@w3.org>
Message-ID: <47CBCC60.6030400@icra.org>
Scheppe, Kai-Dietrich wrote:
> Hi Phil,
> 
> I think in general it is a good idea to provide more means of describing
> resources, thereby increasing the flexibility to do so.
> 
> However, as much as a tag is a free form entity, aren't the descriptors
> the same?
> In other words, what do we gain?
> Does it help to basically list an "example" or does that artificially
> limit the interpretation of the descriptors?


The key difference between a tag and a descriptor is that the latter is 
associated with a namespace and therefore has a precise meaning, defined 
by whoever created the vocabulary. A tag is a free-form bit of text that 
  only has meaning to the tag author and other people who happen to 
agree on it. The example I usually use is at 
http://www.fosi.org/archive/everyonehasanopinion/#control

This shows a picture of sculpture of Einstein and offers several tags 
that might be used - and there would be many more that could apply. More 
precise descriptions would require vocabularies from particular domains 
such as: public art, sculpture, physics, Australian Tourism and so on.

Actually this goes to the very heart of what distinguishes the Semantic 
Web from everything else. What is the name of Madonna's son? For pop 
music fans the (unusually well-informed) answer would be 'Rocco.' A 
Catholic might answer 'Jesus.' It all depends what (or in this case, 
who) you mean by 'Madonna.' http://christianity.org/#madonna and 
http://popmusic.org/#madonna - being URIs - uniquely identify the 
resource in question.

But... people really like tagging and we were asked to support tags 
directly when the charter was up for review a year ago. Add in Kjetil's 
use case (http://www.w3.org/TR/2007/NOTE-powder-use-cases-20071031/#mlk) 
and we have a need to support free text tagging. POWDER can explicitly 
allow tags to be applied to multiple resources (cf. the assumption that 
if I tag www.example.org I sort of mean everything on it including its 
sub-domains.

But can we be clever about this? Actually, yes... because we can support 
associating a free text term with a semantic description. Kjetil started 
to build this into my.opera 
(http://lists.w3.org/Archives/Public/public-powderwg/2007Jun/0025.html)

The mapTo attribute in POWDER (wdr:mapsTo in POWDER-S) links any old 
tag(s) to a precise meaning. So to give a slightly tongue in cheek 
example, my tag "should work on your mobile unless it's _really_ old" 
might map to mobileOK Pro - which will be defined with exquisite precision!


> 
> Not being really familiar with tags beyond their existance I looked it
> up on Wikipedia.
> As for white spaces being used, I found this on Wikipedia
> http://en.wikipedia.org/wiki/Tag_%28metadata%29

Useful info, thank you. If the group agrees with the general approach, 
then a design decision we need to take is what syntax to use. In my 
original e-mail I suggested that we use white space as a delimiter but 
as you have confirmed, this isn't always the best way.

Alternatives to only allowing single tags with no white space would be:

1. Use a comma-separated list. This is perhaps OK but is inconsistent 
with our other list-based property values which all use white space as a 
delimiter.

2. Use white space but allow quoting so that Eiffel Tower is two tags 
but "Eiffel Tower" is allowed in the list as a single tag. Possible but 
the encoding of quotes in XML requires a lot of care and we could get 
into a bit of a mess there and I think we'd do well to avoid it if we can.

3. Allow multiple instances of a tag element within a Tagset, i.e.

<Tagset mapsTo="http://example.org/#semantic">
   <tag>Paris</tag>
   <tag>Eiffel Tower</tag>
</Tagset>

The first tag is a single word, the second includes a white space 
without the need for any complex character escaping. On the downside, 
this is obviously more verbose.

[snip]
> Looking at the seemingly "unstable" nature of tags and the fact that
> rel="tag" is, at least on microformats.org a draft from 2005 I am not
> sure if this a good idea.

rel="tag" is in the HTML 5 list of rel types 
(http://www.w3.org/html/wg/html5/#linkTypes)

> 
> Incidentally, we use the word "tag" a lot in our documents where we
> actually mean "element".  Do we care?

It is an error and needs to be sorted out!

Phil.

> 
>> -----Original Message-----
>> From: Phil Archer [mailto:parcher@icra.org] 
>> Sent: Friday, February 29, 2008 4:52 PM
>> To: Public POWDER
>> Subject: Encoding tags in DRS
>>
>>
>> One of our use cases [1] calls for DRs to support free text 
>> tags and for it to be possible to associate such tags with a 
>> a semantic definition. For example, I might want to associate 
>> my tag 'red' with something more prescriptive like 'ff0000'. 
>> The use case (which comes from my.opera) talks about the 
>> Dahut and cryptozoology.
>>
>> I've been thinking about how we might do this in POWDER ('cos 
>> the present text has all the problems we're just overcoming 
>> with our two-part system).
>>
>> OK, how's this:
>>
>> <POWDER xmlns="http://www.w3.org/2007/05/powder#"
>>          xmlns:ex="http://example.org/vocab#">
>>
>>    <attribution>
>>      <maker>http://authority.example.org/foaf.rdf#me</maker>
>>      <issued>2007-12-14</issued>
>>    </attribution>
>>
>>    <DR>
>>      <URISet>
>>        <includeHosts>example.org</includeHosts>
>>      </URISet>
>>
>>      <Descriptors>
>>        <ex:color>red</ex:color>
>>        <ex:shape>square</ex:shape>
>>        <displayText>Everything on example.org is red and 
>> square</displayText>
>>        <displayIcon>http://example.org/icon.png</displayIcon>
>>      </Descriptors>
>>
>>      <Tags mapsTo="http://ui.example.com#panicButton">red 
>> square</Tags>
>>
>>    </DR>
>>
>> </POWDER>
>>
>> Here our DR contains the URISet and _two_descriptive units. 
>> My proposal is that a DR must contain at least one of these. 
>> The Descriptor set is the familiar one, but there's now a 
>> Tags element that contains a white space separated list of 
>> tags. The mapsTo attribute (optional) might then link to the 
>> Descriptors block in another POWDER doc or any other bit of 
>> RDF that had a semantically-defined definition. So the 
>> semantics here are "when I say red and square I mean the same 
>> as ui.example.com mean by their term 'panicButton'.
>>
>> I guess we could make the tags an element within the 
>> Descriptors but this doesn't feel right. A tag is a free-form 
>> thing, Descriptions are much more tightly constrained and I 
>> fear we might imply that ex:color='red' and ex:shape='square' 
>> both meant the same as panicButton
>> - they don't.
>>
>> I did toy with saying that a DR can only have Tags or 
>> Descriptors but this is pointless since you can create 2 DRs 
>> with the same URISet in the same POWDER doc to archive what's 
>> written above.
>>
>> Incidentally, I believe I'm right in saying that tagging 
>> systems in general do not allow white space within tags, i.e. 
>> Eiffel Tower is 2 tags. If this is wrong and we should 
>> support tags with white space we can do that too but we'd end up with
>>
>> <TagSet mapsTo="http://ui.example.com#panicButton">
>>    <tag>red</tag>
>>    <tag>square</tag>
>> </TagSet>
>>
>> Any comments?
>>
>>
>> Phil.
>>
>> [1]
>>
>> --
>> Phil Archer
>> Chief Technical Officer,
>> Family Online Safety Institute
>> w. http://www.fosi.org/people/philarcher/
>>
>>
>>
>>
>>
>>
> 
> 

-- 
Phil Archer
Chief Technical Officer,
Family Online Safety Institute
t. +44 (0)1473 434770
Skype: philarcher
w. http://www.fosi.org/people/philarcher/
Received on Monday, 3 March 2008 10:11:28 UTC