W3C home > Mailing lists > Public > public-esw-thes@w3.org > May 2004

RE: Compound concepts in a thesaurus structure

From: Houghton,Andrew <houghtoa@oclc.org>
Date: Wed, 12 May 2004 16:46:34 -0400
Message-ID: <B56ABE145BEB0C40A265238FCAA420DF026F53BB@oa2-server.oa.oclc.org>
To: public-esw-thes@w3.org

> From: Aida Slavic [mailto:aida@acorweb.net] 
> Sent: Wednesday, May 12, 2004 3:51 PM
> Subject: RE: Compound concepts in a thesaurus structure
> 
> All depends on the objective behind SKOS.  Generally speaking 
> it is better to make sure that dumbing down is a decision and 
> choice left to  those who sell/apply/implement systems and 
> not for those creating standards. I would also make sure that 

True, SKOS is agnostic in this respect.  You can use RDF's
subclassing to get finer grain resolution or you can add your
own metadata elements to the concept record.

> Fancy you should mention Dublin Core as I wanted to use it to 
> illustrate the opposite. Producing metadata is far too 
> expensive to be created only  to find out that it does not 
> work in IR - which is what was nasty surprise for many 

Not sure what the problems are with IR, text is text, metadata
elements help with context, but if you are looking to extract
deeper meaning from metadata elements, then you need to fully
understand the XML grammars that your dealing with.  We should
probably take this topic offline.

> attracted to DC promise for cheep/simple/easy  - deciding to 
> ignore golde GIGO rule  (garbage in garbage out). Which is 
> why we have been flooded with of DC application profiles, 
> vocabularies, rules, guidelines - each project spending money 
> and time to write its own metadata standard within DC 
> standard to fill the dumbed down gap, inventing qualifiers, 
> syntax and refining semantic for generously empty DC 
> elements.

DC is building an empire...  Everyone wants it their way and
they fail to realize that they don't need to reinvent the
wheel...

> As for SKOS I am not sure I understand  why 94(73) would be 
> easier to parse when downloading data than <tag>94<tag2>(73) 
> both forms being particular to a specific systems and both 
> can be interpreted only within this identified system itself anyway. 
> For instance udc number 94(73) is encoded in one of my databases as
> c94f(73) .... why would SKOS be concerned with a form of my 
> prefered term????

I don't think I implied that 94(73) was easier to parse than
<tag>94<tag2>(73).  If I did that was a mistake.  I don't see
anything in SKOS that says that you are restricted in this
manner.  If you need to add additional metadata elements to
your concept records, go right ahead.  If you want to add
deeper level meaning to your preferred or alternate terms,
then go ahead.

For some internal projects we have planned, we will do 
exactly that.  For other project where we distribute SKOS 
records, we will remove internal system specific tagging.
That's our choice, SKOS doesn't say we have to do that, 
but that tagging may not be meaningful outside the internal
system it was used in or there are business decisions were
the additional tagging is deemed inappropriate for distribution.

If you decide to add additional tagging to your preferred
terms, you shouldn't expect that I will make any sense of it.
Whether you define your preferred term as:

<skos:prefLabel>94(73)</skos:prefLabel>
<skos:prefLabel><![CDATA[<tag>94<tag2>(73)]]></skos:prefLabel>

or

<skos:prefLabel rdf:parseType="Resource">
  <tag>94<tag2>(73)
</skos:prefLabel>

In the first case, nobody will care about the CDATA, its just
text to someone processing the skos:prefLabel.  The latter case
may cause me to do strange things when I process your data.
Lets say I want to display it.  I probably will do the Xpath,
//skos:prefLabel/text(), so I get 94(73) just as if the tags
were not there. [latter case]  Seems reasonable.  But what if
I decided to put something like MARC-XML in there?

<skos:prefLabel rdf:parseType="Resource">
  <marc:subfield code="a">Cats</marc:subfield>
  <marc:subfield code="z">France</marc:subfield>
</skos:prefLabel>

Now someone uses that Xpath, //skos:prefLabel/text(), and gets
"CatsFrance" which is not the intended form of "Cats--France".
If the embedded tagging is only used in my system, no harm, no
foul because I will have the knowledge to produce the correct
preferred label when requested.  However, passing it off to you
and you don't know MARC-XML this could be problematic.  That
was my point.  SKOS could care less, it's guidelines might say
don't do that, but as far as I can see it doesn't care at the
moment.  The problem is interoperability when harvesting SKOS 
schemes and trying to do something with them in online metadata
creation systems.


Andy.

Andrew Houghton, OCLC Online Computer Library Center, Inc.
http://www.oclc.org/about/
http://www.oclc.org/research/staff/houghton.htm
Received on Wednesday, 12 May 2004 16:48:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:52 GMT