Re: Labels separate from localnames (Was: Best Practice for Renaming OWL Vocabulary Elements from Kingsley Idehen on 2011-04-22 (public-lod@w3.org from April 2011)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Fri, 22 Apr 2011 11:08:14 -0400
To: public-lod@w3.org
Message-ID: <4DB199DE.7090605@openlinksw.com>
On 4/22/11 7:36 AM, Martin Hepp wrote:
> See replies inline ;-)
>> Sorry to say this, but I think you are making a mistake. To say that the rdfs:label has to look like a variable name because it is for Web developers sounds to me like you are saying that the javadoc of a method should look like a piece of code because it is addressed to programmers. I refuse to believe that Web developers understand better pseudo code than natural language.
> I will finally give in to use English spacing and capitalization for rdfs:labels in GoodRelations, e.g. use
>
>     "Business entity"@en for gr:BusinessEntity etc.
>
> But I will keep the cardinality recommendation in the rdfs:label of properties, e.g.
>
>      serial number (0..*) for gr:serialNumber

Why not move that to rdfs:comment? We have to encourage the "Web 
Developer" to remember that the Web is actually part of a broader 
computing technology continuum. Programming languages have offered 
"comments" features from the get go. They've had de-reference and 
address-of operators (exposed at very levels) from the get go etc..

We can't keep on pandering to a programmer profile that is increasingly 
producing solutions for mass use without encouraging them to learn. To 
me, this is about "deceptively simple" vs "simply simple" doctrines. 
Sadly, "simply simple" is taking over the Web which (ironically) is a 
great contemporary demonstration of the "deceptively simple" doctrine.

> and the class type information in ontological individuals, as in
>
>      By bank transfer in advance (payment method) for gr:ByBankTransferInAdvance

Hmm. This is why links to pages that aid ontology exploration [1][2][3] 
and reading are very helpful.
> The latter should definitely not irritate human consumers, for it provides context; the former is to my judgment the best way of indicating cardinality recommendations in OWL, since the OWL cardinality constructs don't cover what is needed, yet I have to be able to tell modelers the intended cardinality. It is not nonsensical, as you state, as many users of GR have confirmed.
>
>> Moreover, Web clients most of the time display raw data (in a nice way) extracted from databases. For instance, a Wikipedia article displays a nice readable title, which is exactly the raw data that is found in a column of a database. Of course, you can decide that you won't use rdfs:label for human readable text and reserve another property for that (eg, dc:title), but you cannot decide how others will use your data and they may have a preference for the rdfs:label. As a matter of fact, rdfs:label is commonly used for showing people a nice readable piece of text in natural language.
> I was stressing that SW apps that aim at real people will have to use sophisticated methods for choosing the proper label for data elements anyway; using the raw rdfs:label will not work for non geeks in most of the cases. Most ordinary people cannot process data, just information.

We have inference rules to handle this, of course, but as a baseline 
rdfs:comment is ample re. detailed descriptions from the ontology 
authors perspective :-)

>> Now, let's imagine I have a "product browser" which aggregates information about products found on the Web, leveraging the GoodRelations vocabulary and possibly other vocabularies. It may display the products in a table and have a column for "product type", which displays the class of the product. There are chances that the client will display the rdfs:label of the class as the "product type", which in the case of GoodRelations would look sibylline to a casual reader, with camel-toed text and nonsensical information about arity.

Don't know, best we look at this via an actual page covering different 
presentation styles [4][5][6] .
> Nobody except for very specialized analysts will ever want to use a product browser that presents raw RDF data.

True, so the key is multiple views of the same data for different 
beholder profiles. The ontology can't solve that problem, let the client 
tools do that.

>> Moreover, with such practice, how can you provide labels in multiple languages? Paymentmethod is not even an English word!
> The choice of labels for information consumers cannot be solved by the creator of the vocabulary, because that depends on the context (e.g. audience) in which the results will be displayed.

Yep! As stated above.

> This is independent from the question of translations. A good ontology makes good (context-independent, lasting, cross-cultural) choices regarding the categories of things. The linguistic representation of these categories in specific context is a completely different story.

Yes.

[SNIP]

>> Google Rich Snippets don't show the labels because it is specifically tuned for GoodRelations. But a generic tool which aggregates information from various sources using various vocabularies has to make a generic assumption on what to display. rdfs:label is what is often chosen by generic tools to be shown to people.
> I doubt the interaction with RDF data on a Web scale will be a simple modification of the browser paradigm of HTML content. Pivot-style approaches IMO pointing to the right direction, but again, you will need a hard-coded or pretty intelligent additional layer in between the human and the data, and selecting the proper name for a piece of data will be among the challenges. A simple regex on the labels from the vocabulary will be the least obstacle of all.
>
> I don't think that we as LOD / SW researchers do already know how to implement the larger vision, but it will for sure require a lot more sweat, more creativity, and more cross-discipline effort than many seem to assume.

Links (showing effects of label choices using real data etc.) :

1. 
http://idehen.net/describe/?url=http%3A%2F%2Fpurl.org%2Fgoodrelations%2Fv1 
-- my personal data space instance which is relatively spare re. 
GoodRelations (GR) instance data

2. 
http://uriburner.com/describe/?url=http%3A%2F%2Fpurl.org%2Fgoodrelations%2Fv1&p=12002&lp=12002&op=12000&prev=&gp=12002 
-- URIBurner which has a lot of GR instance data and hosted on a slight 
more powerful setup than my personal data space

3. 
http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fpurl.org%2Fgoodrelations%2Fv1&p=14&lp=13&op=12&prev=&gp=14 
-- LOD cloud cache instance which has fewer instance data items 
(temporarily due to LOC WIP) than URIBurner but much more powerful setup

4. http://idehen.net/c/5JYBS -- iSPARQL query results page (for DESCRIBE 
?s FROM <http://purl.org/goodrelations/v1>WHERE {?s ?p ?o} ) use the 
drop-down for alternative views e.g. table/grid, raw triples etc.

5. http://idehen.net/c/5JYBY -- PivotViewer page

6. http://idehen.net/about/html/http/purl.org/goodrelations/v1 -- 
Sponger based description page (click on the "Referenced By" tab) .


> Best
>
> Martin
>
>
>
>
>
>


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Friday, 22 April 2011 15:08:38 UTC