Re: Labels separate from localnames (Was: Best Practice for Renaming OWL Vocabulary Elements from Antoine Zimmermann on 2011-04-22 (public-lod@w3.org from April 2011)

From: Antoine Zimmermann <antoine.zimmermann@insa-lyon.fr>
Date: Fri, 22 Apr 2011 12:33:07 +0200
To: public-lod@w3.org
Message-ID: <4DB15963.2090408@insa-lyon.fr>
See several comments inline.

Le 22/04/2011 09:44, Martin Hepp a écrit :
> Hi Tim, all:
>
> First: Thanks for your great feedback.
>
> As for labels vs. identifiers: What I want to do is change the
> identifier of a few conceptual elements. The reason why I also
> changed the labels in my example is that in GoodRelations, labels are
> historically geared towards the publisher of data and not the
> consumer and thus as close as possible to the identifier. Thus, we
> currently use
>
> - the original camel words for class labels, (e.g. "BusinessEntity")
> - the original camel words plus and indicator for cardinality
> information for properties (e.g. "eligibleRegions (0..*)") - the
> original camel words plus a hint to the most relevant class for an
> individual (e.g. "MasterCard (PaymentMethod)", even though
> gr:MasterCard is also an instance of gr:PaymentMethodCreditCard, but
> it is more important to know that it is a payment method)
>
> I agree with your suggestions to change the syntax in the labels from
> the camel words to regular spaces in between the words, but I want to
> keep the cardinality information for properties and the class
> membership information for individuals.
>
> Again, I think the perspective of someone coding / publishing data is
> more important that the consumption side, in my opinion, because in
> non-trivial data structures (e.g. relationships of higher arity), one
> cannot derive a meaningful user interface directly from the raw
> labels anyway. I assume that 90 % of the people consuming the
> GoodRelations specification, in both OWL and HTML, will be Web
> developers trying to encode data, not clients used for displaying the
> raw data.

Sorry to say this, but I think you are making a mistake. To say that the 
rdfs:label has to look like a variable name because it is for Web 
developers sounds to me like you are saying that the javadoc of a method 
should look like a piece of code because it is addressed to programmers. 
I refuse to believe that Web developers understand better pseudo code 
than natural language.

Moreover, Web clients most of the time display raw data (in a nice way) 
extracted from databases. For instance, a Wikipedia article displays a 
nice readable title, which is exactly the raw data that is found in a 
column of a database. Of course, you can decide that you won't use 
rdfs:label for human readable text and reserve another property for that 
(eg, dc:title), but you cannot decide how others will use your data and 
they may have a preference for the rdfs:label. As a matter of fact, 
rdfs:label is commonly used for showing people a nice readable piece of 
text in natural language.

Now, let's imagine I have a "product browser" which aggregates 
information about products found on the Web, leveraging the 
GoodRelations vocabulary and possibly other vocabularies. It may display 
the products in a table and have a column for "product type", which 
displays the class of the product. There are chances that the client 
will display the rdfs:label of the class as the "product type", which in 
the case of GoodRelations would look sibylline to a casual reader, with 
camel-toed text and nonsensical information about arity.

Moreover, with such practice, how can you provide labels in multiple 
languages? Paymentmethod is not even an English word!

Having said that, I don't agree with Tim when he says that the URI of 
the class should never change and that we don't care how it looks. The 
class name is important, it has to be easy to remember and meaningful. 
Like in a computer programme, the names of the variables do matter.
Plus, I find it ok to add a new name while keeping the one which is 
already used and assert an equivalence between the two.

> See inline comments for the remaining points.
>
> On Apr 21, 2011, at 11:15 PM, Kingsley Idehen wrote:
>
>> On 4/21/11 3:19 PM, Tim Berners-Lee wrote:
>>> Martin,
>>>
>>> Confused. Do you mean you want to change the localname (the bit
>>> after the namespace in the URI) or the label?
> Actually both, because they are the same in the case of
> GoodRelations, but the important thing is changing the localname so
> that coding in RDFa gets simpler and quicker.
>>>
>>> In your examples below, you have the same string for the
>>> localname and label. This looks like a bug.
> It was intended, but I see your point and share Kingsley's support
> for more readable labels.
>
>>> Let me explain things from the point of view of the tabulator,
>>> for example, which actually uses the rdfs:label for users.
>>>
>>>
>>> 0) if you want to change the label, don't change the URI. Keep
>>> the URIs the same, unless the meaning has changed, in which case
>>> make a new URI and keep the old one marked obsolete.
> The meaning has not changed, but I have to make coding in RDFa easier
> for common class names without breaking existing code or data. For
> example, GoodRelations uses, for historic reasons, a long name for
> stores, etc., which tries to capture the ontological essence:
>
> gr:LocationOfSalesOrServiceProvisioning
>
> It could not have been gr:Store, because this is basically and point
> of interest from which a product or service is available, including
> bus stops, gas stations, movie theaters, etc.
>
> But since this class is so frequently used, I want to change it to
> simply gr:Location while remaining as much of backward compatibility
> as possible; that is the background of the pattern I suggested.

Ouch! I'm afraid amateur Linked Data producers who are searching for 
terms in a SemWeb search engine will find gr:Location very appropriate 
for *any* location. As a consequence, it will be inferred that all 
locations recorded in geonames are selling something! The Semantic Web 
will break and bring in its downfall the World Wide Web and the 
Internet, then the end of the world...

>>>
>>> The actual string used in the URI has no meaning, it is just an
>>> identifier.
> Yes, thats clear.
>>> If it seems weird but people use it - make sure it is well
>>> documented but don't change it if there is data using it out
>>> there. Do make sure the labels track the meaning and usage.
>
> Well, in my case that would mean I cannot change a)
> gr:LocationOfSalesOrServiceProvisioning to gr:Location b)
> gr:ProductOrServicesSomeInstancesPlaceholder to gr:SomeItems and c)
> gr:ActualProductOrServiceInstance gr:Individual

Those names are horribly long but they have the merit of being little 
ambiguous, as opposed to gr:Individual. In FOAF, the names are very 
short, which certainly helps getting the vocabulary adopted but creates 
a considerable amount of misuses (foaf:img, foaf:mbox, ...).  Moreover, 
these long names are easier to discover in keyword-based search engines 
because there is more contextual information to properly index and 
relate the words in the name.

> but I want to and will do that because the loss in backwards
> compatibility (if any) is minimal as compared to the increased ease
> of creating data, in particular in RDFa. Reducing the effort for lay
> Web developers to use SW tech is key.
>
>
>>>
>>>
>>> 1) Please use for the RDFS label what you would expect to see in
>>> a form which a user is filling in. Ideally give it in several
>>> languages.
>> +1
>>
> As said, I am considering to change the formatting from camel word to
> non-camel style but keep the cardinality and class membership info
> for developers. The issue of several languages is, in theory, a nice
> feature, but extremely difficult to implement in six-sigma quality
> due to the differences in connotations and semantic granularity of
> natural languages. Having second-class translations would do more
> harm than good, in my opinion. The only reliable translations I could
> provide easily would be German, but that would really not increase
> adoption significantly - most German Web developers speak English.

You do not need to make the translations yourself. Find fluent 
translators or expert linguists.

>>> So  gr:valueAddedTaxIncluded rdfs:label  "includes VAT"@en, "TVA
>>> inclus"@fr .
>
> I am willing to change thus to "value added tax included" but want to
> avoid major differences between the identifiers and the labels,
> because someone using authoring tools should be kept familiar with
> the identifiers.
>>>
>>> This will be often shorter than a long URI localname
>>>
>>> The label has to be a reasonable prompt for the user, and not a
>>> complete explanation of what the field means.  It should be
>>> usable also as a column  heading in a table, for example.   So
>>> while the URI is "gr:availableDeliveryMethods" the best label
>>> might just be "delivery methods".   The longer string makes it
>>> clear for developers but the short label is quite explicit enough
>>> for a column header or field name in a report generated for a
>>> user.
> Well, from a perspective of consuming SW data in a tabular form, you
> are definitely right; however, I am not yet convinced that this is
> the most important use-case to address by rdfs:label in a Web
> ontology.
>
>>>
>>>
>>>
>>> 1.1) DO NOT do
>>>
>>> gr:appliesToDeliveryMethod rdfs:label "appliesToDeliveryMethod".
>>>
>>> This is worse than nothing. If you leave no rdf:label at all for
>>> gr:appliesToDeliveryMethod, then tabulator will synthesize a
>>> label "applies to delivery method" from the camel case. Actually
>>> on a form it will typically use the string "Applies to delivery
>>> method" as in English often form field names are capitalized.
>>>
>>> If you give an rdf label which actually is just the same as the
>>> URI localname, you are saying that that is the string which you
>>> recommend be used for the form.  This will force tabulator to
>>> display the camel-case, which is not user-friendly.
> This is, IMO, a Tabulator-specific issue and only addresses the
> *consumption* of the data in a tool that directly uses the vocabulary
> labels for generating the user interaction. I am so far unconvinced
> that this is a dominant use-case for interacting with GoodRelations
> data. For example, a user seeing GoodRelations data via Google Rich
> Snippets or Yahoo SearchMonkey will never see the vocabulary labels,
> only the person configuring the generation of data.

Google Rich Snippets don't show the labels because it is specifically 
tuned for GoodRelations. But a generic tool which aggregates information 
from various sources using various vocabularies has to make a generic 
assumption on what to display. rdfs:label is what is often chosen by 
generic tools to be shown to people.

>>>
>>
>> +1
>>
>>> 2) Never put explanations for ontology engineers in the label. In
>>> the comment, OK, not the label.
>>>
>>>
>>> NOT rdfs:label "NewPropertyName1 (Note: See old URI
>>> foo:LongPropertyName1 used previously)" . but more like
>>> rdfs:label "new form label1"; rdfs:comment " (Note: See old URI
>>> foo:LongPropertyName1 used previously)" .
>>>
>>> You don't want that stuff showing up for users, in reports,
>>> forms, etc.
>
> As for the reference to the new element etc., I agree with you; that
> was just to keep the proposal short and simple. We will usually have
> a note "(DEPRECATED)" as part of the label, but no further
> explanation.
>
> It is pretty simple for and end-user tool to filter out the three
> patterns of developer information from GoodRelations labels by simple
> regex.
 >
> Again, the majority of the people dealing with rdfs:labels from
> vocabularies will be developers, not end-users, because you will need
> an intermediate layer between the data and the user anyway.

Again, this only works if you fine tune your application towards 
GoodRelations. One of the strength of RDF is that it is easy to 
aggregate data relying on many vocabularies that the consuming 
application doesn't even know about. Having general guidelines for 
rdfs:label such that it can be applied to any vocabulary is very good, 
IMO. By ignoring these guidelines, you make GoodRelations a kind of 
proprietary format which only works fine with GoodRelations-specific 
applications.


Regards,
AZ.

>>
>> +1
>>
>> Martin:
>>
>> In addition, if you are evolving the ontology (which I believe is
>> the case) and seek to keep backward compatibility i.e., keeping
>> classes and properties functional across ontology releases, just
>> use owl:equivalentClass and owl:equivalentProperty accordingly.
>> Naturally, if this evolution includes new levels abstraction then
>> rdfs:subClassOf and rdfs:subPopertyOf should be put to use etc..
> Yes, that was in my original proposal.
>>
>> We can't really negate reasoning, especially when showcases emerge
>> that help general appreciation of OWL which (IMHO) continues to get
>> an unjustified bad rap.
> Yes, I agree. I also think that since the little bit of reasoning
> needed in here can be easily implemented in SPARQL CONSTRUCT rules or
> SPIN, it is much better to take conceptually superior OWL axioms than
> "quick and dirty sameAs", which will backfire in the long run.
>
> Again, thanks for your detailed feedback!
>
> Martin
>

[...]

-- 
Antoine Zimmermann
Researcher at:
Laboratoire d'InfoRmatique en Image et Systèmes d'information
Database Group
7 Avenue Jean Capelle
69621 Villeurbanne Cedex
France
Tel: +33(0)4 72 43 61 74 - Fax: +33(0)4 72 43 87 13
Lecturer at:
Institut National des Sciences Appliquées de Lyon
20 Avenue Albert Einstein
69621 Villeurbanne Cedex
France
antoine.zimmermann@insa-lyon.fr
http://zimmer.aprilfoolsreview.com/
Received on Friday, 22 April 2011 10:33:38 UTC