Re: Namespace persistence etc from Phil Archer on 2016-08-24 (public-sdw-wg@w3.org from August 2016)

From: Phil Archer <phila@w3.org>
Date: Wed, 24 Aug 2016 12:17:20 +0100
To: Dan Brickley <danbri@google.com>
Cc: SDW WG Public List <public-sdw-wg@w3.org>, Scott Simmons <ssimmons@opengeospatial.org>
Message-ID: <98910a75-eaf0-d79b-bbab-d29a49d1122f@w3.org>
Hey Dan, pls see inline below.

On 24/08/2016 10:14, Dan Brickley wrote:
> (excuse the belatedness of this reply, I thought I had responded but
> don't see it in the thread)
>
> On 13 July 2016 at 06:12, Phil Archer <phila@w3.org> wrote:
>> @Scott - please chime in with any variance to this from an OGC perspective.
>>
>> Dear all,
>>
>> I must begin by apologising for not being on the SSN call today/last night.
>> I could make up some convoluted reason but the truth is that I forgot.
>>
>> I know one of the topics discussed was the issue around vocabulary term
>> persistence so I should set out a few things about that.
>>
>> The principle is, I think, straightforward: any change made to a vocabulary
>> shouldn't break existing implementations. Since we don't know who has an
>> implementation, we can't write to everyone and ask "if we change this will
>> your thing break?" Therefore we have to be cautious.
>
> We discussed this a bit further f2f last time. If you want to be this
> strict you will literally only be allowing yourself meaningless
> changes to a term's definition. For example, if you change the case,
> spelling, indentation, punctuation, phrasing order or other minor
> aspects of the rdfs:comment of a type or property, you're not
> affecting 1.) for a type, the things that are in it 2.) for a
> property, the pairs of things that it relates. As soon as you start
> tweaking the text to clarify meaning, you affect 1.) or 2.), and these
> can always potentially create breakage. The notion that some changes
> are broadening and some are restricting does not affect whether those
> changes might break things; all that is needed for potential breakage
> is any change from previous conditions. Software and applications can
> be very fragile, and embody all kinds of assumptions.
>
> Consider the example of Course markup, and a CourseInstance type with
> a courseMode property. Imagine version one of the definition gave
> "face-to-face" as a (text or URL-based) value option for that
> property. A later revision might want to clarify whether Skype
> sessions (or VR or whatever) counted as face-to-face. Prior to that
> clarification applications could have assumed it did, or that it
> didn't; there's always the risk of breakage even with modest
> improvements. This is not a radical change in meaning, but can make
> the difference between something working as intended and not. It is
> also not a theoretical example but comes from Google's review of the
> draft Courses schema,
> https://www.w3.org/community/schema-course-extend/wiki/Mode_of_study_or_delivery

I guess it's a question of balance, then. It is only search engines and, 
I think, even amongst those, only Google, that has access to this kind 
of view of the real world. So you're able to look at how terms are 
actually used and make an assessment. The rest of the world works 
without such access and so I tend to err on the side of 
caution/conservatism. If there is clear evidence, wherever it comes 
from, that a term's definition should be amended to match the ground 
truth then, OK, that seems right to do so. But that evidence needs to be 
available I think, otherwise, a new term should probably be minted.

Then we get into how long does something have to be published before 
it's locked? If I publish a new term today and think better of it 
tomorrow, am I required to keep it as it is in case someone somewhere 
used my original? In 24 hours, no. In a week, almost certainly not. A 
month? 6? A year? There's no right answer to that.

>
>> That's what leads to W3C saying that vocabulary terms may not be deleted or
>> their semantics changed radically. But it only applies at the namespace
>> level. If you have a new namespace, you can do what you like since nothing
>> will break. *However* it's going to be really confusing if some terms in the
>> old and new namespaces are the same but with radically different semantics.
>> So my interpretation is:
>>
>> Same namespace:
>> ===============
>> No deletions.
>> No changing or tightening or semantics (i.e. don't add a new domain or range
>
> FWIW the approach we took at schema.org was to use weaker domain-like
> and range-like properties that give us more wiggle-room
> (domainIncludes, rangeIncludes). It is a kind of promise that things
> might continue evolving.

Yes and if you'd done that when you were editing RDF Schema it might 
have been a good idea, but, well, the RDF WG wrote it as it is.

>
> You mention tightening and (later) weakening. What about clarifying?
> Realizing that definitions were not as tight as originally hoped is a
> hugely important class of schema edit.
>
>> - make a sub class|property and put the new restrictions on that)
>> Deprecation is OK.
>> Loosening semantics is OK (so you *can* remove a domain or range restriction
>
> This can also cause breakage, if downstream clients expect the data to
> already embody those restrictions. There are also restrictions that
> are not embodied in domain/range but are carried in the textual
> definitions.

OK, so I'm tending towards conservative.

>
>> since it is extremely unlikely that doing so will break anyone's existing
>> implementation).
>> Adding new terms is fine.
>> Clarifying existing definitions is OK.
>> Adding new translations of labels is expressly encouraged.
> +1
>>
>> Different namespace
>> ===================
>> We can be a little more relaxed here. Recall that documents on w3.org are
>> persistent so the original documentation will always be there (at the
>> original URI or redirected from it).
>>
>> No need to replicate the whole of the old vocabulary, so no need to include
>> deprecated terms - they are deprecated by not being included in the new
>> namespace.
>>
>> Assuming the vocabulary has the same name then terms that appear in both old
>> and new should broadly be the same although semantics can change a little.
>> It's a matter of judgement.
>>
>> The case I keep in mind is Dublin Core/DC Terms. dc:creator took either text
>> or a URI as a value - which was confusing. dcterms:creator should take a
>> URI.
>
> Minor nitpic, DC doesn't say quite that. See
> http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#terms-creator
> It says that the value of a dcterms:creator property will be a
> dcterms:Agent. For which see
> http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#Agent
> "A resource that acts or has the power to act.", "
> Examples of Agent include person, organization, and software agent.".
>
> So (in json-ld) you could have something like,
>
>  {
>    "...": "......",
>    "dcterms:creator":
>     {
>       "@type": "dcterms:Agent",
>       "foo": "bar", ....
>     }
>   }
>
> There are those in the Linked Data community who take the view that
> every time you mention an entity you should give a URI for it, but
> that viewpoint is not currently baked into DC Terms. All that DC Terms
> says is that a creator is something that can act, which is pretty
> broad. But it does as you point out discourage us from using names of
> those things as values for the property.

Understood. But please bear in mind that not everyone has several 
hollowed out mountains full of servers to interpret fuzziness.

Phil.


>
> cheers,
>
> Dan
>
>
>> Hope this helps clarify things.
>>
>> Phil.
>>
>> --
>>
>>
>> Phil Archer
>> W3C Data Activity Lead
>> http://www.w3.org/2013/data/
>>
>> http://philarcher.org
>> +44 (0)7887 767755
>> @philarcher1
>>
>

-- 


Phil Archer
W3C Data Activity Lead
http://www.w3.org/2013/data/

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Wednesday, 24 August 2016 11:17:33 UTC