Re: defining the semantics of lists from thomas lörtsch on 2020-06-15 (semantic-web@w3.org from June 2020)

From: thomas lörtsch <tl@rat.io>
Date: Mon, 15 Jun 2020 12:28:35 +0200
To: Jiří Procházka <ojirio@gmail.com>
Cc: semantic-web@w3.org
Message-Id: <7517314E-36D9-4FE3-AF6D-29F240CAA4B4@rat.io>
> On 13. Jun 2020, at 18:36, Jiří Procházka <ojirio@gmail.com> wrote:
> 
> Just adding a vocabulary for describing container size, without any special semantic implications, is of course easy. There is the question if this should be in the "rdf" namespace, which would make it a harder task. People like myself would like RDF to be a minimalist unopinionated foundation to build upon. It would seem like cluttering. I'd prefer if collections etc. had their own vocabulary(ies), perhaps opinionated, with semantics even.
> 
> The attitude towards change is understandable when you take in context that the whole semantic web project is interesting to people from different backgrounds, for different purposes. Hence the focus on a minimalist foundation and opposition to evolving RDF into a kind of a Frankenstein-swiss-army-knife (which you could say it already is, if you are being strict enough).

I wouldn’t be that strict. As a language aimed at end users who we want to upgrade the data they publish on the web with some semantics it is rather a little too terse. The misuse of owl:sameAs is an example for a desire to connect and relate that hasn’t been met with an appropriate vocabulary term. The problem wasn’t that the semantics of owl:sameAs were not crisp enough but that there wasn’t anything better in reach for the over-eager young semanticist (yes, I’m talking about myself too). 
For a good design it’s not enough to be correct, it also has to be reasonably complete and rounded and meet the need of users. That’s why in the case of the "limit" attribute I do indeed think that it should be part of core RDF. The other candidate would be something like owl:sameAs, but without formal semantics (and more same-ness mojo than rdf:seeAlso). That’s a bit paving the cow paths I guess.

> Backwards compatibility is important for the reason that breaking change makes upgrading the same (or close to) as replacing it with something different. 
> I agree on many points regarding RDF issues (named graphs etc.) but for me these are also underlying issues to my main issue, that I don't see much movement in direction towards a modular machine-first system, built on a common minimalist foundation, with an ecosystem of easily integrated components to support the desired use cases. What we have is a sprawl of mostly academic projects which are often very difficult to integrate.

One might partition the vocabulary into
- core (most of RDF) (which I guess is the common minimalist foundation that you refer to), 
- the informal, user oriented parts (lists, also better support for n-ary relations)
- (maybe also templates for some ubiqituous data types like adresses, events etc)
- basic reasoning (domain, range, subclassing, subproperties, typing), 
- more reasoning (RDF reasoning about itself, some less intimidating parts of OWL), 
- and the big gap: sound meta modelling
- and the other big gap: closing and constraining the world locally
That’s probably all still doable in an RDF 2. Some owl:sameAs statements to ensure backward compatability ;-) Applications could then be refactored acccordingly. Is that what you have in mind? (Probably not, I’m a bit guessing here).

> We are lacking some common foundations and established ways how to build on them. For example we don't have established answers to these questions:
> 
> When exactly is it right to build semantic extensions?
> When to build a vocabulary/ontology?
> How should the semantics extensions be built in order to be as interoperable as possible?

We don’t really have much discussion about the concept of semantic extensions at all, in any way. The OWA is both an undispensable and necessary base but also an unrealistic idealistic scenario in practice. Semantic extensions should bridge the gap between teh OWA and closed world applications but the notion is not really filled. OWL DL as a semantic extension is even more embarrasing than the already hard to swallow OWA.  Apart from that application developers do their thing and close the world around them implicitly.
So this definitely needs to be tackled. The question is not so much "what is the best way" but rather "what could be a shared way", how can we bridge and negotiate between different realms of openness etc. 
Probably Named Graph semantics in RDF need to be fixed first. Then we can use that as a base to express semantic extensions, reasoning regimes etc. Then we can make our applications deal with them appropriately.

Cheers,
Thomas

> Best, 
> Jiri
> 
> 
>> On June 11, 2020 12:02:07 AM GMT+02:00, "thomas lörtsch" <tl@rat.io> wrote:
>> 
>> 
>>> On 10. Jun 2020, at 22:30, Jiří Procházka <ojirio@gmail.com> wrote:
>>> 
>>> Right, as has been pointed out by many, changing or extending RDF
>>> semantics would be extremely difficult task, but it isn't necessary.
>>> 
>> There’s two things:
>> 
>> 1) Adding a property to the Container vocabulary as _describing_ the size of a container wouldn’t change the semantics of RDF. We discussed at length in this thread what can and can’t be achieved when changing the fundamentals of RDF is not an option - and I sure think it isn’t.
>> 
>> 2) The attitude towards change in this community is difficult to say the least. When Dan Brickley posted his very interesting history lessons a few moths ago I realized that reification had been considered for deprecation around 2000 already. OMG. On the one hand we are trying to build something of unprecedented size and reach, on the other hand we can’t make the slightest backwards incompatible change from day 1 on? This attitude is insane and it only leads to the standard getting stale and eventually being replaced by soemthing new altogether - but at a much hingher cost to everybody involved than some breaking changes would occurr.
>> I was a the RDF Next Steps Workshop in Stanford 2010 (it was open to everbody and I couldn’t resist) and I remember well the vivid argument that we would risk uptake and adoption that was just about to start if we changed anything more than the most pressing technicalities. Last year at the Berlin Workshop I heard that now it’s too late for profound change as the installed base is already to big, that ship has sailed. So I must have missed the historic millisecond, around 2016 and a half presumably, when the stars had aligned, once and for all? What a choke, what a farce. Occassionally dark forecasts are made that all Semantic Web companies will go bancrupt if we introduce more than the most pressing changes. In the meantime a new industry of now about the same size than ours grew up - Property Graphs - on the only grounds that they have better usability and meta modelling. So who will gp bancrupt soon and why exactly? And a handful of companies support RDF* already - probably because they don’t want to get bancrupt, not because change is so impossibly hard and shouldn’t even be attempted.
>> 
>>> Easier would be to make your own RDF collection/container vocabulary
>> 
>> So, I’d say change is good and is indeed pretty necessary. If nobody cares and just dabbles on heir own vocabulary instead of trying to fix the mothership RDF there won’t be much reason to fix it soon enough. It would be easier for sure but not very sustainable.
>> The situation with lists is a classic SNAFU. But lists are the easiest part. Identifcation is broken and no one even cares anymore. "Just use AI to disambiguate" I hear. Reification is a mess (and RDF* in its current state might make it better, or worse) and Named Graphs have no semantics, so there is no sound meta modelling in RDF. OTOH everybody who doesn’t work on well contained single topic applications will acknowledge that the Semantic Web needs an agreed upon facility to model context and refication. That requires change, and it requires a vision about how everything can fit together. And it is about time.
>> 
>>> and
>>> a validation language for it. The language would have its own semantics,
>>> possibly backed by some formal logic. The language syntax could be RDF.
>>> In many ways Shapes Constraint Language (SHACL)[1] could be an
>>> inspiration (also has RDF as syntax), could be used by your language, or
>>> your language could be a SHACL extension. Overall, what you are going
>>> for sounds like a generalization of an application of SHACL.
>> 
>> A property to desribe the intended size of a Container would belong to RDF. Constraints to enforce such a limit locally in apllications would belong to Shacl, Shex etc. But I’m currently trying to figure out if lists as proper datatypes aren’t a better way forward.
>> 
>>>> But how could we enforce such constraint descriptions not just in
>>>> applications but within the OWA realms of RDF?
>>>> 
>>> Do you mean make entire RDF graphs invalid just because it doesn't fit
>>> some constrains for a particular purpose? Why would that be desirable?
>>> Even if a graph says something which is invalid for some purpose, it
>>> could be valid for another purpose - a simple example being RDF
>>> visualization.
>> 
>> Closing the world for some application shouldn’t inflict the Open World around it. A dataset might have to undergo certain procedures and scrutiny before I feed it into my application. Some applications certainly demand such restrictivity. That doesn’t say anything about the uses and usefulness of that dataset outside of such a specific application.
>> 
>> As a side note, if you decide to go down the rabbit hole, you might want
>> to consider handling collections restricted to members of a particular
>> class/datatype or their subclasses.
>> 
>> A Series container with an explicit extension hook might be a nice idea: immutability was just one  example. But it could also be overkill.
>> 
>> 
>> Best,
>> Thomas
>> 
>> 
>>> Cheers,
>>> Jiri
>>> 
>>> [1] https://www.w3.org/TR/shacl/
>>> 
>>> On 6/8/20 12:19 PM, thomas lörtsch wrote:
>>>> On 4. Jun 2020, at 01:40, Jiří Procházka <ojirio@gmail.com> wrote:
>>>> 
>>>> This has been an interesting thread to follow, but from the start I've
>>>> felt a clearly stated use case is missing. This would clear up many
>>>> things, possibly pointing to a solution which doesn't require changes to
>>>> RDF(S) semantics at all.
>>>> 
>>>> I remember someone (Pat?) rightly saying that RDF was not a data
>>>> structure language, but a KR (or something like that).
>>>> So simply wanting the sort of things that programming languages have
>>>> as data structures is not necessarily a useful thing to spend time on.
>>>> 
>>>> I much prefer predicates to be specific to the range and domain that
>>>> they are working over.
>>>> 
>>>> Agreed, personally I think the rdf:List, RDF containers vocabularies
>>>> should be used mainly as base classes/properties to be subclassed in
>>>> domain specific schema or perhaps not at all.
>>>> 
>>>> That said, one might be describing things like APIs in RDF. In such
>>>> cases one might describe some examples of input, having to be careful to
>>>> not define them with an unintended structure (for example a branching or
>>>> looping list). If this was the use case, RDF authoring tools could
>>>> feature warnings for irregular structures.
>>>> 
>>>> An interesting use case is describing APIs accepting RDF data as input.
>>>> Ideally it should be using a domain specific schema, but I can see it
>>>> often having constructs similar to rdf:List or RDF containers. The APIs
>>>> descriptions should also definite how unexpected inputs are treated.
>>>> This could be standardized, separately to RDF specifications.
>>>> 
>>>> These are the use cases which I thought of, I would like to know of
>>>> others. I don't think a change of RDF semantics would be needed for these.
>>> 
>>> Right, no need to change the RDF semantics. It's two things however:
>>> A what RDF can describe
>>> B how descriptions can be enforced as constraints
>>> 
>>> A
>>> Lists in RDF can be described as
>>> - closed (rdf:List)
>>> - ordered (rdf:list, rdf:Seq, rdf:Bag, rdf:Alt)
>>> - without duplicates (rdf:Alt)
>>> - with preferred member (rdf:Alt)
>>> This is a bit messy and not all combinations are available. Adding a 'last' attribute to Containers might ameliorate the situation. The rdf:Series class that I proposed mainly as an example to facilitate discussion could be extended to address all those properties and maybe more:
>>> 
>>>            // A new subclass of Container that very explicitly describes
>>>            // size, preferred alternative, ordering and duplicates.
>>>            // Ordering and duplicates default to TRUE.
>>>            // If no attributes are set Series corresponds to Seq.
>>>            // This container type has the combined semantic expressivity
>>>            // of Seq, Bag and Alt plus the capability to describe its size.
>>>            // The immutable property is a hint that such a newly defined
>>>            // construct could indeed be used to describe a few more things.
>>> 
>>>  Series     subClassOf  Container
>>>  last       domain      Series
>>>             range       ContainerMembeshipProperty
>>>  pref       domain      Series
>>>             range       ContainerMembeshipProperty
>>>  order      domain      Series
>>>             range       Boolean
>>>  dupes      domain      Series
>>>             range       Boolean
>>>  immutable  domain      Series     // why not reach for the stars...
>>>             range       Boolean
>>> 
>>> B
>>> Translating those descriptions to constraints can be tricky as Peters questions proved but it certainly can be done in one way or the other. APIs described in RDF, OWL DL formalizations, rules, Shexl constraints are all viable options. But how could we enforce such constraint descriptions not just in applications but within the OWA realms of RDF? That requires closing the world around the description. Named Graphs come to mind. So basically:
>>> - define a way to guarantee sound naming semantics for Named Graphs,
>>> - define a way to attribute graphs with semantic extensions,
>>> - define such extensions (like e.g. an app friendly local CWA/NAF/UNA walled garden)
>>> - teach RDF consuming applications to process data according to the rules of such extensions locally, within those graphs
>>> and be done with it?
>>> 
>>> Thomas

[Lots of previous messages snipped.]
Received on Monday, 15 June 2020 10:28:55 UTC