Re: Does 'Feature' = 'Real World Thing'? from Jeremy Tandy on 2015-10-22 (public-sdw-wg@w3.org from October 2015)

From: Jeremy Tandy <jeremy.tandy@gmail.com>
Date: Thu, 22 Oct 2015 09:29:54 +0000
To: "Svensson, Lars" <L.Svensson@dnb.de>, Joshua Lieberman <jlieberman@tumblingwalls.com>
Cc: "public-sdw-wg@w3.org" <public-sdw-wg@w3.org>
Message-ID: <CADtUq_02HAL6XM-bVnDPDNX5eDoLwj-z1FgVXQaJvRuFcK_GqQ@mail.gmail.com>
All- many thanks again for contributions to this topic! I think that we
have a resolution- or at least sufficient resolution for us to move forward.

@Clemens ... you said:

> there is no concept that would support using the same identifier for both
features [...]. A feature instance is of exactly one feature type, it
cannot be of the feature type "light house" from application schema A and
"vertical obstruction" of application schema B.

I think that this relates to the frame-based _information_ modelling
approach used for creating Application Schema. And yes- this is different
to the approach taken with RDF.

@Simon ... based on going back to the spec, you conclude:

> the term ‘Feature’ clearly refers to the abstraction, information, data.
[... So we must] Use ‘Thing’ for the real-world (including fictional)
thing, and ‘feature’ for an information object that describes it, according
to some viewpoint.

> regarding Jeremy’s ‘what is the subject’ question, a case could be made
for using a URI for the real-world thing as the subject in RDF statements
in all cases, regardless of the model or ‘feature-type’ in use, while the
set of statements relating to a specific viewpoint (feature-type) comprises
a graph. The URI for a graph identifies the ‘feature’, while the URI for a
thing in the world can be subject of statements from all viewpoints.

+1 from me. This is the conclusion that I had also reached ...

   1. We identify the real-world (including fictional) Thing.
   2. We identify a collection of statements (e.g. a graph) that describe
   the Thing according to a given perspective; the Feature. The Thing is the
   subject of the statements.

This works for me.

Regarding point (2) it's worth noting that (at least from my pov) it is
best practice for Application Schema to be solely concerned with the
conceptual model (the 'universe of discourse'); attributes of the Thing
that are deemed important in the application domain. That said, I often see
Application Schema with Feature Types that conflate the 'Thing' and
'Feature' (aka information resource) subjects so that the collection of
properties defined by the Feature Type are a mixture of those that talk
about the 'Thing' (e.g. height; an instance might assert height = 37ft ...
this is clearly about the 'Thing') and those that are metadata about the
'Feature' (information object) itself (e.g. creation date, last update
time, license, owner, maintainer etc.). This makes it very difficult to
merge data from such instances of those Feature Types because the subject
isn't clear.

[OK- if that passed you by, don't worry]

@Josh ... you said:

> We can happen to recognize two things that have close the same spatial
extent (and temporal extent). That doesn’t make them the same “thing”.
[...] We learn something by interpreting the collocation, just as layering
features in a map brings insight.

Indeed. Collocation is a useful indicator but, by itself, is often
insufficient to determine 'sameness'.

If we apply Simon's proposal that the 'Thing' is the subject of the
statements in the Feature, two (or more) Features may use the same 'Thing'
as their subject. This is an explicit assertion of sameness and is achieved
either by both Features using the same identifier for their subject, or by
using the 'sameAs' assertion to say that the two identifiers actually refer
to the same Thing. These are very strong assertions, not to be taken
lightly.

Regarding the reconciliation of data that is apparently talking about the
same thing, Ed previously said "here be dragons" [1].

@Lars ...

> Does that mean that if I want to express this in RDF, I need three URIs?
One for the real-world-thing, one for the feature and one for the feature
representation?

Hopefully you're a little less confused. In my mind we have just two URIs:

   - URI identifying 'Thing'
   - URI identifying 'description of Thing' / 'Feature' / 'graph'

Why two URIs? Why can't we just have one? It's clear that we have two
resources: the 'Thing' and 'description of Thing'. The Web Architecture [2]
(that we agreed forms a foundational aspect of our best practice) states:

Constraint: URIs Identify a Single Resource

Assign distinct URIs to distinct resources.


Furthermore, the need to treat 'Thing' and 'description of Thing' as
disjoint resources is the subject of W3C URLs in Data Primer [3]. 'Feature'
is synonymous with the term 'Record' [4] defined therein.

Section 4 'Documenting Properties' [5] notes that:

"A data format that mixes properties about [...] records and properties
about the things those [...] records describe is not necessarily ambiguous:
all that's required for developers to understand what the properties
actually apply to is for the meaning of the property to be documented."

This is exactly the situation we have with many existing (GML) Application
Schema. URLs in Data proposes how to declare which property is which type.
Section 5.3 'Publishing Data' [6] says that:

"Publishers can help enable more accurate merging of data from different
sites if they support URLs for each entity
<http://www.w3.org/TR/urls-in-data/#dfn-entity> they or other sites may
wish to describe, separate from the [...] records
<http://www.w3.org/TR/urls-in-data/#dfn-record> that they publish."

So it's best two have 2 URIs; one for each of Thing and Feature.

Jeremy

[1]: http://lists.w3.org/Archives/Public/public-sdw-wg/2015Sep/0059.html
[2]: http://www.w3.org/TR/webarch/#id-resources
[3]: http://www.w3.org/TR/urls-in-data/
[4]: http://www.w3.org/TR/urls-in-data/#dfn-record
[5]: http://www.w3.org/TR/urls-in-data/#documenting-properties
[6]: http://www.w3.org/TR/urls-in-data/#publishing-data

On Thu, 22 Oct 2015 at 07:37 Svensson, Lars <L.Svensson@dnb.de> wrote:

> On Wednesday, October 21, 2015 10:38 PM, Joshua Lieberman wrote:
> [...]
>
> [Clemens]
> > > But first of all, the feature is an information object describing a
> real-world
> > thing.
> >
> > That's consistent with the definition of Spatial Object in INSPIRE.
> Restated:
> > • Feature != Real-World Thing
> > • Feature = Information Resource that _describes_ Real-World Thing
> > @Josh, @Simon: can you confirm this meets your expectations?
>
> [Josh]
> > Almost. There are two feature statements needed to get from the world to
> > spatial data:
> >
> > 1. Feature = discernment of a type of Real-World Thing (as distinct from
> Not
> > Thing)
> > 2. Feature Data = representation of a Feature (as an information
> resource)
>
> Does that mean that if I want to express this in RDF, I need three URIs?
> One for the real-world-thing, one for the feature and one for the feature
> representation?
>
> Best,
>
> Lars (who starts to feel confused...)
>
>
Received on Thursday, 22 October 2015 09:30:37 UTC