AW: Thing Description for existing data sources

Hi all,
hi Victor

Some of us had a number of discussions on these topics as well in Bremen:


1.      Modeling of data stream
The API is designed in such a way that streams are modeled as collections of single events (this had been debated on this mailing-list), each having an identifier (a URI). One can either access the events individually (the measurements) or access the stream as a whole (the collection). Obviously, the TD should contain an Event to describe a measurement. But in the same time, the collection that would normally act as the Event also returns data and can be seen as a Property. I guess this measurement pattern is pretty common. It is even captured in the SSN ontology (where measurement is called observation). How to deal with that?

I am not sure about your usage of "a measurement", "stream as a whole", and "data".
In my view, the Event interaction pattern in TD should:


-        Provide the metadata for the stream of measurements and how to understand individual measurements

-        Describe how to subscribe, so that new events are pushed to the client. A sub-resource may be used as handle to cancel the subscription (or update its configuration, such as minimum notification interval or a notification threshold). Open issue is how to describe the interactions with this handle (maybe through the TD interaction patterns...).

-        If they are discrete events that may not be lost: A read on the Event resource can return a list of all individual measurements ("single events") and provide a link to them "to access the events individually" (each list entry is basically a Property; again the TD interaction descriptions may help here). Pagination or your next issue (query parameters) might be useful). Is this list what you call "access the stream as a whole"?

-        Optionally list all or specific subscriptions depending on the authorization when reading the Event. This could be part of the listing you get through a read on the Event resource.

What do you mean by "can be seen as a Property"? A potentially very large structure that holds all "single events"? Or a Property that holds the latest event?


2.      Query parameters
A proposal has just come up for dynamic query parameters in the group: https://github.com/w3c/wot/tree/master/proposals/resource-parameters. OData specifies some standard parameters like  "filter" or "orderby". It would be great to have these in the TD model. See http://docs.oasis-open.org/odata/odata/v4.0/errata02/os/complete/part1-protocol/odata-v4.0-errata02-os-part1-protocol-complete.html#_Toc406398291.

This issue also relates to the Explicit Protocol Bindings. There we need some description on how to construct a FETCH message for CoAP. A similar problem might arise if we need to construct an HTTP request for a legacy Web API that uses queries.


3.      Value type definition (I)
OData defines its own schema language (CSDL). Although I could translate the schemas the platform uses into JSON Schema, I doubt this is can be a long-term solution. Moreover, it would require for all data providers to re-engineer their data models. I would advocate instead that data type definitions should be outside of a Thing Description. Here, all schemas are hosted by the platform: WoT devices could actually do the same. Value types would then just need to contain the local URI of the type definition and its schema language. This way, no cloud connectivity is required for a client to retrieve. A JSON serialization of the type definition is also not mandatory anymore. One could use Relax-NG, XML Schema or even CSDL. This is the simplest solution I found to re-use existing value types declared by the platform.

I also noticed that the TD is growing drastically in complexity. I think it will become very hard to solve everything at a single place. Thus, a plug-in mechanism similar to "security" or "encodings" might be nice for the type definitions. This could be a link to a different kind of schema description, e.g., in CSDL, or a vocabulary term coming from a context file.

A problem I noticed when trying to describe OCF resources is that JSON Schema, like most other schemas, lacks the support of semantic annotations directly in the definition of the data structure. Hence, we would have a structural definition in JSON Schema and then a similar structure to annotate the semantics at the right place.

I think we need to investigate a schema language that allows definition of data structure and semantic annotations in one place. Conceptually, this would be a machine-understandable definition of a representation format (the things identified by Internet Media Types). This would be something very powerful for the Web of Things. Imagine SenML in its textual form (https://tools.ietf.org/html/draft-ietf-core-senml-00). It specifies the structure as well as the information model, that is, the semantics behind the addressable data elements in SenML. Now imagine, it wasn't just text, but a proper RDF model that can be imported into the TD.


4.      Value type definition (II)
This solution still need to be refined. For instance, CSDL is formalized in XML Schema, which means one could theoretically parse any SCDL document with the sole knowledge of XML Schema. Which schema language is it better to declare? CSDL (more efficient but too specialized) or XML Schema (more generic but resource-consuming)? Should the decision be left to the implementer of WoT software?



The schema only defines the structure. Somehow we need to attach the semantic annotations directly in the schema. I think schema.org goes in this direction, but has the other issues you identified.


Best wishes
Matthias

Received on Friday, 10 June 2016 17:34:46 UTC