The TREE hypermedia specification

Draft Community Group Report,

More details about this document
This version:
https://w3id.org/tree/specification
Feedback:
public-treecg@w3.org with subject line “[TREE] … message topic …” (archives)
Issue Tracking:
GitHub
Editor:
Pieter Colpaert

Abstract

TREE hypermedia describes the pagination of a collection of entity descriptions. It allows clients to follow their nose through a search tree based information structure.

Status of this document

1. Overview

An overview of the TREE specification with the TREE collection, a reference to the first focus node of its members, and the relations to other nodes from the current node.

The TREE specification introduces these core concepts:

A simple collection can be created as illustrated in the following example:

ex:Collection1 a tree:Collection;
            rdfs:label "A Collection of subjects"@en;
            tree:member ex:Subject1, ex:Subject2 .

ex:Subject1 a ex:Subject ;
            rdfs:label "Subject 1" ;
            ex:value 1 .

ex:Subject2 a ex:Subject ;
            rdfs:label "Subject 2" ;
            ex:value 2 .

From the moment this collection of members grows too large for one page, a pagination needs to be created in which an initial set of members can be found through the first tree:Node, and more members can be found by interpreting the TREE hypermedia controls. This is illustrated in the next example:

> HTTP GET https://example.org/Node1

ex:Collection1 a tree:Collection;
            tree:view ex:Node1 ;
            tree:member ex:Subject1, ex:Subject2 .

ex:Node1 a tree:Node ;
        tree:relation ex:R1,ex:R2 .

ex:R1 a tree:GreaterThanOrEqualToRelation ;
    tree:node ex:Node3 ; # This is the URL of another page
    tree:value 3;
    tree:path ex:value .

ex:R1 a tree:LessThanRelation ; # This is useful for a client that is looking for a value 10 or greater
    tree:node ex:Node3 ;
    tree:value 10;
    tree:remainingItems 7 ;
    tree:path ex:value .

ex:R2 a tree:GreaterThanOrEqualToRelation ;
    tree:node ex:Node4 ;
    tree:value 10;
    tree:remainingItems 10 ;
    tree:path ex:value .

ex:Subject1 a ex:Subject ;
            rdfs:label "Subject 1" ;
            ex:value 1 .

ex:Subject2 a ex:Subject ;
            rdfs:label "Subject 2" ;
            ex:value 2 .

2. Definitions

A tree:Collection is a set of tree:Members. The set of members MAY be empty.

A tree:Member is a set of (at least one) quad(s) defined by the member extraction algorithm (see further).

A tree:Node is a dereferenceable resource containing tree:Relations and a subset of () members of the collection. In a tree:Node, both the set of tree:Relations as the subset of members MAY be empty. The same member MAY be contained in multiple nodes.

A tree:Relation is a function denoting a conditional link to another tree:Node.

A tree:Node is part of a search tree, and apart from the root node, it has exactly one other tree:Node of the search tree linking into it through one or more relations.

A tree:search form is an IRI template, that when filled out with the right parameters becomes a tree:Node IRI, or when dereferenced will redirect to a tree:Node from which all members in the collection that adhere to the described comparator can be found.

A search tree is a set of interlinked tree:Nodes. It will adhere to a certain growth or tree balancing strategy. In one tree, completeness MUST be guaranteed, unless indicated otherwise using a retention policy (as is possible in LDES).

3. Initialization

A client SHOULD be initiated using a URL. The client MUST dereference the URL, which will result in a set of [rdf-concepts] triples or quads. When the URL after all redirects, is used in a triple ?c tree:view <> ., a client MUST assume the URL after redirects is an identifier of the intended root node of the collection in ?c.

Note: Dereferencing in this specification also means parsing the RDF triples or quads from the HTTP response. TREE does not limit the content-types that can be used to represent RDF triples. Client developers should do a best-effort for their community.

If there is no such triple, then the client MUST check whether the URL before redirects (E) has been used in a pattern <E> tree:view ?N. where there’s exactly one ?N, then the algorithm MUST return ?N as the rootnode and E as the collection.

The client then MUST dereference the identified rootnode (if it did not do that already) and merge those quads with the already found quads. It now MUST look for a potential search forms that MAY be linked, either i) on top of the rootnode, or ii) on top of the entity linked through tree:viewDescription, using tree:search.

The resulting collection IRI C, the rootnode IRI R, list of already visited pages H (and optionally their time to live), current quads Q and search forms S.

In case it is not done using an unambiguous URL, clients MAY implement the report on Discovery and Context Information (work in progress). This report also explains how clients MAY implement support for extracting context information such as provenance, contact points, etc.

Note: Having an identifier for the collection has become mandatory: without it you can otherwise not define completeness.

4. The member extraction algorithm

Thanks to the member extraction algorithm, a data publisher can define their members in different ways:
  1. As in the examples above: all quads with the object of the tree:member quads as a subject (and recursively the quads of their blank nodes) are by default included (see also [CBD]), except when they would explicitely not be included in case 3, when the shape would be closed.

  2. Out of band / in band:

    • when no quads of a member have been found, the member will be dereferenced. This allows to publish the member on a separate page.

    • part of the member can be maintained elsewhere when a shape is defined (see 3)

  3. By defining a more complex shape with tree:shape, also nested entities can be included in the member

  4. By putting the triples in a named graph of the object of tree:member, all these triples will be matched.

Depending on the goals of the client, it MAY implement the member extraction algorithm. It MUST implement the algorithm in case the goal is to find the complete and finite set of quads that are part of a member as intended by the data publisher. The method used withun TREE is combination of Concise Bounded Descriptions, named graphs and the topology of a shape (deducted from the tree:shape). The full algorithm is specificied in the shape topologies report.

5. Traversing the search tree

After dereferencing a tree:Node, a client MUST extract all (zero or more) tree:Relation descriptions from the page. This can be done by searching for <> tree:relation ?R triples.

A client MUST follow the object of the relation’s ?R tree:node ?object triple, unless the client is able to prune the branch reachable from that node (see further).

A client MAY also extract the tree:Relation’s tree:remainingItems if it exists. If it does, it will be an integer indicating the remaining items to be found after dereferencing the node.

When traversing, a client SHOULD detect faulty search trees by keeping a list of already visited pages.

When dereferencing the object of a tree:node triple, the client MUST follow redirects.

Note: Allowing redirects allows servers to rebalance their search trees over time.

A client MAY assume completeness of members intended by the search tree when it derefenced all node links.

6. Pruning branches

In search trees, not commonly the tree:Relation will be used as the type, but more commonly it will be one of its subclasses. For partial string matching, tree:PrefixRelation, tree:SubstringRelation, and tree:SuffixRelation exist. For comparing various datatypes, tree:GreaterThanRelation, tree:GreaterThanOrEqualToRelation, tree:LessThanRelation, tree:LessThanOrEqualToRelation, tree:EqualToRelation, and tree:NotEqualToRelation exist. Finally, for geospatial trees, tree:GeospatiallyContainsRelation exists.

A client decides, based on their own tasks, what relations are important to implement. Each relation is a comparator function that helps deciding whether or not the subtree reachable from the tree:node link can be pruned. All relation comparator functions to the same tree:node need to be evaluated using a logical AND. As arguments, it this function takes:

  1. The left-hand: what the members on the linked node will contain w.r.t. the literals reachable from the tree:path

  2. The operator: decided by the type of the relation and the datatype or node type of the tree:value triple’s object.

  3. The right-hand: the tree:value triple’s object.

If the client comes across a relation subclass it did not code against, it MUST return true.

Subclasses of tree:Relation commonly will use the tree:path to indicate the path from the member to the object on which the tree:Relation applies. For the different ways to express or handle a tree:path, we refer to 2.3.1 in the shacl specification. All possible combinations of e.g., shacl:alternativePath, shacl:inversePath or shacl:inLanguage in the SHACL spec can be used. When shacl:alternativePath is used, the order in the list will define the importance of the order when evaluating the tree:Relation. The result of the evaluation of the tree:path, is the value that must be compared to the tree:value. When multiple results from the path are found, they need to be interpreted in the function using a logical OR.

The target object of a tree:path SHOULD be materialized in the current Node document, but when it is not, the object MAY be considered implicit on the condition both tree:path and tree:member are defined. In contrast to sh:path, a tree:path MAY refer to an implicit property and may not be materialized in the current response. This may break SPARQL processors that did not yet come across the object before in their query plan. However, the tree may still be useful for query processors that, for example, prioritize queries according to the user’s location, and first download nodes that are nearby the user. Therefore, the materialized location of the object is not needed. While not recommended, possible heuristics could try to infer the data, could try to fetch it through another tree:Collection, or retrieve it using URI dereferencing. Wildcards in the paths (i.e. sh:zeroOrMorePath) however do not trigger any further look-ups.

6.1. Comparing strings

String values have three specific type of relations: the tree:PrefixRelation, the tree:SubstringRelation and the tree:SuffixRelation.

Note: We experimented with server-chosen locales such that ça suffit can also be found when following a tree:PrefixRelation with a tree:value "c" (which at this moment is not supported). That would require an understanding of locales, and browser/JavaScript support for locales is too low to be useful at this point.

Also the comparator relations such as tree:GreaterThanRelation can be used. The strings MUST then be compared according to case sensitive unicode ordering.

When a tree:path is defined, mind that you also may have to check the language of the element using the property shacl:inLanguage More languages MAY be set. When no language is set, all strings are compared.

Note: If you want to have one resource containing both e and é as a prefix, you will have to create multiple relations to the same tree:Node.

6.2. Comparing named nodes

When using comparator relations such as tree:GreaterThanRelation, named nodes must be compared as defined in the ORDER BY section of the SPARQL specification.

6.3. Comparing geospatial features

The tree:GeospatiallyContainsRelation is the relation than can be used to express all further members will be contained within a geospatial region defined by the WKT String in the tree:value.

When using tree:GeospatiallyContainsRelation, the tree:path MUST refer to a literal containing a WKT string, such as geosparql:asWKT.

6.4. Comparing time literals

When using relations such as tree:LessThanRelation or tree:GreaterThanRelation, the time literals need to be compared according to these 3 possible data types: xsd:date, xsd:dateTime or xsd:dateTimeStamp.

7. Search forms

Searching through a TREE will allow you to immediately jump to the right tree:Node. TREE relies on the Hydra search specification for its search forms. It does however extend Hydra with specific search properties (hydra:IriTemplate) for different types of search forms, and searches starting from a tree:ViewDescription, to which the search form is linked with tree:search. The behaviour of the search form fully depends on the specific property, for which TREE introduces a couple of specific properties:

7.1. Geospatial XYZ tiles search form

Three properties allow to specify a geospatial XYZ tiles template (also known as slippy maps).

  1. tree:longitudeTile describes the X value

  2. tree:latitudeTile descrbes the Y value

  3. tree:zoom describes the zoom level

All properties expect positive integers.

<https://tiles.openplanner.team/#LatestCollection> a tree:Collection ;
    dcterms:title "A prototype tree:Collection for Linked OpenStreetMap’s roads"@en .
<https://tiles.openplanner.team/planet/> a tree:Node ;
    
    tree:search [
        a hydra:IriTemplate ;
        hydra:template "https://tiles.openplanner.team/planet/20201103-095900/{z}/{x}/{y}" ;
        hydra:variableRepresentation hydra:BasicRepresentation ;
        hydra:mapping [
            a hydra:IriTemplateMapping ;
            hydra:variable "x";
            hydra:property tree:longitudeTile;
            hydra:required true
        ],[
            a hydra:IriTemplateMapping ;
            hydra:variable "y";
            hydra:property tree:latitudeTile;
            hydra:required true
        ],[
            a hydra:IriTemplateMapping ;
            hydra:variable "z";
            hydra:property tree:zoom;
            hydra:required true
        ]
    ] .   

This search form describes a specific search form that uses a quad tree. The zoom level describes the depth, the longitudeTile and latitudeTile describe the x and y index of the pagination. (e.g., on zoom level 0, there’s 1 tile, on zoom level 1, there are 4 tiles, etc.).

7.2. Searching through a list of objects ordered by time

Same as the previous example but with the predicate tree:timeQuery expecting an xsd:dateTime. This time however, when the page itself does not exist, a redirect is doing to happen to the page containing the timestamp. A tree:path can indicate the time predicate which is intended.

<https://example.org/#Collection> a tree:Collection ;
    dcterms:title "An example collection with a time search view"@en ;
    tree:view <https://example.org/Node1> .

<https://example.org/Node1> a tree:Node ;
    tree:search [
        a hydra:IriTemplate ;
        hydra:template "https://example.org/{generatedAt}" ;
        hydra:variableRepresentation hydra:BasicRepresentation ;
        hydra:mapping [
            a hydra:IriTemplateMapping ;
            hydra:variable "generatedAt";
            tree:path prov:generatedAtTime;
            hydra:property tree:timeQuery;
            hydra:required true
        ]
    ] .

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[CBD]
Patrick Stickler, Nokia. CBD - Concise Bounded Description. 3 June 2005. W3C Member Submission. URL: https://www.w3.org/Submission/CBD/
[RDF-CONCEPTS]
Graham Klyne; Jeremy Carroll. Resource Description Framework (RDF): Concepts and Abstract Syntax. URL: https://w3c.github.io/rdf-concepts/spec/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[SHACL]
Holger Knublauch; Dimitris Kontokostas. Shapes Constraint Language (SHACL). URL: https://w3c.github.io/data-shapes/shacl/