Re: semsheets

Hans-Jürgen,

You don't say whether your interests are academic or practical. I would 
guess academic, because there is no shortage of tools and standards for 
converting non-RDF data to RDF, as indicated by the responses to your 
original post. I don't know what your criteria for "declarative" mapping 
is, but for practical purposes I would settle for functional (or at 
least non-procedural) mappings, such as XSLT or SPARQL. If you want to 
go more declarative (and I have), you can resort to a generic rule 
language such as RIF[1]. But, to be honest, translating one surface 
syntax to another is not the really interesting problem--at least not to 
me after doing it for 3 decades.

But...you are interested in translating the "semantics" hidden beneath 
the surface syntax. In reality, this is asking to spin straw into gold, 
and yes, if anyone could do that it would be groundbreaking. But the 
best way toward this was pointed 25 years ago by the DSSSL ISO spec[2], 
"Document Style Semantics and Specification Language". Few on this list 
will remember the /annus mirabilis/ 1997, when HyTime 2nd edition 
(ISO/IEC 10744:1997)[3] was published, which was the capstone of a 
family of text processing specifications that included DSSSL (ISO/IEC 
10179:1996), and SGML (ISO 8879:1986). HyTime and DSSSL died aborning; 
SGML was quickly pushed aside by XML. Simplify, simplify, simplify! was 
the battle cry. RDF in 1997 was just being formalized as a simple 
metadata format for HTML pages. Much of what was good in those ISO specs 
was reincarnated in the XML family of W3C recommendations. Much of what 
was better was left behind, or imperfectly reinvented in fragments.

One of the better features was the "grove" concept and formalization, 
used by both DSSSL and HyTime. DSSSL recognized that what should be 
styled (or transformed) of a document is not the surface syntax, but a 
collection of properties extracted from the surface syntax. These 
collections of properties were called "groves" (anecdotally, Graph 
Representation Of Values). The specification of groves turned out to be 
terribly complex, couched in ISO spec-ese, and burdened with antique 
SGML formalisms, which no doubt contributed to its nearly universal 
neglect. But perhaps it was just ahead of its time and constrained by 
the need to conform to SGML. I believe any "breakthrough" in unlocking 
intellectual assets from arbitrary bytestreams for transformation or 
styling will use something that looks a lot like groves--maybe even 
groves expressed in RDF.

Regards,

--Paul

[1] http://www.w3.org/TR/rif-primer/

[2] http://www.jclark.com/dsssl/

[3] https://hytime.org

On 2/23/22 18:00, Hans-Jürgen Rennau wrote:
> My cordial thanks for this wealth of responses which I had not dared 
> to hope for! It will take me time to look at all these projects and 
> products, which ideally would find their places in a single and 
> coherent picture. Sometimes I shall ask a question concerning a 
> particular approach or statement.
>
> A focus of mine will be on the question which approaches may be 
> classified as "declarative mapping languages", borrowing the term from 
> [1] (slide 33). I am sure that tools *not* qualifying as such may in 
> particular scenarios be a superior choice, but on the large scale I 
> think it is declarative mapping languages where the highest potential, 
> perhaps even groundbreaking success is too be expected.
>
> For example, "resource specific XSLT scripts" (mentioned by Christian) 
> may be very efficient (as pointed out by Martynas), but they are not 
> declarative. And I suppose, the same applies to TARQL, but I may be 
> mistaken and will try to check. Beneath the variability of external 
> characteristics there may also be basic differences of perspective, as 
> hinted at by a quote from Enrico [2]: " Proposals focus on either 
> engineering content transformations or accessing non-RDF resources 
> with SPARQL. ... we explore an alternative solution and contribute a 
> general-purpose meta-model for converting non-RDF resources into RDF: 
> Facade-X."
>
> With kind regards,
> Hans-Jürgen
>
>
> [1] Maria-Esther Vidal, Tutorial on "Challenges for Efficiently 
> Creating and Maintaining Knowledge Graphs".
> https://service.tib.eu/ldmservice/dataset/sdmkgc
> [2] https://doi.org/10.3233/ssw210035 <https://doi.org/10.3233/ssw210035>
>
> Am Mi., 23. Feb. 2022 um 08:10 Uhr schrieb Hans-Jürgen Rennau 
> <hjrennau@gmail.com>:
>
>     Hello,
>
>     I am interested in the transformation of non-RDF data into RDF
>     data and I am puzzled, nay, haunted by a simple analogy. We have
>     stylesheets for defining visual representation of data in a
>     convenient, standardized way. Could we not have "semsheets" for
>     defining semantic representation of data in a convenient,
>     standardized way?
>
>     I admit the oversimplification: CSS stylesheets are designed to
>     work with HTML, a scope sufficient for practical purposes. Whereas
>     "non-RDF data" is by definition a broad spectrum of media types,
>     so the uniformity of a single "semsheet language" may not be
>     attainable. But how about approaching the goal, based on an
>     appropriate partitioning of data sources? For example:
>
>     (1) Relational data
>     (2) Tree-structured data
>     (3) Other
>
>     Tree-structured data comprises most structured data except for
>     graph data - JSON, XML, HTML, CSV, .... And concerning "other",
>     what comes to my mind is (i) unstructured text and (ii) non-RDF
>     graph data.
>
>     So keeping this partitioning in mind, how about standards,
>     frameworks, tools enabling customized mapping of data to RDF?
>
>     What I am aware of is very little:
>
>     (1) relational data: R2RML [1], ?
>     (2) tree-structured data: RML [2], ?
>     (3) other: ?
>
>     Note that I did not mention RDFa, as it is about embedding, rather
>     than writing mapping documents, nor GRDDL, as it is about finding
>     a mapping document, not its content.
>
>     I am convinced that there are quite a few other standards,
>     frameworks and tools which should be listed above, replacing the "?".
>
>     Can you help me to find them? Any links, thoughts, comments highly
>     appreciated. (And should you think the partitioning is faulty,
>     please share your criticism. The same applies to the very quest
>     for common, standardized mapping languages.)
>
>     Thank you! With kind regards,
>     Hans-Jürgen Rennau
>
>     [1] https://www.w3.org/TR/r2rml/
>     [2] https://rml.io/specs/rml/
>

Received on Thursday, 24 February 2022 04:30:31 UTC