Re: Semantic Web archaeology

On Fri, 28 Jun 2019 at 09:01, Antoine Zimmermann <antoine.zimmermann@emse.fr>
wrote:

> Recently on Stack Overflow, there was a question asking "Why rdf:Seq and
> not rdfs:Seq?" [1]. I tried to answer the best I could, by digging in
> the old RDF mailing lists, but I am still puzzled about how some terms
> ended up in the rdf: namespace rather than rdfs: (and vice versa). Can
> someone involved in the early days of RDF enlighten us about this?
>
> Nowadays, the duplication of namespaces for RDF terms seems rather
> silly, confusing, and counter productive. Maybe it made sense, back in
> the days...
>

Sure, let me fill in some details, possibly too many. This all dates from
1997. In 1997, several things happened.

* W3C via its Metadata Activity (https://www.w3.org/Metadata/) led by Ralph
Swick, was getting reorganized. The PICS system for content labelling was
to be generalized. PICS already (from 1995/6) included a label syntax (
https://www.w3.org/PICS/labels-951121.html), signed labels (
https://www.w3.org/TR/2009/REC-DSig-label-20091124/), a rule language (
https://www.w3.org/TR/2009/REC-PICSRules-20091124/) and a "label bureau"
protocol. However it did not provide a way for multiple independent
labelling schemes to be used together, for rich structured descriptions,
and for datatyped (strings, numbers etc.) rather than categorical rating
values.
* June 1997 - Netscape submitted Meta Content Framework (MCF) in XML,
 https://www.w3.org/Submission/1997/8/ - both a spec (
https://www.w3.org/TR/NOTE-MCF-XML/) and a tutorial (
https://www.w3.org/TR/NOTE-MCF-XML/MCF-tutorial.html).  While MCF is
clearly the technical ancestor to a lot of what you see in RDF, it didn't
have the same partitioning; the MCF spec had model, syntax, schema and some
starter vocabulary.
* The Dublin Core metadata initiative, meanwhile, was trying to express
in-page Web metadata using HTML 3.2's simple "meta" tags, and encountering
difficulties with nested or repeated fields; e.g. see
http://www.dublincore.org/specifications/dublin-core/info-factoring/2001-03-19/
 (1996) http://www.dlib.org/dlib/june97/metadata/06weibel.html . This 1996
AHDS report gives a feel for the metadata formats of that era,
https://web.archive.org/web/19980118053003/http://ahds.ac.uk/public/metadata/disc_32.html
.
For example, if you had multiple authors with multiple sets of contact
details each, you ended up with metadata fields named things like
"DC.creator.email.2".
See also http://www.ariadne.ac.uk/issue/7/mcf/ for Dublin Core community
exploration of using MCF.
* Lots of projects were crawling the Web and extracting metadata, e.g. see
the earlier 1996 workshop, https://www.w3.org/Search/9605-Indexing-Workshop/
 and https://www.w3.org/Search/9605-Indexing-Workshop/ - you'd see mentions
of formats like SOIF from the Harvest indexing project, or MARC and the
Z39.50 protocol from libraries.
* XML itself was taking shape (as a cut down format derrived from SGML),
including a draft namespaces mechanism.

It was also the height of the browser wars.

I'm sure I've missed a bunch of stuff, but the picture is basically that
the Web had just broken through into everyday life and things were crazy
and moving fast. So when W3C (itself a very new organization) planned the
RDF work there was a sense that it needed to be partitioned and layered,
and to get something useful out soon without being stuck in complexity.

Two RDF Working Groups were initially created in 1997; first (May 1997) the
Model and Syntax WG https://www.w3.org/RDF/Group/Syntax/ (pages are W3C
Members only, still), and a little later that year, the RDF Schema WG
https://www.w3.org/RDF/Group/Schema/

Their charters are not public but I think it is reasonable to excerpt here.

From https://www.w3.org/RDF/Group/SyntaxCharter

"""Purpose and Scope: The Resource Description Framework Model and Syntax
Working Group (RDF-syntax-wg) will define an interchange format for
encoding and exchange of structured resource description data (metadata)
for Web resources. This framework will include all the capabilities of
PICS-1.1 and in addition will support more general models of resource
description, including non-numeric and structured attribute values. The RDF
Model and Syntax WG will work closely with an RDF Schema WG.

Requirements
The goal of RDF is to provide a single mechanism for representing metadata
across many applications. The semantics and structure of many varieties of
metadata will be specified by independent communities. The RDF must provide
an infrastructure that is sufficiently general and flexible to support
these disparate applications. Example applications include sitemaps,
content ratings, stream channel definitions, search engine data collection
(web crawling), digital library collections, and distributed authoring.

Dependencies
The RDF is an evolutionary step from PICS-1.1. The importance of the
content rating application is recognized explicitly by a requirement that
the RDF support existing PICS-1.1 data types and functional specifications.
It must be possible to automatically translate PICS-1.1 labels to RDF.

The RDF Model and Syntax Working Group is responsible for defining an
architecture and interchange format for resource descriptions. The RDF
Schema design is the responsibility of a separate but closely coordinated
working group.

There has been an agreement that the RDF work will build on top of XML, and
that the XML namespace work will supply some of the modularity requirements
for RDF.

The Model and Syntax Working Group will also incorporate into the Resource
Description Framework any requirements for the purposes of digitally
signing resource descriptions defined by the W3C Digital Signature Working
Group."""



The RDF Schema WG's charter followed along a few months later,

"""Purpose and Scope
The Resource Description Framework Schema Working Group (RDF-schema-wg)
will define a model for schemas to specify the semantics of information
encoded in the Resource Description Framework and a language for the
encoding and exchange of those schemas.

Requirements
This schema model must be consistent with the data model produced by the
RDF Model and Syntax Working Group. While it is recognized that not all
aspects of metadata semantics can be described in a machine understandable
form the goal of the RDF Schema working group is to build on well
understood methods from the fields of database schema representation and AI
knowledge representation to enable this as far as possible.

The RDF Schema language must be syntactically compatible with the language
chosen by the RDF Model and Syntax Working Group and must support all the
functions in a PICS-1.1 rating service description. It must be possible to
automatically translate PICS-1.1 rating service descriptions to RDF schemas.

The goal of RDF is to provide a single mechanism for representing metadata
across many applications. The semantics and structure of many varieties of
metadata will be specified by independent communities. Much of these
semantics can be specified in a declarative machine understandable form.
Having such specifications available will greatly improve interoperability.
The goal of the RDF Schema mechanism is to enable this.

Dependencies
The RDF Schema Working Group will coordinate with the RDF Model and Syntax
Working Group to insure that all the features of the data model defined by
the Model and Syntax Working Group are represented in the RDF Schema
specification.  The Metadata Coordination group will help with this
coordination."""

---

In the two Working Groups, the way this played out was that the RDF M+S WG
introduced certain notions informally and "in passing", without
elaboration, and the RDFS group fleshed out some of those details.
Meanwhile, the RDFS WG took care not to introduce new syntax, and decided
to express its schema language within RDF using whatever syntax the other
Working Group created. The first public draft of RDF was the M+S spec from
October 1997: https://www.w3.org/TR/WD-rdf-syntax-971002/

It already talked in MCF-like terms of "In this data model both the
resources being described and the values describing them are nodes in a
directed labeled graph (and values may also be resources). The arcs
connecting pairs of nodes correspond to the names of the property types."
... and a basic notion of types followed along soon enough too. The
requirement to represent ordered structures within this otherwise unordered
graph data model was identified immediately, and tied in to the practical
requirements that came from PICS and from the various other metadata
efforts of that era that I've sketched.

The RDF M+S WG put things in their namespace like rdf:type, while trying to
say as little as possible about the nature of namespaces, schemas and
related things. There was also a rough suggestion, sometimes articulated
explicitly, that multiple different schema systems could be built on top of
the base RDF. Or that instance data could stand alone, schema-less, and
still be useful. When DARPA's DAML came along a few years later, the idea
of alternatives to RDFS sharing the same base took more explicit shape with
DAML+OIL and OWL, but there were also some rather tense discussions with
XML groups in the 1997-9 period. XML was grounded in SGML heritage and
there was a broad expectation that the DTD part of SGML (and XML) would
eventually replaced/modernized, and that when this happened, there would be
an opportunity for the relationship between RDF and XML to be made more
explicit. XML people were not super happy at the prospect of their future
schema language being expressed somehow in RDF (although Tim Bray made a
draft exploring this, see https://www.w3.org/TR/NOTE-dcd). And the RDF/S
WGs didn't see a way for DTDs to really make sense for use with RDF.
Nevertheless there were clear points of overlap, e.g. datatypes. It was
clear that 1998-era RDFS wouldn't be the last word on the topic.  You can
see some of this baggage in earlier drafts of RDFS e.g.
https://www.w3.org/TR/1999/PR-rdf-schema-19990303/#intro as well as in the
later https://www.w3.org/1999/04/WebData and
https://www.w3.org/TR/schema-arch/ documents.

The result of all this was pressure for RDFS to be a pretty minimalistic
language, one potentially replaceable, or elaborated upon later. This also
meant that there was a concern not to have too much RDFS leak into the
(hopefully uncontroversial) core of RDF, which needed to be finalized and
usable ASAP. That is roughly how we end up with things like the "type" and
"property" terminology being introduced in the RDF M+S spec (and rdf:
namespace), while structures making these ideas more explicit (Class,
Property etc.) were elaborated upon in "rdfs:" a little later. As it
happened, the M+S specification did make it to W3C REC status in '99)
whereas RDFS got stuck in limbo for a good while afterwards, and only came
back to formal life once W3C re-chartered the Metadata Activity as the
follow-on "Semantic Web" Activity in 2001.
https://lists.w3.org/Archives/Public/sw99/ has some bits and pieces from
that transition period.

Nobody really liked rdf:Seq but it showed that order could be represented
in the graph structure, and the promise of "you could add a utility
vocabulary to express order differently" probably helped make it bearable.

</1990s>,

Dan

Received on Tuesday, 2 July 2019 14:28:18 UTC