- From: Reto Bachmann-Gmür <reto@gmuer.ch>
- Date: Wed, 16 Jan 2008 16:28:27 +0100
- To: semantic-web@w3.org
- Message-ID: <478E229B.9090500@gmuer.ch>
While we can still read two hundred years old texts quite easily we see
our machines struggling to deal with triples of the same age. Looking at
the triples we see many names starting with "http", what are these names
and why do they require such a lot of temporal and cultural disambiguation?
Before Semantic Web started and the triples began to flow, in the
technological developed regions a lot of information was looked up using
the protocol HTTP in combination with a hierarchical naming system
called DNS. The http-name where originally addresses that could be
resolved within that hierarchy, the idea was that for every name one
could contact a system which would return an authoritative definition of
that name. Originally this system was relatively stable, individuals and
organizations could rent a sub-section of the namespace. The root of the
namespace was originally controlled by organizations of the United
States of America. A European network ("Open Root Server Network")
replicated the America controlled network but was designed to become
independent should the political situation require it. As the Open Root
Server Network was never detached for a prolonged period the system
worked as unique hierarchical naming system for around thirty years. In
2012 a coalition of governments and civil organizations campaigned for a
"free" and "save" naming system. This campaign eventually led to the
"Free Open Network (FON)" which offered names free of charge and
guaranteed "safe"-names by a court-system revoking names found to be
"misleading or dangerous to the public". The acceptance of the new
system was regionally different, in several countries the usage of the
new system become mandated by law. On the American continent and in
parts of Europe the old system continued to be dominant. Disambiguations
becomes especially hard since in 2015 the FON authorities redefined some
terms of popular vocabularies, many parties using names assigned by FON
kept the old definitions arguing that some terms have outgrown the web
and would have a common sense meaning.
An additional issue is sheer amount of names. By the time we sometimes
had several thousand names for the same thing. It can seem paradoxical
that the most unimportant terms had the highest number of synonyms. The
reason for this is, that in the early days people were arguing that
everything should have a name. While the triple-spaces where defined to
allow anonymous contextual entities many preferred to name just
everything, so that an authoritative definition could be looked up. For
terms not enough important for a social consensus on a well known set of
names to arise, many processors just made up names themselves in their
http-namespaces. The same was the case when the information was not
sufficient for identification, for example every time you walked through
an area monitored by video camera you would be assigned an http-name by
the monitoring system. The inflation of names was so big, that for many
names we cannot even find definitions in libraries. The ODC defines less
than a milionth of the http-names used at the time. With this names and
treating the other http-names as contextual (i.e. ignoring the name) we
can reasonably interpret many old triples. However for many http-names
we will ultimately never know if it was just a label associated to a
contextual node or if it in fact had an intersubjective meaning at the time.
Received on Wednesday, 16 January 2008 15:28:33 UTC