W3C home > Mailing lists > Public > www-rdf-interest@w3.org > November 2001

Re: n3/n-triples syntax question

From: Tim Berners-Lee <timbl@w3.org>
Date: Fri, 30 Nov 2001 16:17:52 -0500
Message-ID: <01ab01c179e4$79dc9c50$e9001d12@CREST>
To: "Sandro Hawke" <sandro@w3.org>
Cc: <www-rdf-interest@w3.org>

----- Original Message ----- 
From: "Sandro Hawke" <sandro@w3.org>
To: <timbl@w3.org>
Cc: <www-rdf-interest@w3.org>
Sent: Friday, November 30, 2001 2:22 PM
Subject: n3/n-triples syntax question

> I was telling Bijan that his n3 parser should output N-Triples, when
> we came across the problem that anonymous node names (_:qname) are a
> pain to generate uniquely.  When you need one (eg for a [ ]
> construct), you can't just make one ("_:g57"), because the user might
> use the same identifier ("_:g57") later in the document.

That is assuming that the document is parsed in a pipe,
and you don't want to rename the user's.
Supposed you are pipelining -- treating it as an infinite
stream with limited local storage as you process.
You could generate _:_g1234 ids ad yuou go, 
and keep a dictionary.  When yo find the user has
reused one, you rename *that*.   In theory, as it is
an anonymous node the user has really no right to
ask for the same identifier back (any more than
the same prefix used in the output file). And in
practice, and user or program which has fed you
things looking like _:_g1234 will be very unlikely to
be processing them anything other than automatically
-- there is n't going to be a whole lot of dignostic human
meaning in them.

If you are not using a pipe, then why use a number at all?
Use the address of hte object created. This is one
way I have been thinking of taking the cwm code - the
anonymous nodes have no allocated URI until they are output,
and at that point they are regenerated.  I have code in
cwm actually to regenerate them on output just to make the file
look cleaner.

> Our best solution is to say you generate illegal names ("_:57") during
> parsing, then at the end of the document, you rename those over to the
> first _:gXX that's not already taken.   Painful, but correct.

Well,  you don't need to have a string at all, as I say - it can
be the object you use in memory.
> The more obvious approach of "reserving" names like _:_gXXX would
> violate the principle of N-Triples being a sub-language of n3, at
> least in spirit.   Maybe you can finesse the definition of "reserve",
> and say that such names "may conflict with names generated internally
> if you go beyond N-Triples to other n3 features."  Pretty ugly.

There are alot of practical systems which reserve obscure things.
It isn't clean but it works in practice.  I do use a convention that
I never start an identifier I as a person make up with a "_".  Python uses
__whatever__ for system names, and so on.  It depends
on kludge level of the code, I suppose. 

> Better solutions?
>       -- sandro
Received on Friday, 30 November 2001 16:17:58 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:07:38 UTC