- From: Sandro Hawke <sandro@w3.org>
- Date: Wed, 03 Apr 2002 10:14:56 -0500
- To: timbl@w3.org
- Cc: www-rdf-interest@w3.org
My biggest practical hassle in using n3 is changing between period and semi-colon when I add or re-order properties. It occurs to me that there is no need for period to be used this way: one could use semi colon to mark the end of a tuple, and the parser would fill in any missing fields in the tuple, on the left, with the data from the same field of the previous tuple. The counter argument is probably that redundancy in a language helps catch user errors, but when I hit this error, it's always because I've used the wrong punctuation, not because I've used the wrong number of terms in a tuple. Actually, given the format people seem to use for n3, I'd lean towards the tuple terminator being either semi-colon or newline. (This is much like the shell, where semi-colon or newline ends a space-separated tuple which ends up in argv[]. The idea of repeating fields on the left is quite different though.) I've used the word "tuple" instead of "triple" because this algorithm generalized to any size tuple. It means you could have an n3 file with only predicate/object pairs (perhaps the default subject is "<>", or perhaps it depends on context), or even just objects. I think that's pretty cool -- a file with one term per line would be a collection of values for some property of some subject. Allowing semi-colon where period and comma are now used would not break old data files, it would just give meaning to files which currently have invalid syntax. Allowing newline to work like semicolon would break some files, but not many, and most of those would caught in a transitional language where comma and period retained their current meaning. In general, I'd like n3 versions to be identified. My current favorite approach, which comes out of a much larger analysis [1] is to say that if the file anywhere contains the the text -*- formal-language-URI: "something"; -*- the "something" MUST be the URI-Reference of a language which the system can properly understand. If the pattern occurs more than once, the first one which can be used MUST be used. I think this mechanism allows nearly-arbitrary languages to be correctly understood without external metadata (Content-Type, filename snooping). (The form of the magic string is from Emacs file variables [2]). -- sandro [1] http://www.w3.org/2001/06/blindfold/langIdent under "Content Sniffing" [2] http://www.delorie.com/gnu/docs/emacs/emacs_439.html
Received on Wednesday, 3 April 2002 10:17:10 UTC