Re: Language tags as values - document consequences (Concepts and MT)

On May 11, 2013, at 6:20 AM, Andy Seaborne wrote:

> 
>>>> I like that, but please can we have a DECISION on this quickly,
>>>> as it would need a major edit to Semantics. (It would make it a
>>>> lot simpler. It is really a host of minor edits which I can do in
>>>> one day.)
>>>> 
>>>> Also, we still need to decide whether or not
>>>> "Ilovelanguagefivehundred"@500^^rdf:langString  is just silly, or
>>>> an ill-formed literal, or a syntax error. Its going to be a
>>>> probem for someone, for sure, but if its ill-formed then it is a
>>>> problem for me.
>>> 
>>> IMO
>>> 
>>> "Ilovelanguagefivehundred"@500
>>> 
>>> should be treated like "thenumber14"^^xsd:integer - fails as a
>>> D-entailment because the (generalised) lexical form
>>> ("Ilovelanguagefivehundred", "500") has a language tag not matching
>>> the rules of BCP47.
>> 
>> I know it doesnt match the rules of BCP47. So what does this mean for
>> RDF? Do we say it is a parse error, so this triple simply does not
>> exist (cannot possibly occur) in the abstract syntax? Or do we say,
>> it is legal syntax, but (like "abc"^^xsd:integer ) it is an ill-typed
>> literal? The semantics needs to know.
>> 
>> If it is syntactically legal but ill-typed, then I need to do
>> extensive editing of the semantics document, because that will mean
>> that a graph can be RDF-inconsistent. RIght now that is impossible,
>> and I have been relying on that impossibility to keep things simple.
> 
> Pat,
> 
> I don't understand that point - if a RDF processors is not required to recognize rdf:landString and xs:string

But they are. I agree, if we had no built-in datatypes in RDF then it would be cleaner. On the other hand, cleanness of the semantics isnt the primary driver behind RDF design, I recognize.

> , with them being handled by D-entailment like any other datatype, they don't generate basic RDF inconsistency do they?  Only if the particular D-entailments are supported?

Yes, *if* we do that, then my problems go away. But currently that is not a WG decision, I believe.

> 
> rdf:langString is in the abstract syntax, so we do need to say something somewhere even if it is not an absolute requirement to support the datatype.  That can go in concepts.
> 
> I would be happy either with a requirement that the langtag MUST match by BCP47 (or the weaker RFC3066) and so @500 is simply not RDF (like <foo>@en). 
> Also, I'd be happy to say that at the abstract syntax level a language tag is a string.  It's ill-typed by section 8 like any other datatype with a non-mapping to the value space.
> 
> Neither of these give RDF-inconsistency do they?

The second does if rdf:langString is built into RDF entailment (which, to repeat, it currently is.) 

> 
> 	Andy
> 
> PS There are xs:string can be ill-typed (e.g. "\u0000").

Really? I guess I was under the impression that this was a syntax error. Sigh. I wish we could get this clear. Just using MUST (NOT) language or just saying "literals are" something does not make it clear enough. 

> Hence the suggestion to make it not required to be handled in RDF, only at the D-entailment level.  Also avoids the minor issue that xs:string (XML 1.0) and xs:string (XML 1.1) are different.

In semantics, there are no minor issues :-)

Pat

> 
>> 
>> Pat
>> 
>> 
>>> 
>>> This is independent of registration of language tags - we could add
>>> that it must be any current or previously registered language tag
>>> but the grammatical rules of BCP47 are enough and look to be the
>>> best future proof approach.
>>> 
>>> FWIW: The syntax rules of Turtle forbid it at the character-parsing
>>> level - not so for RDF/XML - but they do pass @XX-500.
>>> 
>>> 
>>> Andy
>>> 
>>>> 
>>>> Pat
>>>> 
>>>> PS. For the record, built-in (required) datatypes are a royal
>>>> PITA for the Semantics editor. rdf:XMLLiteral was a PITA in 2004
>>>> and xsd:string and rdf:langString are a PITA now. In Semantics
>>>> they are like heavy sacks that you have to keep strapped to your
>>>> belt all the time because regulations say you must, but all they
>>>> do is get in the way and trip you up when you are in a hurry. But
>>>> thats just from the editor's point of view.
>>>> 
>>>> 
>>>> 
>>>>> 
>>>>> This licenses current systems.
>>>>> 
>>>>> Recognizing xs:string is about bad characters in the lexical
>>>>> form. This isn't what all systems do for, say, control
>>>>> characters.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> ------------------------------------------------------------
>>>> IHMC (850)434 8903 or (650)494 3973 40 South Alcaniz St. (850)202
>>>> 4416   office Pensacola                            (850)202 4440
>>>> fax FL 32502                              (850)291 0667 mobile
>>>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> ------------------------------------------------------------ IHMC
>> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
>> (850)202 4416   office Pensacola                            (850)202
>> 4440   fax FL 32502                              (850)291 0667
>> mobile phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

Received on Saturday, 11 May 2013 17:37:02 UTC