Re: [DTB] summary of editorial issues (completes ACTION-552) from Jos de Bruijn on 2008-08-27 (public-rif-wg@w3.org from August 2008)

From: Jos de Bruijn <debruijn@inf.unibz.it>
Date: Wed, 27 Aug 2008 16:59:48 +0200
To: Axel Polleres <axel.polleres@deri.org>
CC: "Public-Rif-Wg (E-mail)" <public-rif-wg@w3.org>
Message-ID: <48B56BE4.7020201@inf.unibz.it>
Axel Polleres wrote:
> Jos de Bruijn wrote:
>> <snip/>
>>
>>> 3) In the course of the rdf:text discussions, we discussed that a
>>> function/predicate for implementing language-pattern matching according
>>> to subtag matching according to RFC4647 is needed. (This is not yet
>>> reflected by an editor's not in the current draft.) I propose
>>> to add:
>>>
>>> pred:matches-langtag( ?arg1 , ?arg2 )
>>>
>>>  intended domains:
>>>    - arg1 rdf:text
>>>    - arg2 valid language range according to
>>>        http://www.rfc-editor.org/rfc/rfc4647.txt
>>
>> Why would this be necessary/useful?  We already have a function for
>> extracting language tags.
>>
>> pred:matches-langtag( ?arg1 , ?arg2 )
>> is the same as
>> func:lang(?arg1)=?arg2
> 
> jos, if you mean that the same functionality could be achieved with
> 
>   pred:matches
> 
> (http://www.w3.org/2005/rules/wiki/DTB#pred:matches_.28adapted_from_fn:matches.29)
> 
> 
> then the answer is: yes and no

Wouldn't the answer be: yes, but language patterns generally have a more
convenient syntax?

> 
> lang-pattern-matching in
>  http://www.rfc-editor.org/rfc/rfc4647.txt
> is different from regular expression  matching in pred:matches.
> 
> For example, the extended language range "en-*-US" maps to "en-US"
> (English, United States), also matching is case insensitive, which is
> quite different from matching a regexp (although I don't say it can't
> all be expressed in a regexp, this regexp might become fairly nasty)

In rif:text, language tags are lower case.  What about rdf:text?

> Since lang-pattern wildcards are still something which seems to be often
> used in connection with language tags, I suggest to have a separate
> function for that.

I wonder how often they are actually used and whether we want to require
every BLD implementer to implement this specific kind of pattern matching.

> 
>> <snip/>
>>
>>> 5) Editor's Note: It was noted in discussions of the working group, that
>>> except guard predicates, also an analogous built-in function or
>>> predicate to SPARQL's datatype function is needed. This however has some
>>> technical implications, see
>>> http://lists.w3.org/Archives/Public/public-rif-wg/2008Jul/0096.html
>>>
>>> PROPOSED: We could  - analogous to  pred:iri-to-string, define
>>> predicates
>>>
>>>  pred:matches-datatype( ?arg1 ?arg2)
>>>
>>> such that the predicate is true iff ?arg1 is in the value space of
>>> the datatype denoted by ?arg2 . An open question is whether we should
>>> use the rif:iri or the string representing the datatypeIRI for the
>>> second argument, i.e. what is the intended domain for ?arg2 ??
>>
>> I don't really see how this could be defined in a meaningful way.  In
>> any case, we already have the guard predicates, so I don't see the use.
> 
> The use case is simple: I want to emulate the datatype function from
> SPARQL in RIF... I want to know whether a literal is an integer or a
> decimal. It is quite obvious, that we can't define a function which does

Other than in SPARQL, in BLD you do not have literals, but you have
values.  So, for example, every integer is a decimal.

People who expect SPARQL-like behavior might find this odd.

> this, the predicate is an alternative suggestion, I will not fight for
> it if no one else sees the need to at least cover the expressivity of
> the built-ins in SPARQL... although I find this awkward at least.
> 
>  <snip/>
>>
>>> 7) Editor's Note: In the following, we adapt several cast functions from
>>> [XPath-Functions]. Due to the subtle differences in e.g. error handling
>>> between RIF and [XPath-Functions], these definitions might still need
>>> refinement in future versions of this draft.
>>>
>>> Indeed I need to check back Jos exact concerns here, he thought that
>>> referring to the [XPath-Functions] conversions is not precise enough
>>> here, see also 8)
>>
>> Basically, the interpretations of the functions are not completely
>> defined.
> 
> yes.
> 
>>> 8) Editor's Note: We might split this subsection into separate
>>> subsections per casting function in future versions of this document,
>>> following the convention of having one separate subsection per
>>> funtcion/predicate in the rest of the document. However, it seemed
>>> convenient here to group the cast functions which purely rely on XML
>>> Schema datatype casting into one common subsection.
>>>
>>> I can separate them, if the majority of the working group thinks this is
>>> necessary.
>>
>> I'd say: either follow the principle of having one subsection per
>> predicate/function (I personally don't see the use of that) or don't
>> follow this principle.
>> in the former case, you need to split up the mentioned subsection.  In
>> the latter case, many subsections in the document can be merged.
> 
> yes.
> 
>>> 9) Editor's Note: The cast from rif:text to xs:string is still under
>>> discussion, i.e. whether the lang tag should be included when casting to
>>> xs:string or not.
>>>
>>> PROPOSED. replace rif:text by rdf:text, otherwise leave as is.
>>
>> I don't remember whether we discussed this in the working group.
> 
> yes, it needs to be discussed/approved. casts from rdf:text to xs:string
> are not covered by standard conversions in XPath/XQuery, but the
> suggested treatment covers it analogously to:
> 
>    http://www.w3.org/TR/rdf-sparql-query/#func-str

This is not a cast function; it is a string extraction function,
analogous to the language tag extraction function.
I think it is more intuitive to use such an extraction function, rather
than a cast function, for extracting strings from rdf:text values.

> 
>> <snip/>
>>
>>> 12) Editor's Note: The working group is currently discussing, whether in
>>> addition to adopting the fn:compare function from [XPath-Functions], own
>>> predicates pred:string-equal, pred:string-less-than,
>>> pred:string-greater-than, pred:string-not-equal,
>>> pred:string-less-than-or-equal, pred:string-greater-than-or-equal not
>>> defined in [XPath-Functions] shall be introduced, following the
>>> convention of having such predicates for other datatypes.
>>>
>>> PROPOSED: introduce additional comparison predicates.
>>
>> Why would we want to have these comparison predicates and what does it
>> mean for one string to be less than another?
> 
> Suggested by Gary, the idea is to have uniformity, i.e. predicates
> less-than, greater-than, equal, less-than-or-equal,
> greater-than-or-equal, for all (or ate least most) datatypes,

In that case you would need this kind of comparison also for things like
XML literal and rdf:text.

> where this
> can be defined in a feasible manner.

How can this be defined for strings?  This was my question.

> 
> If there is disagreement here for the sake of redundancy,
> then we also have to revisit the and  less-than-or-equal,
> greater-than-or-equal predicates which were approved by the group, since
> they are likewise superfluous.

I don't care about the redundancy here.

> 
>>> 13) Editor's Note: No less-than-or-equal or greater-than-or-equal
>>> predicates are defined in this draft for durations, since there are no
>>> separate op:dayTimeDuration-equal nor
>>> op:yearMonthDuration-equalpredicates in [XPath-Functions], but only a
>>> common predicate op:duration-equal. Future versions of this working
>>> draft may resolve this by introducing new equality predicates
>>> pred:dayTimeDuration-equal and pred:yearMonthDuration-equal with
>>> restricted intended domains.
>>>
>>> PROPOSED: introduce a single predicate duration-equal that only
>>> evaluates to true if the arguments are both of the same duration subtype
>>> and equal.
>>
>> Agreed.
>>
>>> 14) Editor's Note: Predicates for rdf:XMLLiteral such as at least
>>> comparison predicates (equals, not-equals) are still under discussion in
>>> the working group.
>>>
>>> PROPOSED: introduce equals and not-equals for XMLLiteral which matches
>>> modulo white-spaces in non-text content.
>>
>> Two XML literals are equal if their values (as defined in [1]) are the
>> same and not-equal if their values are not the same. I cannot imagine
>> any other meaningful definition for equality of XML literals.
>>
>> [1] http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
> 
> ok, that doesn't include white-space normalization or alike...

If you want to have whitespace normalization, you should either use a
different data type or introduce a function for this kind of
normalization. Using XMLLiteral-equals for checking anything but
equality of XMLLiteral values is misleading.

> for that actually "=" suffices, doesn't it?

before the equals, yes.  Not-equals would be a different thing.

> 
> If the group is fine with
> 
> pred:XMLLiteral-not-equals("<a/>"^^rdf:XMLLiteral
>                            "<a />"^^rdf:XMLLiteral)
> 
> then fair enough. As far as I understood, XML prescribes some
> normalization of end-of-lines
> 
>  http://www.w3.org/TR/2000/REC-xml-20001006#sec-line-ends
> 
> and for white spaces in attribute values
> 
>  http://www.w3.org/TR/2000/REC-xml-20001006#AVNormalize
> 
> Do we need to bother about this?
> 
> 
> 
>>> 15) Editor's Note: The current name of this function is still under
>>> disscussion in the working group. Alternative proposals include e.g.
>>> func:lang-from-text, which follows the XPath/XQuery naming convention
>>> for extraction functions from datatypes than the SPARQL naming
>>> convention.
>>>
>>> PROPOSED: change to func:lang-from-text and only add a remark that this
>>> is related to SPARQL's lang-function.
>>
>> Agreed.
>>
>>> 16) Editor's Note: We have not yet included comparison predicates
>>> (equal, less-than, greater-than, or compare ...) for rif:text. Future
>>> versions of this document might introduce these.
>>>
>>> PROPOSED: only add equal and not-equal for rdf:text, for more
>>> sophisticated comparisons conversions to strings and the more
>>> fine-grained comparisons on  strings can be used.
>>
>> Agreed.
>>
>>
>>
>> Best, Jos
>>
> 
> 

-- 
Jos de Bruijn            debruijn@inf.unibz.it
+390471016224         http://www.debruijn.net/
----------------------------------------------
No one who cannot rejoice in the discovery of
his own mistakes deserves to be called a
scholar.
  - Donald Foster
Received on Wednesday, 27 August 2008 14:59:37 UTC