Re: question about ShEx from Iovka Boneva on 2016-01-04 (public-data-shapes-wg@w3.org from January 2016)

From: Iovka Boneva <iovka.boneva@univ-lille1.fr>
Date: Mon, 4 Jan 2016 17:13:35 +0100
To: public-data-shapes-wg@w3.org
Message-ID: <568A9A2F.7030506@univ-lille1.fr>
Indeed, I've answered too quickly: representing the numbers in unary is 
not enough to capture the size increase due to the unfolding of 
cardinality expressions. The two things I called equivalent are actually 
not equivalent.

Back to your question: for showing that validation is in NP, we need to 
consider that a cardinality expression is a syntactic shortcut for the 
corresponding unfolded expression (a syntactic shortcut that can be 
exponentially smaller than the actual unfolded expression). We define 
the size of the expression based on the unfolded one, and validation for 
ShEx w/o arbitrary cardinalities (only *, +, ?) is in NP. I think that 
you agree with this last statement.

The proof for NP-completeness is indeed made for the case w/o arbitrary 
cardinalities (except on triple constraints, so no folding of arbitrary 
cardinalities).

Once again, we are talking here about a border line case, which would 
probably never occur in practice. I am personally not concerned about 
the high complexity of validation for this particular case.

Iovka


Le 04/01/2016 14:31, Peter F. Patel-Schneider a écrit :
> I am not convinced, particularly because of your explanation.
>
> The unfolding will result in an exponential size increase if numbers are
> represented in unary.
>
> Consider
>
> { ... { p @ a [n,n] } ... } [n,n]
>
> where the embedding depth is m.
>
> The expansion of this would need to either represent the number n^m or have
> n^m pieces.  Both of these would need size exponential in the size of the
> expression above.
>
> The ability to embed cardinalities inside cardinalities provides a way of
> efficiently representing large numbers even if numbers in expressions have to
> be written in unary.
>
> peter
>
>
> On 01/04/2016 04:16 AM, Iovka Boneva wrote:
>> Le 18/12/2015 18:00, Peter F. Patel-Schneider a écrit :
>>> With this construct I'm having some difficulty determining whether ShEx
>>> validation is in NP.  It seems to me that the outer numbers multiply the
>>> effort required to make the inner partitions, leading to an exponential number
>>> of choices.   Maybe there is some way to do this bottom up, but it certainly
>>> doesn't seem obvious to me.
>>>
>>> Is there an NP algorithm for ShEx validation with this feature?
>> Indeed, very good question. And a technical answer.
>> There is an NP algorithm if we consider that the min and max bounds
>> (1,2,3,4,6,7 in your example) are coded in unary (and not in binary) for
>> determining the size of the schema.
>>  From a complexity point of view, it is equivalent to consider that
>> p@a [1,2]
>> is a syntactic shortcut for
>> p@a , p@a?
>> i.e. equivalent to "unfolding" the multiplicities.
>>
>> Does it answer your question ?
>>
>> As Eric explained, such cases are not supposed to occur very often.
>>
>> Iovka
>>
>>> peter
>>>
>>>
>>> On 12/18/2015 01:12 AM, Eric Prud'hommeaux wrote:
>>>> * Peter F. Patel-Schneider <pfpschneider@gmail.com> [2015-12-17 22:20-0800]
>>>>> It appears to me that ShEx allows embeddings of cardinalities, as in
>>>>>
>>>>> twofoureight {
>>>>>     { { p @ a [2,2] } [2,2] } [2,2]
>>>>>    }
>>>>>
>>>>> or
>>>>>
>>>>> lots {
>>>>>     { p @a [2,4], p @b [1,3] } [6,7]
>>>>> }
>>>>>
>>>>> is this correct?
>>>> In principle, yes, but apart from optional groups, it's rare that one
>>>> wants to make sure that there are as many a's as there are b's. Optional
>>>> is quite reasonable, e.g.
>>>>
>>>>     <X> {
>>>>       :submitter @<UserShape>, :submissionDate xsd:dateTime,
>>>>       (:resolvedBy @<HackerShape>, :resolutionDate xsd:dateTime)?
>>>>     }
>>>>
>>>> The other place this frequently comes up is in repetitions of choices:
>>>>
>>>>     <Y> { (:observation .|:analysis .|:diagnosis .)+ }
>>>>
>>>> which can be compiled to
>>>>
>>>>     <Y> {
>>>>       (:observation .|:analysis .|:diagnosis .),
>>>>       :observation .*,
>>>>       :analysis .*,
>>>>       :diagnosis .*
>>>>     }
>>>>
>>>> with either a partion or a greedy semantics.
>>>>
>>>>
>>>>> peter
>>>>>
>>>>>
>>>>>
>>


-- 
Iovka Boneva
Associate professor (MdC) Université de Lille
http://www.cristal.univ-lille.fr/~boneva/
+33 6 95 75 70 25
Received on Monday, 4 January 2016 16:14:06 UTC