- From: Jos de Bruijn <debruijn@inf.unibz.it>
- Date: Mon, 04 May 2009 16:23:28 +0200
- To: Sandro Hawke <sandro@w3.org>
- CC: public-rif-wg@w3.org
- Message-ID: <49FEFA60.30209@inf.unibz.it>
I assumed you were talking only about list operators, so I mean indeed 1b.
> When you say users "have to define the funtions themselves", you mean
> using rules to re-implement member, index-of, etc?
Yes.
Jos
Sandro Hawke wrote:
>> I am in favor of option 1: the list operators simply work on the values
>> in the lists, rather than performing all kinds of conversions. If users
>> want something more, they have to define the functions themselves.
>
> Sorry, I guess I need more detail within option 1. Which do you mean:
>
> 1a. Remove all 'collation' paramaters from DTB, including on
> the string compare builtin.
>
> 1b. Remove 'collation' parameters from the list builtins, but
> leave them in the non-list builtins.
>
> 1c. Keep 'collation' in the list builtins (so strings are compared
> using the compare builtin) but otherwise compare values using
> RIF equality
>
> (I'd guess you mean 1b.)
>
> When you say users "have to define the funtions themselves", you mean
> using rules to re-implement member, index-of, etc?
>
> -- Sandro
>
>> Jos
>>
>> Sandro Hawke wrote:
>>> In writing the spec for the List builtins, I've come across a difficult=
>>> design choice concerning how literals are compared. (Some of this migh=
>> t
>>> be considered already decided, but it seems to me there's a fair amount=
>>> of new information here, relevantstuff I didn't know at the F2F.)
>>> =20
>>> Background:
>>> =20
>>> - In RIF, you have two ways you can compare most literals:
>>> =20
>>> (1) You can use rif:equals, which is true iff the elements in
>>> the value space for the two literals are the same.
>>> Literals with types with disjoint value spaces will never
>>> compare as equal
>>> =20
>>> true: "01":xs^int =3D "1":xs^int =20
>>> false: "1":xs^int =3D "1"^xs^float
>>> false: "1":xs^double =3D "1"^xs^float
>>> false: "2002-04-02T12:00:00-01:00"^^xs:dateTime=20
>>> =3D "2002-04-02T17:00:00+04:00"^^xs:dateTime=
>> )
>>> false: "Strasse" =3D "Stra=C3=9Fe"
>>> =20
>>> (2) You can use a builtin comparator like numeric-equal,
>>> dateTime-equal, date-equal, time-equal, duration-equal,
>>> XMLLiteral-equal, compare, and text-compare. These
>>> builtins allow more values to be considered equal, for
>>> example:
>>> =20
>>> true: "1":xs^int =3D "1"^xs^float
>>> true: "1":xs^double =3D "1"^xs^float
>>> true: op:dateTime-equal(
>>> "2002-04-02T12:00:00-01:00"^^xs:dateTime, =
>>> "2002-04-02T17:00:00+04:00"^^xs:dateTime)
>>> =20
>>> In addition, for the comparison of strings and text, an
>>> optional 'collation' parameter is available:
>>> =20
>>> false: 0 =3D=3D compare("Strasse", "Stra=C3=9Fe")
>>> true: 0 =3D=3D compare("Strasse", "Stra=C3=9Fe", "deutsch=
>> ")
>>> =20
>>> - As I understand it, the 'collation' is an extensibility
>>> point. XPath-Functions uses the examples 'deutsch',
>>> "http://www.example.com/collations/French1", and
>>> "http://www.example.com/collations/French2", but only defines
>>> (and requires) one collation, the default:
>>> "http://www.w3.org/2005/xpath-functions/collation/codepoint" which=
>>> does unicode normalization (sort of). See
>>> http://www.w3.org/TR/xpath-functions/#collations and
>>> http://www.unicode.org/unicode/reports/tr10/
>>> =20
>>> - Some list builtins need to compare values.=20
>>> =20
>>> * member; this is not in XPath-Functions
>>> =20
>>> * index-of and distinct-values; these take a collation
>>> parameter in XPath-Function. The collation is defined to
>>> apply whenever the values are strings.
>>> =20
>>> * union, intersect, except; these conceptually compare
>>> elements, but in XPath-Functions they only operate on
>>> lists of nodes, so string comparison doesn't come into
>>> play, and no collation is passed; in added them as
>>> parameters in drafting the text for DTB.
>>> =20
>>> Also, some non-list functions in DTB uses collations: compare,
>>> substring-before, substring-after, contains, starts-with,
>>> ends-with, and text-compare.
>>> =20
>>> - Although formally a "collation" is a total preorder (a "compare"
>>> function, returning -1, 0, 1 for each pair of values, like a
>>> total order but with equalities), our primary use for it is
>>> merely do determine equal/not-equal, not to sort things into
>>> their order.
>>> =20
>>> The Question:
>>> =20
>>> Which type of literal comparison should the list builtins use? If,
>>> like XPath-Functions (and DTB, right now), we let users choose how
>>> strings are being compared, and there's an obvious choice to make
>>> in comparing numbers and dates, isn't it odd to not give them the
>>> same flexibility there?
>>> =20
>>> Specifically:
>>> =20
>>> Question 1: What ways, if any, do we provide for users
>>> to specify how literals are compared?
>>> =20
>>> Question 2: If/when users do not specify how literals are
>>> compared? Specifically do we default to rif:equal
>>> or the builtin comparators?=20
>>> =20
>>> Note also that if the rule author doesn't specify a collation, then
>>> the rule system *user* might. That is, the ruleset might not pay
>>> attention to language, but the user might say they want the French
>>> collation etc.
>>> =20
>>> Options for Question 1:
>>> =20
>>> 1. No rule-author control. Get rid of collations in the API.
>>> (This might be considered unacceptable by i18n folks; I don't
>>> know.)
>>> =20
>>> 2. As written in DTB now: users can offer a URI indicating how
>>> strings are to be compared; only one such URI is defined and
>>> required, so this feature can only be used within environments
>>> that implement some extension here. No way to control which
>>> kind of comparison is done for other literals.
>>> =20
>>> 3. Extend the notion of collations to cover all our literals.
>>> Instead of passing 'French', you could pass a collation that
>>> indicated which kind of numerical comparison should be used.
>>> (The problem is that if you do that, then what happens to the
>>> user's local use-French-collation setting?)
>>> =20
>>> 4. Keep 'collations' for strings, and add a similar but different
>>> comparison parameter. For example, member would be:
>>> =20
>>> member(item, list)
>>> member(item, list, comparator)
>>> member(item, list, comparator, collation)
>>> =20
>>> Here, I'm imagining comparator to be a term which for now could
>>> only be certain pre-defined rif:iris (like collations) -- one
>>> for each of the two types of RIF equality. But in dialects
>>> with higher-order functions, they could be functions defined in
>>> the ruleset.
>>> =20
>>> Technical notes:
>>> =20
>>> - we have to put the comparator argument before the
>>> collation argument, because we have no way of omitting
>>> any argument but the last one(s), and sometimes you need
>>> to supply a comparator but no collation (eg, when you
>>> want to let the user supply the collation); you never
>>> need to supply the collation and not the comparator,
>>> since we'll define a fixed value for the default
>>> comparator. (The difference is that while end-users
>>> might control collations, they're not going to be
>>> controlling comparators.)
>>> =20
>>> - when we let users define their own comparators, the
>>> obvious thing to ask them to define is an "equal"
>>> predicate with two parameters. This approach seriously
>>> impacts the complexity class of the builtins; for
>>> example, member has to be done as a linear search,
>>> instead of using a binary search or a hash table (in the
>>> common cases where the list is known to have some
>>> structure/ordering.) Instead, users should either
>>> define a "compare" function (returning -1, 0, or 1) so
>>> sorting can be done, or a "fold" function (returning a
>>> string which is the same for all "equal" values) so
>>> hashing can be done.
>>> =20
>>> =20
>>> This lack of higher-order function syntax is a pain, but I think=
>>> we can live with it here by defining two comparator IRI's, maybe=
>>> func:literal-compare and func:value-compare, which in our
>>> existing dialects can only be used as a comparator argument. If=
>>> you could use them as functions, literal compare would be like
>>> the ordered version of rif:equal for literals -- I'd suggest the=
>>> ordering between disjoint value spaces just be alphabetical orde=
>> r
>>> of the datatype IRI. Similarly, value-compare is just the big
>>> expression using every guard and then the builtin comparators fo=
>> r
>>> that type. =20
>>> =20
>>> It is pretty goofy to define these two functions and say you can=
>>> only use them as a parameter -- you can't really call them -- bu=
>> t
>>> that's still the best option I see right now.
>>> =20
>>> Thoughts?
>>> =20
>>> - Sandro
>>> =20
>> --=20
>> +43 1 58801 18470 debruijn@inf.unibz.it
>>
>> Jos de Bruijn, http://www.debruijn.net/
>> ----------------------------------------------
>> Many would be cowards if they had courage
>> enough.
>> - Thomas Fuller
>>
>>
>> --------------ms050401010203090100000405
>> Content-Type: application/x-pkcs7-signature; name="smime.p7s"
>> Content-Transfer-Encoding: base64
>> Content-Disposition: attachment; filename="smime.p7s"
>> Content-Description: S/MIME Cryptographic Signature
>>
>> MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJEzCC
>> AuQwggJNoAMCAQICEFQWJg3375t1YRYi6x5QpKcwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE
>> BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT
>> I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDEyODE5NTcxNVoX
>> DTEwMDEyODE5NTcxNVowRzEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEkMCIG
>> CSqGSIb3DQEJARYVZGVicnVpam5AaW5mLnVuaWJ6Lml0MIIBIjANBgkqhkiG9w0BAQEFAAOC
>> AQ8AMIIBCgKCAQEAsENUfWYEG8PFApSgNPgfPDmMihwtSHvsq1+yVeKKGel+k/nresDU343R
>> Nz4QCrLeIVhzjUoSUvpbIViBzPw5T+3i0SGmwAoKvYLlw/5Al8JBlKxipf6ZkXLwa9+3agZZ
>> /TzH6FLcJeoYak7ryUFtJOipYiI2ClPlx8porLrOmikAiPmAbxx0rq0Edq4cAxaMDk9lqni4
>> ZaQWgR00MX81+nq1FqIB3KavPeJaJjnB9njHhan64PxUzFKaRgg1d2u1Pi8NfDqElzua0tu+
>> xoXe/alvLVGtTjitRyCsYrTcTt+hZDCcAg65nwlcs1/oaFz/BP2dSYZAk4LEya4kFj+UqQID
>> AQABozIwMDAgBgNVHREEGTAXgRVkZWJydWlqbkBpbmYudW5pYnouaXQwDAYDVR0TAQH/BAIw
>> ADANBgkqhkiG9w0BAQUFAAOBgQBHGdK4P2l67dEm6SvMfklpDPPE5b0hClBw6XOO9XahEYmQ
>> oeq5jxeBp3EdZxbeZtSUjllvJi7wsOKhCqaipe44GzuW5QDziWiAGg3aMrtRBaJXIR9F6MED
>> IWSLksjq5SAEU7uX4HT/sAe6P2F0oe/QzItO/qgrh6NI4vGxw4yt2zCCAuQwggJNoAMCAQIC
>> EFQWJg3375t1YRYi6x5QpKcwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNV
>> BAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJz
>> b25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDEyODE5NTcxNVoXDTEwMDEyODE5NTcx
>> NVowRzEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEkMCIGCSqGSIb3DQEJARYV
>> ZGVicnVpam5AaW5mLnVuaWJ6Lml0MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA
>> sENUfWYEG8PFApSgNPgfPDmMihwtSHvsq1+yVeKKGel+k/nresDU343RNz4QCrLeIVhzjUoS
>> UvpbIViBzPw5T+3i0SGmwAoKvYLlw/5Al8JBlKxipf6ZkXLwa9+3agZZ/TzH6FLcJeoYak7r
>> yUFtJOipYiI2ClPlx8porLrOmikAiPmAbxx0rq0Edq4cAxaMDk9lqni4ZaQWgR00MX81+nq1
>> FqIB3KavPeJaJjnB9njHhan64PxUzFKaRgg1d2u1Pi8NfDqElzua0tu+xoXe/alvLVGtTjit
>> RyCsYrTcTt+hZDCcAg65nwlcs1/oaFz/BP2dSYZAk4LEya4kFj+UqQIDAQABozIwMDAgBgNV
>> HREEGTAXgRVkZWJydWlqbkBpbmYudW5pYnouaXQwDAYDVR0TAQH/BAIwADANBgkqhkiG9w0B
>> AQUFAAOBgQBHGdK4P2l67dEm6SvMfklpDPPE5b0hClBw6XOO9XahEYmQoeq5jxeBp3EdZxbe
>> ZtSUjllvJi7wsOKhCqaipe44GzuW5QDziWiAGg3aMrtRBaJXIR9F6MEDIWSLksjq5SAEU7uX
>> 4HT/sAe6P2F0oe/QzItO/qgrh6NI4vGxw4yt2zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcN
>> AQEFBQAwgdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcT
>> CUNhcGUgVG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRp
>> ZmljYXRpb24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBG
>> cmVlbWFpbCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNv
>> bTAeFw0wMzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYD
>> VQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVy
>> c29uYWwgRnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEA
>> xKY8VXNV+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkV
>> cI7dyfArhVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUq
>> VIUPSAR/p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMG
>> A1UdHwQ8MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZy
>> ZWVtYWlsQ0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJp
>> dmF0ZUxhYmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIX
>> oUOWlJ1/TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydx
>> VyWN3amcOY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8x
>> ggNkMIIDYAIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGlu
>> ZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWlu
>> ZyBDQQIQVBYmDffvm3VhFiLrHlCkpzAJBgUrDgMCGgUAoIIBwzAYBgkqhkiG9w0BCQMxCwYJ
>> KoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0wOTA1MDQxMjA2MTdaMCMGCSqGSIb3DQEJBDEW
>> BBSN0yP7Ok1Uendj/7L3g7TdWg+EPDBSBgkqhkiG9w0BCQ8xRTBDMAoGCCqGSIb3DQMHMA4G
>> CCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDCB
>> hQYJKwYBBAGCNxAEMXgwdjBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1
>> bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElz
>> c3VpbmcgQ0ECEFQWJg3375t1YRYi6x5QpKcwgYcGCyqGSIb3DQEJEAILMXigdjBiMQswCQYD
>> VQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UE
>> AxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECEFQWJg3375t1YRYi6x5Q
>> pKcwDQYJKoZIhvcNAQEBBQAEggEAZqO0ggj+NYCfrBPXJxU5GdfA5rlf4HPzoY1XqhqOTMKU
>> LMH67v6d22Zxw/2JUtDPVZ6v4FDyVR50VG4MNH+Inj3kn79Xr+D0vd2WcwC7XgiuPLgPTA+5
>> ysxkVz05nBSYZEjszXoY0SI4rVJxVBghQ8OfjcOwlZwukIDsT4erUqY0wmcUc5ZYnmE5RlH6
>> cNxBBGzr67LqDa7yLAQhFmyvueaIbHhvMwlogINDEliFwpnKUZw9fj/yXpS6lCLWI98LP3Qr
>> tcviAyYqz3iAo7aLt3zvpnFTrLnYcczLxc5zbBvXCXTyWkj2BitI00ovaVJgm5E7ygF2FJlV
>> 4LQVnQJUZAAAAAAAAA==
>> --------------ms050401010203090100000405--
--
+43 1 58801 18470 debruijn@inf.unibz.it
Jos de Bruijn, http://www.debruijn.net/
----------------------------------------------
Many would be cowards if they had courage
enough.
- Thomas Fuller
Received on Monday, 4 May 2009 14:24:20 UTC