- From: Jos de Bruijn <debruijn@inf.unibz.it>
- Date: Mon, 04 May 2009 16:23:28 +0200
- To: Sandro Hawke <sandro@w3.org>
- CC: public-rif-wg@w3.org
- Message-ID: <49FEFA60.30209@inf.unibz.it>
I assumed you were talking only about list operators, so I mean indeed 1b. > When you say users "have to define the funtions themselves", you mean > using rules to re-implement member, index-of, etc? Yes. Jos Sandro Hawke wrote: >> I am in favor of option 1: the list operators simply work on the values >> in the lists, rather than performing all kinds of conversions. If users >> want something more, they have to define the functions themselves. > > Sorry, I guess I need more detail within option 1. Which do you mean: > > 1a. Remove all 'collation' paramaters from DTB, including on > the string compare builtin. > > 1b. Remove 'collation' parameters from the list builtins, but > leave them in the non-list builtins. > > 1c. Keep 'collation' in the list builtins (so strings are compared > using the compare builtin) but otherwise compare values using > RIF equality > > (I'd guess you mean 1b.) > > When you say users "have to define the funtions themselves", you mean > using rules to re-implement member, index-of, etc? > > -- Sandro > >> Jos >> >> Sandro Hawke wrote: >>> In writing the spec for the List builtins, I've come across a difficult= >>> design choice concerning how literals are compared. (Some of this migh= >> t >>> be considered already decided, but it seems to me there's a fair amount= >>> of new information here, relevantstuff I didn't know at the F2F.) >>> =20 >>> Background: >>> =20 >>> - In RIF, you have two ways you can compare most literals: >>> =20 >>> (1) You can use rif:equals, which is true iff the elements in >>> the value space for the two literals are the same. >>> Literals with types with disjoint value spaces will never >>> compare as equal >>> =20 >>> true: "01":xs^int =3D "1":xs^int =20 >>> false: "1":xs^int =3D "1"^xs^float >>> false: "1":xs^double =3D "1"^xs^float >>> false: "2002-04-02T12:00:00-01:00"^^xs:dateTime=20 >>> =3D "2002-04-02T17:00:00+04:00"^^xs:dateTime= >> ) >>> false: "Strasse" =3D "Stra=C3=9Fe" >>> =20 >>> (2) You can use a builtin comparator like numeric-equal, >>> dateTime-equal, date-equal, time-equal, duration-equal, >>> XMLLiteral-equal, compare, and text-compare. These >>> builtins allow more values to be considered equal, for >>> example: >>> =20 >>> true: "1":xs^int =3D "1"^xs^float >>> true: "1":xs^double =3D "1"^xs^float >>> true: op:dateTime-equal( >>> "2002-04-02T12:00:00-01:00"^^xs:dateTime, = >>> "2002-04-02T17:00:00+04:00"^^xs:dateTime) >>> =20 >>> In addition, for the comparison of strings and text, an >>> optional 'collation' parameter is available: >>> =20 >>> false: 0 =3D=3D compare("Strasse", "Stra=C3=9Fe") >>> true: 0 =3D=3D compare("Strasse", "Stra=C3=9Fe", "deutsch= >> ") >>> =20 >>> - As I understand it, the 'collation' is an extensibility >>> point. XPath-Functions uses the examples 'deutsch', >>> "http://www.example.com/collations/French1", and >>> "http://www.example.com/collations/French2", but only defines >>> (and requires) one collation, the default: >>> "http://www.w3.org/2005/xpath-functions/collation/codepoint" which= >>> does unicode normalization (sort of). See >>> http://www.w3.org/TR/xpath-functions/#collations and >>> http://www.unicode.org/unicode/reports/tr10/ >>> =20 >>> - Some list builtins need to compare values.=20 >>> =20 >>> * member; this is not in XPath-Functions >>> =20 >>> * index-of and distinct-values; these take a collation >>> parameter in XPath-Function. The collation is defined to >>> apply whenever the values are strings. >>> =20 >>> * union, intersect, except; these conceptually compare >>> elements, but in XPath-Functions they only operate on >>> lists of nodes, so string comparison doesn't come into >>> play, and no collation is passed; in added them as >>> parameters in drafting the text for DTB. >>> =20 >>> Also, some non-list functions in DTB uses collations: compare, >>> substring-before, substring-after, contains, starts-with, >>> ends-with, and text-compare. >>> =20 >>> - Although formally a "collation" is a total preorder (a "compare" >>> function, returning -1, 0, 1 for each pair of values, like a >>> total order but with equalities), our primary use for it is >>> merely do determine equal/not-equal, not to sort things into >>> their order. >>> =20 >>> The Question: >>> =20 >>> Which type of literal comparison should the list builtins use? If, >>> like XPath-Functions (and DTB, right now), we let users choose how >>> strings are being compared, and there's an obvious choice to make >>> in comparing numbers and dates, isn't it odd to not give them the >>> same flexibility there? >>> =20 >>> Specifically: >>> =20 >>> Question 1: What ways, if any, do we provide for users >>> to specify how literals are compared? >>> =20 >>> Question 2: If/when users do not specify how literals are >>> compared? Specifically do we default to rif:equal >>> or the builtin comparators?=20 >>> =20 >>> Note also that if the rule author doesn't specify a collation, then >>> the rule system *user* might. That is, the ruleset might not pay >>> attention to language, but the user might say they want the French >>> collation etc. >>> =20 >>> Options for Question 1: >>> =20 >>> 1. No rule-author control. Get rid of collations in the API. >>> (This might be considered unacceptable by i18n folks; I don't >>> know.) >>> =20 >>> 2. As written in DTB now: users can offer a URI indicating how >>> strings are to be compared; only one such URI is defined and >>> required, so this feature can only be used within environments >>> that implement some extension here. No way to control which >>> kind of comparison is done for other literals. >>> =20 >>> 3. Extend the notion of collations to cover all our literals. >>> Instead of passing 'French', you could pass a collation that >>> indicated which kind of numerical comparison should be used. >>> (The problem is that if you do that, then what happens to the >>> user's local use-French-collation setting?) >>> =20 >>> 4. Keep 'collations' for strings, and add a similar but different >>> comparison parameter. For example, member would be: >>> =20 >>> member(item, list) >>> member(item, list, comparator) >>> member(item, list, comparator, collation) >>> =20 >>> Here, I'm imagining comparator to be a term which for now could >>> only be certain pre-defined rif:iris (like collations) -- one >>> for each of the two types of RIF equality. But in dialects >>> with higher-order functions, they could be functions defined in >>> the ruleset. >>> =20 >>> Technical notes: >>> =20 >>> - we have to put the comparator argument before the >>> collation argument, because we have no way of omitting >>> any argument but the last one(s), and sometimes you need >>> to supply a comparator but no collation (eg, when you >>> want to let the user supply the collation); you never >>> need to supply the collation and not the comparator, >>> since we'll define a fixed value for the default >>> comparator. (The difference is that while end-users >>> might control collations, they're not going to be >>> controlling comparators.) >>> =20 >>> - when we let users define their own comparators, the >>> obvious thing to ask them to define is an "equal" >>> predicate with two parameters. This approach seriously >>> impacts the complexity class of the builtins; for >>> example, member has to be done as a linear search, >>> instead of using a binary search or a hash table (in the >>> common cases where the list is known to have some >>> structure/ordering.) Instead, users should either >>> define a "compare" function (returning -1, 0, or 1) so >>> sorting can be done, or a "fold" function (returning a >>> string which is the same for all "equal" values) so >>> hashing can be done. >>> =20 >>> =20 >>> This lack of higher-order function syntax is a pain, but I think= >>> we can live with it here by defining two comparator IRI's, maybe= >>> func:literal-compare and func:value-compare, which in our >>> existing dialects can only be used as a comparator argument. If= >>> you could use them as functions, literal compare would be like >>> the ordered version of rif:equal for literals -- I'd suggest the= >>> ordering between disjoint value spaces just be alphabetical orde= >> r >>> of the datatype IRI. Similarly, value-compare is just the big >>> expression using every guard and then the builtin comparators fo= >> r >>> that type. =20 >>> =20 >>> It is pretty goofy to define these two functions and say you can= >>> only use them as a parameter -- you can't really call them -- bu= >> t >>> that's still the best option I see right now. >>> =20 >>> Thoughts? >>> =20 >>> - Sandro >>> =20 >> --=20 >> +43 1 58801 18470 debruijn@inf.unibz.it >> >> Jos de Bruijn, http://www.debruijn.net/ >> ---------------------------------------------- >> Many would be cowards if they had courage >> enough. >> - Thomas Fuller >> >> >> --------------ms050401010203090100000405 >> Content-Type: application/x-pkcs7-signature; name="smime.p7s" >> Content-Transfer-Encoding: base64 >> Content-Disposition: attachment; filename="smime.p7s" >> Content-Description: S/MIME Cryptographic Signature >> >> MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJEzCC >> AuQwggJNoAMCAQICEFQWJg3375t1YRYi6x5QpKcwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE >> BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT >> I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDEyODE5NTcxNVoX >> DTEwMDEyODE5NTcxNVowRzEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEkMCIG >> CSqGSIb3DQEJARYVZGVicnVpam5AaW5mLnVuaWJ6Lml0MIIBIjANBgkqhkiG9w0BAQEFAAOC >> AQ8AMIIBCgKCAQEAsENUfWYEG8PFApSgNPgfPDmMihwtSHvsq1+yVeKKGel+k/nresDU343R >> Nz4QCrLeIVhzjUoSUvpbIViBzPw5T+3i0SGmwAoKvYLlw/5Al8JBlKxipf6ZkXLwa9+3agZZ >> /TzH6FLcJeoYak7ryUFtJOipYiI2ClPlx8porLrOmikAiPmAbxx0rq0Edq4cAxaMDk9lqni4 >> ZaQWgR00MX81+nq1FqIB3KavPeJaJjnB9njHhan64PxUzFKaRgg1d2u1Pi8NfDqElzua0tu+ >> xoXe/alvLVGtTjitRyCsYrTcTt+hZDCcAg65nwlcs1/oaFz/BP2dSYZAk4LEya4kFj+UqQID >> AQABozIwMDAgBgNVHREEGTAXgRVkZWJydWlqbkBpbmYudW5pYnouaXQwDAYDVR0TAQH/BAIw >> ADANBgkqhkiG9w0BAQUFAAOBgQBHGdK4P2l67dEm6SvMfklpDPPE5b0hClBw6XOO9XahEYmQ >> oeq5jxeBp3EdZxbeZtSUjllvJi7wsOKhCqaipe44GzuW5QDziWiAGg3aMrtRBaJXIR9F6MED >> IWSLksjq5SAEU7uX4HT/sAe6P2F0oe/QzItO/qgrh6NI4vGxw4yt2zCCAuQwggJNoAMCAQIC >> EFQWJg3375t1YRYi6x5QpKcwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkExJTAjBgNV >> BAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQZXJz >> b25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDEyODE5NTcxNVoXDTEwMDEyODE5NTcx >> NVowRzEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEkMCIGCSqGSIb3DQEJARYV >> ZGVicnVpam5AaW5mLnVuaWJ6Lml0MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA >> sENUfWYEG8PFApSgNPgfPDmMihwtSHvsq1+yVeKKGel+k/nresDU343RNz4QCrLeIVhzjUoS >> UvpbIViBzPw5T+3i0SGmwAoKvYLlw/5Al8JBlKxipf6ZkXLwa9+3agZZ/TzH6FLcJeoYak7r >> yUFtJOipYiI2ClPlx8porLrOmikAiPmAbxx0rq0Edq4cAxaMDk9lqni4ZaQWgR00MX81+nq1 >> FqIB3KavPeJaJjnB9njHhan64PxUzFKaRgg1d2u1Pi8NfDqElzua0tu+xoXe/alvLVGtTjit >> RyCsYrTcTt+hZDCcAg65nwlcs1/oaFz/BP2dSYZAk4LEya4kFj+UqQIDAQABozIwMDAgBgNV >> HREEGTAXgRVkZWJydWlqbkBpbmYudW5pYnouaXQwDAYDVR0TAQH/BAIwADANBgkqhkiG9w0B >> AQUFAAOBgQBHGdK4P2l67dEm6SvMfklpDPPE5b0hClBw6XOO9XahEYmQoeq5jxeBp3EdZxbe >> ZtSUjllvJi7wsOKhCqaipe44GzuW5QDziWiAGg3aMrtRBaJXIR9F6MEDIWSLksjq5SAEU7uX >> 4HT/sAe6P2F0oe/QzItO/qgrh6NI4vGxw4yt2zCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcN >> AQEFBQAwgdExCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcT >> CUNhcGUgVG93bjEaMBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRp >> ZmljYXRpb24gU2VydmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBG >> cmVlbWFpbCBDQTErMCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNv >> bTAeFw0wMzA3MTcwMDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYD >> VQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVy >> c29uYWwgRnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEA >> xKY8VXNV+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkV >> cI7dyfArhVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUq >> VIUPSAR/p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMG >> A1UdHwQ8MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZy >> ZWVtYWlsQ0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJp >> dmF0ZUxhYmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIX >> oUOWlJ1/TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydx >> VyWN3amcOY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8x >> ggNkMIIDYAIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGlu >> ZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWlu >> ZyBDQQIQVBYmDffvm3VhFiLrHlCkpzAJBgUrDgMCGgUAoIIBwzAYBgkqhkiG9w0BCQMxCwYJ >> KoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0wOTA1MDQxMjA2MTdaMCMGCSqGSIb3DQEJBDEW >> BBSN0yP7Ok1Uendj/7L3g7TdWg+EPDBSBgkqhkiG9w0BCQ8xRTBDMAoGCCqGSIb3DQMHMA4G >> CCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDCB >> hQYJKwYBBAGCNxAEMXgwdjBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1 >> bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElz >> c3VpbmcgQ0ECEFQWJg3375t1YRYi6x5QpKcwgYcGCyqGSIb3DQEJEAILMXigdjBiMQswCQYD >> VQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UE >> AxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECEFQWJg3375t1YRYi6x5Q >> pKcwDQYJKoZIhvcNAQEBBQAEggEAZqO0ggj+NYCfrBPXJxU5GdfA5rlf4HPzoY1XqhqOTMKU >> LMH67v6d22Zxw/2JUtDPVZ6v4FDyVR50VG4MNH+Inj3kn79Xr+D0vd2WcwC7XgiuPLgPTA+5 >> ysxkVz05nBSYZEjszXoY0SI4rVJxVBghQ8OfjcOwlZwukIDsT4erUqY0wmcUc5ZYnmE5RlH6 >> cNxBBGzr67LqDa7yLAQhFmyvueaIbHhvMwlogINDEliFwpnKUZw9fj/yXpS6lCLWI98LP3Qr >> tcviAyYqz3iAo7aLt3zvpnFTrLnYcczLxc5zbBvXCXTyWkj2BitI00ovaVJgm5E7ygF2FJlV >> 4LQVnQJUZAAAAAAAAA== >> --------------ms050401010203090100000405-- -- +43 1 58801 18470 debruijn@inf.unibz.it Jos de Bruijn, http://www.debruijn.net/ ---------------------------------------------- Many would be cowards if they had courage enough. - Thomas Fuller
Received on Monday, 4 May 2009 14:24:20 UTC