Re: Review of the the DM pre CR version (Re: Final round of Direct Mapping spec changes; please review to prepare for CR)

On Jan 26, 2012, at 03:51 PM, Eric Prud'hommeaux wrote:

> * Ted Thibodeau Jr <tthibodeau@openlinksw.com> [2012-01-24 14:39-0500]
>> Hi, Juan, all --
>> 
>> On Jan 24, 2012, at 12:38 PM, Juan Sequeda wrote:
>>> On Tue, Jan 24, 2012 at 12:01 PM, Ivan Herman <ivan@w3.org> wrote:
>>>> Thanks Ted. This makes it pretty clear and changes should be done. Juan?
>>> 
>>> Done!
>> 
>> 
>> I should have read more carefully.  The *table* identifiers 
>> in the INSERT statements *also* need wrapping, because they 
>> were so wrapped in the CREATE TABLE statements.
>> 
>> The example I quoted should now read...
>> 
>> 
>>>>  CREATE TABLE "Addresses" (
>>>>     "ID" INT, PRIMARY KEY("ID"),
>>>>     "city" CHAR(10),
>>>>     "state" CHAR(2)
>>>>  )
>>>> 
>>>>  CREATE TABLE "People" (
>>>>     "ID" INT, PRIMARY KEY("ID"),
>>>>     "fname" CHAR(10),
>>>>     "addr" INT,
>>>>     FOREIGN KEY("addr") REFERENCES "Addresses"("ID")
>>>>  )
>>>> 
>>>>  INSERT INTO "Addresses" ("ID", "city", "state") 
>>>>     VALUES (18, 'Cambridge', 'MA')
>>>>  INSERT INTO "People" ("ID", "fname", "addr")
>>>>     VALUES (7, 'Bob', 18)
>>>>  INSERT INTO "People" ("ID", "fname", "addr")
>>>>     VALUES (8, 'Sue', NULL)
> 
> SQL's case-insensitivity requires that if we want to be sure to create
> a table called Addresses, we must 'CREATE TABLE "Addresses"...'.
> Case-insensitivity on the inserts would then make all of
> 
>   INSERT INTO  Addresses  ( ID ,  city ,  state ) 
>      VALUES (18, 'Cambridge', 'MA')
> 
>   INSERT INTO  aDdReSsEs  ( iD ,  cItY ,  sTaTe ) 
>      VALUES (18, 'Cambridge', 'MA')
> 
>   INSERT INTO "Addresses" ("ID", "city", "state") 
>      VALUES (18, 'Cambridge', 'MA')
> 
> have identical effect (so long as there were no other identifiers
> which differed only by case).

Close but not quite.

If you delimit the identifier, it becomes case sensitive.

If you don't delimit the identifier, it is case insensitive, 
but the DBMS treats it as the implementation's default case -- 
which may be either upper or lower but is never mixed.  (SQL
standard may specify UPPER, but implementations vary.)

For implementation agnosticism, identifiers should either
*always* or *never* be delimited.

One fairly common practice (which I try to follow in my samples
and also in practice) is to always CAPITALIZE SQL keywords 
(INSERT, SELECT, FROM, etc.), lowercase identifiers (address, 
id, city, state), and keep them both case insensitive.  I do
not personally delimit in most situations, but I also avoid
special characters....

(For my purposes, I find things generally work best when ID, 
iD, Id, and id all mean the same thing -- and when I don't 
have multiple columns or tables which differ only by casing...)


> I'm assuming that the regards the aesthetic burden of extra quotes as
> lower than the cognitive burden of different quoting in the CREATEs
> and the INSERTs so I've followed this quote-heavy advice in R1.18.

Basically, the choice is between delimiting all identifiers, 
or delimiting none.


> I would expect, however, that many readers of the document will have
> MySQL or Postgresql and have not even seen quotes around identifiers
> before. In fact, MySQL doesn't even parse ""s around identifiers (even
> in ANSI compatability mode, per my tests).

Silly questions for you -- Which version of MySQL did you 
test? And were you using MyISAM or InnoDB?  Details matter,
as mentioned in the comments on the first article you cite,
and as noted in the Wikibooks table ...


> Quoting consistency is low:
> <http://www.alberton.info/dbms_identifiers_and_case_sensitivity.html>.
> Unfortunately, that article's advice:
> [[
>  The safest choice if you care about portability and peace of mind is
>  not to quote the identifiers when you create your tables/fields and
>  when you run your queries.
> ]]
> doesn't work for us as the unquoted direct graph would look like
> (according to this test -- I haven't confirmed beyond playing with
> MySQL's ANSI mode and seeing no change in behavior):
> 
> Oracle: <PEOPLE/ID-7> <PEOPLE#FNAME> "Bob" .
> PostgreSQL: <people/id-7> <people#fname> "Bob" .
> MySQL/Windows: <people/ID-7> <people#fname> "Bob" .
> MySQL/Unix: <People/ID-7> <People#fname> "Bob" .
> SQLite: <People/ID-7> <People#fname> "Bob" .
> SQL Server: <People/ID-7> <People#fname> "Bob" .
> 
> 
> Here's a table of the quoting characters, which also vary between
> implementations:
> <http://en.wikibooks.org/wiki/SQL_Dialects_Reference/Data_structure_definition/Delimited_identifiers>

We decided on the SQL:2008 standard, which you left out of
your short list above.  

Our identifier delimiter is the double-quote, `"`.  We don't 
have to know or care about any specific implementation's 
delimiter -- though we could include a warning about the 
variance, if you're concerned about readers with limited
experience running into major difficulty here.

Ted


> If this persuades folks that we should go quote-light, or perhaps use
> Japanese 「」s instead, let me know.
> 
> 
>> Similar must be done throughout.
> 
> I saw only those three inserts. Anybody see something else that needs quoting?
> Maybe some scare-quotes, A "Direct" Mapping of Relational "Data" to RDF?
> 
> 
>> Be seeing you,
>> 
>> Ted
>> 
>> 
>> --
>> A: Yes.                      http://www.guckes.net/faq/attribution.html
>> | Q: Are you sure?
>> | | A: Because it reverses the logical flow of conversation.
>> | | | Q: Why is top posting frowned upon?
>> 
>> Ted Thibodeau, Jr.           //               voice +1-781-273-0900 x32
>> Evangelism & Support         //        mailto:tthibodeau@openlinksw.com
>>                             //              http://twitter.com/TallTed
>> OpenLink Software, Inc.      //              http://www.openlinksw.com/
>>         10 Burlington Mall Road, Suite 265, Burlington MA 01803
>>     Weblog   -- http://www.openlinksw.com/blogs/
>>     LinkedIn -- http://www.linkedin.com/company/openlink-software/
>>     Twitter  -- http://twitter.com/OpenLink
>>     Google+  -- http://plus.google.com/100570109519069333827/
>>     Facebook -- http://www.facebook.com/OpenLinkSoftware
>> Universal Data Access, Integration, and Management Technology Providers
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> -ericP
> 

--
A: Yes.                      http://www.guckes.net/faq/attribution.html
| Q: Are you sure?
| | A: Because it reverses the logical flow of conversation.
| | | Q: Why is top posting frowned upon?

Ted Thibodeau, Jr.           //               voice +1-781-273-0900 x32
Evangelism & Support         //        mailto:tthibodeau@openlinksw.com
                             //              http://twitter.com/TallTed
OpenLink Software, Inc.      //              http://www.openlinksw.com/
         10 Burlington Mall Road, Suite 265, Burlington MA 01803
     Weblog   -- http://www.openlinksw.com/blogs/
     LinkedIn -- http://www.linkedin.com/company/openlink-software/
     Twitter  -- http://twitter.com/OpenLink
     Google+  -- http://plus.google.com/100570109519069333827/
     Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers

Received on Tuesday, 31 January 2012 16:57:55 UTC