W3C home > Mailing lists > Public > public-lod@w3.org > December 2009

Re: Linked Data with web2py

From: Massimo Di Pierro <MDiPierro@cs.depaul.edu>
Date: Thu, 24 Dec 2009 12:40:16 -0600
CC: "public-lod@w3.org" <public-lod@w3.org>, Chris Baron <topher.baron@gmail.com>
Message-ID: <40072E3C-7C69-4DAB-BA2D-48EDD86C8B0A@cs.depaul.edu>
To: Kingsley Idehen <kidehen@openlinksw.com>
Good. I think we agree.

Massimo

On Dec 24, 2009, at 12:24 PM, Kingsley Idehen wrote:

> Massimo Di Pierro wrote:
>> You say "ODBC is about making the backend DBMS engines
>> interchangeable". Does it mean there is a way to write an arbitrary
>> SQL query once and it is guaranteed to work when I change the  
>> database
>> from Oracle to PostgreSQL and vice versa (that is what web2py  
>> provides)?
> You are providing a Virtualization layer atop the Data Access Drivers.
>> If so, would you provide an example of how you would do pagination
>> which is highly dialect dependent? I guess I am missing this.
> Virtuoso would do that for you. Microsoft JET engine would do that for
> you. In either case you link the ODBC DSN into the Virtual DBMS.
>
>>
>> What is the license of Virtuoso?
> Dual i.e. Open Source and Commercial. The Virtual DBMS isn't Open  
> Source.
>>
>> We do not provide nor created an ODBC driver. We use pyODBC for those
>> back-ends that require ODBC. For those RDBS that do not, I guess, you
>> can still use pyODBC but currently it would require some minor
>> modification to our source code.
> As long as you don't have any DBMS specific code in your ODBC calls to
> pyODBC then you're set re. your obligations :-)  Then matters go  
> down to
> the pyODBC layer where they should work like Virtuoso, Access and  
> other
> ODBC apps re. use of Metadata calls to deduce underlying DBMS
> capabilities via the ODBC Drivers for those DBMS engines.
>>
>> Bare in mind that the web2py user, whether he/she wishes to develop a
>> generic web app or expose existing data using Linked Data, the user
>> never writes any SQL. Not even our RDF plugin contains any SQL code.
> The abstraction and its goals are crystal clear to me.
>>
>> Our approach is similar to SQLAlchemy, SQLObject, Django ORM and Ruby
>> Active Record. It is an approach typical of interpreted languages
>> which provide introspection capabilities. It has some similarities
>> with Java Hibernate.
> We built a fixed ODBC Layer for Ruby Active Record [1]. An OpenLink
> member of staff is now the PHP-ODBC maintainer etc..
>>
>> Anyway. The DAL is one component of web2py. Other components include
>> web server, libraries to handle cookies, create session, cache web
>> based and database queries, generate web forms from database models,
>> validate all input fields, report errors on validation,
>> internationalization, web based IDE, web based testing and debugging,
>> automatic generation of a web based interface to any database, role
>> based access control system, a template language to render data in
>> HTML/XML and other protocols, ability to serialize data in XMl, CSV,
>> JSON, RSS, RTF and generate services in XMLRPC, JSONRPC for Google  
>> Web
>> Toolkit, and AMFRPC for Flash/Flex.
>>
>> We tried hard to build this thing so that (at least in the default
>> configuration) it does not have any third party dependence, requires
>> no installation (just unzip and click) and no configuration, to be
>> super easy to setup. Requiring ODBC or any other third party product
>> would not simplify our work. It would increase complexity, setup
>> costs, and reduce portability.
> Again, not questioning your vision and goals re. your tech.
>> I would agree that ODBC should be an option for all database that
>> provide the driver.
> Yep! That's what I am preoccupied with and very protective of  
> etc.. :-)
>> We are in the process of re-factoring our source for the purpose of
>> making it easier to plugin more non-relational databases. This will  
>> be
>> completed in one month or so. Once done it will be straightforward to
>> use the ODBC drivers for those RDBMS database back-ends that  
>> currently
>> do not use it, if you wish to do so.
> Great!
>
> Links:
>
> 1. http://odbc-rails.rubyforge.org/
>
> Kingsley
>>
>> You can find more DAL example here:
>> http://www.web2py.com/examples/default/dal although the best  
>> reference
>> is the web2py book
>> http://www.amazon.com/s/ref=nb_ss?url=search-alias%3Daps&field-keywords=9780470592359
>>
>>
>> Massimo
>>
>> On Dec 24, 2009, at 11:19 AM, Kingsley Idehen wrote:
>>
>>> Massimo Di Pierro wrote:
>>>> I think we are talking apples and oranges.
>>>>
>>>> According to Microsoft, who proposed ODBC,
>>>>
>>>>    http://msdn.microsoft.com/en-us/library/aa198030%28SQL. 
>>>> 80%29.aspx
>>>>
>>>> "The driver passes SQL statements to SQL Server and returns the
>>>> results of the statements to the application."
>>> No, before an ODBC app talks to an ODBC Driver, it should use ODBC
>>> metadata calls to understand the capabilities of the underlying  
>>> driver.
>>> That's how ODBC applications are supposed to be written.
>>>
>>> ODBC is not a simple CLI (as per what you get from DBMS vendors) for
>>> executing SQL statements, far from it.
>>>
>>>>
>>>> The purpose of the web2py DAL is that of generating the SQL  
>>>> statements
>>>> (in the case of RDBS) and it does not care whether the connection  
>>>> is
>>>> established via ODBC or not.
>>> Its an abstraction that sits atop data access APIs.
>>>
>>> Putting my question differently, what happens when I use ODBC  
>>> against a
>>> DBMS that isn't DB2 or MS SQL Server re. this interface?  
>>> Basically, can
>>> I substitute either via the ODBC hook ?
>>>>
>>>> We do not use ODBC on Oracle, for example, but we use cx_Oracle
>>>> instead because, according to this is a quote from the Oracle web  
>>>> page:
>>>>
>>>>
>>>> http://www.oracle.com/technology/pub/articles/rubio-python-turbogears.html
>>>>
>>>>
>>>>
>>>> "cx_Oracle serves as the core and lowest-level API for bridging a
>>>> Python environment to an Oracle database. cx_Oracle is to Python  
>>>> what
>>>> an Oracle JDBC driver is to Java, enabling an application to  
>>>> execute
>>>> raw SQL queries and manipulate database cursors with a fine level  
>>>> of
>>>> detail."
>>> Yes, what do you expect Oracle to say? Something like: we have an
>>> interface that lacks the deductive capabilities of the ODBC  
>>> interface
>>> that ultimately makes DBMS swapping nice and easy?
>>>>
>>>> We could use pyODBC on Oracle as we do with with DB2 but the
>>>> installation process would be more cumbersome.
>>> Depends on which ODBC drivers you use. In fact, you can install our
>>> Drives using an installer on any platform we support (and thats a  
>>> vast
>>> array of platforms).
>>>
>>> Post installation we even give you basic connection exercisers or a
>>> platform port of the ODBC Driver Manager that verifies DSNs post
>>> creation.
>>>>
>>>> ODBC does a lot but in the end it just gives a connection and a  
>>>> cursor
>>>> object. It does not give us complete database abstraction which is
>>>> what the web2py DAL does.
>>> Of course it gives you complete DBMS abstraction. What you really  
>>> mean
>>> is this: the interface you are using doesn't really exploit ODBC  
>>> to the
>>> max. I believe this ODBC layer comes from someone else, right?
>>>>
>>>> One of the most important issue from our prospective is that the
>>>> web2py DAL works with Non Relational Databases too. Currently only
>>>> Google Big Table on the Google App Engine but we are working on
>>>> MongoDB and CouchDB support.  ODBC does not work with non- 
>>>> relational
>>>> databases.
>>> Virtuoso provides an ODBC interface that actually lets you work with
>>> XML, RDF, Web Services and other data representations and model
>>> combinations.  Jena, Redland, and Sesame abstractions all work with
>>> Virtuoso via its JDBC or ODBC drivers. Again, ODBC capabilities  
>>> are very
>>> dependent on who's ODBC implementation (app or driver) you are  
>>> working
>>> with.
>>>
>>> In our world you can join a relation table and Web Addressable  
>>> Resource
>>> via SPARQL pattern in Procedure Views (Table Valued Functions in  
>>> Oracle
>>> parlance) executed inside the Virtuoso SQL (i.e., SPASQL [1]).
>>> Naturally, you can also to the opposite i.e., surface SQL  
>>> functionality
>>> in SPARQL[2] etc..
>>>>
>>>> Running on the Google App Engine cloud has some limitations. We can
>>>> only upload pure Python or pure Java programs, no C/C++ modules or
>>>> pre-compiled modules. We can only use the Google API or something  
>>>> that
>>>> they call GQL. There is no SQL on GAE because it is not a  
>>>> relational
>>>> database. Hence, ODBC is not option. Building our system on ODBC
>>>> simply would not cut it. Yet the Google App Engine is the single  
>>>> best
>>>> thing that happened to the world of web development in some time
>>>> (practically free, unlimited scalability, zero maintenance, no  
>>>> messing
>>>> up with server or database administration). We use ODBC when we can
>>>> but we are interested in pushing our technology in places where  
>>>> ODBC
>>>> is not an option.
>>>>
>>>> Least but not last I want to make the case that everybody is  
>>>> doing web
>>>> apps with database connections these days. This is now off-the- 
>>>> shelf
>>>> technology and people should not pay for it. Most ODBC drivers are
>>>> commercial products or require a commercial license for enterprise
>>>> use. Our Database Abstraction Layer is released under the GPL2
>>>> license, only requires third party GPL or BSD drivers, and has been
>>>> tested by more than 1500 registered users and 50 developers.
>>>
>>> I am not questioning your vision. And when it comes to data  
>>> access, I
>>> think REST based Open Data Access (Linked Data) is something I  
>>> advocate
>>> strongly,  as a superior solution to ODBC, JDBC, ADO.NET, OLE-DB, or
>>> XMLA when dealing disparately shaped data sources.
>>>
>>> My fundamental point to you is this: ODBC is a DBMS independent  
>>> API, so
>>> you shouldn't speak in DBMS specific terms re. ODBC. You have Native
>>> CLIs from DBMS vendors that occupy the Database Specific Data  
>>> Access API
>>> box :-)
>>>
>>> ODBC is about making the backend DBMS engines interchangeable, one  
>>> size
>>> doesn't fit all re. ODBC Drivers  or ODBC compliant applications. I
>>> should be able to simply slot in our Virtuoso ODBC Driver, or our  
>>> ODBC
>>> Drivers for all the other DBMS engines we support without issue  
>>> via your
>>> ODBC hook.  Same should apply to any other ODBC Driver vendor.
>>>
>>> To conclude; when I have a moment I'll have your ODBC layer tested
>>> against our ODBC Drivers, if there are issues, we'll make a simply
>>> report highlight ODBC APIs issues etc. (for you or whoever is  
>>> behind the
>>> ODBC layer. Worst case we can fix it ourselves as we already done  
>>> many
>>> other interfaces).
>>>
>>>
>>> Links:
>>>
>>> 1. http://docs.openlinksw.com/virtuoso/ 
>>> rdfsparql.html#rdfsparqlinline
>>> 2. http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsqlfromsparql
>>>
>>> Kingsley
>>>>
>>>>
>>>> Massimo
>>>>
>>>>
>>>> On Dec 24, 2009, at 9:30 AM, Kingsley Idehen wrote:
>>>>
>>>>> Massimo Di Pierro wrote:
>>>>>> Hi Kingsley,
>>>>>>
>>>>>> Internally we use the Python ODBC driver for the connection to  
>>>>>> DB2
>>>>>> and
>>>>>> to MSSQL.
>>>>> But how have you come to conclude that ODBC only works with DB2  
>>>>> and
>>>>> MSSQL? What about ODBC Drivers that have been developed to work
>>>>> consistently across:
>>>>>
>>>>> Virtuoso, Oracle, SQL Server, DB2, Informix, Ingres, Sybase,  
>>>>> Progress,
>>>>> MySQL, Firebird, other ODBC Drivers, JDBC Drivers etc? Thus, you  
>>>>> could
>>>>> have the option ODBC as opposed to: ODBC for DB2 and MS SQL Sever.
>>>>>
>>>>> What about the fact that you have ODBC Drivers for Virtuoso that
>>>>> enable
>>>>> to make one ODBC connection to all the DBMS engines above, and  
>>>>> treat
>>>>> their tables as though they were part of one DBMS i.e., Virtual  
>>>>> DBMS
>>>>> based Relational Data Virtualization (or Federation).
>>>>>
>>>>>> We use more specific drivers for the other databases, as  
>>>>>> recommended
>>>>>> by the official documentation.
>>>>> What official documentation? URL please.
>>>>>> Anyway, from the Python programming point of view they all  
>>>>>> expose the
>>>>>> same Python API.
>>>>>>
>>>>>> The fact is ODBC only unifies the data access API and that is a  
>>>>>> small
>>>>>> part of we needed since different database still have different
>>>>>> dialects.
>>>>>> Our DAL completely abstracts the database and writes the SQL in  
>>>>>> the
>>>>>> specific dialect of specific backend.
>>>>> Are you aware of the fact that via ODBC metadata calls you can
>>>>> deductively discern the capabilities and SQL dialiects supported  
>>>>> by an
>>>>> DBMS. This is all about the ODBC Drivers you are working with,  
>>>>> not the
>>>>> ODBC spec itself (which is vast and very sophisticated).
>>>>>>
>>>>>> For example a limit query in SQLite is done with "SELECT ...
>>>>>> FIRST N"
>>>>>> The same query in MSSQL is done with "SELECT .... TOP N", in
>>>>>> Oracle it
>>>>>> requires two nested selects "SELECT ... FROM (SELECT w_tmp.*,  
>>>>>> ROWNUM
>>>>>> w_row FROM (SELECT ...) w_tmp WHERE ROWNUM<=N)...;".  In the  
>>>>>> case of
>>>>>> the Google App Engine there is not even SQL so our DAL translates
>>>>>> directly into Google API calls.
>>>>> Again, you should focus on the functionality you want, and then  
>>>>> see if
>>>>> the underling ODBC Driver can handle what you want, if it can't  
>>>>> you
>>>>> can
>>>>> drop back to your work arounds.
>>>>>
>>>>>>
>>>>>> The same query in the web2py DAL reads like, for example:
>>>>>>
>>>>>>  db=DAL('postgresql://username:password@localhost',  
>>>>>> pool_size=100)
>>>>>>  db.define_table('person',Field('name'))
>>>>>>  db.person.insert(name='Ken')
>>>>>>  rows = db(db.person.name=='Ken').select(limitby = (0,10))
>>>>>>
>>>>>> The first line choose the most appropriate connection driver  
>>>>>> (which
>>>>>> may be ODBC). The second line tried to define a table "person".  
>>>>>> If it
>>>>>> does not exist, it is CREATEd. If it exists and has a different
>>>>>> stricture, it is ALTERed. The third line inserts a second. The  
>>>>>> forth
>>>>>> line is performs the limit query.
>>>>>>
>>>>>> As you can see we do not use raw SQL anywhere in our system,  
>>>>>> although
>>>>>> our system may use SQL to talk to the database. Using raw SQL  
>>>>>> also
>>>>>> presents the disadvantage that, if one is not very careful in
>>>>>> escaping
>>>>>> variables, one may introduce SQL Injections vulnerabilities. We  
>>>>>> do
>>>>>> not
>>>>>> have to worry about that with the DAL.
>>>>> I do understand the abstraction.
>>>>> I worked with a product called DAL [1] for Mac OS Classic in the  
>>>>> early
>>>>> 1990's  (and others pre ODBC, which arrived in 1992 as Windows  
>>>>> port of
>>>>> what was then the SAG CLI) :-)
>>>>>>
>>>>>> The RDF tagging is done at the DAL level:
>>>>>>
>>>>>>  db.person.name.rdf = { .... }
>>>>>>
>>>>>> Anyway, it is possible there is some feature of ODBC that we have
>>>>>> overlooked and we may be able to take  advantage of.
>>>>> Yes.
>>>>>
>>>>> You have ODBC itself. Then you have Virtuoso VDBMS (think scalable
>>>>> high
>>>>> performance variant of Microsoft Access JET Engine which major
>>>>> benefits).
>>>>>
>>>>>
>>>>> Kingsley
>>>>>>
>>>>>> Massimo
>>>>>>
>>>>>> On Dec 24, 2009, at 6:57 AM, Kingsley Idehen wrote:
>>>>>>
>>>>>>> Massimo Di Pierro wrote:
>>>>>>>> Hello everybody,
>>>>>>>>
>>>>>>>> I am a new member of this list and first of all I wish  
>>>>>>>> everybody
>>>>>>>> Happy
>>>>>>>> Holidays.
>>>>>>>>
>>>>>>>> I also take the occasion to introduce the RDF plugin for  
>>>>>>>> web2py.
>>>>>>>>
>>>>>>>> http://web2py.com/semantic
>>>>>>>>
>>>>>>>> web2py is an Open Source web framework for rapid application
>>>>>>>> development written in Python and programmable in Python.  
>>>>>>>> web2py
>>>>>>>> runs
>>>>>>>> almost everywhere and includes everything you need for web
>>>>>>>> development
>>>>>>>> in one package including a ssl-enabled web server, a
>>>>>>>> transaction-safe
>>>>>>>> relational database, a web-based IDE, a web-based database
>>>>>>>> administration tool, and a Database Abstraction Layer that
>>>>>>>> writes SQL
>>>>>>>> for you and works transparently on SQLite, MySQL, Postgresql,
>>>>>>>> Oracle,
>>>>>>>> MSSQL, FireBird, DB2, Informix, Ingres, and on Google App  
>>>>>>>> Engine.
>>>>>>>>
>>>>>>>> Web2py requires no installation: just download, unzip and  
>>>>>>>> click to
>>>>>>>> start it.
>>>>>>>> You can see an online demo here:
>>>>>>>>
>>>>>>>> http://web2py.com/demo_admin
>>>>>>>>
>>>>>>>> The RDF plugin allows to label (tag) database fields and  
>>>>>>>> relations
>>>>>>>> with ontologies and web2py automatically exposes the data in  
>>>>>>>> the
>>>>>>>> database as Linked Data via a RESTful web service. Works with  
>>>>>>>> all
>>>>>>>> database back-ends listed above.
>>>>>>>>
>>>>>>>> Any comment and feedback will be appreciated.
>>>>>>> Any reason why you don't use ODBC for your data access? Your
>>>>>>> references
>>>>>>> above imply you implemented data access APIs on a DBMS by DBMS
>>>>>>> basis.
>>>>>>> ODBC is not only superior to all Native DBMS APIs, it is generic
>>>>>>> thereby
>>>>>>> shrinking you development and maintenance costs.
>>>>>>>
>>>>>>> Kingsley
>>>>>>>>
>>>>>>>> Massimo
>>>>>>>>
>>>>>>>> -------------------------------------------------------
>>>>>>>> Massimo Di Pierro
>>>>>>>> Associate Professor
>>>>>>>> School of Computing and Digital Media
>>>>>>>> DePaul University
>>>>>>>> 243 S Wabash Ave
>>>>>>>> Chicago IL 60604
>>>>>>>> +1-312-375-6536 (phone)
>>>>>>>> +1-312-375-6116 (fax)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Kingsley Idehen          Weblog:
>>>>>>> http://www.openlinksw.com/blog/~kidehen
>>>>>>> President & CEO
>>>>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Kingsley Idehen          Weblog:
>>>>> http://www.openlinksw.com/blog/~kidehen
>>>>> President & CEO
>>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>>
>>> Regards,
>>>
>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/ 
>>> ~kidehen
>>> President & CEO
>>> OpenLink Software     Web: http://www.openlinksw.com
>>>
>>>
>>>
>>>
>>
>>
>
>
> --
>
>
> Regards,
>
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO
> OpenLink Software     Web: http://www.openlinksw.com
>
>
>
>
Received on Thursday, 24 December 2009 18:40:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 24 December 2009 18:40:47 GMT