Re: Linked Data with web2py

I will get back to you on the topic below. You clearly have an  
expertise that we can leverage on.

Happy Holdays to you too and everybody on this list.

Massimo


On Dec 24, 2009, at 12:51 PM, Kingsley Idehen wrote:

> Massimo Di Pierro wrote:
>> Good. I think we agree.
> Yes!
>
> Happy Holidays to You, and of course everyone else on the mailing  
> list!
>
> Kingsley
>>
>> Massimo
>>
>> On Dec 24, 2009, at 12:24 PM, Kingsley Idehen wrote:
>>
>>> Massimo Di Pierro wrote:
>>>> You say "ODBC is about making the backend DBMS engines
>>>> interchangeable". Does it mean there is a way to write an arbitrary
>>>> SQL query once and it is guaranteed to work when I change the  
>>>> database
>>>> from Oracle to PostgreSQL and vice versa (that is what web2py
>>>> provides)?
>>> You are providing a Virtualization layer atop the Data Access  
>>> Drivers.
>>>> If so, would you provide an example of how you would do pagination
>>>> which is highly dialect dependent? I guess I am missing this.
>>> Virtuoso would do that for you. Microsoft JET engine would do that  
>>> for
>>> you. In either case you link the ODBC DSN into the Virtual DBMS.
>>>
>>>>
>>>> What is the license of Virtuoso?
>>> Dual i.e. Open Source and Commercial. The Virtual DBMS isn't Open
>>> Source.
>>>>
>>>> We do not provide nor created an ODBC driver. We use pyODBC for  
>>>> those
>>>> back-ends that require ODBC. For those RDBS that do not, I guess,  
>>>> you
>>>> can still use pyODBC but currently it would require some minor
>>>> modification to our source code.
>>> As long as you don't have any DBMS specific code in your ODBC  
>>> calls to
>>> pyODBC then you're set re. your obligations :-)  Then matters go  
>>> down to
>>> the pyODBC layer where they should work like Virtuoso, Access and  
>>> other
>>> ODBC apps re. use of Metadata calls to deduce underlying DBMS
>>> capabilities via the ODBC Drivers for those DBMS engines.
>>>>
>>>> Bare in mind that the web2py user, whether he/she wishes to  
>>>> develop a
>>>> generic web app or expose existing data using Linked Data, the user
>>>> never writes any SQL. Not even our RDF plugin contains any SQL  
>>>> code.
>>> The abstraction and its goals are crystal clear to me.
>>>>
>>>> Our approach is similar to SQLAlchemy, SQLObject, Django ORM and  
>>>> Ruby
>>>> Active Record. It is an approach typical of interpreted languages
>>>> which provide introspection capabilities. It has some similarities
>>>> with Java Hibernate.
>>> We built a fixed ODBC Layer for Ruby Active Record [1]. An OpenLink
>>> member of staff is now the PHP-ODBC maintainer etc..
>>>>
>>>> Anyway. The DAL is one component of web2py. Other components  
>>>> include
>>>> web server, libraries to handle cookies, create session, cache web
>>>> based and database queries, generate web forms from database  
>>>> models,
>>>> validate all input fields, report errors on validation,
>>>> internationalization, web based IDE, web based testing and  
>>>> debugging,
>>>> automatic generation of a web based interface to any database, role
>>>> based access control system, a template language to render data in
>>>> HTML/XML and other protocols, ability to serialize data in XMl,  
>>>> CSV,
>>>> JSON, RSS, RTF and generate services in XMLRPC, JSONRPC for  
>>>> Google Web
>>>> Toolkit, and AMFRPC for Flash/Flex.
>>>>
>>>> We tried hard to build this thing so that (at least in the default
>>>> configuration) it does not have any third party dependence,  
>>>> requires
>>>> no installation (just unzip and click) and no configuration, to be
>>>> super easy to setup. Requiring ODBC or any other third party  
>>>> product
>>>> would not simplify our work. It would increase complexity, setup
>>>> costs, and reduce portability.
>>> Again, not questioning your vision and goals re. your tech.
>>>> I would agree that ODBC should be an option for all database that
>>>> provide the driver.
>>> Yep! That's what I am preoccupied with and very protective of  
>>> etc.. :-)
>>>> We are in the process of re-factoring our source for the purpose of
>>>> making it easier to plugin more non-relational databases. This  
>>>> will be
>>>> completed in one month or so. Once done it will be  
>>>> straightforward to
>>>> use the ODBC drivers for those RDBMS database back-ends that  
>>>> currently
>>>> do not use it, if you wish to do so.
>>> Great!
>>>
>>> Links:
>>>
>>> 1. http://odbc-rails.rubyforge.org/
>>>
>>> Kingsley
>>>>
>>>> You can find more DAL example here:
>>>> http://www.web2py.com/examples/default/dal although the best  
>>>> reference
>>>> is the web2py book
>>>> http://www.amazon.com/s/ref=nb_ss?url=search-alias%3Daps&field-keywords=9780470592359
>>>>
>>>>
>>>>
>>>> Massimo
>>>>
>>>> On Dec 24, 2009, at 11:19 AM, Kingsley Idehen wrote:
>>>>
>>>>> Massimo Di Pierro wrote:
>>>>>> I think we are talking apples and oranges.
>>>>>>
>>>>>> According to Microsoft, who proposed ODBC,
>>>>>>
>>>>>>   http://msdn.microsoft.com/en-us/library/aa198030%28SQL.80%29.aspx
>>>>>>
>>>>>> "The driver passes SQL statements to SQL Server and returns the
>>>>>> results of the statements to the application."
>>>>> No, before an ODBC app talks to an ODBC Driver, it should use ODBC
>>>>> metadata calls to understand the capabilities of the underlying
>>>>> driver.
>>>>> That's how ODBC applications are supposed to be written.
>>>>>
>>>>> ODBC is not a simple CLI (as per what you get from DBMS vendors)  
>>>>> for
>>>>> executing SQL statements, far from it.
>>>>>
>>>>>>
>>>>>> The purpose of the web2py DAL is that of generating the SQL
>>>>>> statements
>>>>>> (in the case of RDBS) and it does not care whether the  
>>>>>> connection is
>>>>>> established via ODBC or not.
>>>>> Its an abstraction that sits atop data access APIs.
>>>>>
>>>>> Putting my question differently, what happens when I use ODBC
>>>>> against a
>>>>> DBMS that isn't DB2 or MS SQL Server re. this interface?  
>>>>> Basically,
>>>>> can
>>>>> I substitute either via the ODBC hook ?
>>>>>>
>>>>>> We do not use ODBC on Oracle, for example, but we use cx_Oracle
>>>>>> instead because, according to this is a quote from the Oracle web
>>>>>> page:
>>>>>>
>>>>>>
>>>>>> http://www.oracle.com/technology/pub/articles/rubio-python-turbogears.html
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> "cx_Oracle serves as the core and lowest-level API for bridging a
>>>>>> Python environment to an Oracle database. cx_Oracle is to  
>>>>>> Python what
>>>>>> an Oracle JDBC driver is to Java, enabling an application to  
>>>>>> execute
>>>>>> raw SQL queries and manipulate database cursors with a fine  
>>>>>> level of
>>>>>> detail."
>>>>> Yes, what do you expect Oracle to say? Something like: we have an
>>>>> interface that lacks the deductive capabilities of the ODBC  
>>>>> interface
>>>>> that ultimately makes DBMS swapping nice and easy?
>>>>>>
>>>>>> We could use pyODBC on Oracle as we do with with DB2 but the
>>>>>> installation process would be more cumbersome.
>>>>> Depends on which ODBC drivers you use. In fact, you can install  
>>>>> our
>>>>> Drives using an installer on any platform we support (and thats  
>>>>> a vast
>>>>> array of platforms).
>>>>>
>>>>> Post installation we even give you basic connection exercisers  
>>>>> or a
>>>>> platform port of the ODBC Driver Manager that verifies DSNs post
>>>>> creation.
>>>>>>
>>>>>> ODBC does a lot but in the end it just gives a connection and a
>>>>>> cursor
>>>>>> object. It does not give us complete database abstraction which  
>>>>>> is
>>>>>> what the web2py DAL does.
>>>>> Of course it gives you complete DBMS abstraction. What you  
>>>>> really mean
>>>>> is this: the interface you are using doesn't really exploit ODBC  
>>>>> to
>>>>> the
>>>>> max. I believe this ODBC layer comes from someone else, right?
>>>>>>
>>>>>> One of the most important issue from our prospective is that the
>>>>>> web2py DAL works with Non Relational Databases too. Currently  
>>>>>> only
>>>>>> Google Big Table on the Google App Engine but we are working on
>>>>>> MongoDB and CouchDB support.  ODBC does not work with non- 
>>>>>> relational
>>>>>> databases.
>>>>> Virtuoso provides an ODBC interface that actually lets you work  
>>>>> with
>>>>> XML, RDF, Web Services and other data representations and model
>>>>> combinations.  Jena, Redland, and Sesame abstractions all work  
>>>>> with
>>>>> Virtuoso via its JDBC or ODBC drivers. Again, ODBC capabilities  
>>>>> are
>>>>> very
>>>>> dependent on who's ODBC implementation (app or driver) you are  
>>>>> working
>>>>> with.
>>>>>
>>>>> In our world you can join a relation table and Web Addressable
>>>>> Resource
>>>>> via SPARQL pattern in Procedure Views (Table Valued Functions in
>>>>> Oracle
>>>>> parlance) executed inside the Virtuoso SQL (i.e., SPASQL [1]).
>>>>> Naturally, you can also to the opposite i.e., surface SQL
>>>>> functionality
>>>>> in SPARQL[2] etc..
>>>>>>
>>>>>> Running on the Google App Engine cloud has some limitations. We  
>>>>>> can
>>>>>> only upload pure Python or pure Java programs, no C/C++ modules  
>>>>>> or
>>>>>> pre-compiled modules. We can only use the Google API or something
>>>>>> that
>>>>>> they call GQL. There is no SQL on GAE because it is not a  
>>>>>> relational
>>>>>> database. Hence, ODBC is not option. Building our system on ODBC
>>>>>> simply would not cut it. Yet the Google App Engine is the  
>>>>>> single best
>>>>>> thing that happened to the world of web development in some time
>>>>>> (practically free, unlimited scalability, zero maintenance, no
>>>>>> messing
>>>>>> up with server or database administration). We use ODBC when we  
>>>>>> can
>>>>>> but we are interested in pushing our technology in places where  
>>>>>> ODBC
>>>>>> is not an option.
>>>>>>
>>>>>> Least but not last I want to make the case that everybody is  
>>>>>> doing
>>>>>> web
>>>>>> apps with database connections these days. This is now off-the- 
>>>>>> shelf
>>>>>> technology and people should not pay for it. Most ODBC drivers  
>>>>>> are
>>>>>> commercial products or require a commercial license for  
>>>>>> enterprise
>>>>>> use. Our Database Abstraction Layer is released under the GPL2
>>>>>> license, only requires third party GPL or BSD drivers, and has  
>>>>>> been
>>>>>> tested by more than 1500 registered users and 50 developers.
>>>>>
>>>>> I am not questioning your vision. And when it comes to data  
>>>>> access, I
>>>>> think REST based Open Data Access (Linked Data) is something I
>>>>> advocate
>>>>> strongly,  as a superior solution to ODBC, JDBC, ADO.NET, OLE- 
>>>>> DB, or
>>>>> XMLA when dealing disparately shaped data sources.
>>>>>
>>>>> My fundamental point to you is this: ODBC is a DBMS independent
>>>>> API, so
>>>>> you shouldn't speak in DBMS specific terms re. ODBC. You have  
>>>>> Native
>>>>> CLIs from DBMS vendors that occupy the Database Specific Data
>>>>> Access API
>>>>> box :-)
>>>>>
>>>>> ODBC is about making the backend DBMS engines interchangeable, one
>>>>> size
>>>>> doesn't fit all re. ODBC Drivers  or ODBC compliant  
>>>>> applications. I
>>>>> should be able to simply slot in our Virtuoso ODBC Driver, or  
>>>>> our ODBC
>>>>> Drivers for all the other DBMS engines we support without issue  
>>>>> via
>>>>> your
>>>>> ODBC hook.  Same should apply to any other ODBC Driver vendor.
>>>>>
>>>>> To conclude; when I have a moment I'll have your ODBC layer tested
>>>>> against our ODBC Drivers, if there are issues, we'll make a simply
>>>>> report highlight ODBC APIs issues etc. (for you or whoever is
>>>>> behind the
>>>>> ODBC layer. Worst case we can fix it ourselves as we already  
>>>>> done many
>>>>> other interfaces).
>>>>>
>>>>>
>>>>> Links:
>>>>>
>>>>> 1. http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsparqlinline
>>>>> 2. http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsqlfromsparql
>>>>>
>>>>> Kingsley
>>>>>>
>>>>>>
>>>>>> Massimo
>>>>>>
>>>>>>
>>>>>> On Dec 24, 2009, at 9:30 AM, Kingsley Idehen wrote:
>>>>>>
>>>>>>> Massimo Di Pierro wrote:
>>>>>>>> Hi Kingsley,
>>>>>>>>
>>>>>>>> Internally we use the Python ODBC driver for the connection  
>>>>>>>> to DB2
>>>>>>>> and
>>>>>>>> to MSSQL.
>>>>>>> But how have you come to conclude that ODBC only works with  
>>>>>>> DB2 and
>>>>>>> MSSQL? What about ODBC Drivers that have been developed to work
>>>>>>> consistently across:
>>>>>>>
>>>>>>> Virtuoso, Oracle, SQL Server, DB2, Informix, Ingres, Sybase,
>>>>>>> Progress,
>>>>>>> MySQL, Firebird, other ODBC Drivers, JDBC Drivers etc? Thus, you
>>>>>>> could
>>>>>>> have the option ODBC as opposed to: ODBC for DB2 and MS SQL  
>>>>>>> Sever.
>>>>>>>
>>>>>>> What about the fact that you have ODBC Drivers for Virtuoso that
>>>>>>> enable
>>>>>>> to make one ODBC connection to all the DBMS engines above, and  
>>>>>>> treat
>>>>>>> their tables as though they were part of one DBMS i.e.,  
>>>>>>> Virtual DBMS
>>>>>>> based Relational Data Virtualization (or Federation).
>>>>>>>
>>>>>>>> We use more specific drivers for the other databases, as
>>>>>>>> recommended
>>>>>>>> by the official documentation.
>>>>>>> What official documentation? URL please.
>>>>>>>> Anyway, from the Python programming point of view they all
>>>>>>>> expose the
>>>>>>>> same Python API.
>>>>>>>>
>>>>>>>> The fact is ODBC only unifies the data access API and that is a
>>>>>>>> small
>>>>>>>> part of we needed since different database still have different
>>>>>>>> dialects.
>>>>>>>> Our DAL completely abstracts the database and writes the SQL  
>>>>>>>> in the
>>>>>>>> specific dialect of specific backend.
>>>>>>> Are you aware of the fact that via ODBC metadata calls you can
>>>>>>> deductively discern the capabilities and SQL dialiects supported
>>>>>>> by an
>>>>>>> DBMS. This is all about the ODBC Drivers you are working with,
>>>>>>> not the
>>>>>>> ODBC spec itself (which is vast and very sophisticated).
>>>>>>>>
>>>>>>>> For example a limit query in SQLite is done with "SELECT ...
>>>>>>>> FIRST N"
>>>>>>>> The same query in MSSQL is done with "SELECT .... TOP N", in
>>>>>>>> Oracle it
>>>>>>>> requires two nested selects "SELECT ... FROM (SELECT w_tmp.*,
>>>>>>>> ROWNUM
>>>>>>>> w_row FROM (SELECT ...) w_tmp WHERE ROWNUM<=N)...;".  In the
>>>>>>>> case of
>>>>>>>> the Google App Engine there is not even SQL so our DAL  
>>>>>>>> translates
>>>>>>>> directly into Google API calls.
>>>>>>> Again, you should focus on the functionality you want, and then
>>>>>>> see if
>>>>>>> the underling ODBC Driver can handle what you want, if it  
>>>>>>> can't you
>>>>>>> can
>>>>>>> drop back to your work arounds.
>>>>>>>
>>>>>>>>
>>>>>>>> The same query in the web2py DAL reads like, for example:
>>>>>>>>
>>>>>>>> db=DAL('postgresql://username:password@localhost',  
>>>>>>>> pool_size=100)
>>>>>>>> db.define_table('person',Field('name'))
>>>>>>>> db.person.insert(name='Ken')
>>>>>>>> rows = db(db.person.name=='Ken').select(limitby = (0,10))
>>>>>>>>
>>>>>>>> The first line choose the most appropriate connection driver  
>>>>>>>> (which
>>>>>>>> may be ODBC). The second line tried to define a table "person".
>>>>>>>> If it
>>>>>>>> does not exist, it is CREATEd. If it exists and has a different
>>>>>>>> stricture, it is ALTERed. The third line inserts a second. The
>>>>>>>> forth
>>>>>>>> line is performs the limit query.
>>>>>>>>
>>>>>>>> As you can see we do not use raw SQL anywhere in our system,
>>>>>>>> although
>>>>>>>> our system may use SQL to talk to the database. Using raw SQL  
>>>>>>>> also
>>>>>>>> presents the disadvantage that, if one is not very careful in
>>>>>>>> escaping
>>>>>>>> variables, one may introduce SQL Injections vulnerabilities.  
>>>>>>>> We do
>>>>>>>> not
>>>>>>>> have to worry about that with the DAL.
>>>>>>> I do understand the abstraction.
>>>>>>> I worked with a product called DAL [1] for Mac OS Classic in the
>>>>>>> early
>>>>>>> 1990's  (and others pre ODBC, which arrived in 1992 as Windows
>>>>>>> port of
>>>>>>> what was then the SAG CLI) :-)
>>>>>>>>
>>>>>>>> The RDF tagging is done at the DAL level:
>>>>>>>>
>>>>>>>> db.person.name.rdf = { .... }
>>>>>>>>
>>>>>>>> Anyway, it is possible there is some feature of ODBC that we  
>>>>>>>> have
>>>>>>>> overlooked and we may be able to take  advantage of.
>>>>>>> Yes.
>>>>>>>
>>>>>>> You have ODBC itself. Then you have Virtuoso VDBMS (think  
>>>>>>> scalable
>>>>>>> high
>>>>>>> performance variant of Microsoft Access JET Engine which major
>>>>>>> benefits).
>>>>>>>
>>>>>>>
>>>>>>> Kingsley
>>>>>>>>
>>>>>>>> Massimo
>>>>>>>>
>>>>>>>> On Dec 24, 2009, at 6:57 AM, Kingsley Idehen wrote:
>>>>>>>>
>>>>>>>>> Massimo Di Pierro wrote:
>>>>>>>>>> Hello everybody,
>>>>>>>>>>
>>>>>>>>>> I am a new member of this list and first of all I wish  
>>>>>>>>>> everybody
>>>>>>>>>> Happy
>>>>>>>>>> Holidays.
>>>>>>>>>>
>>>>>>>>>> I also take the occasion to introduce the RDF plugin for  
>>>>>>>>>> web2py.
>>>>>>>>>>
>>>>>>>>>> http://web2py.com/semantic
>>>>>>>>>>
>>>>>>>>>> web2py is an Open Source web framework for rapid application
>>>>>>>>>> development written in Python and programmable in Python.  
>>>>>>>>>> web2py
>>>>>>>>>> runs
>>>>>>>>>> almost everywhere and includes everything you need for web
>>>>>>>>>> development
>>>>>>>>>> in one package including a ssl-enabled web server, a
>>>>>>>>>> transaction-safe
>>>>>>>>>> relational database, a web-based IDE, a web-based database
>>>>>>>>>> administration tool, and a Database Abstraction Layer that
>>>>>>>>>> writes SQL
>>>>>>>>>> for you and works transparently on SQLite, MySQL, Postgresql,
>>>>>>>>>> Oracle,
>>>>>>>>>> MSSQL, FireBird, DB2, Informix, Ingres, and on Google App  
>>>>>>>>>> Engine.
>>>>>>>>>>
>>>>>>>>>> Web2py requires no installation: just download, unzip and
>>>>>>>>>> click to
>>>>>>>>>> start it.
>>>>>>>>>> You can see an online demo here:
>>>>>>>>>>
>>>>>>>>>> http://web2py.com/demo_admin
>>>>>>>>>>
>>>>>>>>>> The RDF plugin allows to label (tag) database fields and
>>>>>>>>>> relations
>>>>>>>>>> with ontologies and web2py automatically exposes the data  
>>>>>>>>>> in the
>>>>>>>>>> database as Linked Data via a RESTful web service. Works  
>>>>>>>>>> with all
>>>>>>>>>> database back-ends listed above.
>>>>>>>>>>
>>>>>>>>>> Any comment and feedback will be appreciated.
>>>>>>>>> Any reason why you don't use ODBC for your data access? Your
>>>>>>>>> references
>>>>>>>>> above imply you implemented data access APIs on a DBMS by DBMS
>>>>>>>>> basis.
>>>>>>>>> ODBC is not only superior to all Native DBMS APIs, it is  
>>>>>>>>> generic
>>>>>>>>> thereby
>>>>>>>>> shrinking you development and maintenance costs.
>>>>>>>>>
>>>>>>>>> Kingsley
>>>>>>>>>>
>>>>>>>>>> Massimo
>>>>>>>>>>
>>>>>>>>>> -------------------------------------------------------
>>>>>>>>>> Massimo Di Pierro
>>>>>>>>>> Associate Professor
>>>>>>>>>> School of Computing and Digital Media
>>>>>>>>>> DePaul University
>>>>>>>>>> 243 S Wabash Ave
>>>>>>>>>> Chicago IL 60604
>>>>>>>>>> +1-312-375-6536 (phone)
>>>>>>>>>> +1-312-375-6116 (fax)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Kingsley Idehen          Weblog:
>>>>>>>>> http://www.openlinksw.com/blog/~kidehen
>>>>>>>>> President & CEO
>>>>>>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Kingsley Idehen          Weblog:
>>>>>>> http://www.openlinksw.com/blog/~kidehen
>>>>>>> President & CEO
>>>>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/ 
>>>>> ~kidehen
>>>>> President & CEO
>>>>> OpenLink Software     Web: http://www.openlinksw.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>>
>>> Regards,
>>>
>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/ 
>>> ~kidehen
>>> President & CEO
>>> OpenLink Software     Web: http://www.openlinksw.com
>>>
>>>
>>>
>>>
>>
>>
>
>
> --
>
>
> Regards,
>
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO
> OpenLink Software     Web: http://www.openlinksw.com
>
>
>
>

Received on Thursday, 24 December 2009 19:36:02 UTC