Re: Are current RDF tools ready for this use case? from Ioachim Drugus on 2007-06-27 (semantic-web@w3.org from June 2007)

From: Ioachim Drugus <sw@semanticsoft.net>
Date: Wed, 27 Jun 2007 13:35:04 -0700
To: Mark Kennedy <mark.kennedy@gmail.com>
CC: semantic-web@w3.org
Message-ID: <4682C9F8.3090906@semanticsoft.net>
Hi Mark,

"Semantic Server" is a tool which seems to comply with your description. 
I can arrange a copy for you.
We would like "Semantic Server" to be a content server for Semantic Web. 
Here is a short description of its features which are generic, but cover 
the particular use case you described.
Semantic Server has a web-based UI through which you can create and 
manage different "semantic repositories" on the web, including existing 
ones, which are built according JCR, Java Content Repository, 
specification JSR-000170..  In a repository, a resource with a semantic 
web content type is stored "analytically" so that you can search inside 
the resource by using SPARQL. You can also keep all other types of 
content as a blob.  It also has a meta-data management system which 
allows you to keep information *about* resources inside of a repository 
- in Dublin Core or using any other vocabulary, say, of OWL. The system 
has concurrent versioning.

As a separate tool we have an application (Chameleon with Mobile Client) 
which can also work with a cellular - through web service, or if you 
don't have internet connection - through SMS. Soon, probably, we will 
make Semantic Server to be a content server not only for the web but 
also for  mobile telephony.  All this is done in Java so it is cross 
platform. Why Semantic Server is a *server*? Because it allows you also 
to *publish* resources from repositories so that they can be accessed 
via HTTP protocol (and later via SMS, and probably other protocols).

There is also a semantic studio called S-manager, which can work with 
Semantic Server so that you can create semantic type content in three modes
1. Source Mode - N3, NTriples, RDF in several presentation flavours
2. Graphic Mode - where the ontology is represented graphically, and 
where you can manage graphic styles
3. Design Mode - where you write simple propositions in in natural 
language but separate  subject, predicate and object of the proposition 
so that the tool "understands" you. Sure, you have first to indicate the 
namespace of the ontology you are creating. When you move mouse over a 
word, you can see the complete URI of the resource (well, except when it 
is a blank node). This design mode is our first idea on an interface to 
semantic web for those who don't know the standards.

The tool has an 'in memory' system and synchronizes behind the scenes 
all presentations accross modes. So, you can start building an ontology 
in one mode, then switch to another mode and continue there. You can 
develop the ontology also in Graphic Mode. You can save the ontology 
into filesystem, database or semantic repository.
The tool has a vocabularies panel, where you can import any vocabulary, 
additionally to RDF, RDF and OWL which come with the tool.

We are building our SPARQL processor to incorporate it in Semantic 
Server, but you can use any other SPARQL tool for now. Our idea on 
security and trust is that the SPARQL processor should be part of 
repository, only it be allowed to search the repository, and give 
content per request after an analysis of the requestor relationship with 
the repository. Sure the relationship can be also public type.

Joe
Ioachim Drugus, Ph.D
Architect
Semantic Soft, Inc.


Mark Kennedy wrote:
> Hello, all:
>
> I'm hoping to get some feedback for the appropriateness of using RDF 
> as a solution to a challenge that I'm facing. Here's the scenario:
>
> I have an environment that includes several disparate sets of content, 
> all maintained and stored by separate systems. For example, we have a 
> CMS for our editorial content, a third party blogging tool, a message 
> board application, a third party video management system, and perhaps 
> a third party wiki application. Each of those systems has their own 
> schema and storage for metadata about the content they manage. In the 
> future, new systems and content types will be added to our environment 
> as needed.
>
> Our vision is to build a common metadata store that each separate 
> system would feed into. This common store would enable us to add new 
> metadata to content and rectify the metadata from each system into a 
> common schema, e.g. allow us to map the author information from each 
> separate system onto a common set of authors, map separate 
> categorization schemes to a common taxonomy, etc.
>
> Our goal is to be able to query the common metadata store to do things 
> like find all of the content created by a single author regardless of 
> the system, or find all content related to a particular topic, or some 
> similar combination of query criteria.
>
> Based on our requirements, RDF seems like an ideal solution. What I'm 
> unsure about, however, is if there are any RDF tools/frameworks/stores 
> that are robust enough to handle a high level of concurrent querying 
> that would result from a high traffic, publicly available web site.
>
> I'm just starting the process of researching tools and triple stores 
> now, but I guess I'm looking for a gut check on the readiness or 
> appropriateness of RDF to serve the needs I describe. Are RDF and the 
> current tools that enable/support it ready for prime-time 
> consideration? If so, which ones make the most sense to research first?
>
> In my mind, the ideal system would support:
>  * The ability to store large numbers of triples, scalable to hundreds 
> of millions.
>  * Would be clusterable for redundancy.
>  * Could be accessed via HTTP for easy integration into a variety of 
> platforms.
>  * Would be highly performant in regards to querying.
>
> Any feedback would be appreciated. And if you think this query might 
> make more sense in another forum, please let me know.
>
> Thanks!
>
> -- 
> Mark Kennedy
Received on Thursday, 28 June 2007 12:16:01 UTC