SOA organised with RDF - update


About 4 years ago I wrote about organising SOA using RDF. I thought that
if anyone remembers that, they might be interested in an update. 

I still work in the danish tax and customs, and our SOA development
program has now reached a mature stage, with 1000 webservice
definitions, 3000 endpoints, numerous operations environments and a
myriad of information to keep together. We also have to coordinate a
number of solution suppliers and operators, as we don't in fact build
our own solutions. This would probably never have succeeded without
resorting to semantic web technology.

Four years ago, we used an eXist XML database as the core. This required
serialized RDF/XML to work, and was not efficient for querying. I
received many suggestions to move to a dedicated RDF database, but it
has taken a long time to do this, as the day-to-day work keeps
distracting us. 

Nevertheless, we eventually moved our metadata to a core using
OpenRDF-Sesame, and we have developed a suite of rdfizers/introspectors
based on a combination of ARQ and linux shell scripts, to feed the
metadatabase with fresh updates.  

The metadatabase is rapidly growing, but manages to tie together
information from UML, BPMN, WSDL and XSD code generation, documentation,
souorce code, service bus introspection, test suites and you name it
into an interconnected graph/database. 

With this metadatabase it is fairly easy to query questions like 

- "Which webservices from system X are failing?", 
- "If I change UML Use Case UCY, which service endpoint may potentially
be affected?", or 
- "If I change the datatype of XSD element Z, what is the total impact
on our webservices, documentation, source code and operation
- "Where can I find the exact specification and documentation for
endpoint P?"

or any other question of that sort. We have a metadata governance group
that are able to develop their own custom SPARQL queries, so that
requests of this kind from our business departments can be met. 

Metadata updating is all automatized, as manual updating is doomed to
fail! Metadata are produced as a spill-over from the specification work
that has to be done anyway. This means that the day-to-day work
automatically enriches the metadatabase.

The move to Sesame enabled fast queries, multiple RDF backends, web
distribution, and also provides a workbench for the governance group to
develop new queries.

I have noticed that there has always been a tremendous focus on
developing standard ontologies for this and that. I also noticed that
there was an initiative to develop a standard ontology for SOA. But in
my experience, the strength of our concept - in contrast to virtually
all metadata management products I have seen - is that we do NOT have a
fixed ontology. Each type of metainformation has its own ontology. And
this ontology is not even fixed. We make changes to the ontologies
whenever we like. If we add a new source of metainformation, we
typically also add a dedicated ontology for that type of source. 

This way we can keep the metadatabase dynamic and alive all the time,
and adapt to changing needs. I find it very hard to subscribe to fixed
ontologies, because they invariably always turn out to be a hindrance.
Of course this means that some SPARQL queries may have to be altered
too, when we decide to change something in an ontology, but this is a
very small price to pay for the ultimate flexibility.


Frank Carvalho
Central Customs and Tax


Received on Thursday, 29 September 2011 10:29:40 UTC