Re: (Round 2) Proposed Extensions to OWL from Thomas B. Passin on 2003-07-05 (www-rdf-interest@w3.org from July 2003)

From: Thomas B. Passin <tpassin@comcast.net>
Date: Sat, 5 Jul 2003 11:36:44 -0400
To: <www-rdf-interest@w3.org>
Message-ID: <000801c3430b$3db06ee0$6401a8c0@tbp1>
[Roger L. Costello]
>
> The discussions have been excellent.  I believe that we are making good
> progress.
>

We are, but I think it is time for a change in how to proceed - I have
several suggestions.

It struck me this morning that this process is very much like the Pair
Programming of Extreme Programming.  One person writes code (==>RDF) and the
others look on, making suggestions and pointing out errors.  Then another
person takes up the code-writing role.

But some of the other key practices from Extreme Programming are missing.  I
think that they should be in play here.

1) Write unit tests before writing code.
2) Test constantly by applying the unit tests.
3) Refactor diligently.
4) Be as simple as possible.

The unit tests are based on the known requirements for a module plus what is
learned during development.  Tests are added but not normally retired.

When all unit tests pass and no one can think of any other applicable tests,
you are done (until I & T, etc, which I take it we are not tackling here).

The input requirements in this case would be based on a conceptual model of
some kind, together with the kind of statements we think we will want to
make.  As for the tests, we know that we want to be able to make statements
and draw inferences.  So the tests must be in the form of statements.  They
cannot be in RDF because we do not yet know what the RDF will be.  Therefore
they need to be in some other language.  In practice, I am sure we want to
use natural language.

We already know that we are talking about _kinds_ of statements,  but we
will have to test them with specific instance statements.  We also know that
there are a potentially infinite number of statements we would eventually
like to be able to make.  We cannot test them all, so we will have to be
smart about crafting our tests.  Of course we will learn and add tests as we
go.

Below I take first cut at this plan.  But before that, here comes my other
suggestion.  I think that this bit of development should get moved off the
list to a (publicly accessible) Wiki or Blog site.  If a blog, it should
accept comments to each story, and display them in line.  As an Wiki
example, you could look at Sam Ruby's Wiki for the Echo project.  It looks
like it uses the Moin-Moin Wiki engine, which is in Python and is very easy
to set up [ http://moin.sourceforge.net/ ].  There are many others.

Whether Wiki or blog, it would be useful to have an RSS feed for changes.

Now for the start of a conceptual model and a few unit tests.  I hope that
the names I use are sufficiently clear, but if they are not, we need to
improve them or to clarify their intent.  I begin property names with
lowercase, object names with upper case, as Roger has been doing.

M1) A TangibleObject may have zero or more PhysicalProperties.
    Note - Probably needs a different name for TangibleObject.

M2) A physicalProperty is a kind of Property.

M3) The type of the value of a physicalProperty is a PhysicalQuantity.

M4) A linearPhysicalProperty is a kind of physicalProperty.

M5) A length is a kind of linearPhysicalProperty.

M6) A PhysicalQuantity may be characterized by zero or more measurement
properties.

M7) A LinearPhysicalQuantity is a kind of PhysicalQuantity.

M8) The type of the value of a length is a LinearPhysicalQuantity
M8a).  A LinearPhysicalProperty may have one or more length properties.

M9) The type of the value of a measurement is a Measurement.

M10) A LinearMeasurement is a kind of Measurement.

M10a) The type of the value of a length property is a LinearMeasurement.

M11)  A LinearMeasurement is characterized by one or more equivalent
(numerical value, units specification) pairs.
    Note - This refers to the values in different units, not to the results
of different measurements taken at different times or by different methods.

M12) A (numerical value, units specification) pair is a kind of
MeasurementValue.
    Note - I do not try to model a "units specification here".  That looks
like a substantial     job in itself!

M13) A MeasurementValue may be associated with metadata that include -
    a) Accuracy
    b) Precision
    c) Data set
    d) Calculations
    e) Algorithms used for the calculation.
    f) Relevant publications
    g) Source
    h) Reported precision (i.e., number of decimal places in the stated
value,
        which may be different from the precision of the measurement)

M14) There is at least one method for establishing the equivalence between
each (numerical value, unit specification) pair of interest.

I will stop here and give an example of a few unit tests for bits of this
model.

- M1) A TangibleObject may have zero or more PhysicalProperties.
    T-M1-1.  There is a type of Thing called TangibleObject.
    T-M1-2.  There is a type of Property called Physical Property.
    T-M1-3.  The domain of PhysicalProperty is TangibleObject.
    T-M1-4.  A TangibleObject may have zero or more PhysicalProperties.

To perform these four unit tests, write an RDF or OWL statement for each of
them.  This will be easy - so far so good.

M12) A (numerical value, units specification) pair is a kind of
MeasurementValue.
    - T-M12-1.  There is a kind of thing called MeasurementValue.
    - T-M12-2.  A MeasurementValue must have exactly one numerical value.
    - T-M12-3.  A Measurement Value must have exactly one units
specification.
Note - this is not quite right, because a general MeasurementValue could be
more general than a simple (value, units) pair.  We need a name for the
pair - then the test can be fixed.

We want other unit tests as well.  Each of Roger's challenges can be seen as
a unit test.  We might even have a test that requires a specific inference
to be found.

- T-Roger-1.  The Yangtze River has a length of 3609 miles.
The unit test consists of
    a) Translating the sentence into the language of the model.
    b) Translating that into RDF/OWL statements.
Probably, step (a) should really be included in the statement of the unit
test, leaving only step b) to be actually tested.

- T-Roger-2.  The Yangtze River has a length of 6300 kilometers.
- T-Roger-3.  The statements in T-Roger-1 and T-Roger-2 are equivalent.
    -Note - This would seem to be impossible to pass  since we have not
modeled what this kind of "equivalence" means as yet.  But maybe we can
finesse it by using a symmetric "equivalentPhysicalValues" property, which
we could define elsewhere.  The notion of "transformation" would get
refactored into this new property.

If this seems like a good idea, we should add it to the conceptual model.

It looks to me like M14 and the units specification are the only tricky
parts of this.

Guys, this has got to be a good approach, because my ideas changed,
clarified, and improved as I wrote this post!

Just to sum up, I am suggesting this -

    Create a conceptual model and a set of typical
    or edge case statements, then write unit tests for
    each bit of the model and for each of the statements,
    then try to write RDF/OWL for each test.

Roger, are you going to start up a Wiki for this?

Cheers,

Tom P
Received on Saturday, 5 July 2003 11:32:56 UTC