Unstructured vs. Structured (was: HL7 and patient records in RDF/OWL?)

Having trained as a computational linguist, one thing I remember vividly is the
debate among linguists on the issue of semantics vs. syntax. One of the wisdoms
I gained from that experience is the saying "One man's semantics is another
man's syntax." (I'll need to dig deeper to find its origin.)

Having worked on building practical tools for data extraction and integration,
I've learned the lesson on the importance of NOT getting too boggled down on
labeling what's "structured" and what's not. Here I quote another saying "One
Man's Ceiling is Another Man's Floor"


The point I'm trying to make is this: The concept of "structuredness" is
relative and context-sensitive. For example, natural language texts are highly
structured, it's just we still have a long way to fully discover and understand
its structures and use them to find meanings mechanically.
Another example, HTML pages are structured so that web browsers can display them
properly. XML and RDF data can as well be "unstructured" if you put a blob of
text, say abstract, between a pair of tags.

I would almost suggest the term "non-RDF", rather than "unstructured", be used
in the context of transforming some data into RDF format.

---
Yong Gao, PH.D.
MassGeneral Institute for Neurodegenerative Disease (MIND)

Received on Friday, 10 February 2006 19:56:47 UTC