W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > February 2006

RE: Unstructured vs. Structured (was: HL7 and patient records in RDF/OWL?)

From: Cutler, Roger (RogerCutler) <RogerCutler@chevron.com>
Date: Tue, 14 Feb 2006 23:42:04 -0600
Message-ID: <0C237C50B244FD44BE47B8DCE23A3052011C62F5@HOU150NTXC2MC.hou150.chevrontexaco.net>
To: "Christopher Cavnor" <ccavnor@systemsbiology.org>, public-semweb-lifesci@w3.org

That's too deep for me.  I'll be satisfied, at least in an immediate
sense, with a demonstration of how to generate RDF from an Excel
spreadsheet.  I think I'll just start saying "Excel spreadsheet" and
forget about the term that we use internally to categorize the kinds of
problems we have.  Spreadsheets are pretty much the 80-20 of that
problem, so why not call a spade a spade.  I'm really not very good at
generalizing and categorizing.

-----Original Message-----
From: public-semweb-lifesci-request@w3.org
[mailto:public-semweb-lifesci-request@w3.org] On Behalf Of Christopher
Sent: Tuesday, February 14, 2006 3:54 PM
To: public-semweb-lifesci@w3.org
Subject: Re: Unstructured vs. Structured (was: HL7 and patient records
in RDF/OWL?)

I'd argue that most information resources are indeed semi-structured.
The human brain is only able to meta-categorize resources based on its
structured aspects (markup and structural metadata), its informational
content (its aboutness), and context (environmental metadata).

"Structured" data is only structured once we have a common understanding
of its meaning. In this regard, data is never "raw" (except for randomly
generated data) - as even structured database tables have metadata to
add meaning. So the term "semi-structured" is always adequate as far as
I am concerned. You'd have to prove that there is any other type of data
to me ;)

Christopher Cavnor

On 2/14/06 10:54 AM, "Cutler, Roger (RogerCutler)"

> OK, then is there a preferred term for what we call "semi-structured
> data"?  That is, information that is structured but where the
> is not easily determined and perhaps has not been formalized at all,
> for which a formalized structure could be defined?  For example,
> in a spreadsheet?  We really care about this kind of thing, but I
> want to confuse the issue by using terms that most people understand
> differently.
> Incidentally, from my personal experience the usage of the term
> semi-structured, that is, binary blobs in structured databases, is not
> very common.  Frankly, this is the first I have heard the term used in
> that sense, but maybe I just don't run in the right circles.
> -----Original Message-----
> From: public-semweb-lifesci-request@w3.org
> [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of Jim Hendler
> Sent: Monday, February 13, 2006 3:43 PM
> To: Pat Hayes; Gao, Yong
> Cc: public-semweb-lifesci@w3.org
> Subject: Re: Unstructured vs. Structured (was: HL7 and patient records
> in RDF/OWL?)
> At 14:46 -0600 2/13/06, Pat Hayes wrote:
>>> The point I'm trying to make is this: The concept of
>>> is relative and context-sensitive.
>> Hear, hear. Well said.
>> Pat Hayes
> FWIW, Structured, unstructured and semi-structured, although
> concepts in common language and (esp) philosophy, have well-defined
> precise meanings in database jargon" -- most database books have
> definitions that are consistent with:
>   unstructured - NL text
>   semi-structured - unstructured fields within a structured DB context
>   structured - relational model (or similar) (those papers with
> technical definitions tend to get ugly and recourse to relational
> calculus, so these overly simplified definitions should suffice for
> that said, in the spirit of this particular thread, I think we should
> careful and, if we mean to use it in a DB context, make it clear in
> document that uses the term (i.e. "structured database" v.
> "structured data" which are very different in some contexts)
>     -JH
Received on Wednesday, 15 February 2006 05:42:32 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:52:25 UTC