- From: Will Pugh <will.pugh@socrata.com>
- Date: Wed, 18 Jul 2012 00:28:44 -0700
- To: Joshua Shinavier <josh@fortytwo.net>
- Cc: public-vocabs@w3.org
- Message-ID: <CAEhPSgjWjFnvLj7icA4-WCKXNJ_isjMeuEo24JYDomr9tfDMvQ@mail.gmail.com>
On Mon, Jul 16, 2012 at 9:35 AM, Joshua Shinavier <josh@fortytwo.net> wrote:
> Hi Will,
>
> Thanks for your suggestions. I would have replied sooner, but I
> missed your email the first time around.
>
>
>
> On Sat, Jul 14, 2012 at 7:53 PM, Will Pugh <will.pugh@socrata.com> wrote:
> [...]
> > My understanding is that the main goal of the schema.org it to create
> > schemas useful to search engines, rather than the broader goals of
> projects
> > like Linked Data that want to create a "Global Data Space". Is this a
> > correct assessment?
>
>
> I believe so, but there are others on this list who could give a more
> authoritative and complete answer.
>
>
>
> > With that assumption, I've got a few scenarios I wanted to ask about,
> with
> > the idea that these scenarios may describe relationships interesting to
> > search engines.
> >
> > 1) Is there a way to describe "derived datasets"?
>
>
> No, although I think this is a good idea. I imagine this would be
> useful from a licensing perspective (however, schema.org does not deal
> with licensing) as well as for making the related / super-dataset
> discoverable. However, I don't think it's very specific to datasets;
> IMHO, it would make more sense at the CreativeWork level. If such a
> term were in DCAT, perhaps it would make sense to include an
> equivalent term in the extension and then propose that it be moved up
> to CreativeWork.
>
Interesting point. Derived Work in CreativeWork does make sense. The
reason I was leaning towards something in Dataset, though, is that
"DerivedWork" might have different meaning for a general CreativeWork than
for a Dataset specifically. In specific, a general "DerivedWork" could
imply changes on top of the original. E.g. If I have a photo of the
president and photoshop myself in, the result could be a "DerivedWork" from
the original photo.
However, the concept I was trying to express is one where the data itself
is not changed. Instead one where filters, sorts or aggregations are
layered on top of the dataset to "tell a story" from the data, or to make
some point from the data. Perhaps a different name would be better, like
OriginalDataset or ParentDataset?
>
>
>
> > 2) Would it make sense to describe an API on top of a dataset instead of
> > simply a dataset.
>
>
> This is a very important question. It would be reasonable to allow a
> Dataset distribution to be either a data download, a web service, or a
> feed, as in DCAT, *if* there were a straightforward mapping to
> schema.org types and properties. However, schema.org does not have an
> equivalent of DCAT's Distribution class (which is a superclass of
> Download, WebService, and Feed), and I don't even see a proposal for
> feed or web service types. That means that in order to allow the
> distribution property to point to any of the three types of resources,
> either schema.org would need to allow multiple types in the range of a
> property, or we would have to add four new types to schema.org just
> for distributions. Alternatively, separate properties could be added
> for feeds and web services. In any case, two additional types would
> need to be added to schema.org. Since those types are relatively
> fundamental, I suspect they would need to be the subject of other,
> individual proposals.
>
>
>
> > 3) Would it make sense to have a type which refers to a view or a
> dataset?
> > For example, if I have a page that contains a graph that contains number
> of
> > people with different salaries at the White House, would it make sense
> to be
> > able to express to a search engine that the graph is using the
> > "2011-Report-to-Congress-on-White-House-Staff" dataset?
>
This could be a more general derived from concept in Creative Work,
although the reason I thought it might make sense to specifically call out
a view or a presentation for a dataset would be to help search engines
cluster results better. For example, when I google "German Shepard", one
thing I get is a list of images of German Shepards. If we made it easy for
Search Engines to recognize charts or visualizations for a specific
dataset, then you could imagine it could do the same thing with results.
If I Google "White House Salaries", you would expect it should be able to
not only find "2011-Report-to-Congress-on-White-House-Staff", but also
show images from a number fo the visualizations created from it.
Thanks,
--Will
>
>
> This looks like another use for the derivedFrom property you suggested
> above.
>
>
> Best regards,
>
> Joshua
>
>
> >
> >
> >
> > Thanks,
> > --Will
>
Received on Wednesday, 18 July 2012 07:29:14 UTC