Re: Updated Schema Architypes Straw Man Proposal

Hi Jane,

Comments in line.

On 17 May 2017 at 14:52, Jane Stevenson <Jane.Stevenson@jisc.ac.uk> wrote:

> Hi Richard,
>
> Great, thanks.
>
> Just looking at the #Archive Collection for the time being:
>
> 1. Overall this tallies with my attempts - so I’m pleased that I seem to
> be going in the right direction.
>

Good to hear ;-)


>
> 2. can you just clarify for me the syntax re. the creator
>
> schema:creator [ a schema:Person ;
>             schema:name "Ronnie Barker" ;
>             schema:sameAs <http://viaf.org/viaf/2676198> ] ;
>
> You would do this every time you introduce Types?  So you might do it if
> you had, for example,  schema:publisher [a schema: Organization
>

This depends much on what information you have about the creator.  For
example if you had available a separate description of that Person with its
own identifying URI, you would just use that.

e.g.  schema:creator <https://archiveshub.jisc.ac.uk/data/person/1234>

Or you could use an external authoritative source such as <
http://viaf.org/viaf/2676198>, or <http://www.wikidata.org/entity/Q963893>

Note the syntax referenced in the examples here is Turtle, which I used for
clarity when displaying the whole of the example model.  It would be worth
checking out the JSON-LD examples
<https://www.w3.org/community/architypes/wiki/Initial_model_proposal#Page_for_a_sub-collection_in_the_archive>
of what would be inserted into the individual html pages for search engines
to crawl.


>
> 3. just a small error really, but schema: identifier should be GB71
> THM/407  and not 407/8. You’ve put temporal coverage as 1954-2005 but its
> actually 1929-2005 for the whole collection.
>
> In my example, I made the Audio Recordings
<https://archiveshub.jisc.ac.uk/search/archives/e3b10224-3672-33e6-99d3-9ad8e1fa8598?component=c7174483-794b-35d7-8546-3a7417e78879>
part of the collection a sub collection of the whole collection. In which
case the *identifier* and *temporalCoverage* values are as per the page the
relevant html page.

However I did get the temporalCoverage wrong for the main collection which
I have now corrected.


4. hasPart
>
> This particular collection has something like 500 parts (series, sub
> series, items). To my mind it is not generally going to be practical to use
> ‘has part’ in this way. I don’t think that matters, as I guess the
> principle is that you can indicate parts of a whole if you wish to. If I
> did want to do this, would it be:
>
> schema:hasPart "https://archiveshub.jisc.ac.uk/data/gb71-thm/407/thm/407/1"
> ;
> schema:hasPart "https://archiveshub.jisc.ac.uk/data/gb71-thm/407/thm/407/2”
> ;
> schema:hasPart "https://archiveshub.jisc.ac.uk/data/gb71-thm/407/thm/407/3”
> ;
> schema:hasPart "https://archiveshub.jisc.ac.uk/data/gb71-thm/407/thm/407/4”
> ;
>
> I’m not yet sure of the benefits of listing 500 parts in this way.
>

The benefits would be for the search engines to gain a detailed
understanding of the relationship between the collection and the items it
contains.

However I concur that in such a case in practice it probably would not be
practical to list all 500 in the JSON-LD insert on the collection page.
In such a case however use of *isPartOf* In the description of the
*ArchiveItem* would be sufficient to assert the relationship to a search
engine:

“isPartOf”: “https://archiveshub.jisc.ac.uk/data/gb71-thm/407/thm/407/8”
(JSON-LD syntax)



5. extent
>
> I definitely don’t want to include all the descriptive information within
> the schema.org representation, but I would tend to include the size of
> the collection as core information. At present I don’t think there is a
> property that we could use for this?
>

Extent was an issue that caused much discussion in the bibliographic
extension work, as there was much variation as to what extent could be used
for and mean.  It was never therefore recommended as a property, and the
use of *description* was recommended.

Potentially, with ArchiveCollection we could propose a property to describe
the size of a collection, property names that come to mind include
*collectionSize*, *itemQuantity*, *collectionExtent*, with expected types
of *Text* and *Integer*.   Would usage across archives be consistent enough
to support use of a general property such as this?


>
> 6. archiveHeld
>
> Could this description include:
>
> schema:archiveHeld "V&A Theatre and Performance Collections”
>

Whoops! - I missed out that property in the examples - now corrected:

“archiveHeld": "https://archiveshub.jisc.ac.uk/data/gb71-thm/407",



> 7. Language
>
> From what I gather, to be compliant we would have to use ISO639-1 codes?
> i.e. inLanguage: “EN” and not “eng”? All of our descriptions use ISO 639-2
> so its a shame if we can't use them!
>

 That’s the trouble with standards - there are so many to choose from! ;-)

Schema.org inLanguage <http://schema.org/inLanguage> encourages the use of
BCP47 <http://tools.ietf.org/html/bcp47>, which as you indicate, is
ISO639-1 based.  This being the generic used across many domains and
accross the web and html.  However, there are many domains that, as you, do
not yet use that standard.  This is an area where data consumers (search
engines) almost certainly apply Postel’s Law
<https://en.wikipedia.org/wiki/Robustness_principle> and would probably
recognise your language codes.


> 8. Aboutness
>
> Finally, one of the things I assumed with schema.org is that it would be
> useful to include what the archive is about. So I thought about using e.g:
>
> schema:about “Comedy”
> schema:about “Television comedy"
>
> I was thinking in terms of discoverability. What do you think about adding
> subjects/people/places in this way?
>

Most definitely!

I have added a couple of *about* references in the examples.  I used text
values, but equally they could have been URIs for the concepts, person, etc.

~Richard


>
> cheers
> Jane
>
>
>
>
> > On 16 May 2017, at 13:26, Richard Wallis <richard.wallis@dataliberate.c
> om> wrote:
> >
> > Hi all,
> >
> > Following discussions on the mailing list and taking into account
> general evolution of the schema.org vocabulary over recent months, I have
> produced an updated version of the straw man initial proposal in the Wiki.
> >
> > ~Richard.
> > Richard Wallis
> > Founder, Data Liberate
> > http://dataliberate.com
> > Linkedin: http://www.linkedin.com/in/richardwallis
> > Twitter: @rjw
>
> Jisc is a registered charity (number 1149740) and a company limited by
> guarantee which is registered in England under Company No. 5747339, VAT No.
> GB 197 0632 86. Jisc’s registered office is: One Castlepark, Tower Hill,
> Bristol, BS2 0JA. T 0203 697 5800.
>
> Jisc Services Limited is a wholly owned Jisc subsidiary and a company
> limited by guarantee which is registered in England under company number
> 2881024, VAT number GB 197 0632 86. The registered office is: One Castle
> Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800.
>

Received on Wednesday, 17 May 2017 21:34:42 UTC