Re: Tweaks to the Archives proposal [via Schema Architypes Community Group] from Richard Wallis on 2017-07-12 (public-architypes@w3.org from July 2017)

From: Richard Wallis <richard.wallis@dataliberate.com>
Date: Wed, 12 Jul 2017 13:08:41 +0100
To: Owen Stephens <owen@ostephens.com>
Cc: public-architypes <public-architypes@w3.org>, Jane Stevenson <Jane.Stevenson@jisc.ac.uk>
Message-ID: <CAD47Kz6+P_900j11eGAr3VVr1mw-yG8bLiNs0hwLG-f7qLo_bg@mail.gmail.com>
Hi Owen,

Thanks for you input - never too late until the proposal is submitted and
adopted :-)

I feel that by focusing on Jane’s particular use case we are looking at an
edge case, in Jane’s situation a nevertheless substantial one, but I want
to be careful we don’t skew our proposal away from potentially broader
generic cases.

In describing archive collections and the things within them our objective
is to make them more discoverable widely on the web.  I believe that in
general non-archivists understand the concept of an Archive holding
organisation which is responsible for collection(s) that contain things or
items.  Reflecting that simplistic understanding was one of the starting
points for the proposed model.

As to your particular points :

*The naming of a Type as ArchiveItem or ArchiveObject*. Checking some
online definitions I see that object and item are both synonyms of each
other, however I still feel that the description of item (*an individual
article or unit, especially one that is part of a list, collection, or set.*)
is closer to what we are trying to express than that of object (*a material
thing that can be seen and touched*)

*Effectively merging the properties of ArchiveCollection and
ArchiveItem.* Using
other types to define not only if it is a collection or not, but also what
type of item it is, I Believe may be pushing the multi-type generic
capabilities of Schema.org a little too far to be understandable to
implementing archivists.  It has many similarities to my original proposal
where *ArchiveCollection* was made a subtype of both *Collection* and
*ArchiveItem* an approach, although logically correct, caused much
discussion and confusion early on.

So going back to the proposal as it currently stands, it works well when
you know what you are describing - an item or collection of items held by
organisation.

Where it is difficult is when you don’t know what you are describing.  What
do you default to?  The two options being a collection (which would be
wrong if it is for example an individual document) or; an item (which would
be wrong if for example it was a folder containing several as yet to be
described items).  Whichever of these are chosen it will be wrong some of
the time.

My thoughts are that it should be up to the describing organisation to
decide, based on probabilities within their collection(s), as to which of
these to choose.  Not ideal, but I believe preferential when compared with
creating a fuzzy type that would work for either case but loose useful
specificity when what is being described is known to be an individual item
or a collection of items.

~Richard.






Richard Wallis
Founder, Data Liberate
http://dataliberate.com
Linkedin: http://www.linkedin.com/in/richardwallis
Twitter: @rjw

On 12 July 2017 at 11:56, Owen Stephens <owen@ostephens.com> wrote:

> Hi all,
>
> I’ve not had the time to contribute to this discussion so much, and it’s
> good to see some practical progress, but this latest point has brought me
> back to some slight unhappiness with the structure of the current proposal
> and the use of ArchiveItem. Apologies if this is either too late, or I’ve
> missed how the model has developed over the last few months. I’m looking at
> https://www.w3.org/community/architypes/wiki/Initial_model_proposal
>
> As I understand it, the current proposal has ArchiveCollection as a
> subtype of Collection (which is a CreativeWork), while ArchiveItem is an
> intangible, and intended to be applied alongside other types (such as
> CreativeWork or Thing) to enable the addition of ArchiveItem properties to
> existing sdc types.
>
> The case that Jane has highlighted here is that it is unknown whether what
> we are looking at is a Collection or a specific Item.
>
> In this case, giving something that maybe a collection or maybe an item
> the type ArchiveCollection, seems wrong - it suggests a level of
> specificity we don’t know.
> Also it seems to me that giving it a type ArchiveItem doesn’t imply it is
> actually a specific item - because ArchiveItem can be applied alongside
> other types (presumably including ArchiveCollection).
>
> So it would make more sense to me in this case to state that the thing is
> a Thing or CreativeWork, with an additional type of ArchiveItem - this
> doesn’t imply it is either a single item or a collection, it would leave
> this open to question - which seems to me to reflect the reality of the
> situation.
>
> Trying to draw this up into the modelling of archives in scd, the question
> it brings me to - is what is the advantage of splitting archival properties
> between ArchiveCollection and ArchiveItem? Why not bundle all the
> properties (there aren’t that many) into a single type based on intangible
> (taking the current ArchiveItem approach) - I’ll call it ‘ArchiveObject’
> for now. When you know you have a Collection you apply type of Collection
> and ArchiveObject, and when you have a CreativeWork you apply type of
> CreativeWork and ArchiveObject etc.
>
> At the moment applying ArchiveCollection when you aren’t sure whether it
> is actually a Collection seems wrong to me. If there is any ambiguity then
> I think you can apply ArchiveItem (you know it is in an Archive) but you
> can’t assert Collection.
>
> Owen
>
> Owen Stephens
> Owen Stephens Consulting
> Web: http://www.ostephens.com
> Email: owen@ostephens.com
> Telephone: 0121 288 6936
>
> > On 12 Jul 2017, at 11:29, Jane Stevenson <Jane.Stevenson@jisc.ac.uk>
> wrote:
> >
> > Hi Richard,
> >
> > Yes, we are an awkward case! But at least we then bring benefits to over
> 300 repositories when we implement schema.org.
> >
> >> As to your A/B decision, I can only suggest from a non archivist point
> of view, but if something has already been identified in someway as an item
> or piece, it would be worth reflecting that in the description shared with
> the web (using the ArchiveItem type), then defaulting, in your case, to
> ArchiveCollection where this is not known.
> >>
> > Perfect - I was going to go with that, as I’m thinking be accurate where
> you can be accurate.
> >
> >> A minor syntax point:  The convention within Schema.org is for the
> names of Types to begin with an uppercase letter (Archive,
> ArchiveCollection, ArchiveItem)  and properties with a lowercase
> (ItemLocation, holdingArchive, accessConditions, etc.).   I know we are
> only in discussion mode, but looking back on this documentation it can be
> confusing for some if we don’t follow these conventions here as well as in
> the type definitions etc.
> >
> > Thanks. I may have been a bit inconsistent with this….but we’ll ensure
> we implement it correctly.
> >
> > OK….we’ll crack on then.
> >
> > Thanks to all - the discussion has been really useful.
> >
> > cheers,
> > Jane
> >
> >
> > Jane Stevenson
> > Archives Hub Service Manager
> > jane.stevenson@jisc.ac.uk
> > (Work days: Monday to Thursday)
> >
> > Tel: 0161 413 7555
> > Web: archiveshub.jisc ac.uk
> > Skype:  janestevenson
> > Twitter: @archiveshub, @janestevenson
> >
> >
> >
> >> On 12 Jul 2017, at 10:20, Richard Wallis <richard.wallis@dataliberate.
> com> wrote:
> >>
> >> Thanks Jane for your insight into the issues surrounding this within
> Archives Hub.  As effectively an aggregator of archives this provides a
> test of the model at one end of the spectrum of use cases we are looking to
> satisfy.
> >>
> >> As you say, from the information you are provided with you may not know
> if something being described is a collection or a single item.  Also it is
> unlikely that you would know if a single item is located with the rest of
> the collection or not.
> >>
> >> Those responsible for other individual archives may well be very clear
> on these things for their collections.  Hopefully we are in a position to
> satisfy the broad spectrum of use cases with this proposal.
> >>
> >> As to your A/B decision, I can only suggest from a non archivist point
> of view, but if something has already been identified in someway as an item
> or piece, it would be worth reflecting that in the description shared with
> the web (using the ArchiveItem type), then defaulting, in your case, to
> ArchiveCollection where this is not known.
> >>
> >> If there are no further discussion points from the group, I intend in
> the next couple of weeks to forward this proposal to the Schema.org group
> for consideration.
> >>
> >> A minor syntax point:  The convention within Schema.org is for the
> names of Types to begin with an uppercase letter (Archive,
> ArchiveCollection, ArchiveItem)  and properties with a lowercase
> (ItemLocation, holdingArchive, accessConditions, etc.).   I know we are
> only in discussion mode, but looking back on this documentation it can be
> confusing for some if we don’t follow these conventions here as well as in
> the type definitions etc.
> >>
> >> ~Richard.
> >>
> >>
> >>
> >>
> >>
> >> Richard Wallis
> >> Founder, Data Liberate
> >> http://dataliberate.com
> >> Linkedin: http://www.linkedin.com/in/richardwallis
> >> Twitter: @rjw
> >>
> >> On 12 July 2017 at 09:36, Jane Stevenson <Jane.Stevenson@jisc.ac.uk>
> wrote:
> >> Hi Richard,
> >>
> >>> It would work to describe a collection of one or more things. However,
> if you have a known physical item (book, article, photograph, etc) or file
> (video, audio, image, web page, etc.) why would you not describe it as such?
> >>
> >> This is the nub of the matter….it is because we won’t always know. We
> can definitely decide that if the level is described as “item” we apply the
> archiveItem type. But (1) levels are not always given values - although on
> the Hub we do ask for this, but in general, within EAD, values are not
> mandatory (2) You can have a level that is a sub-series, or a folder or a
> file that is effectively one physical item, but the level value does not
> identify this. Archivists will describe ‘one folder’ but it may have one
> item in it.  Is something described as ‘one folder’ an item? Should ‘one
> box’ always be treated as a collection of items, although it may only have
> one item in it , e.g. an account book is a sub-series in one box.
> >>
> >> It is maybe possible for an individual repository to sort out single
> item descriptions  from ‘more than one item’ descriptions, but its not
> possible for us to do that in an automated way across all our data. People
> aren’t consistent enough with cataloguing for that, and to be fair, the
> standards have never emphasised the importance of distinguishing one
> physical item in this way.
> >>
> >>> This comes back to describing information about an individual item.
> Potentially the ArchiveCollection the item is part of could be held by an
> organisation (Archive), yet an individual item could be located, on
> extended loan for example, at a different location.
> >>
> >>
> >> OK. I get the logic. It is just quite rare for that to happen, unlike
> museums. And if it was temporarily elsewhere, we wouldn’t know. Something
> on loan would not be flagged as such in the description. But that’s OK - we
> would always just use the repository as the holding institution, so
> itemLocation, if we use it, would always have the same value as
> holdingArchive. If an item was on loan it simply wouldn’t show up in our
> schema.org data.  I don’t think that matters. As you say, its optional
> anyway.
> >>
> >> I think we’re ready to go now. I just have to decide on either
> >>
> >> A. Always use archiveCollection, including for items, because we can’t
> distinguish all items anyway
> >> B. use archiveItem where we have a level value of “item” or “piece”,
> which will give us a majority of items (my estimate is that we would get
> something like 70% of single entities this way), but it will be the case
> that a fair number of items won’t be described as items because they don’t
> have that level value, even if they are single physical entities, so they
> will be single physical items but described as type archiveCollection.
> >>
> >> cheers,
> >> Jane.
> >>
> >>
> >> Jane Stevenson
> >> Archives Hub Service Manager
> >> jane.stevenson@jisc.ac.uk
> >> (Work days: Monday to Thursday)
> >>
> >> Tel: 0161 413 7555
> >> Web: archiveshub.jisc ac.uk
> >> Skype:  janestevenson
> >> Twitter: @archiveshub, @janestevenson
> >>
> >>
> >>
> >>> On 11 Jul 2017, at 17:26, Richard Wallis <richard.wallis@dataliberate.
> com> wrote:
> >>>
> >>> Hi Jane,
> >>>
> >>> Sorry for being slow in responding.
> >>>
> >>> Answers inline.
> >>>
> >>> ~Richard.
> >>>
> >>>
> >>> On 3 July 2017 at 07:48, Jane Stevenson <Jane.Stevenson@jisc.ac.uk>
> wrote:
> >>> Hi Richard and everyone,
> >>>
> >>> If I decided to only use #archiveCollection for all of the units of
> description, would that work?  We don’t necessarily know if units described
> are single items or more than one item anyway, and it seems to me we can
> effectively describe each unit with the properties now provided, which is
> the main thing. So my question is, why would I need to use #archiveItem?
> >>>
> >>> It would work to describe a collection of one or more things. However,
> if you have a known physical item (book, article, photograph, etc) or file
> (video, audio, image, web page, etc.) why would you not describe it as such?
> >>>
> >>>
> >>> Just one more question…. we have properties archiveHeld and
> holdingArchive, and we also have itemLocation. How is itemLocation
> different from holdingArchive? In the example, for Ronnie Barker,
> itemLocation is given as the V&A Theatre & Performance Archive (URL). But
> surely the property of holdingArchive would do just as well.
> >>>
> >>> This comes back to describing information about an individual item.
> Potentially the ArchiveCollection the item is part of could be held by an
> organisation (Archive), yet an individual item could be located, on
> extended loan for example, at a different location.
> >>>
> >>> All properties within Schema.org are optional, so you probably would
> only provide an itemLocation when an item is located separate from the
> holdingArchive of the ArchiveCollection of which it is part.
> >>>
> >>> ~Richard.
> >>>
> >>>
> >>> cheers
> >>> Jane
> >>>
> >>> Jane Stevenson
> >>> Archives Hub Service Manager
> >>> jane.stevenson@jisc.ac.uk
> >>>
> >>> Jisc is a registered charity (number 1149740) and a company limited by
> guarantee which is registered in England under Company No. 5747339, VAT No.
> GB 197 0632 86. Jisc’s registered office is: One Castlepark, Tower Hill,
> Bristol, BS2 0JA. T 0203 697 5800.
> >>>
> >>> Jisc Services Limited is a wholly owned Jisc subsidiary and a company
> limited by guarantee which is registered in England under company number
> 2881024, VAT number GB 197 0632 86. The registered office is: One Castle
> Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800.
> >>>
> >>
> >>
> >
>
>
Received on Wednesday, 12 July 2017 12:09:17 UTC