W3C home > Mailing lists > Public > public-architypes@w3.org > July 2017

Re: Tweaks to the Archives proposal [via Schema Architypes Community Group]

From: Richard Wallis <richard.wallis@dataliberate.com>
Date: Wed, 12 Jul 2017 17:42:30 +0100
Message-ID: <CAD47Kz7ds85cRd=Gvo08E9bVTo-0mj=s7RoTntQ23_-9pYHoMg@mail.gmail.com>
To: "Roke, Elizabeth Russey" <erussey@emory.edu>
Cc: Adrian Stevenson <Adrian.Stevenson@jisc.ac.uk>, "owen@ostephens.com" <owen@ostephens.com>, public-architypes <public-architypes@w3.org>, Jane Stevenson <Jane.Stevenson@jisc.ac.uk>
Thanks Elizabeth.

With regard to extent, there was some discussion on how that would be
provided in some consistent way.  One option floated was to provide a
collectionSize property which would take a Number or a Text as a value.
However I seem to remember there was some inconclusive discussion around
what values are normally found for extent and how they would fit with such
a property.

I would be happy to propose such a property for *ArchiveCollection*, or
even to add it to its super-type *Collection*, if we can agree on its
proposed use/contents.

As to the current discussion.…

We seem to be striving for a Type to represent something that is at the
“lowest level of description”, which may or may not be an individual item,
but is part of an ArchiveCollection.   To me this could probably be a
super-type for the current *ArchiveItem*.

Names that come to mind for such a Type are *ArchiveComponent*,
*ArchiveObject*, *ArchiveElement*.   The properties for such a type could
include *accessConditions*, & *isPartOf*.  With the more item specific
properties (*hasPart*, *itemCondition*, *itemLocation*) only appearing on
the ArchiveItem subtype.

The decision tree in such a case for defining the type of something in an
archive would be:

   1. Is it a collection of things?
      - Yes - used ArchiveCollection
      - No - go to next step

      2. Is it a specific identifiable thing (book, file, video, etc.)?
      - Yes - use ArchiveItem plus specific schema.org type for the item
      type
      - No - use ArchiveObject


If that works we would have to be careful how we describe such a type as I
would imagine many outside the archives community would not understand the
concept of “lowest level of description”.

~Richard.



Richard Wallis
Founder, Data Liberate
http://dataliberate.com
Linkedin: http://www.linkedin.com/in/richardwallis
Twitter: @rjw

On 12 July 2017 at 15:47, Roke, Elizabeth Russey <erussey@emory.edu> wrote:

> Richard—
> I thought we were going to add an extent property for ArchiveCollection to
> the proposal.  My group feels quite strongly that this will be necessary to
> appropriately represent the overall size of a collection to users.  It
> makes a big difference if a collection is 1 box or 2000 when people are
> deciding how relevant a collection will be.
>
> As to the other discussion going on, I agree that this is far from an edge
> case.  Because of the way archival description works, we can only
> consistently say that something is at the lowest level of description, not
> that it is an item or not, especially if we’re serializing data from
> archival management systems such as ArchivesSpace.  I wonder if we might
> have some sort of type like ArchiveComponent to express everything beneath
> an ArchiveCollection.  People should be free to use ArchiveItem or the
> specific type for individual objects, but in my opinion we need something
> that expresses the described object is part of a collection.  The way this
> is structured now I think is confusing.  For example, if I have a folder
> that is part of a series that is part of a collection, I would describe all
> of these as collections?  It’s applying a bibliographic understanding (that
> the core level of description is an individual item) when we should be
> using archival approaches.
>
> Elizabeth
>
> ___________________________
> Elizabeth Russey Roke
> Digital Archivist
> Stuart A. Rose Manuscript, Archives, & Rare Book Library
> 404.727.2345 | erussey@emory.edu
>
>
>
>
> "The Stuart A. Rose Manuscript, Archives, & Rare Book Library collects and
> connects stories of human experience, promotes access and learning, and
> offers opportunities for dialogue for all wise hearts who seek knowledge.”
>
> Read the Rose Library blog: https://scholarblogs.emory.edu/marbl/
>
> Like the Rose Library on Facebook:  https://www.facebook.com/emorymarbl
>
> Follow the Rose Library on Twitter: https://twitter.com/EmoryRoseMARBL
>
> On 7/12/17, 9:37 AM, "Adrian Stevenson" <Adrian.Stevenson@jisc.ac.uk>
> wrote:
>
>     I’m sure Jane will chip in again when she gets a chance, but I just
> wanted to back up the point that what’s described for the Hub relates to
> the archival data as created typically at institutions and is far from
> being an edge case. I suspect it may be considered standard practice so the
> proposal will need to support this adequately.
>
>     Adrian
>     _____________________________
>     Adrian Stevenson
>     Senior Technical Coordinator
>     Jisc Manchester
>     6th Floor, Churchgate House
>     56 Oxford Street
>     Manchester
>     M1 6EU
>
>     Email: adrian.stevenson@jisc.ac.uk
>     Tel: +44 (0) 161 413 7561
>     http://www.twitter.com/adrianstevenson
>     http://uk.linkedin.com/in/adrianstevenson/
>
>     > On 12 Jul 2017, at 14:22, Richard Wallis <
> richard.wallis@dataliberate.com> wrote:
>     >
>     > Yes Owen, lets see what that proposal might look like and compare
> the two.
>     >
>     > ~Richard.
>     >
>     > Richard Wallis
>     > Founder, Data Liberate
>     > http://dataliberate.com
>     > Linkedin: http://www.linkedin.com/in/richardwallis
>     > Twitter: @rjw
>     >
>     > On 12 July 2017 at 14:09, Owen Stephens <owen@ostephens.com> wrote:
>     > On 12 Jul 2017, at 13:08, Richard Wallis <
> richard.wallis@dataliberate.com> wrote:
>     >>
>     >> Hi Owen,
>     >>
>     >> Thanks for you input - never too late until the proposal is
> submitted and adopted :-)
>     >>
>     >> I feel that by focusing on Jane’s particular use case we are
> looking at an edge case, in Jane’s situation a nevertheless substantial
> one, but I want to be careful we don’t skew our proposal away from
> potentially broader generic cases.
>     >
>     > I don’t agree that this is an edge case - or at least I don’t think
> there is any evidence that it is. If this is the data that Archives Hub is
> getting, then I think it is reasonable to assume that this is the data that
> is available for many archives - and in that case many archives will be
> faced with the same issue of not knowing, on any automated basis, whether a
> particular description is a collection or an individual item
>     >
>     >> In describing archive collections and the things within them our
> objective is to make them more discoverable widely on the web.  I believe
> that in general non-archivists understand the concept of an Archive holding
> organisation which is responsible for collection(s) that contain things or
> items.  Reflecting that simplistic understanding was one of the starting
> points for the proposed model.
>     > I agree with this - I don’t think that anything I suggested went
> against this (or at least it wasn’t my intention)
>     >
>     >>
>     >> As to your particular points :
>     >>
>     >> The naming of a Type as ArchiveItem or ArchiveObject. Checking some
> online definitions I see that object and item are both synonyms of each
> other, however I still feel that the description of item (an individual
> article or unit, especially one that is part of a list, collection, or
> set.) is closer to what we are trying to express than that of object (a
> material thing that can be seen and touched)
>     > Fair enough - I was trying to find a term that didn’t suggest either
> an item or a collection - I obviously failed!
>     >
>     >>
>     >> Effectively merging the properties of ArchiveCollection and
> ArchiveItem. Using other types to define not only if it is a collection or
> not, but also what type of item it is, I Believe may be pushing the
> multi-type generic capabilities of Schema.org a little too far to be
> understandable to implementing archivists.  It has many similarities to my
> original proposal where ArchiveCollection was made a subtype of both
> Collection and ArchiveItem an approach, although logically correct, caused
> much discussion and confusion early on.
>     > I can see the potential to create confusion here, but I think this
> already exists in the current proposal which mixes two approaches to adding
> archive properties to a Thing. I think my proposal is simpler in that it
> adopts a single way of doing this. I’m not entirely happy with this (I was
> initially against the use of Intangible type) but I’d argue it is simpler
> as it reduces the number of new types and groups all the relevant
> properties in a single type.
>     >
>     >>
>     >> So going back to the proposal as it currently stands, it works well
> when you know what you are describing - an item or collection of items held
> by organisation.
>     >>
>     >> Where it is difficult is when you don’t know what you are
> describing.  What do you default to?  The two options being a collection
> (which would be wrong if it is for example an individual document) or; an
> item (which would be wrong if for example it was a folder containing
> several as yet to be described items).  Whichever of these are chosen it
> will be wrong some of the time.
>     >>
>     >> My thoughts are that it should be up to the describing organisation
> to decide, based on probabilities within their collection(s), as to which
> of these to choose.  Not ideal, but I believe preferential when compared
> with creating a fuzzy type that would work for either case but loose useful
> specificity when what is being described is known to be an individual item
> or a collection of items.
>     >
>     > While I don’t disagree its up to the describing organisation to
> decide (of course), it’s about the decision they are having to make.
>     > I’m proposing that the question of whether it is a Collection or not
> should be separate to whether the thing is in an Archive or not. At the
> moment this seems problematic as you have decide up front whether you want
> to use ArchiveCollection or ArchiveItem.
>     >
>     > The intent of my proposal was to separate out the question of ‘what’
> it is, from the fact it is in an archive and therefore has a set of archive
> specific properties related to it.
>     >
>     > I’m inclined to write up a proposal using this approach on the wiki
> so I can be a bit more explicit and we can see how the approaches compare.
> Does this seem like a good way forward?
>     > Does anyone else have a comment or view on this?
>     >
>     > Owen
>     >
>     >
>     >>
>     >> ~Richard.
>     >>
>     >>
>     >>
>     >>
>     >>
>     >>
>     >> Richard Wallis
>     >> Founder, Data Liberate
>     >> http://dataliberate.com
>     >> Linkedin: http://www.linkedin.com/in/richardwallis
>     >> Twitter: @rjw
>     >>
>     >> On 12 July 2017 at 11:56, Owen Stephens <owen@ostephens.com> wrote:
>     >> Hi all,
>     >>
>     >> I’ve not had the time to contribute to this discussion so much, and
> it’s good to see some practical progress, but this latest point has brought
> me back to some slight unhappiness with the structure of the current
> proposal and the use of ArchiveItem. Apologies if this is either too late,
> or I’ve missed how the model has developed over the last few months. I’m
> looking at https://www.w3.org/community/architypes/wiki/Initial_model_
> proposal
>     >>
>     >> As I understand it, the current proposal has ArchiveCollection as a
> subtype of Collection (which is a CreativeWork), while ArchiveItem is an
> intangible, and intended to be applied alongside other types (such as
> CreativeWork or Thing) to enable the addition of ArchiveItem properties to
> existing sdc types.
>     >>
>     >> The case that Jane has highlighted here is that it is unknown
> whether what we are looking at is a Collection or a specific Item.
>     >>
>     >> In this case, giving something that maybe a collection or maybe an
> item the type ArchiveCollection, seems wrong - it suggests a level of
> specificity we don’t know.
>     >> Also it seems to me that giving it a type ArchiveItem doesn’t imply
> it is actually a specific item - because ArchiveItem can be applied
> alongside other types (presumably including ArchiveCollection).
>     >>
>     >> So it would make more sense to me in this case to state that the
> thing is a Thing or CreativeWork, with an additional type of ArchiveItem -
> this doesn’t imply it is either a single item or a collection, it would
> leave this open to question - which seems to me to reflect the reality of
> the situation.
>     >>
>     >> Trying to draw this up into the modelling of archives in scd, the
> question it brings me to - is what is the advantage of splitting archival
> properties between ArchiveCollection and ArchiveItem? Why not bundle all
> the properties (there aren’t that many) into a single type based on
> intangible (taking the current ArchiveItem approach) - I’ll call it
> ‘ArchiveObject’ for now. When you know you have a Collection you apply type
> of Collection and ArchiveObject, and when you have a CreativeWork you apply
> type of CreativeWork and ArchiveObject etc.
>     >>
>     >> At the moment applying ArchiveCollection when you aren’t sure
> whether it is actually a Collection seems wrong to me. If there is any
> ambiguity then I think you can apply ArchiveItem (you know it is in an
> Archive) but you can’t assert Collection.
>     >>
>     >> Owen
>     >>
>     >> Owen Stephens
>     >> Owen Stephens Consulting
>     >> Web: http://www.ostephens.com
>     >> Email: owen@ostephens.com
>     >> Telephone: 0121 288 6936
>     >>
>     >> > On 12 Jul 2017, at 11:29, Jane Stevenson <
> Jane.Stevenson@jisc.ac.uk> wrote:
>     >> >
>     >> > Hi Richard,
>     >> >
>     >> > Yes, we are an awkward case! But at least we then bring benefits
> to over 300 repositories when we implement schema.org.
>     >> >
>     >> >> As to your A/B decision, I can only suggest from a non archivist
> point of view, but if something has already been identified in someway as
> an item or piece, it would be worth reflecting that in the description
> shared with the web (using the ArchiveItem type), then defaulting, in your
> case, to ArchiveCollection where this is not known.
>     >> >>
>     >> > Perfect - I was going to go with that, as I’m thinking be
> accurate where you can be accurate.
>     >> >
>     >> >> A minor syntax point:  The convention within Schema.org is for
> the names of Types to begin with an uppercase letter (Archive,
> ArchiveCollection, ArchiveItem)  and properties with a lowercase
> (ItemLocation, holdingArchive, accessConditions, etc.).   I know we are
> only in discussion mode, but looking back on this documentation it can be
> confusing for some if we don’t follow these conventions here as well as in
> the type definitions etc.
>     >> >
>     >> > Thanks. I may have been a bit inconsistent with this….but we’ll
> ensure we implement it correctly.
>     >> >
>     >> > OK….we’ll crack on then.
>     >> >
>     >> > Thanks to all - the discussion has been really useful.
>     >> >
>     >> > cheers,
>     >> > Jane
>     >> >
>     >> >
>     >> > Jane Stevenson
>     >> > Archives Hub Service Manager
>     >> > jane.stevenson@jisc.ac.uk
>     >> > (Work days: Monday to Thursday)
>     >> >
>     >> > Tel: 0161 413 7555
>     >> > Web: archiveshub.jisc ac.uk
>     >> > Skype:  janestevenson
>     >> > Twitter: @archiveshub, @janestevenson
>     >> >
>     >> >
>     >> >
>     >> >> On 12 Jul 2017, at 10:20, Richard Wallis <
> richard.wallis@dataliberate.com> wrote:
>     >> >>
>     >> >> Thanks Jane for your insight into the issues surrounding this
> within Archives Hub.  As effectively an aggregator of archives this
> provides a test of the model at one end of the spectrum of use cases we are
> looking to satisfy.
>     >> >>
>     >> >> As you say, from the information you are provided with you may
> not know if something being described is a collection or a single item.
> Also it is unlikely that you would know if a single item is located with
> the rest of the collection or not.
>     >> >>
>     >> >> Those responsible for other individual archives may well be very
> clear on these things for their collections.  Hopefully we are in a
> position to satisfy the broad spectrum of use cases with this proposal.
>     >> >>
>     >> >> As to your A/B decision, I can only suggest from a non archivist
> point of view, but if something has already been identified in someway as
> an item or piece, it would be worth reflecting that in the description
> shared with the web (using the ArchiveItem type), then defaulting, in your
> case, to ArchiveCollection where this is not known.
>     >> >>
>     >> >> If there are no further discussion points from the group, I
> intend in the next couple of weeks to forward this proposal to the
> Schema.org group for consideration.
>     >> >>
>     >> >> A minor syntax point:  The convention within Schema.org is for
> the names of Types to begin with an uppercase letter (Archive,
> ArchiveCollection, ArchiveItem)  and properties with a lowercase
> (ItemLocation, holdingArchive, accessConditions, etc.).   I know we are
> only in discussion mode, but looking back on this documentation it can be
> confusing for some if we don’t follow these conventions here as well as in
> the type definitions etc.
>     >> >>
>     >> >> ~Richard.
>     >> >>
>     >> >>
>     >> >>
>     >> >>
>     >> >>
>     >> >> Richard Wallis
>     >> >> Founder, Data Liberate
>     >> >> http://dataliberate.com
>     >> >> Linkedin: http://www.linkedin.com/in/richardwallis
>     >> >> Twitter: @rjw
>     >> >>
>     >> >> On 12 July 2017 at 09:36, Jane Stevenson <
> Jane.Stevenson@jisc.ac.uk> wrote:
>     >> >> Hi Richard,
>     >> >>
>     >> >>> It would work to describe a collection of one or more things.
> However, if you have a known physical item (book, article, photograph, etc)
> or file (video, audio, image, web page, etc.) why would you not describe it
> as such?
>     >> >>
>     >> >> This is the nub of the matter….it is because we won’t always
> know. We can definitely decide that if the level is described as “item” we
> apply the archiveItem type. But (1) levels are not always given values -
> although on the Hub we do ask for this, but in general, within EAD, values
> are not mandatory (2) You can have a level that is a sub-series, or a
> folder or a file that is effectively one physical item, but the level value
> does not identify this. Archivists will describe ‘one folder’ but it may
> have one item in it.  Is something described as ‘one folder’ an item?
> Should ‘one box’ always be treated as a collection of items, although it
> may only have one item in it , e.g. an account book is a sub-series in one
> box.
>     >> >>
>     >> >> It is maybe possible for an individual repository to sort out
> single item descriptions  from ‘more than one item’ descriptions, but its
> not possible for us to do that in an automated way across all our data.
> People aren’t consistent enough with cataloguing for that, and to be fair,
> the standards have never emphasised the importance of distinguishing one
> physical item in this way.
>     >> >>
>     >> >>> This comes back to describing information about an individual
> item.  Potentially the ArchiveCollection the item is part of could be held
> by an organisation (Archive), yet an individual item could be located, on
> extended loan for example, at a different location.
>     >> >>
>     >> >>
>     >> >> OK. I get the logic. It is just quite rare for that to happen,
> unlike museums. And if it was temporarily elsewhere, we wouldn’t know.
> Something on loan would not be flagged as such in the description. But
> that’s OK - we would always just use the repository as the holding
> institution, so itemLocation, if we use it, would always have the same
> value as holdingArchive. If an item was on loan it simply wouldn’t show up
> in our schema.org data.  I don’t think that matters. As you say, its
> optional anyway.
>     >> >>
>     >> >> I think we’re ready to go now. I just have to decide on either
>     >> >>
>     >> >> A. Always use archiveCollection, including for items, because we
> can’t distinguish all items anyway
>     >> >> B. use archiveItem where we have a level value of “item” or
> “piece”, which will give us a majority of items (my estimate is that we
> would get something like 70% of single entities this way), but it will be
> the case that a fair number of items won’t be described as items because
> they don’t have that level value, even if they are single physical
> entities, so they will be single physical items but described as type
> archiveCollection.
>     >> >>
>     >> >> cheers,
>     >> >> Jane.
>     >> >>
>     >> >>
>     >> >> Jane Stevenson
>     >> >> Archives Hub Service Manager
>     >> >> jane.stevenson@jisc.ac.uk
>     >> >> (Work days: Monday to Thursday)
>     >> >>
>     >> >> Tel: 0161 413 7555
>     >> >> Web: archiveshub.jisc ac.uk
>     >> >> Skype:  janestevenson
>     >> >> Twitter: @archiveshub, @janestevenson
>     >> >>
>     >> >>
>     >> >>
>     >> >>> On 11 Jul 2017, at 17:26, Richard Wallis <
> richard.wallis@dataliberate.com> wrote:
>     >> >>>
>     >> >>> Hi Jane,
>     >> >>>
>     >> >>> Sorry for being slow in responding.
>     >> >>>
>     >> >>> Answers inline.
>     >> >>>
>     >> >>> ~Richard.
>     >> >>>
>     >> >>>
>     >> >>> On 3 July 2017 at 07:48, Jane Stevenson <
> Jane.Stevenson@jisc.ac.uk> wrote:
>     >> >>> Hi Richard and everyone,
>     >> >>>
>     >> >>> If I decided to only use #archiveCollection for all of the
> units of description, would that work?  We don’t necessarily know if units
> described are single items or more than one item anyway, and it seems to me
> we can effectively describe each unit with the properties now provided,
> which is the main thing. So my question is, why would I need to use
> #archiveItem?
>     >> >>>
>     >> >>> It would work to describe a collection of one or more things.
> However, if you have a known physical item (book, article, photograph, etc)
> or file (video, audio, image, web page, etc.) why would you not describe it
> as such?
>     >> >>>
>     >> >>>
>     >> >>> Just one more question…. we have properties archiveHeld and
> holdingArchive, and we also have itemLocation. How is itemLocation
> different from holdingArchive? In the example, for Ronnie Barker,
> itemLocation is given as the V&A Theatre & Performance Archive (URL). But
> surely the property of holdingArchive would do just as well.
>     >> >>>
>     >> >>> This comes back to describing information about an individual
> item.  Potentially the ArchiveCollection the item is part of could be held
> by an organisation (Archive), yet an individual item could be located, on
> extended loan for example, at a different location.
>     >> >>>
>     >> >>> All properties within Schema.org are optional, so you probably
> would only provide an itemLocation when an item is located separate from
> the holdingArchive of the ArchiveCollection of which it is part.
>     >> >>>
>     >> >>> ~Richard.
>     >> >>>
>     >> >>>
>     >> >>> cheers
>     >> >>> Jane
>     >> >>>
>     >> >>> Jane Stevenson
>     >> >>> Archives Hub Service Manager
>     >> >>> jane.stevenson@jisc.ac.uk
>     >> >>>
>     >> >>> Jisc is a registered charity (number 1149740) and a company
> limited by guarantee which is registered in England under Company No.
> 5747339, VAT No. GB 197 0632 86. Jisc’s registered office is: One
> Castlepark, Tower Hill, Bristol, BS2 0JA. T 0203 697 5800.
>     >> >>>
>     >> >>> Jisc Services Limited is a wholly owned Jisc subsidiary and a
> company limited by guarantee which is registered in England under company
> number 2881024, VAT number GB 197 0632 86. The registered office is: One
> Castle Park, Tower Hill, Bristol BS2 0JA. T 0203 697 5800.
>     >> >>>
>     >> >>
>     >> >>
>     >> >
>     >>
>     >>
>     >
>     >
>
>
>
>
> ________________________________
>
> This e-mail message (including any attachments) is for the sole use of
> the intended recipient(s) and may contain confidential and privileged
> information. If the reader of this message is not the intended
> recipient, you are hereby notified that any dissemination, distribution
> or copying of this message (including any attachments) is strictly
> prohibited.
>
> If you have received this message in error, please contact
> the sender by reply e-mail message and destroy all copies of the
> original message (including attachments).
>
Received on Wednesday, 12 July 2017 16:43:02 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 8 August 2018 13:29:00 UTC