Re: Ideas for the Use Case of Public-sector Meeting Transcripts

Silvia,
All,

Thank you. Yes, that does help.

I will plug away at it some more. I also hope to spur interest in enhancing (public-sector) meetings' minutes and transcripts in the standards community.

I see how WebVTT-based timed thumbnails could represent slides for both audiences and multimodal LLMs. One could also explore hyperlinks to individual slides (e.g., "files/slideshow.pptx#3"). In theory, both hyperlinks to individual slides and their (thumbnail) images could be provided as alternatives when hyperlinks were available.

In addition to in-person meetings, there are virtual meetings to consider (e.g., WebRTC) and how to best represent these (real-time) captions, minutes, and transcripts. At some future point, it might be useful to indicate WebRTC when making a case to the community and browser vendors.


Best regards,
Adam

________________________________
From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Sent: Tuesday, July 9, 2024 3:51 AM
To: Adam Sobieski <adamsobieski@hotmail.com>
Cc: public-tt@w3.org <public-tt@w3.org>
Subject: Re: Ideas for the Use Case of Public-sector Meeting Transcripts

Hi Adam,

The specification of such a data type is completely up to you.
What you have shown seems to work.

The question you have to ask yourself is: where do you want this to be used?
If it's for your own special use case - or maybe within a small group of people - then you can just write up a spec, some javascript libraries to deal with it, and start using it.
If your intention is for the Web browsers to support it, then there's a need to discuss the size of the user group that will create and consume such content, and you will need to convince the browser vendors that this is a use case that is worthwhile spending development time on.

I would suggest taking the first path and writing some JS libraries to parse your content.
As an example of such an approach, see https://developer.bitmovin.com/playback/docs/webvtt-based-thumbnails for timed thumbnails.
And another one using GeoJSON: https://sites.google.com/a/webmproject.org/wiki/webm-metadata/temporal-metadata/webvtt-metadata


Hope that helps.

Cheers,
Silvia.


On Mon, Jul 8, 2024 at 3:43 PM Adam Sobieski <adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>> wrote:
Silvia,
All,

Please find a WebVTT metadata sketch in the postscript, a mapping of the aforementioned TTML ideas. Thank you for any feedback.


Best regards,
Adam

P.S.:

00:00.000 --> 00:05.000
{
  "@type": "speech",
  "agent" : {
    "@type": "person",
    "fullName" : "Alice Smith",
    "position" : ["Senator", "Co-chair", "Attendee"],
  }
  "content": "Without objection, the presenter's slides are entered into the minutes."
}

NOTE that attachments and hyperlinks could be separated. In this way, hyperlinks to resources beyond those attached to meetings' minutes and transcripts could be shared with audiences.

00:05.000 --> 00:05.000
{
  "@type": "attachment",
  "agent" : {
    "@type": "person",
    "fullName" : "Charles Brown",
    "position" : ["Secretary", "Attendee"],
  }
  "data": [{
    "@type: "link",
    "mimeType" : "application/vnd.openxmlformats-officedocument.presentationml.presentation",
    "href" : "files/panelist-presentation-1.pptx"
    "metadata" : {
      "author" : {
         "@type": person",
         "fullName" : "David Jackson"
      },
      "title" : "A Slideshow Presentation"
    }
  },
  {
    "@type: "link",
    "mimeType" : "application/pdf",
    "href" : "files/panelist-presentation-1.pdf"
    "metadata" : {
      "author" : {
         "@type": person",
         "fullName" : "David Jackson"
      },
      "title" : "A Slideshow Presentation"
    }
  }]
}

00:05.000 --> 00:35.000
{
  "@type": "hyperlink",
  "agent" : {
    "@type": "person",
    "fullName" : "Charles Brown",
    "position" : ["Secretary", "Attendee"],
  }
  "data": [{
    "@type: "link",
    "mimeType" : "application/vnd.openxmlformats-officedocument.presentationml.presentation",
    "href" : "files/panelist-presentation-1.pptx"
    "metadata" : {
      "author" : {
         "@type": person",
         "fullName" : "David Jackson"
      },
      "title" : "A Slideshow Presentation"
    }
  },
  {
    "@type: "link",
    "mimeType" : "application/pdf",
    "href" : "files/panelist-presentation-1.pdf"
    "metadata" : {
      "author" : {
         "@type": person",
         "fullName" : "David Jackson"
      },
      "title" : "A Slideshow Presentation"
    }
  }]
}

00:08.000 --> 00:10.000
{
  "@type": "speech",
  "agent" : {
    "@type": "person",
    "fullName" : "Bob Jones",
    "position" : ["Senator", "Co-chair", "Attendee"],
  }
  "content": "Let's get started."
}

NOTE that presentation-related events, e.g., presenters changing slides, could be useful for scenarios such as synchronizing audiences' views of slides alongside live-streams and pre-recorded videos of meetings.

NOTE that AI Q&A and dialogue about meetings are important scenarios for the schema. Would presentation-related data, e.g., presenters changing slides, complicate AI systems' processing of minutes and transcripts or, instead, could these data enable or enhance multimodal Q&A and dialogue involving meetings' presentations' slides' text and visual contents?

01:20.000 --> 01:23.000
{
  "@type": "speech",
  "agent" : {
    "@type": person",
    "fullName" : "David Jackson",
    "position" ["Guest", "Panelist", "Presenter", "Attendee"]
  }
  "content": "Let's take a look at the next slide."
}

01:24.000 --> 01:24.000
{
  "@type": "cue",
  "agent" : {
    "@type": "software",
    "fullName" : "A Slideshow Presentation Software",
  }
  "data" : {
    "@type" : "presentation-update",
    ...
  }
}

NOTE that this sketch is a work in progress. Thank you for any feedback.

________________________________
From: Adam Sobieski <adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>>
Sent: Sunday, July 7, 2024 4:22 PM
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com<mailto:silviapfeiffer1@gmail.com>>
Cc: public-tt@w3.org<mailto:public-tt@w3.org> <public-tt@w3.org<mailto:public-tt@w3.org>>
Subject: Re: Ideas for the Use Case of Public-sector Meeting Transcripts

Silvia,

Hello and thank you for that hyperlink about the WebVTT metadata type. It has an expressiveness resembling that of JSON with some important caveats about blank lines.

Next steps for the use case of meetings' minutes and transcripts appear to involve developing extensible, general-purpose schema including for the WebVTT metadata type.

Also, the WebVTT metadata type would, in theory, be more readily compatible with LLMs than extended TTML, enabling some interesting scenarios such as Q&A and dialogue about (public sector) meetings utilizing their minutes and transcripts [1][2].


Best regards,
Adam

[1] Golany, Lotem, Filippo Galgani, Maya Mamo, Nimrod Parasol, Omer Vandsburger, Nadav Bar, and Ido Dagan. "Efficient data generation for source-grounded information-seeking dialogs: A use case for meeting transcripts." (2024). https://arxiv.org/abs/2405.01121

[2] https://github.com/google-research-datasets/MISeD


________________________________
From: Silvia Pfeiffer <silviapfeiffer1@gmail.com<mailto:silviapfeiffer1@gmail.com>>
Sent: Saturday, July 6, 2024 1:56 AM
To: Adam Sobieski <adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>>
Cc: public-tt@w3.org<mailto:public-tt@w3.org> <public-tt@w3.org<mailto:public-tt@w3.org>>
Subject: Re: Ideas for the Use Case of Public-sector Meeting Transcripts

Hi Adam,

You might consider using WebVTT for that purpose - the "metadata" type already allows you to formulate your custom timed markup:
https://www.w3.org/TR/webvtt1/#introduction-metadata


Kind Regards,
Silvia.


On Sat, Jul 6, 2024 at 1:18 AM Adam Sobieski <adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>> wrote:
Timed Text Working Group,

Hello. I am pleased to share, for purposes of discussion, some ideas for extending TTML for use cases including public-sector meetings' minutes and transcripts.

As shown in the following markup example, seven main ideas are broached:


  1.  Files could be attached to meetings' minutes and transcripts, e.g., presenters' slideshow slides.
  2.  These files could be described with metadata.
  3.  Agents could have one or more roles or positions described in their metadata.
  4.  Minutes and transcripts could have generator agents and/or software tools.
     *   Beyond "person", "character", "group", "organization", and "other", might software tools be a type of agent?
  5.  Inline time-based hyperlinks could be placed in minutes to signal when files were attached to meetings' minutes and transcripts.
  6.  These time-based hyperlinks could be attributed to agents or software tools.
  7.
These time-based hyperlinks could be displayed for end-users consuming accompanying videos of meetings for downloading attached files.

Here is a markup sketch. The new parts, showcasing the above ideas, are phrased using an XML extension and are emphasized in bold.


<tt xml:lang="en" xmlns="http://www.w3.org/ns/ttml"
    xmlns:ttm="http://www.w3.org/ns/ttml#metadata"
    xmlns:ext="..."
    xml:base="...">

  <head>
    <metadata xmlns:ttm="http://www.w3.org/ns/ttml#metadata">
      <ttm:title>...</ttm:title>
      <ttm:desc>...</ttm:desc>
      <ext:generator ttm:agent="brown" />
    </metadata>

    <ext:attachment xml:id="budget-2024-1"
                    ext:mime="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
                    ext:src="attachments/budget-2024.xlsx" />
    <ext:attachment xml:id="budget-2024-2"
                    ext:mime="application/xml"
                    ext:src="attachments/budget-2024.xbrl" />
    <ext:attachment xml:id="panelist-presentation-1"
                    ext:mime="application/vnd.openxmlformats-officedocument.presentationml.presentation"
                    ext:src="attachments/panelist-presentation-1.pptx">
      <metadata>
        <ttm:title>Slideshow Presentation</ttm:title>
        <ext:generator ttm:agent="jackson" />
      </metadata>
    </ext:attachment>

    <ttm:agent xml:id="smith" type="person">
      <ttm:name type="family">Smith</ttm:name>
      <ttm:name type="given">Alice</ttm:name>
      <ttm:name type="full">Alice Smith</ttm:name>
      <ext:position>Senator</ext:position>
      <ext:position>Co-chair</ext:position>
    </ttm:agent>
    <ttm:agent xml:id="jones" type="person">
      <ttm:name type="family">Jones</ttm:name>
      <ttm:name type="given">Bob</ttm:name>
      <ttm:name type="full">Bob Jones</ttm:name>
      <ext:position>Senator</ext:position>
      <ext:position>Co-chair</ext:position>
    </ttm:agent>
    <ttm:agent xml:id="brown" type="person">
      <ttm:name type="family">Brown</ttm:name>
      <ttm:name type="given">Charles</ttm:name>
      <ttm:name type="full">Charles Brown</ttm:name>
      <ext:position>Secretary</ext:position>
    </ttm:agent>
    <ttm:agent xml:id="jackson" type="person">
      <ttm:name type="family">Jackson</ttm:name>
      <ttm:name type="given">David</ttm:name>
      <ttm:name type="full">David Jackson</ttm:name>
      <ext:position>Guest</ext:position>
      <ext:position>Panelist</ext:position>
    </ttm:agent>
  </head>
  <body>
    <div>
      ...
      <p begin="00:22.000" end="00:27.000" ttm:agent="smith">
        Without objection, the annual budget is entered into the minutes.
      </p>
      <ext:a begin="00:27.000" duration="00:10.000" ext:xref="budget-2024-1" ttm:agent="brown" />
      <ext:a begin="00:27.000" duration="00:10.000" ext:xref="budget-2024-2" ttm:agent="brown" />
      ...
      <p begin="01:23.000" end="01:28.000" ttm:agent="smith">
        Without objection, the panelist's slides are entered into the minutes.
      </p>
      <ext:a begin="01:28.000" duration="00:10.000" ext:xref="panelist-presentation-1" ttm:agent="brown" />
      ...
    </div>
  </body>
</tt>

Any thoughts on these ideas and the markup sketch? Any other ideas towards utilizing and/or extending timed text, e.g., TTML, for the use case of representing (public-sector) meetings' minutes and transcripts? Thank you.


Best regards,
Adam Sobieski

P.S.: It appears that I should have emailed this mailing list instead of having opened a GitHub issue. Apologies for the multiple copies of this content in this mailing list.

Received on Wednesday, 10 July 2024 15:21:12 UTC