- From: Laurent Le Meur <laurent@edrlab.org>
- Date: Tue, 3 Oct 2023 10:14:12 +0200
- To: "public-tdmrep@w3.org" <public-tdmrep@w3.org>
- Message-Id: <7F182A35-6ABE-45C9-981D-442A0FC045A6@edrlab.org>
The article below was printed in Le Monde on Sept 29th. It is focusing on Generative AI and the EU AI Act. https://www.sne.fr/actu/tribune-construisons-des-aujourdhui-une-intelligence-artificielle-de-rang-mondial-respectueuse-de-la-propriete-litteraire-et-artistique/ Tribune : Construisons dès aujourd’hui une Intelligence Artificielle de rang mondial respectueuse de la propriété littéraire et artistique - Syndicat national de l'édition sne.fr In summary, it calls for the EU to go further than simply requesting AI companies to publish summaries of copyrighted data used for training (this is the current trend). The request is to obtain total transparency through a detailed list of all works used by Generative AI systems for training, and their sources. This request is shared by many practitioners, in the EU but also in the US. Personal thinking: Providing URLs would not be sufficient, because many works appear on multiple URLs that are not managed by rights owners, and many URLs are transient. Such repositories of training sources should therefore index for each training source an ISCC code <https://iscc.foundation/iscc/>, a date of import, a source url (if any), and optionally a few other metadata (some title). And they should be searchable by ISCC (or title). This would make it easy to check that an opt-out has been respected, even if a work / content has been syndicated through multiple locations / websites. What is your opinion on this? Best regards Laurent
Attachments
- text/html attachment: stored
- image/jpeg attachment: Image2.jpeg
Received on Tuesday, 3 October 2023 08:14:31 UTC