W3C home > Mailing lists > Public > public-dxwg-wg@w3.org > September 2017

RE: example of clustering and topic modelling on best practices

From: <Peter.Winstanley@gov.scot>
Date: Tue, 5 Sep 2017 08:37:23 +0000
To: <thomas.dhaenens@kb.vlaanderen.be>, <public-dxwg-wg@w3.org>
Message-ID: <BEA9D5BE2C1C76448E2955B1FD8769E10185C5FE6F@s0393g.scotland.gov.uk>
Hi Thomas
I think the benefit of some objective/agnostic process such as cluster analysis or topic modelling is to help us/readers both spot relations and also identify potential gaps.
Peter

From: D'Haenens Thomas [mailto:thomas.dhaenens@kb.vlaanderen.be]
Sent: 04 September 2017 16:28
To: public-dxwg-wg@w3.org
Subject: Re: example of clustering and topic modelling on best practices

Hi everyone,

With every respect to efforts attempting to come up with clever ways of clustering requirements I wonder whether we're not overdoing ourselves? Textual interpretations, clustering and approximate techniques, fuzzy searches and the likes are ok when we're talking about huge amounts of text.
What are we talking about? 50-60 requirements? Some composite, others more detailed? And those mapped to from 50 use cases.

Instead of trying to fit that all in a single document with the latest AI techniques, it might be more efficient just to simply keep track of (i) a list of use cases, (ii) a list of requirements (in which you could have some internal parent-child relationship) and (iii) a mapping table? And add that as an addendum of some sorts to the document.
IMHO, a spreadsheet is suitable here.

From which we could go to the real discussions (instead of ending up with four minutes to discuss the req's about versioning).
Or am I missing some things here? Please enlighten me if so (always happy to learn a thing or two).

Thomas

Op 4 sep. 2017 om 17:07 heeft "Peter.Winstanley@gov.scot<mailto:Peter.Winstanley@gov.scot>" <Peter.Winstanley@gov.scot<mailto:Peter.Winstanley@gov.scot>> het volgende geschreven:
Following on the discussion about ways of slicing and dicing the requirements and use cases…..

https://www.w3.org/2013/share-psi/wiki/Best_Practices/TextClustering



Was just wondering if hierarchical clustering might be helpful, in addition to KWIC indexing or some other permuted index


Peter





**********************************************************************

This e-mail (and any files or other attachments transmitted with it) is intended solely for the attention of the addressee(s). Unauthorised use, disclosure, storage, copying or distribution of any part of this e-mail is not permitted. If you are not the intended recipient please destroy the email, remove any copies from your system and inform the sender immediately by return.

Communications with the Scottish Government may be monitored or recorded in order to secure the effective operation of the system and for other lawful purposes. The views or opinions contained within this e-mail may not necessarily reflect those of the Scottish Government.





Tha am post-d seo (agus faidhle neo ceanglan còmhla ris) dhan neach neo luchd-ainmichte a-mhàin. Chan eil e ceadaichte a chleachdadh ann an dòigh sam bith, a’ toirt a-steach còraichean, foillseachadh neo sgaoileadh, gun chead. Ma ’s e is gun d’fhuair sibh seo gun fhiosd’, bu choir cur às dhan phost-d agus lethbhreac sam bith air an t-siostam agaibh agus fios a leigeil chun neach a sgaoil am post-d gun dàil.



Dh’fhaodadh gum bi teachdaireachd sam bith bho Riaghaltas na h-Alba air a chlàradh neo air a sgrùdadh airson dearbhadh gu bheil an siostam ag obair gu h-èifeachdach neo airson adhbhar laghail eile. Dh’fhaodadh nach eil beachdan anns a’ phost-d seo co-ionann ri beachdan Riaghaltas na h-Alba.

**********************************************************************



______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com

______________________________________________________________________

*********************************** ********************************

This email has been received from an external party and

has been swept for the presence of computer viruses.

********************************************************************
Received on Tuesday, 5 September 2017 08:37:54 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 30 October 2019 00:15:38 UTC