- From: <w3t-archive+esw-wiki@w3.org>
- Date: Mon, 24 Oct 2005 20:09:02 -0000
- To: w3t-archive+esw-wiki@w3.org
Dear Wiki user, You have subscribed to a wiki page or wiki category on "ESW Wiki" for change notification. The following page has been changed by GoutamSaha: http://esw.w3.org/topic/its0908LinguisticMarkup ------------------------------------------------------------------------------ =='''Understanding Content Domain Level Markups:-'''== - In order to find out the content domain for a paragraph of text, we normally find that content domain is nothing but the most frequently occurred word (e.g. a noun) in that paragraph. For example, in a paragraph, if we see that the word-frequency of a word say, "football" is the maximum among other words' frequencies, then the content domain is "football" only. + In order to find out the content domain for a paragraph of text, we normally find that content domain is nothing but the '''most frequently occurred word''' (e.g. a noun) in that paragraph. For example, in a paragraph, if we see that the word-frequency of a word say, "football" is the maximum among other words' frequencies, then the content domain is "football" only. + Again, a word with the maximum '''word-desnsity''' may often be a Content Domain. The ratio of the number of times a word appears in a document to the size (total number word counts) of the document is called the word density. It is a measure of how important a word is to the overall content of the document. A higher word density results in a higher relevance ranking. == Challenges ==
Received on Tuesday, 25 October 2005 06:48:37 UTC