[ESW Wiki] Update of "its0908LinguisticMarkup" by GoutamSaha

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "ESW Wiki" for change notification.

The following page has been changed by GoutamSaha:
http://esw.w3.org/topic/its0908LinguisticMarkup


------------------------------------------------------------------------------
   
  =='''Understanding Content Domain Level Markups:-'''==
  
- In order to find out the content domain for a paragraph of text, we normally find that content domain is nothing but the most frequently occurred word (e.g. a noun) in that paragraph. For example, in a paragraph, if we see that the word-frequency of a word say, "football" is the maximum among other words' frequencies, then the content domain is "football" only.  
+ In order to find out the content domain for a paragraph of text, we normally find that content domain is nothing but the '''most frequently occurred word''' (e.g. a noun) in that paragraph. For example, in a paragraph, if we see that the word-frequency of a word say, "football" is the maximum among other words' frequencies, then the content domain is "football" only.  
+ Again, a word with the maximum '''word-desnsity''' may often be a Content Domain. The ratio of the number of times a word appears in a document to the size (total number word counts) of the document is called the word density. It is a measure of how important a word is to the overall content of the document. A higher word density results in a higher relevance ranking. 
  
  == Challenges ==
  

Received on Tuesday, 25 October 2005 06:48:37 UTC