Description: Web text extraction code,* in linear time extract topic class (news, blogs, etc.) the body of the page.
* using the < b > </b > line based on block distribution function, the method of a specific website write rules to keep no commonality.
To Search:
File list (Check if you may need any files):
TextExtract.java