Description: Web page from the extraction of main contents, such as extracting news from news web page content, and the time to check whether it contains the web content. Write the Java, attach the source code (eclipse project) and example program, and throw in an article on the papers of extraction method
To Search:
File list (Check if you may need any files):
webextracting\test.bat
.............\param.txt
.............\out.txt
.............\说明.txt
.............\test\1.htm
.............\....\2.htm
.............\....\3.htm
.............\....\4.htm
.............\....\5.htm
.............\....\6.htm
.............\....\1.files\sohu_logo2.gif
.............\....\.......\sports_gmlogo2.gif
.............\....\.......\sports_gmlogo3.gif
.............\....\.......\Img247386504.gif
.............\....\.......\Img248240217.jpg
.............\....\.......\sogou.gif
.............\....\.......\ccc.gif
.............\....\.......\earphone.gif
.............\....\.......\2007-05-07_1.jpg
.............\....\.......\2007-05-07_2.jpg
.............\....\.......\2007-05-07_3.jpg
.............\....\.......\sohu_blog.gif
.............\....\.......\Img250157558.jpg
.............\....\.......\Img250158618.jpg
.............\....\.......\sogou1.gif
.............\....\.......\icon4.gif
.............\....\.......\pic_001_1.jpg
.............\....\.......\pic_004.gif
.............\....\.......\75240.jpg
.............\....\.......\64809.gif
.............\....\.......\68934.gif
.............\....\.......\button1.gif
.............\....\.......\button2.gif
.............\....\.......\Img250159426.jpg
.............\....\.......\Img250130491.jpg
.............\....\.......\Img250156228.jpg
.............\....\.......\club.gif
.............\....\.......\Img250152205.jpg
.............\....\.......\chinaren.gif
.............\....\.......\Img250158680.jpg
.............\....\.......\yj.gif
.............\....\.......\jufu.html
.............\....\.......\fnewstaob1.html
.............\....\.......\fnewstaob2.html
.............\....\.......\CAIZ2L87.htm
.............\....\.......\jingcai.html
.............\....\.......\fenlei.html
.............\....\.......\fenlei2.html
.............\....\.......\fenlei3.html
.............\....\.......\sohuflash_1.js
.............\....\.......\comment.js
.............\....\.......\function.js
.............\....\.......\pp18030_3.js
.............\....\.......\pn18030_3_2.js
.............\....\.......\commentCount.js
.............\....\.......\SogouUnionCPC.js
.............\....\.......\mms.js
.............\....\.......\pv.js
.............\....\.......\adm123.js
.............\....\.......\jingcai.files\bg.gif
.............\....\.......\fenlei.files\bg.gif
.............\....\.......\......2.files\bg.gif
.............\....\.......\......3.files\bg.gif
.............\....\2.files\sohu_logo2.gif
.............\....\.......\sports_gmlogo2.gif
.............\....\.......\sports_gmlogo3.gif
.............\....\.......\Img247386504.gif
.............\....\.......\Img250173628.jpg
.............\....\.......\sogou.gif
.............\....\.......\ccc.gif
.............\....\.......\earphone.gif
.............\....\.......\2007-05-07_1.jpg
.............\....\.......\2007-05-07_2.jpg
.............\....\.......\2007-05-07_3.jpg
.............\....\.......\sohu_blog.gif
.............\....\.......\Img250157558.jpg
.............\....\.......\Img250158618.jpg
.............\....\.......\sogou1.gif
.............\....\.......\icon4.gif
.............\....\.......\pic_001_1.jpg
.............\....\.......\pic_004.gif
.............\....\.......\75240.jpg
.............\....\.......\64809.gif
.............\....\.......\68934.gif
.............\....\.......\button1.gif
.............\....\.......\button2.gif
.............\....\.......\Img250180651.jpg
.............\....\.......\Img250175481.jpg
.............\....\.......\Img250177534.jpg
.............\....\.......\club.gif
.............\....\.......\Img250165708.jpg
.............\....\.......\chinaren.gif
.............\....\.......\Img250181369.jpg
.............\....\.......\yj.gif
.............\....\.......\marketpip.html
.............\....\.......\jufu.html
.............\....\.......\fnewstaob1.html
.............\....\.......\fnewstaob2.html
.............\....\.......\jingcai.html
.............\....\.......\fenlei.html