Description: "Spider" (Spider) is the Internet, a very useful procedure, the search engine spider programs will use to collect Web pages to the database, business process using spider to monitor and track changes in a competitor s site, individual users download Web pages with a spider program to off machine use, developers use the Web spider program scans your check invalid link ... ... different users have different procedures for the use of spider. So in the end is how the spider program work?
This article describes how to use the C# language to construct a spider program that can download entire site s content to a specified directory, run the program interface shown in Figure 1. You can easily use this article offers several core classes to construct their own spider.
File list (Check if you may need any files):
CSharpSpider
............\App.ico
............\AssemblyInfo.cs
............\Attribute.cs
............\AttributeList.cs
............\Backup
............\Backup1
............\.......\App.ico
............\.......\AssemblyInfo.cs
............\.......\Attribute.cs
............\.......\AttributeList.cs
............\.......\DocumentWorker.cs
............\.......\Done.cs
............\.......\Parse.cs
............\.......\ParseHTML.cs
............\.......\Spider.cs
............\.......\Spider.csproj
............\.......\Spider.csproj.user
............\.......\Spider.sln
............\.......\Spider.suo
............\.......\SpiderForm.cs
............\.......\SpiderForm.resx
............\.......\TestSpider.cs
............\.......\说明.htm
............\......\App.ico
............\......\AssemblyInfo.cs
............\......\Attribute.cs
............\......\AttributeList.cs
............\......\DocumentWorker.cs
............\......\Done.cs
............\......\Parse.cs
............\......\ParseHTML.cs
............\......\Spider.cs
............\......\Spider.csproj
............\......\Spider.csproj.user
............\......\Spider.sln
............\......\Spider.suo
............\......\SpiderForm.cs
............\......\SpiderForm.resx
............\......\TestSpider.cs
............\......\说明.htm
............\bin
............\...\Debug
............\...\.....\Spider.vshost.exe
............\...\.....\Spider.vshost.exe.manifest
............\DocumentWorker.cs
............\Done.cs
............\obj
............\...\Debug
............\...\.....\DesignTimeResolveAssemblyReferencesInput.cache
............\...\.....\TempPE
............\Parse.cs
............\ParseHTML.cs
............\Spider.cs
............\Spider.csproj
............\Spider.csproj.user
............\Spider.exe
............\Spider.sln
............\Spider.suo
............\SpiderForm.cs
............\SpiderForm.resx
............\temp
............\TestSpider.cs
............\UpgradeLog.XML
............\UpgradeLog2.XML
............\_UpgradeReport_Files
............\....................\UpgradeReport.css
............\....................\UpgradeReport.xslt
............\....................\UpgradeReport_Minus.gif
............\....................\UpgradeReport_Plus.gif
............\说明.htm