Description: Simple and very efficient multithreaded web crawler with pipeline based processing written in C#. Contains HTML, Text, PDF, and IFilter document processors and language detection(Google). Easy to add pipeline steps to extract, use and alter information.
To Search:
File list (Check if you may need any files):
BuildProcessTemplates
.....................\DefaultTemplate.xaml
.....................\UpgradeTemplate.xaml
Net 3.5
.......\Build.bat
.......\NCrawler.Console
.......\................\NCrawler.Console.csproj
.......\NCrawler.Db4oServices
.......\.....................\NCrawler.Db4oServices.csproj
.......\.....................\NCrawler.Db4oServices.csproj.vspscc
.......\NCrawler.DbServices
.......\...................\NCrawler.DbServices.csproj
.......\...................\NCrawler.DbServices.csproj.vspscc
.......\...................\NCrawlerEntities.Designer.cs
.......\...................\NCrawlerEntities.edmx
.......\...................\Properties
.......\NCrawler.Demo
.......\.............\App.config
.......\.............\NCrawler.Demo.csproj
.......\.............\NCrawler.Demo.csproj.vspscc
.......\.............\Properties
.......\NCrawler.EsentServices
.......\......................\NCrawler.EsentServices.csproj
.......\......................\NCrawler.EsentServices.csproj.vspscc
.......\NCrawler.FileStorageServices
.......\............................\NCrawler.FileStorageServices.csproj
.......\............................\NCrawler.FileStorageServices.csproj.vspscc
.......\NCrawler.HtmlProcessor
.......\......................\Extensions
.......\......................\NCrawler.HtmlProcessor.csproj
.......\......................\NCrawler.HtmlProcessor.csproj.vspscc
.......\......................\Properties
.......\......................\..........\Resources.Designer.cs
.......\......................\..........\Resources.resx
.......\NCrawler.IFilterProcessor
.......\.........................\NCrawler.IFilterProcessor.csproj
.......\.........................\NCrawler.IFilterProcessor.csproj.vspscc
.......\.........................\Properties
.......\NCrawler.IsolatedStorageServices.csproj
.......\.......................................\Properties
.......\NCrawler.IsolatedStorageServices
.......\................................\NCrawler.IsolatedStorageServices.csproj
.......\................................\NCrawler.IsolatedStorageServices.csproj.vspscc
.......\NCrawler.iTextSharpPdfProcessor
.......\...............................\NCrawler.iTextSharpPdfProcessor.csproj
.......\...............................\NCrawler.iTextSharpPdfProcessor.csproj.vspscc
.......\...............................\Properties
.......\NCrawler.LanguageDetection.Google
.......\.................................\NCrawler.LanguageDetection.Google.csproj
.......\.................................\NCrawler.LanguageDetection.Google.csproj.vspscc
.......\.................................\Properties
.......\NCrawler.MP3Processor
.......\.....................\NCrawler.MP3Processor.csproj
.......\.....................\NCrawler.MP3Processor.csproj.vspscc
.......\NCrawler.proj
.......\NCrawler.SitemapProcessor
.......\.........................\NCrawler.SitemapProcessor.csproj
.......\.........................\NCrawler.SitemapProcessor.csproj.vspscc
.......\NCrawler.sln
.......\NCrawler.vssscc
.......\NCrawler
.......\........\Events
.......\........\Extensions
.......\........\..........\IsolatedStorageFileExtensions.cs
.......\........\Interfaces
.......\........\NCrawler.csproj
.......\........\NCrawler.csproj.vspscc
.......\........\Properties
.......\........\Services
.......\........\Utils
.......\........\.....\Lazy.cs
.......\Repository
.......\..........\Autofac.2.4.5.724
.......\..........\.................\Autofac.Configuration.dll
.......\..........\.................\Autofac.dll
.......\..........\Db4o
.......\..........\....\Db4objects.Db4o.dll
.......\..........\EPocalipse.IFilter
.......\..........\..................\EPocalipse.IFilter.dll
.......\..........\..................\EPocalipse.IFilter.pdb
.......\..........\HtmlAgilityPack
.......\..........\...............\HtmlAgilityPack.dll
.......\..........\...............\HtmlAgilityPack.pdb
.......\..........\HundredMilesSoftware
.......\..........\....................\UltraID3Lib.dll
.......\..........\ILMerge
.......\..........\.......\ILMerge.exe
.......\..........\iTextSharp
.......\........