Location:
Search - DataScraper
Search list
Description: DataScraper是网页信息提取(抽取)工具箱MetaSeeker中的一个工具,可以提取任何网站,为一个网站定制提取规则时不用编程,操作GUI,规则自动生成。适合做
1. 垂直搜索(或称为专业搜索)服务
2. 信息汇聚和门户服务
3. Mashup服务
4. 企业网信息汇聚
5. 商业情报采集
6. 论坛或博客迁移
7. 智能信息代理
8. 个人信息检索
9. 信息挖掘
有多个版本可以共享下载,下载完整工具箱,请访问:http://www.gooseeker.com
用此工具制作的威客项目搜索参见:http://www.metaseeker.cn/projectsearch/home.htm-MetaSeeker is just this kind of toolkit for defining data schema and extracting structured data from the Web. MetaSeeker provide convenient methods to define schemas of Web pages, to generate wrappers without coding, to extract data effectively. It is a differentiated feature that all above are done in a distributed environment and in a collaborative manner.
MetaSeeker is a valuable toolkit for personals or enterprises who are going to provide the following services:
1. vertical search engines (or called as professional search engines)
2. information aggregation portals
3. Mashup services
4. intelligent agents
5. personalized information retrieval systems
6. information mining facilities
Please go to download the whole toolkit: http://www.gooseeker.com
Platform: |
Size: 148480 |
Author: Fuller Hua |
Hits:
Description: 主要应用领域:
• 垂直搜索(Vertical Search):也称为专业搜索,高速、海量和精确抓取是定题网络爬虫DataScraper的强项,每天24小时每周7天无人值守自主调度的周期性批量采集,加上断点续传和软件看门狗(Watch Dog),确保您高枕无忧
• 移动互联网:手机搜索、手机混搭(mashup)、移动社交网络、移动电子商务都离不开结构化的数据内容,DataScraper实时高效地 采集内容,输出富含语义元数据的XML格式的抓取结果文件,确保自动化的数据集成和加工,跨越小尺寸屏幕展现和高精准信息检索的障碍。手机互联网不是 Web的子集而是全部,由MetaSeeker架设桥梁
• 企业竞争情报采集/数据挖掘:俗称商业智能(Business Intelligence),噪音信息滤除、结构化转换,确保数据的准确性和时效性,独有的广域分布式架构,赋予DataScraper无与伦比的情报采 集渗透能力,AJAX/Javascript动态页面、服务器动态网页、静态页面、各种鉴权认证机制,一视同仁。在微博网站数据采集和舆情监测领域远远领 先其它产品。-The main application areas:
• Vertical Search (Vertical Search): also known as professional search, speed, mass and precision is the SDI Web crawler to crawl the strengths DataScraper 24 hours a day 7 days a week periodic unattended batch capture self-scheduling, Canada and software watchdog on the HTTP (Watch Dog), make sure you sit back and relax
• Mobile Internet: mobile search, mobile mashups (mashup), mobile social networking, mobile commerce are inseparable from the structure of the data content, DataScraper efficiently capture real-time content, the output is rich semantic metadata XML format for the capture outcome document, to ensure that automated data integration and processing, across the small size screen display and high precision information retrieval obstacles. Mobile Internet is not a subset of Web but all, by building bridges MetaSeeker
• Competitive intelligence gathering/data mining: commonly known as Business Intelligence (Business Intelli
Platform: |
Size: 4218880 |
Author: 陈东 |
Hits:
Description: MetaSeeker工具包V3是GooSeeker团队自主开发的网页抓取/数据抽取/信息提取软件,经历了垂直搜索、SNS等多个互联网浪潮的实战检验,已经发展到V3版本,并且分成企业版和在线版,对于不愿支付昂贵的企业版费用的用户可以免费下载使用在线版。 MetaSeeker工具包V3版本包括如下软件工具: 1,MetaStudio,网页数据结构定义工具,通过图形界面免编程定义网站数据抓取规则 2,DataScraper,数据抽取工具,能够连续大批量抓取网页内容,不是普通的网络爬虫,而是适应力-MetaSeeker toolkit V3 team is GooSeeker independent development of web page grab/data extracting/information extraction software, experienced vertical search, SNS, and other Internet wave of the real test that have been developed to V3 versions, and divided into enterprise edition and online edition, for not willing to pay for expensive enterprise edition cost of users can be downloaded for free using the online version. MetaSeeker toolkit V3 version includes the following software tools: 1, MetaStudio, web data structure defines tools, through the graphical interface definition programming from web site data grab rule 2, DataScraper, data extraction tools that can continuous mass grab web content, not ordinary web crawlers, but flexibility
Platform: |
Size: 326656 |
Author: highyun |
Hits: