Description: Program instructions:
1. open \Sina_spider1\Sina_spider1\
2. edit spiders.py with notepad++ or Python 2.7
3. input Sina micro-blog account and password purchased from Taobao after the following program
Class Spider (CrawlSpider):
Name = "sinaSpider""
Host = "http://weibo.cn""
Start_urls = [
4. run \Sina_spider1\begin.py
5. start crawling, the final data access to text2.txt
To Search:
File list (Check if you may need any files):
程序\Begin.py
程序\scrapy.cfg
程序\Sina_spider1\cookies.py
程序\Sina_spider1\items.py
程序\Sina_spider1\middleware.py
程序\Sina_spider1\pipelines.py
程序\Sina_spider1\settings.py
程序\Sina_spider1\spiders\.idea\.name
程序\Sina_spider1\spiders\.idea\encodings.xml
程序\Sina_spider1\spiders\.idea\misc.xml
程序\Sina_spider1\spiders\.idea\modules.xml
程序\Sina_spider1\spiders\.idea\spiders.iml
程序\Sina_spider1\spiders\.idea\workspace.xml
程序\Sina_spider1\spiders\spiders.py
程序\Sina_spider1\spiders\__init__.py
程序\Sina_spider1\user_agents.py
程序\Sina_spider1\yumdama.py
程序\Sina_spider1\__init__.py
程序\Sina_spider1\spiders\.idea
程序\Sina_spider1\spiders
程序\Sina_spider1
程序