# 安装 ```shell pip install scrapy ``` # 快速开始项目 ## 创建项目 ```shell scrapy startproject douban ``` ## 创建爬虫 ```shell scrapy genspider douabn_top250 book.douban.com ``` ## 运行爬虫 ```shell scrapy crawl douabn_top250 ``` ## 运行多个排重 在项目的根目录,`scrapy.cfg` 的同级目录新建 `.py` 文件 ```shell from scrapy.crawler import CrawlerProcess from scrapy.utils.project import get_project_settings process = CrawlerProcess(get_project_settings()) # 参数爬虫的名字 process.crawl('douban_***') process.start() ``` ## 导出数据 ```shell scrapy crawl douban_top250 -o output.json ``` ## 使用scrapy shell ```shell scrapy shell "http://example.com" ```