You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
44 lines
718 B
Markdown
44 lines
718 B
Markdown
# 安装
|
|
```shell
|
|
pip install scrapy
|
|
```
|
|
|
|
# 快速开始项目
|
|
|
|
## 创建项目
|
|
```shell
|
|
scrapy startproject douban
|
|
```
|
|
|
|
## 创建爬虫
|
|
```shell
|
|
scrapy genspider douabn_top250 book.douban.com
|
|
```
|
|
|
|
## 运行爬虫
|
|
```shell
|
|
scrapy crawl douabn_top250
|
|
```
|
|
|
|
## 运行多个排重
|
|
在项目的根目录,`scrapy.cfg` 的同级目录新建 `.py` 文件
|
|
```shell
|
|
from scrapy.crawler import CrawlerProcess
|
|
from scrapy.utils.project import get_project_settings
|
|
|
|
process = CrawlerProcess(get_project_settings())
|
|
# 参数爬虫的名字
|
|
process.crawl('douban_***')
|
|
process.start()
|
|
```
|
|
|
|
## 导出数据
|
|
```shell
|
|
scrapy crawl douban_top250 -o output.json
|
|
```
|
|
|
|
## 使用scrapy shell
|
|
```shell
|
|
scrapy shell "http://example.com"
|
|
```
|