scrapy/scrapy

GitHub: scrapy/scrapy

一个基于 Python 的高性能 Web 爬虫与数据采集框架,专为从网站批量提取结构化数据而设计。

Stars: 60502 | Forks: 11331

|logo| .. |logo| image:: https://raw.githubusercontent.com/scrapy/scrapy/master/docs/_static/logo.svg :target: https://scrapy.org :alt: Scrapy :width: 480px |version| |python_version| |ubuntu| |macos| |windows| |coverage| |conda| |deepwiki| .. |version| image:: https://img.shields.io/pypi/v/Scrapy.svg :target: https://pypi.org/pypi/Scrapy :alt: PyPI 版本 .. |python_version| image:: https://img.shields.io/pypi/pyversions/Scrapy.svg :target: https://pypi.org/pypi/Scrapy :alt: 支持的 Python 版本 .. |ubuntu| image:: https://github.com/scrapy/scrapy/workflows/Ubuntu/badge.svg :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AUbuntu :alt: Ubuntu .. |macos| image:: https://github.com/scrapy/scrapy/workflows/macOS/badge.svg :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AmacOS :alt: macOS .. |windows| image:: https://github.com/scrapy/scrapy/workflows/Windows/badge.svg :target: https://github.com/scrapy/scrapy/actions?query=workflow%3AWindows :alt: Windows .. |coverage| image:: https://img.shields.io/codecov/c/github/scrapy/scrapy/master.svg :target: https://codecov.io/github/scrapy/scrapy?branch=master :alt: 覆盖率报告 .. |conda| image:: https://anaconda.org/conda-forge/scrapy/badges/version.svg :target: https://anaconda.org/conda-forge/scrapy :alt: Conda 版本 .. |deepwiki| image:: https://deepwiki.com/badge.svg :target: https://deepwiki.com/scrapy/scrapy :alt: 询问 DeepWiki Scrapy_ 是一个用于从网站提取结构化数据的 Web 抓取框架。 它是跨平台的,需要 Python 3.10+。它由 Zyte_(前身是 Scrapinghub)和 `many other contributors`_ 维护。 .. _many other contributors: https://github.com/scrapy/scrapy/graphs/contributors .. _Scrapy: https://scrapy.org/ .. _Zyte: https://www.zyte.com/ 安装方式: .. code:: bash ``` pip install scrapy ``` 并按照 documentation_ 学习如何使用它。 .. _documentation: https://docs.scrapy.org/en/latest/ 如果您希望做出贡献,请参阅 Contributing_。 .. _Contributing: https://docs.scrapy.org/en/master/contributing.html
标签:HTTP工具, Python, Scrapy, Web Scraping, 互联网, 命令控制, 大数据, 开发框架, 开源, 异步IO, 数据提取, 数据泄露, 数据采集, 文本解析, 无后门, 爬虫框架, 目录扫描, 网络调试, 网页抓取工具, 自动化, 逆向工具