# webspider **Repository Path**: zhexiao/webspider ## Basic Information - **Project Name**: webspider - **Description**: Quick search web link content by keyword - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2018-04-18 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # webspider ## Required ``` > sudo apt-get install python3 python3-pip > sudo pip3 install Scrapy ``` ## Example ``` > cd webspider > scrapy crawl links -L WARNING \ -a url_tpl="http://example.com/forum-{page}.html" \ -a keyword=wuhan \ -a start_page=1 \ -a end_page=5 ``` ## Parameters - **url_tpl**: request url and replace the pagniation number as *{page}*. For eaxmple: - page 1: http://example.com/forum-1.html - page 2: http://example.com/forum-2.html - page 3: http://example.com/forum-3.html - url_tpl: http://example.com/forum-{page}.html - **keyword**: the keyword in the link content - **start_page**: request *url_tpl* link start page number - **end_page**: request *url_tpl* link end page number