We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Site from list 1 #239 - PR
Unable to crawl and getting various response errors e.g.
2019-06-11 15:31:15 [scrapy.core.engine] INFO: Spider opened 2019-06-11 15:31:15 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2019-06-11 15:31:15 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2019-06-11 15:32:15 [scrapy.extensions.logstats] INFO: Crawled 371 pages (at 371 pages/min), scraped 0 items (at 0 items/min) 2019-06-11 15:33:15 [scrapy.extensions.logstats] INFO: Crawled 724 pages (at 353 pages/min), scraped 0 items (at 0 items/min) 2019-06-11 15:34:15 [scrapy.extensions.logstats] INFO: Crawled 1032 pages (at 308 pages/min), scraped 0 items (at 0 items/min) 2019-06-11 15:35:01 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.pressherald.com/2018/04/07/police-say-westbrook-armed-robbery-likely-linked-to-others/bquimby@pressherald.com>: HTTP status code is not handled or not allowed 2019-06-11 15:35:15 [scrapy.extensions.logstats] INFO: Crawled 1341 pages (at 309 pages/min), scraped 0 items (at 0 items/min) 2019-06-11 15:36:15 [scrapy.extensions.logstats] INFO: Crawled 1647 pages (at 306 pages/min), scraped 0 items (at 0 items/min) 2019-06-11 15:36:58 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.pressherald.com/2018/11/03/world-war-i-sacrifices-of-mainers-go-digital/digitalmaine.com>: HTTP status code is not handled or not allowed
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Site from list 1 #239 - PR
Unable to crawl and getting various response errors e.g.
The text was updated successfully, but these errors were encountered: