-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to make scrapy-splash work with frontera? #333
Comments
Ok I've noticed the log on my computer looks like a timeout error now.
|
Hi @DiscipleOfOne the right approach will be to use this guide https://github.com/scrapy-plugins/scrapy-splash#configuration , and Scrapy.Request with |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have added the following lines to my spider.
From my debugging these lines are only executed when the seed is run. Frontera is creating the request and responses outside of the spider. Specifically in frontera/contrib/requests/converters.py.. I found these lines. mentioned in #300 .
I've edited those two function in virtualenv/lib to see if I could change the functionality. Just changing to SplashRequest and SplashResponse, but the problem is that the parse method in the spider is no longer called. Am I doing this wrong? I'm wanting to make sure my crawler is efficient so I'm wanting to avoid downloading pages more than once if I can.
The text was updated successfully, but these errors were encountered: