Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement a way to stop crawler from the user function #2777

Open
barjin opened this issue Dec 18, 2024 · 0 comments
Open

feat: implement a way to stop crawler from the user function #2777

barjin opened this issue Dec 18, 2024 · 0 comments
Labels
t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@barjin
Copy link
Contributor

barjin commented Dec 18, 2024

This is a parity-tracking issue for this PR in Crawlee for Python: apify/crawlee-python#651

Currently, to stop the crawler instance, the users can only call the BasicCrawler.teardown() method, which is both undocumented (has the @ignore TypeDoc decorator) and not exactly named well.

The crawler.stop() implementation in Crawlee for Python forces the AutoscaledPool to not take any more tasks, but to gracefully finish the ones that are in currently in progress. This is different from the AutoscaledPool.abort method (called by crawler.teardown()), which according to the docstring abandons the running tasks on spot ("all running tasks will be left in their current state").

More context / discussion at https://apify.slack.com/archives/CD0SF6KD4/p1734526549266519

@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

1 participant