Releases: apify/crawlee
Releases · apify/crawlee
v3.10.3
3.10.3 (2024-06-07)
Bug Fixes
- adaptive-crawler: log only once for the committed request handler execution (#2524) (533bd3f)
- increase timeout for retiring inactive browsers (#2523) (195f176)
- respect implicit router when no
requestHandler
is provided inAdaptiveCrawler
(#2518) (31083aa) - revert the scaling steps back to 5% (5bf32f8)
Features
v3.10.2
v3.10.1
3.10.1 (2024-05-23)
Bug Fixes
- adjust
URL_NO_COMMAS_REGEX
regexp to allow single character hostnames (#2492) (ec802e8), closes #2487 - investigate and temp fix for possible 0-concurrency bug in RQv2 (#2494) (4ebe820)
- provide URLs to the error snapshot (#2482) (7f64145), closes /github.com/apify/apify-sdk-js/blob/master/packages/apify/src/key_value_store.ts#L25
v3.10.0
3.10.0 (2024-05-16)
Bug Fixes
EnqueueStrategy.All
erroring with links using unsupported protocols (#2389) (8db3908)- conversion between tough cookies and browser pool cookies (#2443) (74f73ab)
- fire local
SystemInfo
events every second (#2454) (1fa9a66) - use createSessionFunction when loading Session from persisted state (#2444) (3c56b4c)
- do not drop statistics on migration/resurrection/resume (#2462) (8ce7dd4)
- double tier decrement in tiered proxy (#2468) (3a8204b)
- Fixed double extension for screenshots (#2419) (e8b39c4), closes #1980
- malformed sitemap url when sitemap index child contains querystring (#2430) (e4cd41c)
- return true when robots.isAllowed returns undefined (#2439) (6f541f8), closes #2437
- sitemap
content-type
check breaks oncontent-type
parameters (#2442) (db7d372)
Features
- add
FileDownload
"crawler" (#2435) (d73756b) - implement ErrorSnapshotter for error context capture (#2332) (e861dfd), closes #2280
- make
RequestQueue
v2 the default queue, see more on Apify blog (#2390) (41ae8ab), closes #2388
Performance Improvements
v3.9.2
v3.9.1
v3.9.0
3.9.0 (2024-04-10)
Bug Fixes
- include actual key in error message of KVS'
setValue
(#2411) (9089bf1) - notify autoscaled pool about newly added requests (#2400) (a90177d)
- puppeteer: allow passing
networkidle
towaitUntil
ingotoExtended
(#2399) (5d0030d), closes #2398 - sitemaps support
application/xml
(#2408) (cbcf47a)
Features
v3.8.2
3.8.2 (2024-03-21)
Bug Fixes
- core: solve possible dead locks in
RequestQueueV2
(#2376) (ffba095) - correctly report gzip decompression errors (#2368) (84a2f17)
- puppeteer: improve detection of older versions (98d4e86), closes #2370
- use 0 (number) instead of false as default for sessionRotationCount (#2372) (667a3e7)
Features
v3.8.1
v3.8.0
3.8.0 (2024-02-21)
Bug Fixes
createRequests
works correctly withexclude
(and nothing else) (#2321) (048db09)- puppeteer: add 'process' to the browser bound methods (#2329) (2750ba6)
- puppeteer: replace
page.waitForTimeout()
withsleep()
(52d7219), closes #2335 - puppeteer: support
puppeteer@v22
(#2337) (3cc360a)
Features
KeyValueStore.recordExists()
(#2339) (8507a65)- accessing crawler state, key-value store and named datasets via crawling context (#2283) (58dd5fc)
- adaptive playwright crawler (#2316) (8e4218a)
- add Sitemap.tryCommonNames to check well known sitemap locations (#2311) (85589f1), closes #2307
- core: add
userAgent
parameter toRobotsFile.isAllowed()
+RobotsFile.from()
helper (#2338) (343c159) - Support plain-text sitemap files (sitemap.txt) (#2315) (0bee7da)