Skip to content

Releases: apify/crawlee

v3.10.3

07 Jun 11:53
Compare
Choose a tag to compare

3.10.3 (2024-06-07)

Bug Fixes

  • adaptive-crawler: log only once for the committed request handler execution (#2524) (533bd3f)
  • increase timeout for retiring inactive browsers (#2523) (195f176)
  • respect implicit router when no requestHandler is provided in AdaptiveCrawler (#2518) (31083aa)
  • revert the scaling steps back to 5% (5bf32f8)

Features

  • add waitForSelector context helper + parseWithCheerio in adaptive crawler (#2522) (6f88e73)
  • log desired concurrency in the default status message (9f0b796)

v3.10.2

03 Jun 09:07
Compare
Choose a tag to compare

3.10.2 (2024-06-03)

Bug Fixes

Features

v3.10.1

23 May 10:45
Compare
Choose a tag to compare

3.10.1 (2024-05-23)

Bug Fixes

v3.10.0

16 May 13:41
Compare
Choose a tag to compare

3.10.0 (2024-05-16)

Bug Fixes

  • EnqueueStrategy.All erroring with links using unsupported protocols (#2389) (8db3908)
  • conversion between tough cookies and browser pool cookies (#2443) (74f73ab)
  • fire local SystemInfo events every second (#2454) (1fa9a66)
  • use createSessionFunction when loading Session from persisted state (#2444) (3c56b4c)
  • do not drop statistics on migration/resurrection/resume (#2462) (8ce7dd4)
  • double tier decrement in tiered proxy (#2468) (3a8204b)
  • Fixed double extension for screenshots (#2419) (e8b39c4), closes #1980
  • malformed sitemap url when sitemap index child contains querystring (#2430) (e4cd41c)
  • return true when robots.isAllowed returns undefined (#2439) (6f541f8), closes #2437
  • sitemap content-type check breaks on content-type parameters (#2442) (db7d372)

Features

Performance Improvements

  • improve scaling based on memory (#2459) (2d5d443)
  • optimize RequestList memory footprint (#2466) (12210bd)
  • optimize adding large amount of requests via crawler.addRequests() (#2456) (6da86a8)

v3.9.2

17 Apr 13:14
Compare
Choose a tag to compare

3.9.2 (2024-04-17)

Bug Fixes

Features

v3.9.1

11 Apr 09:02
Compare
Choose a tag to compare

3.9.1 (2024-04-11)

Features

v3.9.0

10 Apr 11:54
Compare
Choose a tag to compare

3.9.0 (2024-04-10)

Bug Fixes

  • include actual key in error message of KVS' setValue (#2411) (9089bf1)
  • notify autoscaled pool about newly added requests (#2400) (a90177d)
  • puppeteer: allow passing networkidle to waitUntil in gotoExtended (#2399) (5d0030d), closes #2398
  • sitemaps support application/xml (#2408) (cbcf47a)

Features

v3.8.2

21 Mar 16:23
Compare
Choose a tag to compare

3.8.2 (2024-03-21)

Bug Fixes

  • core: solve possible dead locks in RequestQueueV2 (#2376) (ffba095)
  • correctly report gzip decompression errors (#2368) (84a2f17)
  • puppeteer: improve detection of older versions (98d4e86), closes #2370
  • use 0 (number) instead of false as default for sessionRotationCount (#2372) (667a3e7)

Features

  • implement global storage access checking and use it to prevent unwanted side effects in adaptive crawler (#2371) (fb3b7da), closes #2364

v3.8.1

22 Feb 14:06
Compare
Choose a tag to compare

3.8.1 (2024-02-22)

Bug Fixes

  • fix crawling context type in router.addHandler() (#2355) (d73c202)

v3.8.0

21 Feb 15:55
Compare
Choose a tag to compare

3.8.0 (2024-02-21)

Bug Fixes

  • createRequests works correctly with exclude (and nothing else) (#2321) (048db09)
  • puppeteer: add 'process' to the browser bound methods (#2329) (2750ba6)
  • puppeteer: replace page.waitForTimeout() with sleep() (52d7219), closes #2335
  • puppeteer: support puppeteer@v22 (#2337) (3cc360a)

Features

  • KeyValueStore.recordExists() (#2339) (8507a65)
  • accessing crawler state, key-value store and named datasets via crawling context (#2283) (58dd5fc)
  • adaptive playwright crawler (#2316) (8e4218a)
  • add Sitemap.tryCommonNames to check well known sitemap locations (#2311) (85589f1), closes #2307
  • core: add userAgent parameter to RobotsFile.isAllowed() + RobotsFile.from() helper (#2338) (343c159)
  • Support plain-text sitemap files (sitemap.txt) (#2315) (0bee7da)