diff --git a/README.md b/README.md index 0cb3b2a..8de24a5 100644 --- a/README.md +++ b/README.md @@ -92,5 +92,6 @@ Version 2 is significantly different from the initial release. The highlights of * ability to load class files(for custom DataAdatpers) from external jars/locations * ability to join (or re-join) a job in progress * ability to recover from coordinator failures +* pluggable mechanism to track "seen pages" to handle massive crawls * pluggable mechanism to "fetch Pages" in SpiderThread (thus allowing non-web/filesystem extraction jobs) * piggyback heartbeats on other messages and only send HB if needed