Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 486 Bytes

README.md

File metadata and controls

10 lines (6 loc) · 486 Bytes

storm-crawler

Low-latency, large scale crawler based on Storm (and maybe HBase). Merely a field for experimentation for now but hopefully we'll have some Spouts and Bolts that will be useful for building crawlers.

Install Maven and call : mvn clean assembly:assembly to generate the full jar

storm jar storm-crawler-0.1-SNAPSHOT-jar-with-dependencies.jar com.digitalpebble.storm.crawler.CrawlTopology crawl

Mailing list : http://groups.google.com/group/digitalpebble