-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
31d1df7
commit 7bef6f7
Showing
1 changed file
with
1 addition
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
# Sprouter: Dynamic Graph Processing over Data Streams at Scale | ||
# Sprouter: Dynamic Graph Processing over Data Streams at Scale - Apache Spark | ||
|
||
Graph data is becoming dominant for many applications such as social networks, targeted advertising, and web indexing. As a result of that, advances in machine learning and data mining techniques depend tightly on the ability to process this data structure efficiently and reliably. Despite the importance of processing dynamic graphs in real-time, it remains a challenge to maintain such graphs and to process them over data streams. We propose Sprouter, an end-to-end framework which is able to store enormous graph data, allows updates in real-time, and supports efficient complex analytics in addition to simple OLTP queries. We demonstrate that our framework is able to ingest and process streaming data efficiently using a scalable multi-cluster distributed architecture, apply incremental graph updates, and store the dynamic graph for fast query performance. Experiments showed the system ability to update graphs with up to 100 million edges in under 50 seconds in a moderate underlying cluster. This good performance is essential for the framework to serve its purpose. |