Welcome! The project is a big data solution to get all news and olds tweets from specific hashtags in order display the data in Kibana.
- Maven to deploy de src files
- docker && docker-compose to start the containers.
Use this command in the shell to Compile Java file:
mvn clean compile assembly:single
Then copy the file generated inside the folder flink and tweets. The file should be rename in ToElastic.jar inside the folder flink and inside the folder tweets rename to tweets.jar
It is important to create at least two containers, inside the folder flink execute the command bellow:
docker build -t flink_custom .
and the second container:
docker build -t tweets .
In docker-compose.yaml file you can add the hashtags to listen in the enviroment variable "hashtag" inside twitter service.
hashtags: #JuevesDeArquitectura, #Arquitectura
and config the mode, there are three modes:
- all get old and new tweets
- stream the new tweets (streaming)
- old get all old tweets
The next containers will be download in the official repository
Note: It is important to give all permisions in the folder elastic because the data of elastic search will be save there.
chmod 777 elastic
Before start the big bang is important to add your twitter credentials in docker-compose.yaml file, then:
./up.sh
Before start the project and all containers.
- Flink (1 jobmanager; 3 taskmanager)
- Kafka (3 partitions 2 replicas )
- Elastic Search (2 instances elastic search)
- OpenNLP
- Twitter4J
- GetOldTweets-java
- Kibana
Inspiration: sergiokhayyat
Get old tweets: Jefferson Henrique