layout	title
page	Monitoring Events with Twarc

Monitoring Events Using twarc Filter and Search

This is a narrative guide outlining how to start a search and a filter and combine the results once the event is over. We're going to running this on a recent news event about the Governor of Florida, but any topic will work.

Before You Start

Before starting this guide, make sure you have twarc installed and setup.

Filter and Search

Next you're going to want to run twarc filter which collects tweets from the Twitter stream matching the filter criteria, and twarc search which collects tweets made in the past seven days matching the search criteria. There are a couple of ways this can be done, but the most preferable is to run two command line windows.

twarc filter desantis > desantis_filter.jsonl

twarc search desantis > desantis_search.jsonl

The search command will finish before the filter which will keep running until manually stopped. Once we are finished running the search, we can work on combining the two JSONLs.

Dehydrate

We will start by dehydrating the two collected datasets.

twarc dehydrate desantis_filter.jsonl > desantis_filter.txt 
   
twarc dehydrate desantis_search.jsonl  > desantis_search.txt

Combine

Now that the datasets have been dehydrated, we can use the python program combine.py here to combine them.

python utils/combine.py

And enter the input requests as follows:

Enter the name of your filter txt: desantis_filter.txt
Enter the name of your search txt: desantis_search.txt
Enter the name of your output txt: desantis_fs.txt

Rehydrate

Now that we have our merged dataset, we can rehydrate the dataset.

twarc hydrate desantis_fs.txt > desantis_fs.jsonl

Deduplicate

Then, we can run deduplicate.py to remove any overlap from the merging of the two datasets.

python utils/deduplicate.py desantis_fs.jsonl > desantis.jsonl

All of the usage is displayed in the command line here:

DESANTIS1

DESANTIS2

Analysis

Now that we have our merged dataset without duplicate ID's, we can perform analysis using the python utilities provided with twarc. See the twarc page for more information and links the the repository.

You can download the DeSantis files from the twitter repo.

Back To Top

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

monitoring_events_twarc.md

monitoring_events_twarc.md

Monitoring Events Using twarc Filter and Search

Table of Contents

Before You Start

Filter and Search

Dehydrate

Combine

Rehydrate

Deduplicate

Analysis

Files

monitoring_events_twarc.md

Latest commit

History

monitoring_events_twarc.md

File metadata and controls

Monitoring Events Using twarc Filter and Search

Table of Contents

Before You Start

Filter and Search

Dehydrate

Combine

Rehydrate

Deduplicate

Analysis