Skip to content

Resources for ACL'24 paper on 'A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech'

Notifications You must be signed in to change notification settings

claws-lab/violence-provoking-speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

(ACL 2024) A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech

Paper: https://arxiv.org/abs/2407.15227
Webpage: https://claws-lab.github.io/violence-provoking-speech/
GitHub: https://github.com/claws-lab/violence-provoking-speech

Authors: Gaurav Verma1, Rynaa Grover1, Jiawei Zhou1, Binny Mathew2, Jordan Kraemer2, Munmun De Choudhury1, Srijan Kumar1 | Affiliations: 1Georgia Institute of Technology, 2Anti-Defamation League

Data

Unlabeled Data

Of the 418,999 unlabeled Twitter data that was collected for our study in Feb 2023, spanning Twitter posts from January 1, 2020 to February 1, 2023, 121,684 (about 30%) could still be accessed in August 23, 2023. We make the Twitter IDs of these posts available in the file data/all_tweet_ids.txt. We suspect that the fraction of posts that are still accessible will decrease over time. The potential reasons behind this could be many, including the deletion of the post by the user/Twitter, the deletion of the user's account (self-deletion of deletion/blocking by moderators), or the user transitioning to Private settings. We strongly recommend the use of provided keywords to collect new data.

Annotated Data

We make the IDs of the Twitter posts that were identified as violence-provoking by community-crowdsourcing. Of the 246 posts that were identified as violence-provoking, 94 (about 40%) could still be accessed in August 23, 2023. The Twitter IDs of these posts is available in the file data/violence_provoking_tweet_ids.txt.

Keywords

We provide the keywords used to collect the data in the file data/keywords.py. These keywords also include the subset from the keyword expansion strategy we adopted; please refer to Tables 6 and 7 in the paper for more details. We recommend using these keywords to collect new data.

Bibtex

If you find these resources helpful, please cosnider citing our paper:

@article{verma2024community,
  title={A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech},
  author={Verma, Gaurav and Grover, Rynaa and Zhou, Jiawei and Mathew, Binny and Kraemer, Jordan and De Choudhury, Munmun and Kumar, Srijan},
  publisher={62nd Annual Meeting of the Association for Computational Linguistics (ACL)},
  year={2024}
}

About

Resources for ACL'24 paper on 'A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech'

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published