(ACL 2024) A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech

Paper: https://arxiv.org/abs/2407.15227
Webpage: https://claws-lab.github.io/violence-provoking-speech/
GitHub: https://github.com/claws-lab/violence-provoking-speech

Authors: Gaurav Verma¹, Rynaa Grover¹, Jiawei Zhou¹, Binny Mathew², Jordan Kraemer², Munmun De Choudhury¹, Srijan Kumar¹ | Affiliations: ¹Georgia Institute of Technology, ²Anti-Defamation League

Data

Unlabeled Data

Of the 418,999 unlabeled Twitter data that was collected for our study in Feb 2023, spanning Twitter posts from January 1, 2020 to February 1, 2023, 121,684 (about 30%) could still be accessed in August 23, 2023. We make the Twitter IDs of these posts available in the file data/all_tweet_ids.txt. We suspect that the fraction of posts that are still accessible will decrease over time. The potential reasons behind this could be many, including the deletion of the post by the user/Twitter, the deletion of the user's account (self-deletion of deletion/blocking by moderators), or the user transitioning to Private settings. We strongly recommend the use of provided keywords to collect new data.

Annotated Data

We make the IDs of the Twitter posts that were identified as violence-provoking by community-crowdsourcing. Of the 246 posts that were identified as violence-provoking, 94 (about 40%) could still be accessed in August 23, 2023. The Twitter IDs of these posts is available in the file data/violence_provoking_tweet_ids.txt.

Keywords

We provide the keywords used to collect the data in the file data/keywords.py. These keywords also include the subset from the keyword expansion strategy we adopted; please refer to Tables 6 and 7 in the paper for more details. We recommend using these keywords to collect new data.

Bibtex

If you find these resources helpful, please cosnider citing our paper:

@article{verma2024community,
  title={A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech},
  author={Verma, Gaurav and Grover, Rynaa and Zhou, Jiawei and Mathew, Binny and Kraemer, Jordan and De Choudhury, Munmun and Kumar, Srijan},
  publisher={62nd Annual Meeting of the Association for Computational Linguistics (ACL)},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
data		data
keywords		keywords
.DS_Store		.DS_Store
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(ACL 2024) A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech

Data

Unlabeled Data

Annotated Data

Keywords

Bibtex

About

Releases

Packages

Languages

claws-lab/violence-provoking-speech

Folders and files

Latest commit

History

Repository files navigation

(ACL 2024) A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech

Data

Unlabeled Data

Annotated Data

Keywords

Bibtex

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages