-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
README.md will become completely generated. Changes will now be made to README.Rmd only. The document must be rendered for changes to be reflected in README.md.
- Loading branch information
Showing
3 changed files
with
128 additions
and
40 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,3 +3,5 @@ | |
^\.travis\.yml$ | ||
sticker/ | ||
data-raw/ | ||
^README\.Rmd$ | ||
^README-.*\.png$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
--- | ||
output: github_document | ||
--- | ||
|
||
<!-- README.md is generated from README.Rmd. Please edit this file. --> | ||
|
||
```{r count_table_prep, echo = FALSE, message = FALSE} | ||
library(dplyr) | ||
library(glue) | ||
# If adding a new language, add a new row to the following | ||
# data frame. Make sure that the codes are alphabetically | ||
# ordered and include their language equivalent. | ||
langs <- data_frame( | ||
lang_code = c("cs", "en", "pl"), | ||
lang = c("Czech", "English", "Polish") | ||
) | ||
# Counts of swear words for each language are computed | ||
# based on our swear_words data frame. | ||
counts <- sweary::swear_words %>% | ||
count(language) | ||
# This joined data frame includes language names, | ||
# counts and labels that are used to create a row | ||
# in a markdown table. | ||
lang_counts <- inner_join( | ||
langs, | ||
counts, | ||
by = c("lang_code" = "language") | ||
) %>% | ||
mutate( | ||
label_row = glue("| {lang} | {lang_code} | {n} |") | ||
) | ||
``` | ||
|
||
[](https://gitter.im/swearyr) | ||
[](https://travis-ci.org/pdrhlik/sweary) | ||
|
||
# sweary <img src="sticker/sweary-sticker.png" align="right" width="150" /> | ||
|
||
Sweary is an R package that contains a database of swear words from different languages, cherry picked by native speakers. | ||
|
||
## Installation | ||
|
||
The development version of this package can be installed using [devtools](https://github.com/r-lib/devtools): | ||
|
||
``` | ||
devtools::install_github("pdrhlik/sweary") | ||
``` | ||
|
||
## Current swear word lists | ||
|
||
| Language | Language code | Number of swear words | | ||
| ------------- | ------------- | --------------------- | | ||
`r glue_collapse(lang_counts$label_row, sep = "\n")` | ||
| **Total** | **`r nrow(lang_counts)` langs** | **`r sum(lang_counts$n)`** | | ||
|
||
## How to use it | ||
|
||
The package contains a data frame called `swear_words`. You can filter or modify it as you wish now. There will be convenient functions to extract only the languages that are of your interest. | ||
|
||
## Add (modify) a language | ||
|
||
If you are not comfortable with `git` and pull requests, you can just follow steps **1-3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG_CODE}**. We will acknowledge you in the README after we approve of the changes. | ||
|
||
1. **Choose a new language.**\ | ||
Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). | ||
2. **Create a language file.**\ | ||
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`.\ | ||
Example for English: `data-raw/swear-word-lists/en`. | ||
3. **Fill in the file with swear-words.** Following rules must apply: | ||
+ **One** swear-word per line. | ||
+ All words must be **lowercase**. | ||
+ The list must only contain **unique** words. | ||
+ The list must be **sorted** alphabetically. | ||
4. **Make sure all the tests pass.**\ | ||
You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`. | ||
5. **Update README.Rmd**\ | ||
Update the `langs` data frame in README.Rmd by adding a new row to it. More precise instructions are in the raw file itself. | ||
6. **Create a pull request.** | ||
|
||
## Origin | ||
|
||
The idea first appeared after the [South Park text analysis lightning talk](https://github.com/pdrhlik/southparktalk-whyr2018) at the [Why R? 2018 conference](http://whyr2018.pl/) in Wrocław. All the contributors will be acknowledged as the work progresses. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,55 +1,56 @@ | ||
[](https://gitter.im/swearyr) | ||
[](https://travis-ci.org/pdrhlik/sweary) | ||
|
||
# sweary <img src="sticker/sweary-sticker.png" align="right" width="150" /> | ||
<!-- README.md is generated from README.Rmd. Please edit this file. --> | ||
[](https://gitter.im/swearyr) [](https://travis-ci.org/pdrhlik/sweary) | ||
|
||
sweary <img src="sticker/sweary-sticker.png" align="right" width="150" /> | ||
========================================================================= | ||
|
||
Sweary is an R package that contains a database of swear words from different languages, cherry picked by native speakers. | ||
|
||
## Installation | ||
Installation | ||
------------ | ||
|
||
The development version of this package can be installed using [devtools](https://github.com/r-lib/devtools): | ||
|
||
``` | ||
devtools::install_github("pdrhlik/sweary") | ||
``` | ||
devtools::install_github("pdrhlik/sweary") | ||
|
||
## Current swear word lists | ||
Current swear word lists | ||
------------------------ | ||
|
||
| Language | Language code | Number of swear words | | ||
| ------------- | ------------- | --------------------- | | ||
| Czech | cs | 57 | | ||
| English | en | 39 | | ||
| Polish | pl | 41 | | ||
| **Total** | **3 langs** | **137** | | ||
| Language | Language code | Number of swear words | | ||
|-----------|---------------|-----------------------| | ||
| Czech | cs | 57 | | ||
| English | en | 39 | | ||
| Polish | pl | 41 | | ||
| **Total** | **3 langs** | **137** | | ||
|
||
## How to use it | ||
How to use it | ||
------------- | ||
|
||
The package contains a data frame called `swear_words`. You can filter or modify it as you wish now. There will be convenient functions to extract only the languages that are of your interest. | ||
|
||
## Contributions | ||
|
||
You are welcome to create a pull request either with modifications to current lists or with a completely new language. | ||
|
||
## Add (modify) a language | ||
|
||
If you are not comfortable with `git` and pull requests, you can just follow steps **1–3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG_CODE}**. We will acknowledge you in the README after we approve of the changes. | ||
|
||
1. **Choose a new language.** | ||
Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). | ||
2. **Create a language file.** | ||
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`. | ||
Example for English: `data-raw/swear-word-lists/en`. | ||
3. **Fill in the file with swear-words.** Following rules must apply: | ||
+ **One** swear-word per line. | ||
+ All words must be **lowercase**. | ||
+ The list must only contain **unique** words. | ||
+ The list must be **sorted** alphabetically. | ||
4. **Make sure all the tests pass.** | ||
You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`. | ||
5. **Update README.** | ||
Add a line to the table with swear-word counts in README. Don't forget to update total counts. (We will try to automate this step in near future.) | ||
6. **Create a pull request.** | ||
|
||
## Origin | ||
Add (modify) a language | ||
----------------------- | ||
|
||
If you are not comfortable with `git` and pull requests, you can just follow steps **1-3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG\_CODE}**. We will acknowledge you in the README after we approve of the changes. | ||
|
||
1. **Choose a new language.** | ||
Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). | ||
2. **Create a language file.** | ||
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`. | ||
Example for English: `data-raw/swear-word-lists/en`. | ||
3. **Fill in the file with swear-words.** Following rules must apply: | ||
- **One** swear-word per line. | ||
- All words must be **lowercase**. | ||
- The list must only contain **unique** words. | ||
- The list must be **sorted** alphabetically. | ||
4. **Make sure all the tests pass.** | ||
You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`. | ||
5. **Update README.Rmd** | ||
Update the `langs` data frame in README.Rmd by adding a new row to it. More precise instructions are in the raw file itself. | ||
6. **Create a pull request.** | ||
|
||
Origin | ||
------ | ||
|
||
The idea first appeared after the [South Park text analysis lightning talk](https://github.com/pdrhlik/southparktalk-whyr2018) at the [Why R? 2018 conference](http://whyr2018.pl/) in Wrocław. All the contributors will be acknowledged as the work progresses. |