From b8ca3276e9396d2aa478d712546df63b2818eb7d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Patrik=20Drhl=C3=ADk?= Date: Fri, 21 Sep 2018 13:39:54 +0200 Subject: [PATCH] close #19 - add README.Rmd README.md will become completely generated. Changes will now be made to README.Rmd only. The document must be rendered for changes to be reflected in README.md. --- .Rbuildignore | 2 ++ README.Rmd | 85 +++++++++++++++++++++++++++++++++++++++++++++++++++ README.md | 81 ++++++++++++++++++++++++------------------------ 3 files changed, 128 insertions(+), 40 deletions(-) create mode 100644 README.Rmd diff --git a/.Rbuildignore b/.Rbuildignore index 3ee806f..e7d1102 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -3,3 +3,5 @@ ^\.travis\.yml$ sticker/ data-raw/ +^README\.Rmd$ +^README-.*\.png$ diff --git a/README.Rmd b/README.Rmd new file mode 100644 index 0000000..d02b5d6 --- /dev/null +++ b/README.Rmd @@ -0,0 +1,85 @@ +--- +output: github_document +--- + + + +```{r count_table_prep, echo = FALSE, message = FALSE} +library(dplyr) +library(glue) + +# If adding a new language, add a new row to the following +# data frame. Make sure that the codes are alphabetically +# ordered and include their language equivalent. +langs <- data_frame( + lang_code = c("cs", "en", "pl"), + lang = c("Czech", "English", "Polish") +) + +# Counts of swear words for each language are computed +# based on our swear_words data frame. +counts <- sweary::swear_words %>% + count(language) + +# This joined data frame includes language names, +# counts and labels that are used to create a row +# in a markdown table. +lang_counts <- inner_join( + langs, + counts, + by = c("lang_code" = "language") +) %>% + mutate( + label_row = glue("| {lang} | {lang_code} | {n} |") + ) +``` + +[![Join the chat at https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr) +[![Build Status](https://travis-ci.org/pdrhlik/sweary.svg?branch=master)](https://travis-ci.org/pdrhlik/sweary) + +# sweary + +Sweary is an R package that contains a database of swear words from different languages, cherry picked by native speakers. + +## Installation + +The development version of this package can be installed using [devtools](https://github.com/r-lib/devtools): + +``` +devtools::install_github("pdrhlik/sweary") +``` + +## Current swear word lists + +| Language | Language code | Number of swear words | +| ------------- | ------------- | --------------------- | +`r glue_collapse(lang_counts$label_row, sep = "\n")` +| **Total** | **`r nrow(lang_counts)` langs** | **`r sum(lang_counts$n)`** | + +## How to use it + +The package contains a data frame called `swear_words`. You can filter or modify it as you wish now. There will be convenient functions to extract only the languages that are of your interest. + +## Add (modify) a language + +If you are not comfortable with `git` and pull requests, you can just follow steps **1-3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG_CODE}**. We will acknowledge you in the README after we approve of the changes. + +1. **Choose a new language.**\ + Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). +2. **Create a language file.**\ + Place the file in `data-raw/swear-word-lists/{LANG_CODE}`.\ + Example for English: `data-raw/swear-word-lists/en`. +3. **Fill in the file with swear-words.** Following rules must apply: + + **One** swear-word per line. + + All words must be **lowercase**. + + The list must only contain **unique** words. + + The list must be **sorted** alphabetically. +4. **Make sure all the tests pass.**\ + You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`. +5. **Update README.Rmd**\ + Update the `langs` data frame in README.Rmd by adding a new row to it. More precise instructions are in the raw file itself. +6. **Create a pull request.** + +## Origin + +The idea first appeared after the [South Park text analysis lightning talk](https://github.com/pdrhlik/southparktalk-whyr2018) at the [Why R? 2018 conference](http://whyr2018.pl/) in Wrocław. All the contributors will be acknowledged as the work progresses. diff --git a/README.md b/README.md index 23dc980..0e0ae7d 100644 --- a/README.md +++ b/README.md @@ -1,55 +1,56 @@ -[![Join the chat at https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr) -[![Build Status](https://travis-ci.org/pdrhlik/sweary.svg?branch=master)](https://travis-ci.org/pdrhlik/sweary) -# sweary + +[![Join the chat at https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr) [![Build Status](https://travis-ci.org/pdrhlik/sweary.svg?branch=master)](https://travis-ci.org/pdrhlik/sweary) + +sweary +========================================================================= Sweary is an R package that contains a database of swear words from different languages, cherry picked by native speakers. -## Installation +Installation +------------ The development version of this package can be installed using [devtools](https://github.com/r-lib/devtools): -``` -devtools::install_github("pdrhlik/sweary") -``` + devtools::install_github("pdrhlik/sweary") -## Current swear word lists +Current swear word lists +------------------------ -| Language | Language code | Number of swear words | -| ------------- | ------------- | --------------------- | -| Czech | cs | 57 | -| English | en | 39 | -| Polish | pl | 41 | -| **Total** | **3 langs** | **137** | +| Language | Language code | Number of swear words | +|-----------|---------------|-----------------------| +| Czech | cs | 57 | +| English | en | 39 | +| Polish | pl | 41 | +| **Total** | **3 langs** | **137** | -## How to use it +How to use it +------------- The package contains a data frame called `swear_words`. You can filter or modify it as you wish now. There will be convenient functions to extract only the languages that are of your interest. -## Contributions - -You are welcome to create a pull request either with modifications to current lists or with a completely new language. - -## Add (modify) a language - -If you are not comfortable with `git` and pull requests, you can just follow steps **1–3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG_CODE}**. We will acknowledge you in the README after we approve of the changes. - -1. **Choose a new language.** - Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). -2. **Create a language file.** - Place the file in `data-raw/swear-word-lists/{LANG_CODE}`. - Example for English: `data-raw/swear-word-lists/en`. -3. **Fill in the file with swear-words.** Following rules must apply: - + **One** swear-word per line. - + All words must be **lowercase**. - + The list must only contain **unique** words. - + The list must be **sorted** alphabetically. -4. **Make sure all the tests pass.** - You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`. -5. **Update README.** - Add a line to the table with swear-word counts in README. Don't forget to update total counts. (We will try to automate this step in near future.) -6. **Create a pull request.** - -## Origin +Add (modify) a language +----------------------- + +If you are not comfortable with `git` and pull requests, you can just follow steps **1-3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG\_CODE}**. We will acknowledge you in the README after we approve of the changes. + +1. **Choose a new language.** + Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). +2. **Create a language file.** + Place the file in `data-raw/swear-word-lists/{LANG_CODE}`. + Example for English: `data-raw/swear-word-lists/en`. +3. **Fill in the file with swear-words.** Following rules must apply: + - **One** swear-word per line. + - All words must be **lowercase**. + - The list must only contain **unique** words. + - The list must be **sorted** alphabetically. +4. **Make sure all the tests pass.** + You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`. +5. **Update README.Rmd** + Update the `langs` data frame in README.Rmd by adding a new row to it. More precise instructions are in the raw file itself. +6. **Create a pull request.** + +Origin +------ The idea first appeared after the [South Park text analysis lightning talk](https://github.com/pdrhlik/southparktalk-whyr2018) at the [Why R? 2018 conference](http://whyr2018.pl/) in Wrocław. All the contributors will be acknowledged as the work progresses.