Skip to content

Commit

Permalink
close #19 - add README.Rmd
Browse files Browse the repository at this point in the history
README.md will become completely generated. Changes will now be made to
README.Rmd only. The document must be rendered for changes to be reflected
in README.md.
  • Loading branch information
pdrhlik committed Sep 21, 2018
1 parent 9ca33ae commit b8ca327
Show file tree
Hide file tree
Showing 3 changed files with 128 additions and 40 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@
^\.travis\.yml$
sticker/
data-raw/
^README\.Rmd$
^README-.*\.png$
85 changes: 85 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit this file. -->

```{r count_table_prep, echo = FALSE, message = FALSE}
library(dplyr)
library(glue)
# If adding a new language, add a new row to the following
# data frame. Make sure that the codes are alphabetically
# ordered and include their language equivalent.
langs <- data_frame(
lang_code = c("cs", "en", "pl"),
lang = c("Czech", "English", "Polish")
)
# Counts of swear words for each language are computed
# based on our swear_words data frame.
counts <- sweary::swear_words %>%
count(language)
# This joined data frame includes language names,
# counts and labels that are used to create a row
# in a markdown table.
lang_counts <- inner_join(
langs,
counts,
by = c("lang_code" = "language")
) %>%
mutate(
label_row = glue("| {lang} | {lang_code} | {n} |")
)
```

[![Join the chat at https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr)
[![Build Status](https://travis-ci.org/pdrhlik/sweary.svg?branch=master)](https://travis-ci.org/pdrhlik/sweary)

# sweary <img src="sticker/sweary-sticker.png" align="right" width="150" />

Sweary is an R package that contains a database of swear words from different languages, cherry picked by native speakers.

## Installation

The development version of this package can be installed using [devtools](https://github.com/r-lib/devtools):

```
devtools::install_github("pdrhlik/sweary")
```

## Current swear word lists

| Language | Language code | Number of swear words |
| ------------- | ------------- | --------------------- |
`r glue_collapse(lang_counts$label_row, sep = "\n")`
| **Total** | **`r nrow(lang_counts)` langs** | **`r sum(lang_counts$n)`** |

## How to use it

The package contains a data frame called `swear_words`. You can filter or modify it as you wish now. There will be convenient functions to extract only the languages that are of your interest.

## Add (modify) a language

If you are not comfortable with `git` and pull requests, you can just follow steps **1-3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG_CODE}**. We will acknowledge you in the README after we approve of the changes.

1. **Choose a new language.**\
Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).
2. **Create a language file.**\
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`.\
Example for English: `data-raw/swear-word-lists/en`.
3. **Fill in the file with swear-words.** Following rules must apply:
+ **One** swear-word per line.
+ All words must be **lowercase**.
+ The list must only contain **unique** words.
+ The list must be **sorted** alphabetically.
4. **Make sure all the tests pass.**\
You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`.
5. **Update README.Rmd**\
Update the `langs` data frame in README.Rmd by adding a new row to it. More precise instructions are in the raw file itself.
6. **Create a pull request.**

## Origin

The idea first appeared after the [South Park text analysis lightning talk](https://github.com/pdrhlik/southparktalk-whyr2018) at the [Why R? 2018 conference](http://whyr2018.pl/) in Wrocław. All the contributors will be acknowledged as the work progresses.
81 changes: 41 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,55 +1,56 @@
[![Join the chat at https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr)
[![Build Status](https://travis-ci.org/pdrhlik/sweary.svg?branch=master)](https://travis-ci.org/pdrhlik/sweary)

# sweary <img src="sticker/sweary-sticker.png" align="right" width="150" />
<!-- README.md is generated from README.Rmd. Please edit this file. -->
[![Join the chat at https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr) [![Build Status](https://travis-ci.org/pdrhlik/sweary.svg?branch=master)](https://travis-ci.org/pdrhlik/sweary)

sweary <img src="sticker/sweary-sticker.png" align="right" width="150" />
=========================================================================

Sweary is an R package that contains a database of swear words from different languages, cherry picked by native speakers.

## Installation
Installation
------------

The development version of this package can be installed using [devtools](https://github.com/r-lib/devtools):

```
devtools::install_github("pdrhlik/sweary")
```
devtools::install_github("pdrhlik/sweary")

## Current swear word lists
Current swear word lists
------------------------

| Language | Language code | Number of swear words |
| ------------- | ------------- | --------------------- |
| Czech | cs | 57 |
| English | en | 39 |
| Polish | pl | 41 |
| **Total** | **3 langs** | **137** |
| Language | Language code | Number of swear words |
|-----------|---------------|-----------------------|
| Czech | cs | 57 |
| English | en | 39 |
| Polish | pl | 41 |
| **Total** | **3 langs** | **137** |

## How to use it
How to use it
-------------

The package contains a data frame called `swear_words`. You can filter or modify it as you wish now. There will be convenient functions to extract only the languages that are of your interest.

## Contributions

You are welcome to create a pull request either with modifications to current lists or with a completely new language.

## Add (modify) a language

If you are not comfortable with `git` and pull requests, you can just follow steps **1–3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG_CODE}**. We will acknowledge you in the README after we approve of the changes.

1. **Choose a new language.**
Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).
2. **Create a language file.**
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`.
Example for English: `data-raw/swear-word-lists/en`.
3. **Fill in the file with swear-words.** Following rules must apply:
+ **One** swear-word per line.
+ All words must be **lowercase**.
+ The list must only contain **unique** words.
+ The list must be **sorted** alphabetically.
4. **Make sure all the tests pass.**
You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`.
5. **Update README.**
Add a line to the table with swear-word counts in README. Don't forget to update total counts. (We will try to automate this step in near future.)
6. **Create a pull request.**

## Origin
Add (modify) a language
-----------------------

If you are not comfortable with `git` and pull requests, you can just follow steps **1-3**. After you create the file, send it to me via [email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary language: {LANG\_CODE}**. We will acknowledge you in the README after we approve of the changes.

1. **Choose a new language.**
Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).
2. **Create a language file.**
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`.
Example for English: `data-raw/swear-word-lists/en`.
3. **Fill in the file with swear-words.** Following rules must apply:
- **One** swear-word per line.
- All words must be **lowercase**.
- The list must only contain **unique** words.
- The list must be **sorted** alphabetically.
4. **Make sure all the tests pass.**
You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`.
5. **Update README.Rmd**
Update the `langs` data frame in README.Rmd by adding a new row to it. More precise instructions are in the raw file itself.
6. **Create a pull request.**

Origin
------

The idea first appeared after the [South Park text analysis lightning talk](https://github.com/pdrhlik/southparktalk-whyr2018) at the [Why R? 2018 conference](http://whyr2018.pl/) in Wrocław. All the contributors will be acknowledged as the work progresses.

0 comments on commit b8ca327

Please sign in to comment.