Skip to content

Commit

Permalink
Close #21 - rename language files
Browse files Browse the repository at this point in the history
When adding a new language, it is no longer necessary to edit README.Rmd.
We don't need to edit the language data frame any more because the
language name is present in the file name.

Example:
en_English
fr-CA_French (Canada)

The only issue that came up is that running Ctrl+Shift+K in Rstudio
(default Knit) doesn't work because of an unexported function load_langs.
It works when using build_sweary() so it's okay. This function cannot be
exported because it loads raw language files that are not present in an
installed package.
  • Loading branch information
pdrhlik committed Nov 8, 2018
1 parent 28f8407 commit 103c95b
Show file tree
Hide file tree
Showing 20 changed files with 190 additions and 53 deletions.
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,6 @@ Suggests:
purrr,
rmarkdown,
stringr,
usethis
usethis,
readr
Imports: glue
85 changes: 84 additions & 1 deletion R/build_tools.R
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ format.sweary_build_results <- function(x) {
status$errors > 0 ~ "You need to fix some ERRORS!",
status$warnings > 0 ~ "You should fix those WARNINGS!",
status$notes > 0 ~ "Handle those NOTES and you're good to go!",
TRUE ~ paste0("Great job! Random swearword for you: ", rsw$word, " [", rsw$language, "] :-)")
TRUE ~ paste0("Great job! Random swear word for you: ", rsw$word, " [", rsw$language, "] :-)")
)

glue::glue("
Expand Down Expand Up @@ -204,3 +204,86 @@ print_devtools_check_summary <- function(x) {
warnings: {x$warnings}
notes: {x$notes}")
}

#' Splits lang file name in language code and name
#'
#' @param lang_file Language file name, either absolute
#' or relative.
#'
#' @return Character vector of length 2. First
#' element is language code, second element
#' is language name.
split_lang_file <- function(lang_file) {
file_name <- stringr::str_split(lang_file, "/", simplify = TRUE) %>%
dplyr::last(.)
file_split <- stringr::str_split(file_name, "_", simplify = TRUE)

return(file_split)
}

#' Returns language code from file name
#'
#' @param lang_file Language file name, either absolute
#' or relative.
#'
#' @return Language code.
file_lang_code <- function(lang_file) {
file_split <- split_lang_file(lang_file)

return(file_split[1])
}

#' Returns language name from file name
#'
#' @param lang_file Language file name, either absolute
#' or relative.
#'
#' @return Language name.
file_lang_name <- function(lang_file) {
file_split <- split_lang_file(lang_file)

return(file_split[2])
}

#' Loads a single language data frame from file
#'
#' @param lang_file Language file name with full path.
#' @return Data frame of swear words in one language.
load_lang_from_file <- function(lang_file) {
suppressMessages(
words <- readr::read_csv(lang_file, col_names = c("word"))
)
words$language <- file_lang_code(lang_file)

return(words)
}

#' Create a summary df with languages and their counts
#'
#' @return Data frame with language codes, language names,
#' word counts and a formatted markdown table row.
load_langs <- function() {
lang_files <- list.files("data-raw/swear-word-lists/", full.names = TRUE)

langs <- purrr::map_df(lang_files, function(lang_file) {
file_split <- split_lang_file(lang_file)
dplyr::data_frame(
lang_code = file_split[1],
lang = file_split[2]
)
})

counts <- sweary::swear_words %>%
dplyr::count(.data$language)

lang_counts <- dplyr::inner_join(
langs,
counts,
by = c("lang_code" = "language")
) %>%
dplyr::mutate(
label_row = glue::glue("| {lang} | {lang_code} | {n} |")
)

return(lang_counts)
}
4 changes: 2 additions & 2 deletions R/sweary.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,5 @@
#' @format A data frame with 96 rows and 2 variables.
"swear_words"

## Deletes R CMD check NOTES for '.' and '%>%'.
utils::globalVariables(c(".", "%>%"))
## Deletes R CMD check NOTES for '.', '%>%' and '.data'.
utils::globalVariables(c(".", "%>%", ".data"))
36 changes: 6 additions & 30 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -8,30 +8,7 @@ output: github_document
library(dplyr)
library(glue)
# If adding a new language, add a new row to the following
# data frame. Make sure that the codes are alphabetically
# ordered and include their language equivalent.
langs <- data_frame(
lang_code = c("cs", "de", "en", "fr-CA", "gr", "mk", "pl", "ro", "sk"),
lang = c("Czech", "German", "English", "French (Canada)", "Greek", "Macedonian", "Polish", "Romanian", "Slovak")
)
# Counts of swear words for each language are computed
# based on our swear_words data frame.
counts <- sweary::swear_words %>%
count(language)
# This joined data frame includes language names,
# counts and labels that are used to create a row
# in a markdown table.
lang_counts <- inner_join(
langs,
counts,
by = c("lang_code" = "language")
) %>%
mutate(
label_row = glue("| {lang} | {lang_code} | {n} |")
)
lang_counts <- load_langs()
```

[![Join the chat at https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr)
Expand Down Expand Up @@ -80,20 +57,19 @@ If you are not comfortable with `git` and pull requests, you can just follow ste
Find its two letter [ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).\
If the language you are creating is a certain dialect (e.g. Canadian French), find its [IETF language tag](https://en.wikipedia.org/wiki/IETF_language_tag) in this [language code table](http://www.lingoes.net/en/translator/langcode.htm).
2. **Create a language file.**\
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`.\
Place the file in `data-raw/swear-word-lists/{LANG_CODE}_{LANG_NAME}`.\
Examples:\
+ English: `data-raw/swear-word-lists/en`
+ Canadian French: `data-raw/swear-word-lists/fr-CA`
+ English: `data-raw/swear-word-lists/en_English`
+ Canadian French: `data-raw/swear-word-lists/fr-CA_French (Canada)`\
Note that spaces and parentheses in file names are allowed.
3. **Fill in the file with swear words.** Following rules must apply:
+ **One** swear word per line with no trailing whitespace.
+ All words must be **lowercase**.
+ The list must only contain **unique** words.
+ The list must be **sorted** alphabetically.
4. **Make sure all the tests pass.**\
You can do that using a development function called `build_sweary()`. It becomes available when you `git clone` the repository and call `devtools::load_all()`. Or pressing `Ctrl+Shift+L` in RStudio. Learn more about calling this function using `?build_sweary`.
5. **Update README.Rmd**.\
Update the `langs` data frame in README.Rmd by adding a new row to it. More precise instructions are in the raw file itself.
6. **Create a pull request.**
5. **Create a pull request.**

## Origin

Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,13 @@ approve of the changes.
[language code
table](http://www.lingoes.net/en/translator/langcode.htm).
2. **Create a language file.**
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`.
Place the file in
`data-raw/swear-word-lists/{LANG_CODE}_{LANG_NAME}`.
Examples:
- English: `data-raw/swear-word-lists/en`
- Canadian French: `data-raw/swear-word-lists/fr-CA`
- English: `data-raw/swear-word-lists/en_English`
- Canadian French: `data-raw/swear-word-lists/fr-CA_French
(Canada)`
Note that spaces and parentheses in file names are allowed.
3. **Fill in the file with swear words.** Following rules must apply:
- **One** swear word per line with no trailing whitespace.
- All words must be **lowercase**.
Expand All @@ -101,10 +104,7 @@ approve of the changes.
repository and call `devtools::load_all()`. Or pressing
`Ctrl+Shift+L` in RStudio. Learn more about calling this function
using `?build_sweary`.
5. **Update README.Rmd**.
Update the `langs` data frame in README.Rmd by adding a new row to
it. More precise instructions are in the raw file itself.
6. **Create a pull request.**
5. **Create a pull request.**

## Origin

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
13 changes: 1 addition & 12 deletions data-raw/swear-words.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,7 @@ library(readr)
library(stringr)
library(dplyr)

load_lang <- function(lang_file) {
suppressMessages(
words <- readr::read_csv(lang_file, col_names = c("word"))
)
lang <- stringr::str_extract(lang_file, "[\\w-]+$")

words$language <- lang

return(words)
}

lang_files <- list.files("data-raw/swear-word-lists/", full.names = TRUE)
swear_words <- purrr::map_df(lang_files, load_lang)
swear_words <- purrr::map_df(lang_files, load_lang_from_file)

usethis::use_data(swear_words, overwrite = TRUE)
18 changes: 18 additions & 0 deletions man/file_lang_code.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 18 additions & 0 deletions man/file_lang_name.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions man/load_lang_from_file.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions man/load_langs.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions man/split_lang_file.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 103c95b

Please sign in to comment.