Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use roxygen2 parser for roxy comments + lightparser for figure and tbl captions #29

Open
wants to merge 83 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
8db3686
Awesome progress on roxy parsing!
olivroy May 17, 2024
cdd9c70
Speed things up!
olivroy May 17, 2024
d2578be
Tiny modifications to ensure comparison between the other outline files.
olivroy May 17, 2024
a887d76
Other fixes
olivroy May 17, 2024
0e4f9fc
Remove cli info
olivroy May 17, 2024
1ee6d49
Add support for outline-roxy
olivroy May 17, 2024
88bb444
Fix color in message
olivroy May 17, 2024
2f66540
Add tests to see if it works correctly.
olivroy May 17, 2024
401997c
Use `line` instead of line_id
olivroy May 17, 2024
0e74f37
Commit remaining breakage
olivroy May 17, 2024
314fa9b
Merge
olivroy May 18, 2024
de4fc3e
Merge branch
olivroy May 24, 2024
145809c
rename `o_is_object_title()` to `o_is_tab_plot_title()`
olivroy May 24, 2024
3a875df
Exclude `@keywords` and `@noRd` (will only need to exclude undocument…
olivroy May 24, 2024
f8fe284
Create `define_criteria_roxy()` to define criteria independently for …
olivroy May 24, 2024
92335a4
Update snapshots
olivroy May 24, 2024
38a71a0
Handle empty case better
olivroy May 24, 2024
97da597
Fix logic to include object titles in outline
olivroy May 24, 2024
ebc4355
Commit changes to README
olivroy May 24, 2024
d3bf724
Refine some criteria to exclude some contents or files.
olivroy May 24, 2024
557209a
Avoid index roxygen comments in tests
olivroy May 24, 2024
29375cf
Merged origin/main into roxy-parse
olivroy May 24, 2024
fa9d935
Improve table detection. Improve package version detection in news.
olivroy May 24, 2024
1db1418
Add markup for linking local issues
olivroy May 24, 2024
d172b30
Update R/outline.R
olivroy May 24, 2024
d3506b4
Last update
olivroy May 25, 2024
646fe4b
Add parent error for debugging.
olivroy May 25, 2024
e481ac3
I identified the issue. Will try to fix.
olivroy May 25, 2024
323a121
Fix `pos` and `objects` to make sure they have a common length.
olivroy May 30, 2024
910a202
Add `active_rs_doc_nav()` to navigate to Files Pane at location.
olivroy May 30, 2024
8cec6ce
Rename to `_outline` for consistency.
olivroy May 30, 2024
406a0c4
Better topic name detection
olivroy May 30, 2024
c0e9063
Improve regex to allow for title to be wrapped in function + family t…
olivroy May 30, 2024
8badd76
Improve `proj_file()` if exact match in `proj`.
olivroy May 30, 2024
83f0c4a
Make sure pos and objects have the same length.
olivroy May 30, 2024
594b627
fix regex for plot title.
olivroy May 30, 2024
35ba23d
Avoid uninteresting roxy headings.
olivroy May 30, 2024
15be047
Don't parse roxy comments in `proj_file()` + add `options("reuseme.ro…
olivroy May 30, 2024
a742214
Avoid recognizing test_that("a", expect_true(TRUE))
olivroy May 30, 2024
eeadbaa
Temporarily change directory when parsing roxy comments as it may help?
olivroy May 30, 2024
b5f1896
Fix mistake
olivroy May 30, 2024
decdd06
Use lightparser for caption parsing!
olivroy May 30, 2024
d5b6ad3
Some fixups for revdeps and plot titles to remove some false positive…
olivroy May 31, 2024
6af2548
Don't error if you couldn't find gh URL.
olivroy May 31, 2024
27cd5cb
Clean workaround
olivroy May 31, 2024
a497218
Add to NEWS + minor adjustments to make outline and usethis to keep w…
olivroy May 31, 2024
fc50e76
More robust `escape_markup()` (add `\\.` as an acceptable start of va…
olivroy May 31, 2024
49e443b
Add some workarounds to make cli parsing and escaping work a bit bett…
olivroy May 31, 2024
57547f1
mark todos as complete...
olivroy May 31, 2024
420fd50
Rename `print_todo` -> `exclude_todos`...
olivroy May 31, 2024
93b1805
Rm redundant heading
olivroy May 31, 2024
938193b
Avoid empty todos + html sourceCode (tidyselect integration)
olivroy May 31, 2024
11880b1
Lint + use base R replacements
olivroy May 31, 2024
51bde5c
Recognize the last way possible to specify chunk options.
olivroy May 31, 2024
5751577
Rename `is_chunk_cap` to `is_object_caption`
olivroy May 31, 2024
871dd05
Add the factored `exclude_example_files()`
olivroy May 31, 2024
21fcdc3
Improve knitr notebook support
olivroy May 31, 2024
33a4d33
Sort out uninteresting headings.
olivroy May 31, 2024
d5782e4
Experimental support for displaying topic.
olivroy May 31, 2024
7dac4e5
Fix problem of incorrect dir.
olivroy May 31, 2024
9d07e39
Use mocking for mocking RStudio + change dir
olivroy May 31, 2024
72c3d45
Fix snap
olivroy May 31, 2024
9e00edc
Tweaks based on integration testing,
olivroy May 31, 2024
e45b749
More integration adjustments and addition of examples.
olivroy May 31, 2024
1429c02
Address some comments.
olivroy May 31, 2024
cdd0cc9
Make sure is_doc_title doesn't interleave with , prefer is_object_tit…
olivroy Jun 2, 2024
434b92f
Add files from violetcereza as testing.
olivroy Jun 3, 2024
3cb9225
Test adding indent
olivroy Jun 3, 2024
d11f9d5
Merge main
olivroy Jun 3, 2024
d926d32
Fix conflict [ci skip]
olivroy Jun 3, 2024
1d84af2
Merge
olivroy Jun 3, 2024
8f9978c
merge [ci skip]
olivroy Jun 3, 2024
392591b
Merge main [ci skip]
olivroy Jun 3, 2024
f87c935
Merge
olivroy Jun 3, 2024
4d87700
Merge
olivroy Jun 7, 2024
56a4062
[ci skip] adjust news
olivroy Jun 7, 2024
0fb7264
merge
olivroy Jun 9, 2024
cce2805
notebook are already supported on main.
olivroy Jun 9, 2024
27bdec2
[ci skip]
olivroy Jun 9, 2024
f7105c2
Merge
olivroy Jun 13, 2024
5f665cb
fix merge [ci skip]
olivroy Jun 13, 2024
322e5d0
Merge
olivroy Jun 26, 2024
caa185c
Merge branch 'main' into roxy-parse
olivroy Aug 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ See our guide on [how to create a great issue](https://code-review.tidyverse.org
* `define_outline_criteria()` if an item shows as outline, but seems like a false positive,


olivroy marked this conversation as resolved.
Show resolved Hide resolved
* `keep_outline_element()`: if an element is **missing** from outline.
* `keep_outline_element()`: if an element is **missing** from outline, you can add the keyword "REQUIRED ELEMENT" to get an object for debugging.

* `define_important_element()` if an element is important [^1]

Expand Down
3 changes: 3 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,12 @@ Suggests:
curl,
gert,
gt,
lightparser,
magick,
pillar,
roxygen2,
testthat (>= 3.2.1),
tidyr,
withr
Config/testthat/edition: 3
Encoding: UTF-8
Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@ that will passed on to `proj_list()`

* `proj_list()` / `proj_switch()` no longer opens a nested project if looking for `"pkgdown"`, `"testthat"`, etc.

* `proj_outline()` was improved to work with roxygen2 and lightparser to parse file contents more consistenly. This means a slowdown, but the increased accuracy is worth it! Parsing a single file should still be pretty fast!

* `proj_outline()` gains `exclude_tests` to exclude tests from outline

* `proj_outline()` now detects legacy `fig.cap` in the chunk header. See `knitr::convert_chunk_headers()` for the newer approach.

* `active_rs_doc_nav()` is a new function to navigate to files pane location.

`active_rs_doc_copy()` now accepts copying md and qmd files too and no longer allows renaming Rprofile.
Expand Down
171 changes: 147 additions & 24 deletions R/outline-criteria.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,26 @@
#' * is test title
#' * is a todo item
#' * is_roxygen_line
#' * is_tab_title
#' * is_tab_plot_title
#'
#' @noRd
o_is_roxygen_comment <- function(x, file_ext = NULL) {

o_is_roxygen_comment <- function(x, file_ext = NULL, is_notebook = FALSE) {
if (!is.null(file_ext)) {
is_r_file <- tolower(file_ext) == "r"
is_r_file <- tolower(file_ext) == "r" & !is_notebook
} else {
is_r_file <- TRUE
is_r_file <- !is_notebook
}

if (!any(is_r_file)) {
return(FALSE)
}

ifelse(rep(is_r_file, length.out = length(x)), stringr::str_starts(x, "#'\\s"), FALSE)
ifelse(
rep(is_r_file, length.out = length(x)),
grepl("^#'\\s|^#'$", x), # detect roxygen comments in R files
FALSE # not a roxy comment in Rmd files, fusen is an exception?
)
Comment on lines +22 to +37
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve readability in o_is_roxygen_comment.

-  if (!is.null(file_ext)) {
-    is_r_file <- tolower(file_ext) == "r" & !is_notebook
-  } else {
-    is_r_file <- !is_notebook
-  }
+  is_r_file <- (!is.null(file_ext) && tolower(file_file_ext) == "r" || is.null(file_ext)) && !is_notebook

Committable suggestion was skipped due low confidence.

}

o_is_notebook <- function(x, file, file_ext, line) {
Expand Down Expand Up @@ -106,14 +111,23 @@
!stringr::str_detect(x, "expect_error|header\\(\\)|```\\{|guide_")
}

o_is_section_title <- function(x, is_roxygen_comment = FALSE, is_todo_fixme = FALSE) {
is_section_title <- !is_roxygen_comment & !is_todo_fixme & stringr::str_detect(x, "^\\s{0,4}\\#+\\s+(?!\\#)") & !is_roxygen_comment # remove commented add roxygen
o_is_section_title <- function(x, is_roxygen_comment = FALSE, is_todo_fixme = FALSE, roxy_section = FALSE) {
is_section_title <- roxy_section |
(!is_roxygen_comment & !is_todo_fixme & stringr::str_detect(x, "^\\s{0,4}\\#+\\s+(?!\\#)") & !is_roxygen_comment) # remove commented add roxygen
if (!any(is_section_title)) {
return(is_section_title)
}
if (length(is_roxygen_comment) == 1) {
rep(is_roxygen_comment, length.out = length(is_section_title))
}
if (length(roxy_section) == 1) {
rep(roxy_section, length.out = length(is_section_title))
}
if (any(roxy_section)) {
x[roxy_section] <- sub("@section", "", x, fixed = TRUE)
x[roxy_section] <- sub(":$", "", x, fixed = F)

}
uninteresting_headings <- paste(
"(Tidy\\s?T(uesday|emplate)|Readme|Wrangle|Devel)$|error=TRUE",
"url\\{|Error before installation|unreleased|Function ID$|Function Introduced",
Expand Down Expand Up @@ -150,14 +164,19 @@

# Add variable to outline data frame --------------------

define_outline_criteria <- function(.data, print_todo) {
define_outline_criteria <- function(.data, exclude_todos) {
dir_common <- get_dir_common_outline(.data$file)
x <- .data
x$file_ext <- s_file_ext(x$file)
x$is_md <- x$file_ext %in% c("qmd", "md", "Rmd", "Rmarkdown")
x$is_news <- x$is_md & grepl("NEWS.md", x$file, fixed = TRUE)
x$is_md <- x$is_md & !x$is_news # treating news and other md files differently.
x$is_test_file <- grepl("tests/testthat/test", x$file, fixed = TRUE)
x$is_notebook <- o_is_notebook(x = x$content, x$file, x$file_ext, x$line)
x$is_roxygen_comment <- o_is_roxygen_comment(x$content, x$file_ext, x$is_notebook)
x$content[x$is_notebook] <- sub("^#'\\s?", "", x$content[x$is_notebook])
x$is_md <- (x$is_md | x$is_roxygen_comment | x$is_notebook) & !x$is_news # treating news and other md files differently.
x$is_snap_file <- grepl("_snaps", x$file, fixed = TRUE)

x$is_roxygen_comment <- o_is_roxygen_comment(x$content, x$file_ext)
if (any(x$is_roxygen_comment)) {
# detect knitr notebooks
Expand All @@ -176,37 +195,75 @@
} else {
x$is_notebook <- FALSE
}

should_parse_roxy_comments <-
!isFALSE(getOption("reuseme.roxy_parse", default = TRUE)) && # will not parse if option is set to FALSE
any(x$is_roxygen_comment)
if (should_parse_roxy_comments) {
# doing this created problems in tests?
if (interactive() && !is.null(dir_common) && is_rstudio()) {
# The idea is that roxygen2 may be better at getting objects if directory is changed.
# but don't bother doing this outside RStudio for now...
withr::local_dir(dir_common)
if (!fs::file_exists(x$file[1])) {
cli::cli_abort("Wrong dir done. file = {.file {x$file[1]}. dir = {.path {dir_common}}", .internal = TRUE)

Check warning on line 209 in R/outline-criteria.R

View check run for this annotation

Codecov / codecov/patch

R/outline-criteria.R#L207-L209

Added lines #L207 - L209 were not covered by tests
}
}
rlang::check_installed(c("roxygen2", "tidyr"), "to create roxygen2 comments outline.")
files_with_roxy_comments <- unique(x[x$is_roxygen_comment, "file", drop = TRUE])
files_with_roxy_comments <- rlang::set_names(files_with_roxy_comments, files_with_roxy_comments)
# roxygen2 messages
# TRICK purrr::safely creates an error object, while possible is better.
# Suppresss roxygen2 message, suppress callr output, suppress asciicast warnings.
invisible(
utils::capture.output(
parsed_files <- purrr::map(
files_with_roxy_comments,
purrr::possibly(\(x) roxygen2::parse_file(x, env = NULL))))
) |>
suppressMessages() |>
suppressWarnings()
# if roxygen2 cannot parse a file, let's just forget about it.
unparsed_files <- files_with_roxy_comments[which(is.null(parsed_files))]
# browser()
if (length(unparsed_files) > 0) {
cli::cli_inform("Could not parse roxygen comments in {.file {unparsed_files}}")

Check warning on line 230 in R/outline-criteria.R

View check run for this annotation

Codecov / codecov/patch

R/outline-criteria.R#L230

Added line #L230 was not covered by tests
}
parsed_files <- purrr::compact(parsed_files)
processed_roxy <- join_roxy_fun(parsed_files)
outline_roxy <- define_outline_criteria_roxy(processed_roxy)
} else {
outline_roxy <- NULL
}

x <- dplyr::mutate(
x,
x |> dplyr::filter(!is_roxygen_comment),
# Problematic when looking inside functions
# maybe force no leading space.
# TODO strip is_cli_info in Package? only valid for EDA
# TODO strip is_cli_info in Package? only valid for EDA (currently not showcased..)
is_cli_info = o_is_cli_info(content, is_snap_file, file),
# TODO long enough to be meanignful?
# doc title cannot be after line 50 of a document.
is_doc_title = stringr::str_detect(content, "(?<![-(#\\s?)_[:alpha:]'\"])title\\:.{4,100}") &
!stringr::str_detect(content, "No Description|Ttitle|Subtitle|[Tt]est$|\\\\n") & line < 50 &
!stringr::str_detect(dplyr::lag(content, default = "nothing to detect"), "```yaml"),
is_chunk_cap = stringr::str_detect(content, "\\#\\|.*(cap|title):"),
# deal with chunk cap
# FIXME try to detect all the chunk caption, but would have to figure out the end of it maybe {.pkg lightparser}.
is_chunk_cap_next = is_chunk_cap & stringr::str_detect(content, "\\s*[\\>\\|]$"),
is_chunk_cap = dplyr::case_when(
is_chunk_cap & is_chunk_cap_next ~ FALSE,
dplyr::lag(is_chunk_cap_next, default = FALSE) ~ TRUE,
.default = is_chunk_cap
),
is_chunk_cap_next = is_chunk_cap,
is_obj_caption = stringr::str_detect(content, "\\#\\|\\s{1,2}[:alpha:]{0,5}[\\-\\.]?(cap|title)[:(\\s*=)]|```\\{r.*cap\\s?\\="),
is_test_name = is_test_file & o_is_test_name(content) & !o_is_generic_test(content),
is_todo_fixme = print_todo & o_is_todo_fixme(content, is_roxygen_comment) & !is_snap_file,
is_todo_fixme = !exclude_todos & o_is_todo_fixme(content) & !o_is_roxygen_comment(content, file_ext, is_notebook) & !is_snap_file,
is_section_title = o_is_section_title(content, is_roxygen_comment, is_todo_fixme),
pkg_version = extract_pkg_version(content, is_news, is_section_title),
is_section_title_source = is_section_title &
stringr::str_detect(content, "[-\\=]{3,}|^\\#'") &
stringr::str_detect(content, "[:alpha:]"),
n_leading_hash = nchar(stringr::str_extract(content, "\\#+")),
n_leading_hash = nchar(stringr::str_extract(content, "\\#+(?!\\|)")), # don't count hashpipe
n_leading_hash = dplyr::coalesce(n_leading_hash, 0),
# Make sure everything is second level in revdep/.
n_leading_hash = n_leading_hash + grepl("revdep/", file, fixed = TRUE),
is_second_level_heading_or_more = (is_section_title_source | is_section_title) & n_leading_hash > 1,
# roxygen2 title block
is_object_title = FALSE,
tag = NA_character_,
topic = NA_character_,
is_cross_ref = stringr::str_detect(content, "docs_(links|add.+)?\\(.") & !stringr::str_detect(content, "@param|\\{\\."),
is_function_def = grepl("<- function(", content, fixed = TRUE) & !stringr::str_starts(content, "\\s*#"),
is_tab_or_plot_title = o_is_tab_plot_title(content) & !is_section_title & !is_function_def,
Expand All @@ -217,7 +274,73 @@
line == 1 | !nzchar(dplyr::lead(content, default = "")) & !nzchar(dplyr::lag(content)),
.by = "file"
)
# browser()
res <- dplyr::bind_rows(x, outline_roxy)
res <- dplyr::filter(
res,
content != "NULL"
)
res <- dplyr::arrange(res, .data$file, .data$line)
#res$is_object_title[res$is_doc_title] <- FALSE
res
}


define_outline_criteria_roxy <- function(x) {
olivroy marked this conversation as resolved.
Show resolved Hide resolved
# TODO merge with define_outline_criteria
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider merging define_outline_criteria_roxy with define_outline_criteria.

The TODO comment suggests merging define_outline_criteria_roxy with define_outline_criteria to reduce redundancy. If you need assistance with this, I can help refactor the code or open a GitHub issue to track this task.

if (rlang::is_atomic(x)) {
# in tests, not interactively, got something bizzare
cli::cli_warn("x is {.obj_type_friendly {x}}.")
if (length(x) == 0) {
return(NULL)

Check warning on line 295 in R/outline-criteria.R

View check run for this annotation

Codecov / codecov/patch

R/outline-criteria.R#L293-L295

Added lines #L293 - L295 were not covered by tests
}
}
x$is_md <- x$tag %in% c("subsection", "details", "description", "section")
# short topics are likely placeholders.
x$is_object_title <- x$tag == "title" & nchar(x$content) > 4
x$line <- as.integer(x$line)
x$file_ext <- "R"
# x$content <- paste0("#' ", x$content) # maybe not?
x$is_news <- FALSE
x$is_roxygen_comment <- TRUE
x$is_test_file <- FALSE
x$is_snap_file <- FALSE
x$before_and_after_empty <- TRUE
x$is_section_title <-
(x$tag %in% c("section", "subsection") & o_is_section_title(x$content, roxy_section = TRUE)) |
(x$tag %in% c("details", "description") & stringr::str_detect(x$content, "#\\s"))
x$is_section_title_source <- x$is_section_title
x$is_obj_caption <- FALSE
x$is_test_name <- FALSE
x$pkg_version <- NA_character_
# a family or concept can be seen as a plot subtitle?
x$is_tab_or_plot_title <- x$tag %in% c("family", "concept")
x$is_cli_info <- FALSE
x$is_cross_ref <- FALSE
x$is_function_def <- FALSE
x$is_todo_fixme <- FALSE
x$is_notebook <- FALSE
x$is_doc_title <- FALSE
#x$is_doc_title <- x$line == 1 & x$tag == "title"
x$n_leading_hash <- nchar(stringr::str_extract(x$content, "\\#+"))
x$n_leading_hash <- dplyr::case_when(
x$n_leading_hash > 0 ~ x$n_leading_hash,
# give second importance to doc sections..
x$tag == "section" & x$is_section_title_source ~ 2,
x$tag == "subsection" & x$is_section_title_source ~ 3,
.default = 0
)
x$content <- dplyr::case_when(
!x$is_section_title ~ x$content,
# : removed from section tag in join_roxy_fun()
# code section may not be that interesting..
x$tag == "section" ~ paste0("## ", x$content),
x$tag == "subsection" ~ paste0("### ", x$content),
.default = x$content
)
x$is_second_level_heading_or_more <- ((x$is_section_title_source | x$is_section_title) & x$n_leading_hash > 1)
# x$has_inline_markup <- FALSE # let's not mess with inline markup
x
}

# it is {.file R/outline.R} ------
# it is {.file R/outline.R} or {.file R/outline-roxy.R} ------
Loading