Skip to content

Commit

Permalink
[load] Minor tweak to loadvals() and update documentation (#880)
Browse files Browse the repository at this point in the history
* [load] skip empty input on write

* [load] hide `calc_chain` argument and extend the documentation for `wb_load()`

* update NEWS
  • Loading branch information
JanMarvin authored Dec 31, 2023
1 parent ca06d05 commit 85503cf
Show file tree
Hide file tree
Showing 7 changed files with 120 additions and 91 deletions.
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# openxlsx2 (development version)

## Documentation improvement

* Further tweaks to documentation and vignettes to make them more consistent.
* `wb_add_pivot_table()` / `wb_add_slicer()`
* `wb_load()`: `calc_chain` is no longer visible and the previous text that might have been misleading in regards of its use, has been replaced by a more detailed description of what are the consequences of keeping the calculation chain

## New features

* Allow further modifications of comments. The background can now be filled with a color or an image. [870](https://github.com/JanMarvin/openxlsx2/pull/870)
Expand Down
5 changes: 1 addition & 4 deletions R/class-workbook.R
Original file line number Diff line number Diff line change
Expand Up @@ -1924,23 +1924,20 @@ wbWorkbook <- R6::R6Class(
#' @description load workbook
#' @param file file
#' @param data_only data_only
#' @param calc_chain calc_chain
#' @return The `wbWorkbook` object invisibly
load = function(
file,
sheet,
data_only = FALSE,
calc_chain = FALSE,
...
) {
# Is this required?
if (missing(file)) file <- substitute()
if (missing(file)) file <- substitute()
if (missing(sheet)) sheet <- substitute()
self <- wb_load(
file = file,
sheet = sheet,
data_only = data_only,
calc_chain = calc_chain,
... = ...
)
invisible(self)
Expand Down
86 changes: 51 additions & 35 deletions R/wb_load.R
Original file line number Diff line number Diff line change
@@ -1,60 +1,76 @@
#' Load an existing .xlsx file
#' Load an existing .xlsx, .xlsm or .xlsb file
#'
#' `wb_load()` returns a [wbWorkbook] object conserving styles and
#' formatting of the original input file.
#' `wb_load()` returns a [wbWorkbook] object conserving the content of the
#' original input file, including data, styles, media. This workbook can be
#' modified, read from, and be written back into a xlsx file.
#'
#' A warning is displayed if an xml namespace for main is found in the xlsx file.
#' Certain xlsx files created by third-party applications contain a namespace
#' (usually `x`). This namespace is not required for the file to work in spreadsheet
#' software and is not expected by `openxlsx2`. Therefore it is removed when the
#' file is loaded into a workbook. Removal is generally expected to be safe,
#' but the feature is still experimental.
#' @details
#' If a specific `sheet` is selected, the workbook will still contain sheets
#' for all worksheets. The argument `sheet` and `data_only` are used internally
#' by [wb_to_df()] to read from a file with minimal changes. They are not
#' specifically designed to create rudimentary but otherwise fully functional
#' workbooks. It is possible to import with
#' `wb_load(data_only = TRUE, sheet = NULL)`. In this way, only a workbook
#' framework is loaded without worksheets or data. This can be useful if only
#' some workbook properties are of interest.
#'
#' Initial support for binary openxml files (`xlsb`) has been added to the package.
#' We parse the binary file format into pseudo-openxml files that we can import.
#' Therefore, after importing, it is possible to interact with the file as if it
#' had been provided as xlsx in the first place. This is of course slower than
#' reading directly from the binary file. Our implementation is also still missing
#' some features: some array formulas are still broken, conditional formatting and
#' data validation are not implemented, nor are pivot tables and slicers.
#' There are some internal arguments that can be passed to wb_load, which are
#' used for development. The `debug` argument allows debugging of `xlsb` files
#' in particular. With `calc_chain` it is possible to maintain the calculation
#' chain. The calculation chain is used by spreadsheet software to determine
#' the order in which formulas are evaluated. Removing the calculation chain
#' has no known effect. The calculation chain is created the next time the
#' worksheet is loaded into the spreadsheet. Keeping the calculation chain
#' could only shorten the loading time in said software. Unfortunately, if a
#' cell is added to the worksheet, the calculation chain may block the
#' worksheet as the formulas will not be evaluated again until each individual
#' cell with a formula is selected in the spreadsheet software and the Enter
#' key is pressed manually. It is therefore strongly recommended not to
#' activate this function.
#'
#' It is possible to import with `wb_load(data_only = TRUE, sheet = NULL)`. This
#' way only a workbook skeleton is loaded. This can be useful if only some
#' workbook properties are of interest.
#' In rare cases, a warning is issued when loading an xlsx file that an xml
#' namespace has been removed from xml files. This refers to the internal
#' structure of the loaded xlsx file. Certain xlsx files created by third-party
#' applications contain a namespace (usually x). This namespace is not required
#' for the file to work in spreadsheet software and is not expected by
#' `openxlsx2`. It is therefore removed when the file is loaded into a
#' workbook. Removal is generally considered safe, but the feature is still not
#' commonly observed, hence the warning.
#'
#' Initial support for binary openxml files (`xlsb`) has been added to the
#' package. We parse the binary file format into pseudo-openxml files that we
#' can import. Therefore, once imported, it is possible to interact with the
#' file as if it had been provided in xlsx file format in the first place. This
#' parsing into pseudo xml files is of course slower than reading directly from
#' the binary file. Our implementation is also still missing some functions:
#' some array formulas are not yet correct, conditional formatting and data
#' validation are not implemented, nor are pivot tables and slicers.
#'
#' @param file A path to an existing .xlsx, .xlsm or .xlsb file
#' @param sheet optional sheet parameter. if this is applied, only the selected
#' sheet will be loaded. This can be a numeric, a string or `NULL`.
#' @param data_only mode to import if only a data frame should be returned. This
#' strips the `wbWorkbook` to a bare minimum.
#' @param calc_chain optionally you can keep the calculation chain intact. This
#' is used by spreadsheet software to identify the order in which formulas are
#' evaluated. Removing the calculation chain is considered harmless. The calc
#' chain will be created upon the next time the worksheet is loaded in
#' spreadsheet software. Keeping it, might only speed loading time in said
#' software.
#' @param ... additional arguments
#' @return A Workbook object.
#' @export
#' @examples
#' ## load existing workbook from package folder
#' wb <- wb_load(file = system.file("extdata", "openxlsx2_example.xlsx", package = "openxlsx2"))
#' wb$get_sheet_names() # list worksheets
#' wb ## view object
#' ## Add a worksheet
#' wb$add_worksheet("A new worksheet")
#' ## load existing workbook
#' fl <- system.file("extdata", "openxlsx2_example.xlsx", package = "openxlsx2")
#' wb <- wb_load(file = fl)
#' @export
wb_load <- function(
file,
sheet,
data_only = FALSE,
calc_chain = FALSE,
...
) {

debug <- list(...)$debug
xlsx_file <- list(...)$xlsx_file
calc_chain <- list(...)$calc_chain
debug <- list(...)$debug
xlsx_file <- list(...)$xlsx_file
standardize_case_names(...)

if (is.null(calc_chain)) calc_chain <- FALSE
if (is.null(debug)) debug <- FALSE

if (!is.null(xlsx_file)) {
Expand Down
1 change: 0 additions & 1 deletion inst/WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,6 @@ autocompletion
bandedCols
bandedRows
bool
calc
calcChain
calculatedColumn
camelCase
Expand Down
4 changes: 1 addition & 3 deletions man/wbWorkbook.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

79 changes: 46 additions & 33 deletions man/wb_load.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

30 changes: 15 additions & 15 deletions src/openxlsx2_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -99,21 +99,21 @@ inline SEXP wrap(const std::vector<xml_col> &x) {

// struct to vector
for (size_t i = 0; i < n; ++i) {
r[i] = Rcpp::String(x[i].r);
row_r[i] = Rcpp::String(x[i].row_r);
c_r[i] = Rcpp::String(x[i].c_r);
c_s[i] = Rcpp::String(x[i].c_s);
c_t[i] = Rcpp::String(x[i].c_t);
c_cm[i] = Rcpp::String(x[i].c_cm);
c_ph[i] = Rcpp::String(x[i].c_ph);
c_vm[i] = Rcpp::String(x[i].c_vm);
v[i] = Rcpp::String(x[i].v);
f[i] = Rcpp::String(x[i].f);
f_t[i] = Rcpp::String(x[i].f_t);
f_ref[i] = Rcpp::String(x[i].f_ref);
f_ca[i] = Rcpp::String(x[i].f_ca);
f_si[i] = Rcpp::String(x[i].f_si);
is[i] = Rcpp::String(x[i].is);
if (!x[i].r.empty()) r[i] = Rcpp::String(x[i].r);
if (!x[i].row_r.empty()) row_r[i] = Rcpp::String(x[i].row_r);
if (!x[i].c_r.empty()) c_r[i] = Rcpp::String(x[i].c_r);
if (!x[i].c_s.empty()) c_s[i] = Rcpp::String(x[i].c_s);
if (!x[i].c_t.empty()) c_t[i] = Rcpp::String(x[i].c_t);
if (!x[i].c_cm.empty()) c_cm[i] = Rcpp::String(x[i].c_cm);
if (!x[i].c_ph.empty()) c_ph[i] = Rcpp::String(x[i].c_ph);
if (!x[i].c_vm.empty()) c_vm[i] = Rcpp::String(x[i].c_vm);
if (!x[i].v.empty()) v[i] = Rcpp::String(x[i].v);
if (!x[i].f.empty()) f[i] = Rcpp::String(x[i].f);
if (!x[i].f_t.empty()) f_t[i] = Rcpp::String(x[i].f_t);
if (!x[i].f_ref.empty()) f_ref[i] = Rcpp::String(x[i].f_ref);
if (!x[i].f_ca.empty()) f_ca[i] = Rcpp::String(x[i].f_ca);
if (!x[i].f_si.empty()) f_si[i] = Rcpp::String(x[i].f_si);
if (!x[i].is.empty()) is[i] = Rcpp::String(x[i].is);
}

// Assign and return a dataframe
Expand Down

0 comments on commit 85503cf

Please sign in to comment.