AutoHierarchies()
has been updated to recognize common from-to names, and thesign
variable is now optional.- See the new parameter
autoNames
for details on common from-to names. - Also note the new parameter
autoLevel
, with a default value (TRUE
) that ensures the function behaves as it always has. - NAs in the 'to' variable are now allowed to support common hierarchies, and rows where 'to' == 'from' are also allowed. Such rows are removed before processing the hierarchy, with a warning when relevant (Codes removed due to 'to' == 'from' or 'to' == NA).
- Output from functions like
get_klass()
in the klassR package orhier_create()
in the sdcHierarchies package can now be used directly as input. - Example of usage:
a <- get_klass(classification = "24") b <- hier_create(root = "Total", nodes = LETTERS[1:5]) AutoHierarchies(list(tree = a, letter = b))
- See the new parameter
- New hierarchy functionality with hierarchies coded as variables (minimal datasets):
- New function
hierarchies_as_vars()
:- Hierarchies coded as variables.
- New function
vars_to_hierarchies()
:- Transform hierarchies coded as variables to "to-from" format.
- A kind of reverse operation of
hierarchies_as_vars()
.
- New function
map_hierarchies_to_data()
:- Add variables to dataset based on hierarchies.
- Uses
hierarchies_as_vars()
to transform hierarchies, followed by mapping to the dataset.
- New function
- New function
max_contribution()
with wrappern_contributors()
.- Find major contributions to aggregates and count contributors.
- Improved versions of
MaxContribution()
andNcontributors()
developed in the GaussSuppression package.
- New function
table_all_integers()
.- Table all integers from 1 to n
- New function
total_collapse()
.- Collapse variables to single representation.
- New function
substitute_formula_vars()
.- Part of the utility functions listed under
?formula_utils
. - An improved version of
formula_include_hierarchies()
, which has been renamed for clarity and corrected to produce the intended output.
- Part of the utility functions listed under
- Allow "empty terms" in
FormulaSums()
whenviaSparseMatrix = TRUE
.- "Empty terms" refer to cases where no columns exist in the model matrix due to
NAomit
. - The old method (
viaSparseMatrix = FALSE
) already handled this correctly.
- "Empty terms" refer to cases where no columns exist in the model matrix due to
- Minor improvement to
Extent0()
.- Now allows 0 input rows when
hierarchical = FALSE
.
- Now allows 0 input rows when
- Minor improvement to
FormulaSelection()
and its identical wrapperformula_selection()
.- Now supports 0-length selections.
- The function
FormulaSelection()
and thereby the identical wrapperformula_selection()
have been generalized.- New parameter named
logical
: WhenTRUE
, the logical selection vector is returned. FormulaSelection()
is now a generic function, allowing methods for other input objects to be added.
- New parameter named
- The
GaussSuppression()
function and related functionality have now been documented in a "Privacy in Statistical Databases 2024" paper.- The package description and function documentations have been updated with this reference (Langsrud, 2024).
- Now the
data.table
package is listed under Suggests and can be utilized in two functions. See below. - New function,
aggregate_by_pkg()
- This function aggregates data by specified grouping variables, using either base R or
data.table
. - Note the parameter
include_na
: A logical value indicating whetherNA
values in the grouping variables should be included in the aggregation. Default isFALSE
. - Will be used in packages depending on SSBtools.
- This function aggregates data by specified grouping variables, using either base R or
NAomit
is new parameter toRowGroups()
andFormula2ModelMatrix()
/FormulaSums()
.- This is about NAs in the grouping variables.
- The parameter can be used as input to
ModelMatrix()
.
pkg
is new parameter toRowGroups()
- Must be either
"base"
(default) or"data.table"
(for improved speed).
- Must be either
- Improved speed of
Formula2ModelMatrix()
/FormulaSums()
.- Thus, improved speed of
ModelMatrix()
. - Now, the model matrix is constructed by a single call to
Matrix::sparseMatrix()
instead of building the transposed matrix withrbind()
based on numerousMatrix::fac2sparse()
calls. - Further speed improvement can be achieved by setting the new parameter,
rowGroupsPackage
, todata.table
.
- Thus, improved speed of
- An efficiency bug in
ModelMatrix()
is fixed.- With
viaOrdinary = TRUE
,model.matrix()
orsparse.model.matrix()
was called twice.
- With
combine_formulas()
is improved- A long string problem solved, when long formulas.
- Some technical changes in documentation to comply with standards.
- The
ModelMatrix()
function and related functionality for hierarchical computations have now been documented in a paper in The R Journal.- The package description has been updated with this reference (Langsrud, 2023).
- Now,
remove_empty
is an explicit parameter tomodel_aggregate()
.- Previously, this had to be done via the
mm_args
parameter. Old code works as before.
- Previously, this had to be done via the
- Some tools for formula manipulation are included.
- See
?formula_utils
- See
- Minor change in
Extend0()
to allow even more advanced possibilities byvarGroups
-attribute. - Fix for a rare problem in
GaussSuppression()
,- Could happen with parallel eliminations combined with integer overflow. Then warning message: longer object length is not a multiple of shorter object length
- Minor change to the singleton method
"anySum"
inGaussSuppression()
to align with best theory.- In practice, this rarely makes a difference.
- The previous behavior can be ensured by setting
singletonMethod
to either"anySumOld"
or"anySumNOTprimaryOld"
.
- Fixed a zero-weight issue in
quantile_weighted()
.- Now,
quantile_weighted(x=c(0,2,0), weights = c(1,1,0))
correctly outputs the 50% value as 1.
- Now,
- A function for checking function inputs has been included and can be used as either
CheckInput()
orcheck_input()
.- The function was originally created in 2016 and has been included in internal packages at Statistics Norway (SSB). Due to its widespread use, it was beneficial to include it in this CRAN package.
- Last version before any news