Skip to content

Commit

Permalink
non-ASCII test robustness (#6375)
Browse files Browse the repository at this point in the history
* non-ASCII test robustness

* cite GH issue

* Add a note about the skipped test to be reported.

* Mention issue tracker
  • Loading branch information
MichaelChirico authored Aug 19, 2024
1 parent 6cee825 commit 36e9f2c
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 11 deletions.
2 changes: 1 addition & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@

6. In `DT[,j,by]`, `by` retains its attributes (e.g. class) when `j` is GForce optimized, [#5567](https://github.com/Rdatatable/data.table/issues/5567). Thanks to @danwwilson for the report, and @ben-schwen for the PR.

7. `dt[,,by=año]` (i.e., using a column name containing a non-ASCII character in `by` as a plain symbol) no longer errors with "object 'año' not found", #4708. Thanks @pfv07 for the report, and @MichaelChirico for the fix.
7. `dt[,,by=año]` (i.e., using a column name containing a non-ASCII character in `by` as a plain symbol) no longer errors with "object 'año' not found", #4708. Thanks @pfv07 for the report, and @MichaelChirico for the fix. Also thanks to @aitap for suggesting an improvement to the corresponding test, [#6339](https://github.com/Rdatatable/data.table/issues/6339).

8. Fixed some memory management issues in the C routines backing `melt()`, `froll()`, and GForce `mean()`, as identified by `rchk`. Thanks Tomas Kalibera and the CRAN team for setting up the `rchk` system, and @MichaelChirico for the fix.

Expand Down
25 changes: 15 additions & 10 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -18724,16 +18724,21 @@ if (test_bit64) local({
})

# non-ASCII plain symbol in by, #4708
DT = data.table(a = rep(1:3, 2))
# NB: recall we can't use non-ASCII symbols here. the text is a-<n-tilde>-o (year in Spanish)
setnames(DT, "a", "a\U00F1o")
test(2266, eval(parse(text="DT[ , .N, a\U00F1o]$N[1L]")), 2L)
# sub-key can also be retained in plain query, part of #4498
DT = data.table(id = rep(1:10, 2L), grp = rep(1:2, each=10L), V = 1:20/13, key=c('id', 'grp'))
test(2266.1, key(DT[ , .(id)]), 'id')
test(2266.2, key(DT[ , .(grp)]), NULL)
## renaming also caught
test(2266.3, key(DT[ , .(newid = id, newgrp = grp)]), c('newid', 'newgrp'))
# NB: recall we can't use non-ASCII symbols in the test script. The text is a-<n-tilde>-o (year in Spanish)
native_ano = iconv("a\U00F1o", "UTF-8", "")
if (!is.na(native_ano)) { # #6339: symbol must be represented in native encoding
DT = data.table(a = rep(1:3, 2))
setnames(DT, "a", native_ano)
test(2266, eval(parse(text=sprintf("DT[ , .N, %s]$N[1L]", native_ano))), 2L)
# sub-key can also be retained in plain query, part of #4498
DT = data.table(id = rep(1:10, 2L), grp = rep(1:2, each=10L), V = 1:20/13, key=c('id', 'grp'))
test(2266.1, key(DT[ , .(id)]), 'id')
test(2266.2, key(DT[ , .(grp)]), NULL)
## renaming also caught
test(2266.3, key(DT[ , .(newid = id, newgrp = grp)]), c('newid', 'newgrp'))
} else {
cat("Tests 2266* skipped. n-tilde cannot be represented in the native encoding on this system. Please report to the data.table issue tracker.\n")
}

# all.equal failed to dispatch to methods of columns, #4543
DT1 = data.table(t = .POSIXct(1590973200, tz='UTC'))
Expand Down

0 comments on commit 36e9f2c

Please sign in to comment.