Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated error message in rbindlist to display the class attributes of mismatching columns #4822

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ src/Makevars
.emacs.desktop
.emacs.desktop.lock

# Sublime Text IDE files
*.sublime-project
*.sublime-workspace

# RStudio IDE files
.Rproj.user
data.table.Rproj
Expand Down
2 changes: 2 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -6995,6 +6995,8 @@ DT1 = data.table(date=as.POSIXct("2014-06-22", format="%Y-%m-%d", tz="GMT"))
DT2 = data.table(date=as.Date("2014-06-23"))
test(1494.1, rbind(DT1, DT2), error="Class attribute on column")
test(1494.2, rbind(DT2, DT1), error="Class attribute on column")
test(1494.3, rbind(DT1, DT2), error="Class attribute on column 1 (Date) of item 2 does not match with column 1 (POSIXct, POSIXt) of item 1")
test(1494.4, rbind(DT2, DT1), error="Class attribute on column 1 (POSIXct, POSIXt) of item 2 does not match with column 1 (Date) of item 1")

# test 1495 has been added to melt's test section (fix for #1055)

Expand Down
20 changes: 10 additions & 10 deletions po/data.table.pot
Original file line number Diff line number Diff line change
Expand Up @@ -4066,34 +4066,34 @@ msgid ""
"'message'|'warning'|'error'|'none'. See news item 5 in v1.12.2."
msgstr ""

#: rbindlist.c:298
#: rbindlist.c:300
#, c-format
msgid ""
"Column %d of item %d has type 'factor' but has no levels; i.e. malformed."
msgstr ""

#: rbindlist.c:316
#: rbindlist.c:320
#, c-format
msgid ""
"Class attribute on column %d of item %d does not match with column %d of "
"item %d."
"Class attribute on column %d (%s) of item %d does not match with column %d "
"(%s) of item %d."
msgstr ""

#: rbindlist.c:326
#: rbindlist.c:332
#, c-format
msgid ""
"Internal error: column %d of result is determined to be integer64 but "
"maxType=='%s' != REALSXP"
msgstr ""

#: rbindlist.c:362
#: rbindlist.c:368
#, c-format
msgid ""
"Failed to allocate working memory for %d ordered factor levels of result "
"column %d"
msgstr ""

#: rbindlist.c:383
#: rbindlist.c:389
#, c-format
msgid ""
"Column %d of item %d is an ordered factor but level %d ['%s'] is missing "
Expand All @@ -4102,22 +4102,22 @@ msgid ""
"factor will be created for this column."
msgstr ""

#: rbindlist.c:388
#: rbindlist.c:394
#, c-format
msgid ""
"Column %d of item %d is an ordered factor with '%s'<'%s' in its levels. But "
"'%s'<'%s' in the ordered levels from column %d of item %d. A regular factor "
"will be created for this column due to this ambiguity."
msgstr ""

#: rbindlist.c:433
#: rbindlist.c:439
#, c-format
msgid ""
"Failed to allocate working memory for %d factor levels of result column %d "
"when reading item %d of item %d"
msgstr ""

#: rbindlist.c:523
#: rbindlist.c:529
#, c-format
msgid "Column %d of item %d: %s"
msgstr ""
Expand Down
23 changes: 12 additions & 11 deletions po/zh_CN.po
Original file line number Diff line number Diff line change
Expand Up @@ -4397,37 +4397,38 @@ msgstr ""
"options()$datatable.rbindlist.check=='%s' 不"
"是'message'|'warning'|'error'|'none'。参见 v1.12.2 更新信息中的第 5 项。"

#: rbindlist.c:298
#: rbindlist.c:300
#, c-format
msgid ""
"Column %d of item %d has type 'factor' but has no levels; i.e. malformed."
msgstr ""
"第%2$d 项的第 %1$d 列为因子('factor')类型却没有因子水平(levels),格式错"
"误。"

#: rbindlist.c:316
#: rbindlist.c:320
#, c-format
msgid ""
"Class attribute on column %d of item %d does not match with column %d of "
"item %d."
msgstr "第 %2$d 项的第 %1$d 列的类属性与第 %4$d 项的第 %3$d列的不匹配。"
"Class attribute on column %d (%s) of item %d does not match with column %d "
"(%s) of item %d."
msgstr "第 %3$d 项的第 %1$d (%2$s) 列的类属性与第 %6$d 项的第 %4$d (%5$d) "
"列的不匹配。"

#: rbindlist.c:326
#: rbindlist.c:332
#, c-format
msgid ""
"Internal error: column %d of result is determined to be integer64 but "
"maxType=='%s' != REALSXP"
msgstr ""
"内部错误:结果中的第 %d 列应为 integer64 类型,但maxType=='%s' != REALSXP"

#: rbindlist.c:362
#: rbindlist.c:368
#, c-format
msgid ""
"Failed to allocate working memory for %d ordered factor levels of result "
"column %d"
msgstr "未能为结果中第 %d 列的 %d 个有序因子水平分配工作内存"

#: rbindlist.c:383
#: rbindlist.c:389
#, c-format
msgid ""
"Column %d of item %d is an ordered factor but level %d ['%s'] is missing "
Expand All @@ -4439,7 +4440,7 @@ msgstr ""
"(level)['%4$s']在第 %6$d 项第 %5$d 列的有序因子水平中缺失。每组有序因子水平"
"应为其中最长有序因子水平的子集。该列将被创建为一非有序因子列。"

#: rbindlist.c:388
#: rbindlist.c:394
#, c-format
msgid ""
"Column %d of item %d is an ordered factor with '%s'<'%s' in its levels. But "
Expand All @@ -4450,15 +4451,15 @@ msgstr ""
"的有序因子水平中却 '%5$s'<'%6$s'。由于这种模糊性,该列将被创建为一非有序因子"
"列。"

#: rbindlist.c:433
#: rbindlist.c:439
#, c-format
msgid ""
"Failed to allocate working memory for %d factor levels of result column %d "
"when reading item %d of item %d"
msgstr ""
"当读取第%4$d项的第%3$d个子项时,无法为第%2$d列的%1$d个因素水平分配工作内存"

#: rbindlist.c:523
#: rbindlist.c:529
#, c-format
msgid "Column %d of item %d: %s"
msgstr "第 %2$d 项的第 %1$d 列: %3$s"
Expand Down
1 change: 1 addition & 0 deletions src/data.table.h
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,7 @@ bool islocked(SEXP x);
SEXP islockedR(SEXP x);
bool need2utf8(SEXP x);
SEXP coerceUtf8IfNeeded(SEXP x);
SEXP concatCharVec(SEXP x, const char *sep);

// types.c
char *end(char *start);
Expand Down
9 changes: 5 additions & 4 deletions src/rbindlist.c
Original file line number Diff line number Diff line change
Expand Up @@ -310,10 +310,11 @@ SEXP rbindlist(SEXP l, SEXP usenamesArg, SEXP fillArg, SEXP idcolArg)
if (firsti==-1) { firsti=i; firstw=w; firstCol=thisCol; }
else {
if (!factor && !int64) {
if (!R_compute_identical(PROTECT(getAttrib(thisCol, R_ClassSymbol)),
PROTECT(getAttrib(firstCol, R_ClassSymbol)),
0)) {
error(_("Class attribute on column %d of item %d does not match with column %d of item %d."), w+1, i+1, firstw+1, firsti+1);
SEXP thisClass = PROTECT(getAttrib(thisCol, R_ClassSymbol));
MichaelChirico marked this conversation as resolved.
Show resolved Hide resolved
SEXP firstClass = PROTECT(getAttrib(firstCol, R_ClassSymbol));
if (!R_compute_identical(thisClass, firstClass, 0)) {
error(_("Class attribute on column %d (%s) of item %d does not match with column %d (%s) of item %d."),
w+1, CHAR(concatCharVec(thisClass, ", ")), i+1, firstw+1, CHAR(concatCharVec(firstClass, ", ")), firsti+1);
}
UNPROTECT(2);
}
Expand Down
35 changes: 35 additions & 0 deletions src/utils.c
Original file line number Diff line number Diff line change
Expand Up @@ -374,3 +374,38 @@ SEXP coerceUtf8IfNeeded(SEXP x) {
return(ans);
}

// Concatenate a character vector into a single CHARSXP, e.g. for printing error messages
// adapted from https://stackoverflow.com/a/58163237
// make sure to *UNPROTECT* the returned CHARSXP after use
SEXP concatCharVec (SEXP x, const char *sep)
{
char *concatenated = NULL; /* pointer to concatenated string w/sep */
size_t lensep = strlen (sep), /* length of separator */
sz = 0; /* current stored size */
int first = 1; /* flag whether first term */

/* check that a character vector has been passed */
if (TYPEOF(x) != STRSXP)
error(_("Internal error: unsupported type '%s' passed to concatCharVec()"), type2char(TYPEOF(x))); // # nocov

for (R_xlen_t i=0; i<xlength(x); i++) { /* for each string in s */
size_t len = strlen (Rf_translateChar(STRING_ELT(x, i)));
/* allocate/reallocate concatenated */
void *tmp = realloc (concatenated, sz + len + (first ? 0 : lensep) + 1);
if (!tmp) { /* validate allocation */
error(_("Internal error: memory allocation failure in concatCharVec()")); // # nocov
}
concatenated = tmp; /* assign allocated block to concatenated */
if (!first) { /* if not first string */
strcpy (concatenated + sz, sep); /* copy separator */
sz += lensep; /* update stored size */
}
strcpy (concatenated + sz, Rf_translateChar(STRING_ELT(x, i))); /* copy string to concatenated */
first = 0; /* unset first flag */
sz += len; /* update stored size */
}

SEXP concatenatedCharVec = PROTECT(mkChar(concatenated));
free(concatenated);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I think the comment got misplaced from mobile.

This free applies to the SEXP right? if so then it needs to be UNPROTECTed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay. No, it's actually being called on a char* which is dynamically allocated and used only within concatCharVec().

return concatenatedCharVec; /* return concatenated string */
}