-
Notifications
You must be signed in to change notification settings - Fork 978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated error message in rbindlist to display the class attributes of mismatching columns #4822
base: master
Are you sure you want to change the base?
Conversation
… mismatching columns - added 2 tests - added joinCharVec utility function to join each element of a character vector into a single C string - updated translations for the new error message - added *.sublime-project and *.sublime-workspace to .gitignore
Codecov Report
@@ Coverage Diff @@
## master #4822 +/- ##
=======================================
Coverage 99.47% 99.47%
=======================================
Files 73 73
Lines 14589 14611 +22
=======================================
+ Hits 14512 14534 +22
Misses 77 77
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for PR. Some minor feedback left, haven't really looked at the logic.
- renamed joinCharVec to concatCharVec - fixed spacing - added # nocov - fixed calls to 'free' - moved variable declarations to point of use
@jangorecki, @MichaelChirico thanks for the quick feedback! Please let me know if there's anything else. I've converted back to draft since I still need to update the translation files - will do those next if everything else is looking good. |
don't bother with the translations part, we handle that in one go before release. |
src/rbindlist.c
Outdated
if (!R_compute_identical(thisClass, firstClass, 0)) { | ||
#define BUFSIZE 1024 // 'error' function length limit in R's errors.c is 8192 | ||
char thisClassBuf[BUFSIZE]; | ||
char* thisClassStr = concatCharVec(thisClass, ", "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can concatCharVec()
has an argument length
? So that we can avoid this assigning to thisClassStr
then to thisClassBuf
code, which may be more concise, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add a length
argument, but on its own I don't think that wouldn't make the code more concise.
The real issue here is the dynamic allocation of the array within concatCharVec()
. I think that memory needs to be freed, but the array also needs to be passed to the error()
function, and I'm not sure if there is a way to free it after error()
has been called.
Hence we're copying the values to the intermediate statically allocated arrays before the dynamic arrays are freed, and then passing those static arrays to error()
.
I'm open to feedback if there is a better way to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think return a SEXP object should solve your concerns as SEXP object could be freed by R.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, that makes things much simpler! Thanks :)
What needs to be |
} | ||
|
||
SEXP concatenatedCharVec = PROTECT(mkChar(concatenated)); | ||
free(concatenated); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I think the comment got misplaced from mobile.
This free
applies to the SEXP
right? if so then it needs to be UNPROTECT
ed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, okay. No, it's actually being called on a char*
which is dynamically allocated and used only within concatCharVec()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea and improvement.
There is already concat
in fmelt.c
. Can that be enhanced and reused for this usage? The main enhancement needed being able to be called twice from one error/warning message. It constrains the maximum length of the string which is needed, and this PR doesn't do currently. When its idx
is NULL
, rather than error as it does now, it could concat all the items in order as needed in this PR.
Or, if it were me, I would have just included the first item of the class vector in the message; no concat would be needed. In this example, POSIXct
!= Date
is already clear and helpful to the user. They don't need to know that POSIXct inherits from POSIXt too. I can only think of esoteric constructed examples where the message might state that myClass != myClass (and thus be confusing to the user) because myClass somehow didn't inherit the same way.
If you decide to include all the classes in the message and reuse fmelt.c:concat
, its static
could be removed and its signature added to data.table.h
so it could be used here.
Also please remove the translation changes from PR as Michael said it will be done as part of release procedures in one shot. Or, since it has already been translated, maybe @MichaelChirico is ok to leave the translation in this PR this time? Up to him on that.
yes, please revert the translations since unfortunately they're incorrect 😄 -- we'll let a native speaker handle updating them (and other new/changed messages) in one go before release. Thanks! I've added a small section on Translation to the Contributing wiki: https://github.com/Rdatatable/data.table/wiki/Contributing |
Hi @logworthy would you like to resume this PR? Please take note of related PRs like #5874 -- is the functionality here still needed given recent enhancements? cc @ben-schwen again. |
Definitely an improvement, but would go with Matt's comment and only return the first item of class vector, which makes it much simpler. It won't interfere with my recent PRs since they try to solve the problem and not just improve the error message. |
Thanks... marking as draft until #5570 is merged, unless someone wants to rebase this PR relative to that one first. |
Thought it would be helpful to provide a little more context to this error message.
Old:
Class attribute on column 1 of item 2 does not match with column 1 of item 1.
New:
Class attribute on column 1 (POSIXct, POSIXt) of item 2 does not match with column 1 (Date) of item 1.
I've added some tests and attempted to update the translations.