Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a function in env= fails #6026

Closed
MichaelChirico opened this issue Mar 28, 2024 · 11 comments · Fixed by #6027
Closed

Using a function in env= fails #6026

MichaelChirico opened this issue Mar 28, 2024 · 11 comments · Fixed by #6027
Assignees
Labels
bug programming parameterizing queries: get, mget, eval, env

Comments

@MichaelChirico
Copy link
Member

DT=data.table(a=1:2, b=3:4)
DT[, f(b), by=a, env=list(f=sum)]
# Error in as.character(jsub[[1L]]) : 
#   cannot coerce type 'builtin' to vector of type 'character'

The issue is that the resulting query, instead of being DT[, sum(b), by=a], becomes DT[, .Primitive("sum")(b), by=a].

The same will happen for any closure:

DT[, f(b), by=a, env=list(f=\(x) sum(x))]
# Error in as.character(jsub[[1L]]) : 
#   cannot coerce type 'closure' to vector of type 'character'

This is not a general issue with [:

DT[, .Primitive("sum")(b), by=a]
#        a    V1
#    <int> <int>
# 1:     1     3
# 2:     2     4
@MichaelChirico
Copy link
Member Author

Just adding that this is a general issue with [, but it's a bit hard to construct examples AFAICT, which is why it hasn't come up. E.g.

DT = data.table(a = 1:3)
eval(substitute(DT[, foo(a)], list(foo = sum)))
# Error in as.character(jsub[[1L]]) : 
#   cannot coerce type 'builtin' to vector of type 'character'

Directly writing .Primitive("sum")(b) as above is not quite the same -- AIUI it's confusing because .Primitive("sum") is just the deparse() output / print string representation of the LHS.

@jangorecki
Copy link
Member

Code without eval call would give better picture what's happening, like verbose via env var

@jangorecki
Copy link
Member

jangorecki commented Jul 28, 2024

I would like to see what vebose says, or even better str() of output from verbose. I am currently not with laptop so cannot check myself.

@MichaelChirico
Copy link
Member Author

MichaelChirico commented Jul 28, 2024

Summarized in the issue:

The issue is that the resulting query, instead of being DT[, sum(b), by=a], becomes DT[, .Primitive("sum")(b), by=a].

c.f.

jsub = substitute(foo(x), list(foo=sum))
jsub
# .Primitive("sum")(x)
jsub[[1L]]
# function (..., na.rm = FALSE)  .Primitive("sum")
# [1] "builtin"
as.character(jsub[[1L]])
# Error in as.character(substitute(foo(x), list(foo = sum))[[1L]]) : 
#   cannot coerce type 'builtin' to vector of type 'character'
jsub[[1L]] == "sum"
# Error in jsub[[1]] == "sum" : 
#   comparison (==) is possible only for atomic and list types

# in the PR, we use deparse(), which _does_ give us a string representation without error:
deparse(jsub[[1L]])
# [1] ".Primitive(\"sum\")"

@jangorecki
Copy link
Member

jangorecki commented Jul 28, 2024

What's wrong about? it is what user requested.
If there is intent to use function then name needs to be passed, not the function itself. This is very different from lapply and others that operates on anonymous function interfaces. Meta programming is about meta programming, function is not meta, name of the function is. Therefore expected usage is as.name("sum") or quote(sum) and not sum. That would introduce inconsistency which is the last thing we want in meta programming interface.

@MichaelChirico
Copy link
Member Author

MichaelChirico commented Jul 28, 2024

That's fine, but [ throws a very confusing error in the issue. Provided PR fixes that. It should be possible to generate this error without using env=, e.g.:

DT=data.table(a=1:2, b=3:4)
eval(substitute(DT[, foo(b), by=a], list(foo=sum)))
# Error in as.character(jsub[[1L]]) : 
#   cannot coerce type 'builtin' to vector of type 'character'

If env= wants to catch incorrect input like list(f=sum), that can be a different issue. I also think this is something that can be addressed in the vignette. Provided PR is useful regardless.

base approach also doesn't have this issue (of course there's not an easy/direct comparison b/c of how our [ operates):

eval(substitute(foo(DT$a), list(foo=sum)))
# [1] 3
eval(substitute(with(DT, foo(a)), list(foo=sum)))
# [1] 3

Also note that the provided PR has nothing to do with how env= is handled per se, so not sure why you think we're introducing inconsistency there.

@jangorecki
Copy link
Member

Inconsistency is when someone will try to actually inject function body via passing function to env argument. Don't have reasonable use case for that but I don't think we can rule out such situation. For env it is actually sufficient to provide "sum" because it is automatically turned into name. So not even need to quote() or as.name(). What seems very reasonable is to add this use case to examples so it is documented to use "sum" rather than sum.

@MichaelChirico
Copy link
Member Author

I think it's a good idea, I think supplying the call programmatically will be common, using f=sum looks pretty natural to me (and I assume others). Having f="sum" documented well up-front will be much appreciated. Filed #6323.

I still think #6027 is needed given how opaque the error is when user approaches the problem incorrectly.

@jangorecki
Copy link
Member

So the error comes from getting the string for a column name (or verbose output), do I understand correctly?

@MichaelChirico
Copy link
Member Author

We have a couple places in [ using check like jsub[[1]]] == "<fun>", implicitly LHS gets coerced to string, but that fails when jsub[[1L]] is an "exotic" object like closure (almost always it's a name or a call like name::name).

@jangorecki
Copy link
Member

got it. thanks. PR looks good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug programming parameterizing queries: get, mget, eval, env
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants