-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check coverage across webchem #264
Conversation
I should also note that these functions rely on CTS, which currently just sometimes fail for no apparent reason. Maybe switching to the REST API will help (#257)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Aariq. It's quite a big one, I guess I'll review it in chunks. So far I just tried the check_coverage()
function and it's great! I was wondering if we could find a better name for it, I tend to associate the term "coverage" with test coverage. Just some ideas: since it returns booleans, instead of a command like get()
or check()
, it could be a question, e.g. is_included()
. Or if we go with the command, maybe lookup()
to associate more with looking it up in the register?
Do you imagine this function to be exported on the long term? If yes, then plotting is a good idea. I personally imagined this function to be used by other integrator functions and maybe not even exported at some point, in that case plotting is not really necessary. I'll look into the base graphics, I agree, it would be better not to import another package.
As for the name, I'm not tied to
I wrote this intending it to be exported and used directly by a user with a list of compounds wanting to get some common bit of data (e.g. molecular weight), but without an idea of which of several data sources to use. But I'm actually not sure how useful this is, which is why I asked about it on Slack. Perhaps this is just a rough sketch of some future function that integrates across data sources? |
…ue with _R_CHECK_LENGTH_1_CONDITION_ failing on CRAN
… base R plot I think.
… in a fresh R session.
This reverts commit 04a1399.
How about using the |
The tests are failing because CTS just fails randomly sometimes. Not sure if I should include some kind of workaround for this in the |
…nstead of `choices` ropensci#263
…ch = "best" and from != "name". Match result output is now just the chemical name.
I went ahead and addressed #263 since it seemed in the spirit of the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Aariq here is the next chunk:) thanks for updating the name for has_entry()
I think it better reflects to purpose of the function. Also data.matrix
, I agree that a lightweight package is better for our needs. I hope it will work in Suggests but if not I think we can live with it in Imports as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Aariq, added a few more comments. I have reviewed everything but integration.R, I'll review it in another session. Thanks for harmonizing the syntax I think this will make webchem much easier to use!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few thoughts for autotranlate()
. Thanks for introducing rlang
, great stuff!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is almost ready to be merged! I like the functionality of has_entry()
and I think the name is OK, but maybe we could think a little more about that (it's going to stay with us for quite a while). I contemplated the benefit of autotranslate()
for a long time. I understand it's necessary for has_enrty()
and that's fine, but I don't think it should be exported.
This adds two functions that integrate multiple data sources in webchem.
autotranslate()
is a wrapper that you can give a query of any type and the name of a webchem function and it will attempt to translate the query into an identifier the function accepts.check_coverage()
runs aget_*()
or*_query()
from most of the data sources in webchem and returns a tibble indicating which compounds got hits in each data source.Currently it also has the option to plot the results in a sort of table/heatmap, but the plot is made with
ggplot2
. I need help translating that code to base R, because it doesn't make sense to add ggplot2 as a dependency at this time.minor changes
To make all this happen, I fixed some bugs and made the
from=
argument more consistent across the package. (closes #262 and #263)PR task list:
devtools::document()