-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Functions to write to a database #16
Comments
I have thought about if and how to add write functions. dbplyr does have a set of functions that do so, which would probably be the method/syntax of how we go about adding them. So right now the plan is TBD I'd say. SQLstrings is an interesting package I have thought about as an example for how to safely write to the underlying database. |
For MySql and MariaDB is there no way to do it without a csv intermediary ? I'm just curious. |
In R the DBI package has some function to write tables to databases, but they are extremely slow; probably write line by line. I don't know of any approach to this in Julia. |
With a tiny foray into writing with I think |
Hello, @drizk1! I am AFK now, but I can paste here from my phone a draft of what I use in production at my job. Basically for each dataframe I save a CSV file with some odd formatting (to avoid clashes with csv new lines) and then upload it to mariadb. It works really fast and have the IGNORE or REPLACE options, which are good enough for me. Don't mind the type piracies along the way.
Do you have any suggestions on how to do that in a more "universal" manner (ie to other databases)? |
As far as database compatibility from what I've read / seen working on TidierDB, most databases use pretty similar patterns for writing so that shouldn't be an issue. I'm curious, how come you opted to use a csv as opposed to a dataframe and something like As far as workflows, Are these local files that you need to put on the database ? Or is it something you've wrangled from the db but want to make a permanent table |
I came up with this method in R, because the writetable function for mariadb there is terribly slow (the driver writes line by line and there is no option to upsert). When migrating some packages to Julia, I just translated the code and it seemed to work really well. As for the csv file: it is a temporary file where the code runs, and then you upload it to the mariadb server (that is: it is not a file created directly on the mariadb server). (maybe I did not understand your question) |
I found the discussion here: |
Ohh ok well I really like this. My question was unclear initially but its essentially are you writing a table from a sql query for example Table already in db -> sql query wrangle -> write as new table Or Local dataframe -> Julia wrangle -> write as new table Because if it's the latter of the 2, I think makes sense to use the temp csv version you've got |
It's the second case. A table in Julia memory -> csv -> database. If the table is already in mariadb, we can always collect it first to julia memory as a dataframe |
Ahh ok perfect. So this to me sounds quite similar to My thought would be to implement a And then later on we can implement a How does that sound to you? If that sounds good to you, would you be able to throw in a PR adding |
Is this enough? I tested on my MariaDB and it went fine. |
I think for now i will close this. every database is not yet supported, but perhaps fresh issues as they are needed will be ok. |
We now have tools to collect data from databases using the TidierData language, but what about writing to databases?
In R, I had to create a function that saves a temporary csv file and then upload it to MariaDB. I was able to do the same with Julia. This has the advantage of choosing whether to insert with REPLACE or IGNORE duplicate keys.
Do you plan to add these kinds of function to TidierDB? I can help with the MySQL/MariaDB part.
The text was updated successfully, but these errors were encountered: