-
Notifications
You must be signed in to change notification settings - Fork 449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL magic drops column if all row value is NaN #611
Comments
Yeah, issue is Spark's |
One solution is to have first line of result be the list of column names. This is annoying to do because |
Example of the problem: Input:
Output:
|
The code that needs fixing is |
If we ever get #598 in we can drop two of the languages. As a third alternative, I think I have some memory of SQL support in Livy, too? But that would presumably not less us fix this bug (although perhaps Livy has fixed it). |
Hello is there any more news on this? I'm currently facing the same problem. Thanks |
If a column has null value in every row/record, %%sql will not drop that entire column.
To reproduce, create a table where a column has only null values, e.g.
%%sql
insert into table
values (1, null),
(2, null),
(3, null)
I have attached screenshots using results from %%sql and spark.sql()
Screen Shot 2019-12-26 at 2.50.52 pm.pdf
Versions:
Additional context
I believe the problem comes from the fact that since JSON doesn't pick up null values, when the data got converted into dict and then converted into dataframe, it couldn't have known that there was a missing column:
https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/sparkmagic/utils/utils.py#L52
https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/sparkmagic/livyclientlib/sqlquery.py#L58
We need a way to pick up the schema before populating all the data.
The text was updated successfully, but these errors were encountered: