Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'to_api_repr' #836

Closed
berniwal opened this issue Dec 12, 2024 · 4 comments · Fixed by #838
Closed

AttributeError: 'NoneType' object has no attribute 'to_api_repr' #836

berniwal opened this issue Dec 12, 2024 · 4 comments · Fixed by #838
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@berniwal
Copy link

In the new pandas_gqb Version 0.24.0 there seems to be a bug with writing to BigQuery tables from dataframes where there are nested columns which only include NULL values. However the Schema type should be clear since we provide a Schema for type information.

With pandas_gbq Version=0.24.0 - we get a "AttributeError: 'NoneType' object has no attribute 'to_api_repr' " Error - for the previous version 0.23.0 we do not get this error and the script works.

I assume the reason is that in the library we try to get the schema for this dataframe in "gbq.py" eventhough the schema would be provided. The function "generate_bq_schema" fails eventhough the execution would not be necessary in this case:

  default_schema = _generate_bq_schema(dataframe) --> Throws the error since it can not infer the schema
  # If table_schema isn't provided, we'll create one for you
  if not table_schema:
      table_schema = default_schema
  # It table_schema is provided, we'll update the default_schema to the provided table_schema
  else:
      table_schema = pandas_gbq.schema.update_schema(
          default_schema, dict(fields=table_schema)
      )

Environment details

  • Python version: 3.12.7
  • pip version: 24.2
  • pandas-gbq version: 0.24.0
  • pandas version: 2.2.3
  • numpy version: 1.26.4

Steps to reproduce

  1. Execute script below with the above versions -> Fails.
  2. Execute script below with above versions - but adjust pandas-gbq==0.23.0 -> Works

Code example

import pandas_gbq
import pandas as pd
import numpy as np

DESTINATION_TABLE_ID = 'INSERT_YOUR_TABLE_HERE'

schema = [
 {'name': 'Id', 'type': 'INTEGER', 'mode': 'NULLABLE'},
 {'name': 'Positions',
  'type': 'RECORD',
  'mode': 'REPEATED',
  'fields': [
   {'name': 'PositionState',
    'type': 'STRING',
    'mode': 'NULLABLE'}
  ]
}
]

works_df = pd.DataFrame([{
        'Id': 123,
        'Positions': None
}])

error_df = pd.DataFrame([{
        'Id': 123,
        'Positions': np.array([{
            'PositionState': None
        }])
}])

# Works with warning
# pandas_gbq.to_gbq(works_df, destination_table=DESTINATION_TABLE_ID, table_schema=schema, if_exists='replace')

# Throws error
pandas_gbq.to_gbq(error_df, destination_table=DESTINATION_TABLE_ID, table_schema=schema, if_exists='replace')

Stack trace

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[20], line 2
      1 import pandas_gbq
----> 2 pandas_gbq.to_gbq(dummy_df, destination_table='DESTINATION_TABLE_ID', table_schema=schema)

File ~/anaconda3/envs/py-312/lib/python3.12/site-packages/pandas_gbq/gbq.py:1163, in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials, api_method, verbose, private_key, auth_redirect_uri, client_id, client_secret, user_agent, rfc9110_delimiter)
   1160 dataset_id = destination_table_ref.dataset_id
   1161 table_id = destination_table_ref.table_id
-> 1163 default_schema = _generate_bq_schema(dataframe)
   1164 # If table_schema isn't provided, we'll create one for you
   1165 if not table_schema:

File ~/anaconda3/envs/py-312/lib/python3.12/site-packages/pandas_gbq/gbq.py:1249, in _generate_bq_schema(df, default_type)
   1246 fields_json = []
   1248 for field in fields:
-> 1249     fields_json.append(field.to_api_repr())
   1251 return {"fields": fields_json}

File ~/anaconda3/envs/py-312/lib/python3.12/site-packages/google/cloud/bigquery/schema.py:353, in SchemaField.to_api_repr(self)
    350 # If this is a RECORD type, then sub-fields are also included,
    351 # add this to the serialized representation.
    352 if self.field_type.upper() in _STRUCT_TYPES:
--> 353     answer["fields"] = [f.to_api_repr() for f in self.fields]
    355 # Done; return the serialized dictionary.
    356 return answer

AttributeError: 'NoneType' object has no attribute 'to_api_repr'
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-pandas API. label Dec 12, 2024
@tswast tswast added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Dec 12, 2024
@tswast tswast assigned tswast and unassigned whuffman36 Dec 12, 2024
@tswast
Copy link
Collaborator

tswast commented Dec 12, 2024

Thanks for the report and the reproducible code sample. I'll look into this, since it's related to some changes I made recently.

@tswast
Copy link
Collaborator

tswast commented Dec 19, 2024

I believe #838 should fix this.

@tswast
Copy link
Collaborator

tswast commented Dec 19, 2024

I'm planning on merging #837 today to get this fix in 0.26.0

@berniwal
Copy link
Author

Thank you for this quick fix! 🥇 Solved my issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
3 participants