- Get the memory size of a DataFrame of Pandas
df.memory_usage(deep=True).sum()
2. Upload a large DataFrame of Pandas to BigQuery table
If your DataFrame is too big, the uploading operation will report “UDF out of memory”
google.api_core.exceptions.BadRequest: 400 Resources exceeded during query execution: UDF out of memory.; Failed to read Parquet file [...]. This might happen if the file contains a row that is too large, or if the total size of the pages loaded for the queried columns is too large.
The solution is as simple as splitting the DataFrame and upload them one by one:
client = bigquery.Client() for df_chunk in np.array_split(df, 10): job_config = bigquery.LoadJobConfig() job_config.write_disposition = bigquery.WriteDisposition.WRITE_APPEND job = client.load_table_from_dataframe(df_chunk, table_id, job_config=job_config) job.result()
3. Restore table in BigQuery
How to recover a deleted table in BigQuery? Just use bq
command
bq cp dataset.table@1577833205000 dataset.new_table
If your <timestamp>
is not correct, the bq
command will give you a notification about what <timestamp>
is right for this table. Then you can use that correct <timestamp>
again.