Skip to content

Commit

Permalink
Move spinner init in load_data_to_hdfs.py to avoid error if /data is …
Browse files Browse the repository at this point in the history
…not populated
  • Loading branch information
adisve committed Apr 10, 2024
1 parent 48c8b66 commit 7969010
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions scripts/spark/setup/load_data_to_hdfs.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,13 +70,14 @@ def get_schema(self):
return self.parse_schema_file()

def transfer_data(self):
spinner = Halo(text=f"Reading and writing data from /data/output.csv to {self.hdfs_path}")
spinner.start()
try:
self.start_spark_session()
schema = self.get_schema()

logging.info(f"Reading and writing data from /data/output.csv to {self.hdfs_path}")
spinner = Halo(text=f"Reading and writing data from /data/output.csv to {self.hdfs_path}")
spinner.start()

df = (self.spark.read.option("header", "true")
.option("mode", "DROPMALFORMED")
.option("overwrite", "true")
Expand Down

0 comments on commit 7969010

Please sign in to comment.