Spark-las is a spark reader for las and laz point cloud files
Read a laz file :
val las_file = spark.read.format("LAS.LAS").load("las_file.laz")
Transform a folder of laz files in a parquet file:
val las_file = spark.read.format("LAS.LAS").load("*.laz")
las_file.write.parquet("point_cloud.parquet")
- Read las files:
- Las 1.0-1.4
- Point format 0 to 10
- Files compressed with laszip (laz)
- Predicates Pushdown for spatial query and return number
- COPC optimisations
- Write las and laz files
Based on the data-validator README.md.
- Create a GitHub user token with the read:packages authorisation.
- Create a ivy configuration file :
<ivysettings>
<settings defaultResolver="thechain">
<credentials
host="maven.pkg.github.com"
username="YOUR USER NAME"
passwd="YOUR GITHUB TOKEN"
realm="GitHub Package Registry"/>
</settings>
<resolvers>
<chain name="thechain">
<ibiblio name="central" m2compatible="true"
root="https://repo1.maven.org/maven2/" />
<ibiblio name="ghp-dv" m2compatible="true"
root="https://maven.pkg.github.com/mbunel/spark-las"/>
</chain>
</resolvers>
</ivysettings>
- Call this file at the spark startup with this option :
--conf spark.jars.ivySettings=$(pwd)/my_ivy.settings
- If necessary, set a proxy:
--driver-java-options '-Dhttp.proxyHost=proxy.ign.fr -Dhttp.proxyPort=3128 -Dhttps.proxyHost=proxy.ign.fr-Dhttps.proxyPort=3128'
- Download directly the jar file: https://github.com/MBunel/spark-las/packages/2182443
- Use the
--jars
option at spark startup