Can't write to blob storage from AzureML Spark Cluster
Dance, Cody R. (ALT)
35
Reputation points
Using the Azure ML Spark compute (serverless or attached), it is not possible to write to gen2 datalake blob storage.
The code below produces the error 'Caused by: org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: This operation is not permitted on a non-empty directory.'
This greatly reduces the usefulness of the spark integration. Any help would be appreciated.
df = #pyspark.pandas or pyspark.sql dataframe
df.to_parquet('azureml://[blah]) #or df.write.parquet('azureml://[blah])
Sign in to answer