Can't write to blob storage from AzureML Spark Cluster

Dance, Cody R. (ALT) 35 Reputation points
2023-05-10T18:32:20.2533333+00:00

Using the Azure ML Spark compute (serverless or attached), it is not possible to write to gen2 datalake blob storage.

The code below produces the error 'Caused by: org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: This operation is not permitted on a non-empty directory.'

This greatly reduces the usefulness of the spark integration. Any help would be appreciated.

df = #pyspark.pandas or pyspark.sql dataframe
df.to_parquet('azureml://[blah]) #or df.write.parquet('azureml://[blah])
Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,496 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.