I've run into some issues previewing parquet data using a pipeline Copy Activity when that data was compressed using the zstd codec. Are there plans for ADF to add support for this (now default in Databricks) compression method?
Example error:
An error occurred when invoking java, message: java.lang.UnsatisfiedLinkError:D:\Users\_azbatchtask_2\AppData\Local\Temp\libzstd-jni-1.5.5-51665259375217484100.dll: Your organization used Device Guard to block this app. Contact your support person for more info
no zstd-jni-1.5.5-5 in java.library.path
Unsupported OS/arch, cannot find /win/amd64/libzstd-jni-1.5.5-5.dll or load zstd-jni-1.5.5-5 from system libraries. Please try building from source the jar or providing libzstd-jni-1.5.5-5 in your system.
total entry:30
java.lang.ClassLoader.loadLibrary(ClassLoader.java:1864)
java.lang.Runtime.loadLibrary0(Runtime.java:870)
java.lang.System.loadLibrary(System.java:1122)
com.github.luben.zstd.util.Native$1.run(Native.java:69)
com.github.luben.zstd.util.Native$1.run(Native.java:67)
java.security.AccessController.doPrivileged(Native Method)
com.github.luben.zstd.util.Native.loadLibrary(Native.java:67)
com.github.luben.zstd.util.Native.load(Native.java:154)
com.github.luben.zstd.util.Native.load(Native.java:85)
com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18)
com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18)
org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:90)
org.apache.parquet.hadoop.codec.ZstandardCodec.createInputStream(ZstandardCodec.java:83)
org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.decompress(CodecFactory.java:112)
org.apache.parquet.hadoop.ColumnChunkPageReadStore$ColumnChunkPageReader.readDictionaryPage(ColumnChunkPageReadStore.java:236)
org.apache.parquet.column.impl.ColumnReaderBase.<init>(ColumnReaderBase.java:410)
org.apache.parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:46)
org.apache.parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:82)
org.apache.parquet.io.RecordReaderImplementation.<init>(RecordReaderImplementation.java:271)
org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:147)
org.apache.parquet.io.MessageColumnIO$1.visit(MessageColumnIO.java:109)
org.apache.parquet.filter2.compat.FilterCompat$NoOpFilter.accept(FilterCompat.java:177)
org.apache.parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:109)
org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:141)
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:230)
org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
com.microsoft.datatransfer.bridge.parquet.ParquetBatchReaderBridge.<init>(ParquetBatchReaderBridge.java:70)
com.microsoft.datatransfer.bridge.parquet.ParquetBatchReaderBridge.open(ParquetBatchReaderBridge.java:64)
com.microsoft.datatransfer.bridge.parquet.ParquetFileBridge.createReader(ParquetFileBridge.java:22)
.