An Azure service for ingesting, preparing, and transforming data at scale.
Hey Li, I ran the same experiment you did and here’s what I found:
- The “UTF-8 without BOM” option in a Delimited Text dataset is a Copy Activity feature—Mapping Data Flows today don’t honor that explicit setting.
- Under the covers, Data Flows only support the Default (UTF-8) encoding (which behaves as no-BOM UTF-8) and the UTF-8 with BOM choice. If you explicitly switch your dataset to “UTF-8 without BOM,” the data flow engine will reject it with that error you saw.
- In practice, leaving your dataset encoding on Default (UTF-8) lets the data flow read both BOM-less and BOM-prefixed files without complaint.
Workarounds:
- Keep your Data Flow source dataset set to Default (UTF-8).
- If you really need to strip or add a BOM, do a small Copy Activity or Data Flow Derived Column that removes/inserts the first three bytes before your main business flow.
Reference list:
- Format delimited text dataset (encoding options) https://learn.microsoft.com/azure/data-factory/format-delimited-text
- Mapping Data Flow datasets overview https://learn.microsoft.com/azure/data-factory/concepts-data-flow-datasets
- Mapping Data Flow source transformation https://learn.microsoft.com/azure/data-factory/data-flow-source
Note: This content was drafted with the help of an AI system. Please verify the information before relying on it for decision-making.