Error with MySQL variable initialization in Azure ADF pipeline

Jaafar Saleh 0 Reputation points
2024-11-22T13:53:28.75+00:00

Hello,

I am trying to create a counter for a table using variables in mysql query in a pipeline in Azure data factory, which looks like this

SET @counter = 0;
SELECT @counter = @counter + 1 from XXX;

I am getting an error that says

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '? = 0' at line 1

The variable is treated as a parameter placeholder and it is not correct. I've recently updated the linked services for MySQL connection, and this issue appears only after the update. Note that the query works completely fine in MySQL. I tried the ROW_NUMBER() approach but it is not working when the data is too large

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,922 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Vinodh247 24,091 Reputation points MVP
    2024-11-23T08:39:42.16+00:00

    The error you're encountering suggests that Azure Data Factory (ADF) is treating the variable @counter as a parameter placeholder (?) in the SQL query. This issue arises because of how ADF handles parameters in linked services for databases, and the new MySQL connector you updated to might have stricter handling of variable syntax.

    1. Instead of directly writing the query in the pipeline, encapsulate the logic in a stored procedure in MySQL. Call the stored procedure in the ADF pipeline using a query like: CALL IncrementCounter();
    2. Rewrite the query to initialize and use the variable inline. This might work better with the ADF's parameter parsing logic. However, if this still causes issues due to ADF's query parsing behavior, proceed with alternative methods.

      SELECT (@counter := @counter + 1) AS row_number, col1, col2 FROM your_table, (SELECT @counter := 0) AS init;

    3. If your MySQL version supports window functions, ROW_NUMBER() is a better alternative for large datasets:

      SELECT ROW_NUMBER() OVER (ORDER BY some_column) AS row_number, col1, col2 FROM your_table;

      1. Check and Update Connector/Driver Settings
      2. Workaround with ADF Expressions
      If none of the above work and the counter is essential in the pipeline logic:
      • Fetch the data from MySQL without the counter.
      • Use an ADF Data Flow or Script Activity to add a counter column in post-processing
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.