Synapse link and security

MrFlinstone 706 Reputation points
2025-09-12T18:04:29.38+00:00

Hi All.

I am looking for any form of white paper on securing dataverse data access from synapse link/data lake and serverless synapse formerly SQL DW.

From my investigation, when synapse link gets the data over to a data lake, there are concerns that users can access this information from the data lake directly even export it, some of the information could be sensitive data, secondly when a linked service is created from synapse workspaces, it gives access to the entire data since access is via the system principal which is required to have blob data reader role assignment, this means that end users could have access to restricted data via queries from the lake database.

Dynamic data masking would have been great here, but its not possible with serverless SQL pools, what are the options with a serverless SQL pool and options in general to lock down a dynamics synapse link implementation ?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
{count} votes

1 answer

Sort by: Most helpful
  1. Vinodh247 40,031 Reputation points MVP Volunteer Moderator
    2025-09-13T07:03:59.23+00:00

    when you push Dataverse (Dynamics) out to a lake via Synapse Link you have created a new copy of the data that must be protected independently. Below I list the realities, the constraints (what is and is not supported), and concrete mitigations/architecture patterns you can apply focused on serverless SQL pool but covering general options as well.

    1. Synapse Link writes Dataverse data into ADLSGen2 (lake) as files. Once files exist in the lake, any identity that can read those files can export or copy them. You must treat the lake as the primary protection boundary.
    2. The Synapse workspace uses a workspace identity / MI (system principal) to read files and run serverless queries; by default that identity is commonly granted Storage Blob Data Reader on the container, if that role is overbroad then downstream users can gain access to data via lake database queries. Restrict the identity scope.
    3. Dynamic data masking (DDM) and some other built-in SQL features are not supported on serverless SQL pools (DDM is supported on dedicated pools/Fabric SQL). Do not rely on serverless to provide DDM.
    4. Row level security and column-level security have limited support for serverless external tables; serverless can use views to implement some column restrictions, but native RLS/DDM capabilities are fuller in dedicated SQL pools. Confirm requirements before choosing serverless.

    Note:

    1. If you require true enforced masking, row-level enforcement and enterprise policy that cannot be bypassed, do not rely on serverless SQL pools alone. Use a hardened curated endpoint (dedicated SQL/Fabric SQL) as the enforcement point. Serverless is great for exploration and low-cost queries but it is not a drop in replacement for a secured RLS/DDM capable engine.
    2. The single biggest operational mistake I see is granting the workspace identity wide storage rights. Lock that down and your risk drops dramatically.

    Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.