Continuous Cross-Tenant Azure SQL Database Synchronization Using Azure Databricks

Question

Continuous Cross-Tenant Azure SQL Database Synchronization Using Azure Databricks

Mallikarjun appani 6

Dear Microsoft Support Team,

We are working on a requirement to implement a continuous data synchronization pipeline between two Azure SQL Database environments hosted in different Microsoft Entra tenants using Azure Databricks.

Requirement Overivew:

We need to establish a near real-time/continuous sink mechanism where data changes from a Source Azure SQL Database in one tenant are continuously replicated or synchronized to a Target Azure SQL Database in another tenant through Azure Databricks.
Could you please suggest beest way to do the above requirement.

0 comments

2 answers

Your answer

Answer 1

Hi Mallikarjun appani,

Thank you for reaching out — this is a very relevant scenario, and we often see similar requirements when syncing data across Azure SQL Databases in different Entra tenants.

For a near real-time / continuous synchronization, the most suitable and scalable approach is to build a Change Data Capture (CDC)-based incremental pipeline using Azure Databricks.

Enable Change Tracking or CDC on the source database
Start by enabling Change Tracking (lightweight) or Change Data Capture (CDC) on the source tables.

Change Tracking helps identify which rows have changed
CDC captures full insert/update/delete history, which is better suited for streaming scenarios

Capture and stage changes in ADLS Gen2
Use an ingestion mechanism (for example, incremental extraction or CDC tools) to land:

An initial full snapshot
Ongoing change data

into Azure Data Lake Storage Gen2. This provides a reliable staging layer for downstream processing.

Process changes using Azure Databricks
In Databricks, you can use:

Structured Streaming for continuous micro-batch processing
Or AUTO CDC pipelines to simplify change handling

These capabilities are designed to process data continuously and keep downstream systems in sync.

Apply incremental changes to the target Azure SQL Database
Use merge/upsert logic to apply:

Inserts
Updates
Deletes

This ensures the target database remains aligned with the source.

Configure secure cross-tenant connectivity
Since this is a cross-tenant setup, make sure to configure:

Microsoft Entra ID (Azure AD) service principal authentication
Appropriate networking (Private Endpoints or firewall rules)

You could consider Azure SQL Data Sync, which allows synchronization between Azure SQL databases (including cross-tenant scenarios).

However:

It operates on scheduled intervals (~5 minutes or more)
Has limited flexibility for transformation scenarios
Planned for retirement by September 30, 2027, so not recommended for new designs

For your requirement (cross-tenant + near real-time + continuous sync), I recommend:

Using Change Tracking / CDC for incremental capture
Processing with Azure Databricks (Structured Streaming or AUTO CDC)
Applying merge/upsert logic to the target Azure SQL Database

Since this involves architecture decisions (cross-tenant networking, security, and design patterns), I would also recommend engaging your Solution Architect (SA) to review and validate the end-to-end architecture based on your organization’s setup and constraints.
Reference:
Query streaming data in Azure Databricks Replicate external database using AUTO CDC in Databricks
What is SQL Data Sync SQL Data Sync retirement guidance

Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

Mallikarjun appani 6 Reputation points

2026-05-19T04:00:44.7466667+00:00

Thank you for the detailed guidance and recommendations.

We would also like to clarify a few additional points regarding the proposed architecture.

Could you please confirm whether the same approach would work in the following scenarios as well:

When both source and target Azure SQL Databases are accessed through Private Endpoints only

When Microsoft-managed keys (MMK) are enabled

When Customer-managed keys (CMK) / TDE with Key Vault integration are enabled on the Azure SQL Databases

Specifically, we would like to understand if there are any additional considerations or limitations related to:

Cross-tenant connectivity using Private Endpoints

Databricks access and authentication across tenants

CDC / Change Tracking functionality with CMK-enabled databases

Networking or firewall requirements in such setups

Please let us know if there are any recommended best practices or architectural constraints for these scenarios.

Thank you for your support.

Answer 2

A suitable approach is to treat this as a cross-tenant, near real-time replication problem and use change data capture (CDC) into cloud storage plus Azure Databricks ingestion, then write to the target Azure SQL Database.

A high-level architecture that aligns with the documented patterns is:

Enable change capture on the source Azure SQL Database.
- Use a CDC mechanism (for example, Debezium, Kafka, or another log-based CDC tool) to continuously capture inserts/updates/deletes from the source Azure SQL Database.
- Land both an initial full snapshot and the ongoing change feed into a cloud storage account (for example, Azure Data Lake Storage) that Azure Databricks can access.
Use Azure Databricks AUTO CDC pipelines to ingest from storage.
- Follow the pattern described for replicating an external RDBMS table using AUTO CDC:
  - One-time full copy (once flow) from the snapshot dataset to initialize the target state.
  - Continuous change flow that reads the CDC feed from storage and applies changes.
- This pattern is explicitly recommended for “keeping a target table in sync with an external system of record” and for slowly changing dimensions.
Write from Azure Databricks into the target Azure SQL Database in the other tenant.
- Configure cross-tenant access so that Databricks can authenticate to the target Azure SQL Database (for example, using a service principal in the target tenant and appropriate firewall/network rules).
- In the AUTO CDC pipeline, define the sink as the target Azure SQL Database table and apply the incremental changes (upserts/deletes) as they are processed.
Choose the appropriate sync mode for Databricks pipelines over the lakehouse layer (if using Lakehouse / Lakebase Provisioned synced tables):
- Use Continuous sync mode when changes must appear in near real time. The pipeline runs once for the initial snapshot and then continuously applies incremental changes.
- Ensure the source Delta tables in the lake (fed by AUTO CDC) have Change Data Feed enabled if using Triggered or Continuous modes.
Handle cross-tenant specifics.
- Cross-tenant synchronization is supported at the application level (Databricks + service principals) even though some PaaS sync services (like SQL Data Sync) have limitations across tenants or clouds.
- Use separate identities and connections for source and target tenants, and manage secrets/credentials in a secure store (for example, Azure Key Vault) that Databricks can access.

This design gives:

Near real-time replication (driven by the CDC feed and Continuous Databricks pipelines).
Decoupling between tenants via storage and Databricks.
Flexibility to extend the pipeline for transformations or additional targets.

References:

Share via

Continuous Cross-Tenant Azure SQL Database Synchronization Using Azure Databricks

2 answers

Your answer