Hello Will !
Thank you for posting on Microsoft Learn.
You need to use spark.sql()
and spark.catalog.setCurrentDatabase()
lake_db = "testdb"
parquet_path = "abfss://<container>@<storageaccount>.dfs.core.windows.net/data/firsttable.parquet"
spark.catalog.setCurrentDatabase(lake_db)
spark.sql(f"""
CREATE TABLE IF NOT EXISTS firsttable
USING PARQUET
LOCATION '{parquet_path}'
""")
Lake Databases are Spark-managed; you must use the Spark engine, not T-SQL.
You must use spark.catalog.setCurrentDatabase()
to make sure that the table is created in the desired Lake Database.
The CREATE EXTERNAL TABLE ...
syntax you used is from the T-SQL side and won’t update the Spark-based Lake DB catalog, which is why you don’t see it in the workspace.
The table you create this way will appear in the Synapse Studio > Data > Lake databases > testdb section.
If you want additional benefits like ACID transactions, consider writing your table as a Delta Table instead of pure Parquet:
df = spark.read.parquet(parquet_path)
df.write.format("delta").save("abfss://.../data/deltatable")
spark.sql(f"""
CREATE TABLE deltatable
USING DELTA
LOCATION 'abfss://.../data/deltatable'
""")