Public Holidays

Worldwide public holiday data sourced from PyPI holidays package and Wikipedia, covering 38 countries or regions from 1970 to 2099.

Each row indicates the holiday info for a specific date, country or region, and whether most people have paid time off.

Note

Microsoft provides Azure Open Datasets on an “as is” basis. Microsoft makes no warranties, express or implied, guarantees or conditions with respect to your use of the datasets. To the extent permitted under your local law, Microsoft disclaims all liability for any damages or losses, including direct, consequential, special, indirect, incidental or punitive, resulting from your use of the datasets.

This dataset is provided under the original terms that Microsoft received source data. The dataset may include data sourced from Microsoft.

Volume and retention

This dataset is stored in Parquet format. It's a snapshot with holiday information from January 1, 1970 to January 1, 2099. The data size is about 500KB.

Storage location

This dataset is stored in the East US Azure region. We recommend locating compute resources in East US for affinity.

Additional information

This dataset combines data sourced from Wikipedia (WikiMedia Foundation Inc) and PyPI holidays package.

The combined dataset is provided under the Creative Commons Attribution-ShareAlike 3.0 Unported License.

Email [email protected] if you have any questions about the data source.

Columns

Name Data type Unique Values (sample) Description
countryOrRegion string 38 Sweden Norway Country or region full name.
countryRegionCode string 35 SE NO Country or region code following the format here.
date timestamp 20,665 2074-01-01 00:00:00 2025-12-25 00:00:00 Date of the holiday.
holidayName string 483 Søndag Söndag Full name of the holiday.
isPaidTimeOff boolean 3 True Indicate whether most people have paid time off on this date (only available for US, GB, and India now). If it is NULL, it means unknown.
normalizeHolidayName string 438 Søndag Söndag Normalized name of the holiday.

Preview

countryOrRegion holidayName normalizeHolidayName countryRegionCode date
Norway Søndag Søndag NO 12/28/2098 12:00:00 AM
Sweden Söndag Söndag SE 12/28/2098 12:00:00 AM
Australia Boxing Day Boxing Day AU 12/26/2098 12:00:00 AM
Hungary Karácsony másnapja Karácsony másnapja HU 12/26/2098 12:00:00 AM
Austria Stefanitag Stefanitag AT 12/26/2098 12:00:00 AM
Canada Boxing Day Boxing Day CA 12/26/2098 12:00:00 AM
Croatia Sveti Stjepan Sveti Stjepan HR 12/26/2098 12:00:00 AM
Czech 2. svátek vánoční 2. svátek vánoční CZ 12/26/2098 12:00:00 AM

Data access

Azure Notebooks

# This is a package in preview.
from azureml.opendatasets import PublicHolidays

from datetime import datetime
from dateutil import parser
from dateutil.relativedelta import relativedelta


end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)
hol = PublicHolidays(start_date=start_date, end_date=end_date)
hol_df = hol.to_pandas_dataframe()
hol_df.info()

Azure Databricks

# This is a package in preview.
# You need to pip install azureml-opendatasets in Databricks cluster. https://learn.microsoft.com/azure/data-explorer/connect-from-databricks#install-the-python-library-on-your-azure-databricks-cluster
from azureml.opendatasets import PublicHolidays

from datetime import datetime
from dateutil import parser
from dateutil.relativedelta import relativedelta


end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)
hol = PublicHolidays(start_date=start_date, end_date=end_date)
hol_df = hol.to_spark_dataframe()
display(hol_df.limit(5))

Azure Synapse

# This is a package in preview.
from azureml.opendatasets import PublicHolidays

from datetime import datetime
from dateutil import parser
from dateutil.relativedelta import relativedelta


end_date = datetime.today()
start_date = datetime.today() - relativedelta(months=1)
hol = PublicHolidays(start_date=start_date, end_date=end_date)
hol_df = hol.to_spark_dataframe()
# Display top 5 rows
display(hol_df.limit(5))

Next steps

View the rest of the datasets in the Open Datasets catalog.