DataLakeDirectoryClient Class
A client to interact with the DataLake directory, even if the directory may not yet exist.
For operations relating to a specific subdirectory or file under the directory, a directory client or file client can be retrieved using the get_sub_directory_client or get_file_client functions.
- Inheritance
-
azure.storage.filedatalake._path_client.PathClientDataLakeDirectoryClient
Constructor
DataLakeDirectoryClient(account_url: str, file_system_name: str, directory_name: str, credential: str | Dict[str, str] | AzureNamedKeyCredential | AzureSasCredential | TokenCredential | None = None, **kwargs: Any)
Parameters
Name | Description |
---|---|
account_url
Required
|
The URI to the storage account. |
file_system_name
Required
|
The file system for the directory or files. |
directory_name
Required
|
The whole path of the directory. eg. {directory under file system}/{directory to interact with} |
credential
|
The credentials with which to authenticate. This is optional if the account URL already has a SAS token. The value can be a SAS token string, an instance of a AzureSasCredential or AzureNamedKeyCredential from azure.core.credentials, an account shared access key, or an instance of a TokenCredentials class from azure.identity. If the resource URI already contains a SAS token, this will be ignored in favor of an explicit credential
Default value: None
|
Keyword-Only Parameters
Name | Description |
---|---|
api_version
|
The Storage API version to use for requests. Default value is the most recent service version that is compatible with the current SDK. Setting to an older version may result in reduced feature compatibility. |
audience
|
The audience to use when requesting tokens for Azure Active Directory authentication. Only has an effect when credential is of type TokenCredential. The value could be https://storage.azure.com/ (default) or https://.blob.core.windows.net. |
Examples
Creating the DataLakeServiceClient from connection string.
from azure.storage.filedatalake import DataLakeDirectoryClient
DataLakeDirectoryClient.from_connection_string(connection_string, "myfilesystem", "mydirectory")
Variables
Name | Description |
---|---|
url
|
The full endpoint URL to the file system, including SAS token if used. |
primary_endpoint
|
The full primary endpoint URL. |
primary_hostname
|
The hostname of the primary endpoint. |
Methods
acquire_lease |
Requests a new lease. If the file or directory does not have an active lease, the DataLake service creates a lease on the file/directory and returns a new lease ID. |
close |
This method is to close the sockets opened by the client. It need not be used when using with a context manager. |
create_directory |
Create a new directory. |
create_file |
Create a new file and return the file client to be interacted with. |
create_sub_directory |
Create a subdirectory and return the subdirectory client to be interacted with. |
delete_directory |
Marks the specified directory for deletion. |
delete_sub_directory |
Marks the specified subdirectory for deletion. |
exists |
Returns True if a directory exists and returns False otherwise. |
from_connection_string |
Create DataLakeDirectoryClient from a Connection String. |
get_access_control | |
get_directory_properties |
Returns all user-defined metadata, standard HTTP properties, and system properties for the directory. It does not return the content of the directory. |
get_file_client |
Get a client to interact with the specified file. The file need not already exist. |
get_paths |
Returns a generator to list the paths under specified file system and directory. The generator will lazily follow the continuation tokens returned by the service. |
get_sub_directory_client |
Get a client to interact with the specified subdirectory of the current directory. The sub subdirectory need not already exist. |
remove_access_control_recursive |
Removes the Access Control on a path and sub-paths. |
rename_directory |
Rename the source directory. |
set_access_control |
Set the owner, group, permissions, or access control list for a path. |
set_access_control_recursive |
Sets the Access Control on a path and sub-paths. |
set_http_headers |
Sets system properties on the file or directory. If one property is set for the content_settings, all properties will be overridden. |
set_metadata |
Sets one or more user-defined name-value pairs for the specified file system. Each call to this operation replaces all existing metadata attached to the file system. To remove all metadata from the file system, call this operation with no metadata dict. |
update_access_control_recursive |
Modifies the Access Control on a path and sub-paths. |
acquire_lease
Requests a new lease. If the file or directory does not have an active lease, the DataLake service creates a lease on the file/directory and returns a new lease ID.
acquire_lease(lease_duration: int | None = -1, lease_id: str | None = None, **kwargs) -> DataLakeLeaseClient
Parameters
Name | Description |
---|---|
lease_duration
Required
|
Specifies the duration of the lease, in seconds, or negative one (-1) for a lease that never expires. A non-infinite lease can be between 15 and 60 seconds. A lease duration cannot be changed using renew or change. Default is -1 (infinite lease). |
lease_id
Required
|
Proposed lease ID, in a GUID string format. The DataLake service returns 400 (Invalid request) if the proposed lease ID is not in the correct format. |
Keyword-Only Parameters
Name | Description |
---|---|
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A DataLakeLeaseClient object, that can be run in a context manager. |
close
This method is to close the sockets opened by the client. It need not be used when using with a context manager.
close() -> None
Keyword-Only Parameters
Name | Description |
---|---|
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
create_directory
Create a new directory.
create_directory(metadata: Dict[str, str] | None = None, **kwargs) -> Dict[str, str | datetime]
Parameters
Name | Description |
---|---|
metadata
Required
|
Name-value pairs associated with the file as metadata. |
Keyword-Only Parameters
Name | Description |
---|---|
content_settings
|
ContentSettings object used to set path properties. |
lease
|
Required if the file has an active lease. Value can be a DataLakeLeaseClient object or the lease ID as a string. |
umask
|
Optional and only valid if Hierarchical Namespace is enabled for the account. When creating a file or directory and the parent folder does not have a default ACL, the umask restricts the permissions of the file or directory to be created. The resulting permission is given by p & ^u, where p is the permission and u is the umask. For example, if p is 0777 and u is 0057, then the resulting permission is 0720. The default permission is 0777 for a directory and 0666 for a file. The default umask is 0027. The umask must be specified in 4-digit octal notation (e.g. 0766). |
owner
|
The owner of the file or directory. |
group
|
The owning group of the file or directory. |
acl
|
Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]". |
lease_id
|
Proposed lease ID, in a GUID string format. The DataLake service returns 400 (Invalid request) if the proposed lease ID is not in the correct format. |
lease_duration
|
Specifies the duration of the lease, in seconds, or negative one (-1) for a lease that never expires. A non-infinite lease can be between 15 and 60 seconds. A lease duration cannot be changed using renew or change. |
permissions
|
Optional and only valid if Hierarchical Namespace is enabled for the account. Sets POSIX access permissions for the file owner, the file owning group, and others. Each class may be granted read, write, or execute permission. The sticky bit is also supported. Both symbolic (rwxrw-rw-) and 4-digit octal notation (e.g. 0766) are supported. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
cpk
|
Encrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A dictionary of response headers. |
Examples
Create directory.
directory_client.create_directory()
create_file
Create a new file and return the file client to be interacted with.
create_file(file: FileProperties | str, **kwargs) -> DataLakeFileClient
Parameters
Name | Description |
---|---|
file
Required
|
The file with which to interact. This can either be the name of the file, or an instance of FileProperties. |
Keyword-Only Parameters
Name | Description |
---|---|
content_settings
|
ContentSettings object used to set path properties. |
metadata
|
Name-value pairs associated with the file as metadata. |
lease
|
Required if the file has an active lease. Value can be a DataLakeLeaseClient object or the lease ID as a string. |
umask
|
Optional and only valid if Hierarchical Namespace is enabled for the account. When creating a file or directory and the parent folder does not have a default ACL, the umask restricts the permissions of the file or directory to be created. The resulting permission is given by p & ^u, where p is the permission and u is the umask. For example, if p is 0777 and u is 0057, then the resulting permission is 0720. The default permission is 0777 for a directory and 0666 for a file. The default umask is 0027. The umask must be specified in 4-digit octal notation (e.g. 0766). |
owner
|
The owner of the file or directory. |
group
|
The owning group of the file or directory. |
acl
|
Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]". |
lease_id
|
Proposed lease ID, in a GUID string format. The DataLake service returns 400 (Invalid request) if the proposed lease ID is not in the correct format. |
lease_duration
|
Specifies the duration of the lease, in seconds, or negative one (-1) for a lease that never expires. A non-infinite lease can be between 15 and 60 seconds. A lease duration cannot be changed using renew or change. |
expires_on
|
The time to set the file to expiry. If the type of expires_on is an int, expiration time will be set as the number of milliseconds elapsed from creation time. If the type of expires_on is datetime, expiration time will be set absolute to the time provided. If no time zone info is provided, this will be interpreted as UTC. |
permissions
|
Optional and only valid if Hierarchical Namespace is enabled for the account. Sets POSIX access permissions for the file owner, the file owning group, and others. Each class may be granted read, write, or execute permission. The sticky bit is also supported. Both symbolic (rwxrw-rw-) and 4-digit octal notation (e.g. 0766) are supported. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
cpk
|
Encrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A DataLakeFileClient with newly created file. |
create_sub_directory
Create a subdirectory and return the subdirectory client to be interacted with.
create_sub_directory(sub_directory: DirectoryProperties | str, metadata: Dict[str, str] | None = None, **kwargs) -> DataLakeDirectoryClient
Parameters
Name | Description |
---|---|
sub_directory
Required
|
The directory with which to interact. This can either be the name of the directory, or an instance of DirectoryProperties. |
metadata
Required
|
Name-value pairs associated with the file as metadata. |
Keyword-Only Parameters
Name | Description |
---|---|
content_settings
|
ContentSettings object used to set path properties. |
lease
|
Required if the file has an active lease. Value can be a DataLakeLeaseClient object or the lease ID as a string. |
umask
|
Optional and only valid if Hierarchical Namespace is enabled for the account. When creating a file or directory and the parent folder does not have a default ACL, the umask restricts the permissions of the file or directory to be created. The resulting permission is given by p & ^u, where p is the permission and u is the umask. For example, if p is 0777 and u is 0057, then the resulting permission is 0720. The default permission is 0777 for a directory and 0666 for a file. The default umask is 0027. The umask must be specified in 4-digit octal notation (e.g. 0766). |
owner
|
The owner of the file or directory. |
group
|
The owning group of the file or directory. |
acl
|
Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]". |
lease_id
|
Proposed lease ID, in a GUID string format. The DataLake service returns 400 (Invalid request) if the proposed lease ID is not in the correct format. |
lease_duration
|
Specifies the duration of the lease, in seconds, or negative one (-1) for a lease that never expires. A non-infinite lease can be between 15 and 60 seconds. A lease duration cannot be changed using renew or change. |
permissions
|
Optional and only valid if Hierarchical Namespace is enabled for the account. Sets POSIX access permissions for the file owner, the file owning group, and others. Each class may be granted read, write, or execute permission. The sticky bit is also supported. Both symbolic (rwxrw-rw-) and 4-digit octal notation (e.g. 0766) are supported. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
cpk
|
Encrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
DataLakeDirectoryClient for the subdirectory. |
delete_directory
Marks the specified directory for deletion.
delete_directory(**kwargs) -> None
Keyword-Only Parameters
Name | Description |
---|---|
lease
|
Required if the file has an active lease. Value can be a LeaseClient object or the lease ID as a string. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
None. |
Examples
Delete directory.
new_directory.delete_directory()
delete_sub_directory
Marks the specified subdirectory for deletion.
delete_sub_directory(sub_directory: DirectoryProperties | str, **kwargs) -> DataLakeDirectoryClient
Parameters
Name | Description |
---|---|
sub_directory
Required
|
The directory with which to interact. This can either be the name of the directory, or an instance of DirectoryProperties. |
Keyword-Only Parameters
Name | Description |
---|---|
lease
|
Required if the file has an active lease. Value can be a LeaseClient object or the lease ID as a string. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
DataLakeDirectoryClient for the subdirectory. |
exists
Returns True if a directory exists and returns False otherwise.
exists(**kwargs: Any) -> bool
Keyword-Only Parameters
Name | Description |
---|---|
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
True if a directory exists, False otherwise. |
from_connection_string
Create DataLakeDirectoryClient from a Connection String.
from_connection_string(conn_str: str, file_system_name: str, directory_name: str, credential: str | Dict[str, str] | AzureNamedKeyCredential | AzureSasCredential | TokenCredential | None = None, **kwargs: Any) -> Self
Parameters
Name | Description |
---|---|
conn_str
Required
|
A connection string to an Azure Storage account. |
file_system_name
Required
|
The name of file system to interact with. |
credential
|
The credentials with which to authenticate. This is optional if the account URL already has a SAS token. The value can be a SAS token string, an instance of a AzureSasCredential or AzureNamedKeyCredential from azure.core.credentials, an account shared access key, or an instance of a TokenCredentials class from azure.identity. If the resource URI already contains a SAS token, this will be ignored in favor of an explicit credential
Default value: None
|
directory_name
Required
|
The name of directory to interact with. The directory is under file system. |
Keyword-Only Parameters
Name | Description |
---|---|
audience
|
The audience to use when requesting tokens for Azure Active Directory authentication. Only has an effect when credential is of type TokenCredential. The value could be https://storage.azure.com/ (default) or https://.blob.core.windows.net. |
Returns
Type | Description |
---|---|
A DataLakeDirectoryClient. |
get_access_control
get_access_control(upn: bool | None = None, **kwargs) -> Dict[str, Any]
Parameters
Name | Description |
---|---|
upn
Required
|
Optional. Valid only when Hierarchical Namespace is enabled for the account. If "true", the user identity values returned in the x-ms-owner, x-ms-group, and x-ms-acl response headers will be transformed from Azure Active Directory Object IDs to User Principal Names. If "false", the values will be returned as Azure Active Directory Object IDs. The default value is false. Note that group and application Object IDs are not translated because they do not have unique friendly names. |
Keyword-Only Parameters
Name | Description |
---|---|
lease
|
Required if the file/directory has an active lease. Value can be a LeaseClient object or the lease ID as a string. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
response dict containing access control options with no modifications. |
get_directory_properties
Returns all user-defined metadata, standard HTTP properties, and system properties for the directory. It does not return the content of the directory.
get_directory_properties(**kwargs: Any) -> DirectoryProperties
Keyword-Only Parameters
Name | Description |
---|---|
lease
|
Required if the directory or file has an active lease. Value can be a DataLakeLeaseClient object or the lease ID as a string. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
cpk
|
Decrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS. Required if the directory was created with a customer-provided key. |
upn
|
If True, the user identity values returned in the x-ms-owner, x-ms-group, and x-ms-acl response headers will be transformed from Azure Active Directory Object IDs to User Principal Names in the owner, group, and acl fields of DirectoryProperties. If False, the values will be returned as Azure Active Directory Object IDs. The default value is False. Note that group and application Object IDs are not translate because they do not have unique friendly names. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
DirectoryProperties with all user-defined metadata, standard HTTP properties, and system properties for the directory. It does not return the content of the directory. |
Examples
Getting the properties for a file/directory.
props = new_directory.get_directory_properties()
get_file_client
Get a client to interact with the specified file.
The file need not already exist.
get_file_client(file: FileProperties | str) -> DataLakeFileClient
Parameters
Name | Description |
---|---|
file
Required
|
The file with which to interact. This can either be the name of the file, or an instance of FileProperties. eg. directory/subdirectory/file |
Keyword-Only Parameters
Name | Description |
---|---|
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A DataLakeFileClient. |
get_paths
Returns a generator to list the paths under specified file system and directory. The generator will lazily follow the continuation tokens returned by the service.
get_paths(*, recursive: bool = True, max_results: int | None = None, upn: bool | None = None, timeout: int | None = None, **kwargs: Any) -> ItemPaged[PathProperties]
Keyword-Only Parameters
Name | Description |
---|---|
recursive
|
Set True for recursive, False for iterative. The default value is True. |
max_results
|
An optional value that specifies the maximum number of items to return per page. If omitted or greater than 5,000, the response will include up to 5,000 items per page. |
upn
|
If True, the user identity values returned in the x-ms-owner, x-ms-group, and x-ms-acl response headers will be transformed from Azure Active Directory Object IDs to User Principal Names in the owner, group, and acl fields of PathProperties. If False, the values will be returned as Azure Active Directory Object IDs. The default value is None. Note that group and application Object IDs are not translate because they do not have unique friendly names. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. The default value is None. |
Returns
Type | Description |
---|---|
An iterable (auto-paging) response of PathProperties. |
get_sub_directory_client
Get a client to interact with the specified subdirectory of the current directory.
The sub subdirectory need not already exist.
get_sub_directory_client(sub_directory: DirectoryProperties | str) -> DataLakeDirectoryClient
Parameters
Name | Description |
---|---|
sub_directory
Required
|
The directory with which to interact. This can either be the name of the directory, or an instance of DirectoryProperties. |
Keyword-Only Parameters
Name | Description |
---|---|
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A DataLakeDirectoryClient. |
remove_access_control_recursive
Removes the Access Control on a path and sub-paths.
remove_access_control_recursive(acl: str, **kwargs: Any) -> AccessControlChangeResult
Parameters
Name | Description |
---|---|
acl
Required
|
Removes POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, and a user or group identifier in the format "[scope:][type]:[id]". |
Keyword-Only Parameters
Name | Description |
---|---|
progress_hook
|
<xref:func>(AccessControlChanges)
Callback where the caller can track progress of the operation as well as collect paths that failed to change Access Control. |
continuation_token
|
Optional continuation token that can be used to resume previously stopped operation. |
batch_size
|
Optional. If data set size exceeds batch size then operation will be split into multiple requests so that progress can be tracked. Batch size should be between 1 and 2000. The default when unspecified is 2000. |
max_batches
|
Optional. Defines maximum number of batches that single change Access Control operation can execute. If maximum is reached before all sub-paths are processed then, continuation token can be used to resume operation. Empty value indicates that maximum number of batches in unbound and operation continues till end. |
continue_on_failure
|
If set to False, the operation will terminate quickly on encountering user errors (4XX). If True, the operation will ignore user errors and proceed with the operation on other sub-entities of the directory. Continuation token will only be returned when continue_on_failure is True in case of user errors. If not set the default value is False for this. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A summary of the recursive operations, including the count of successes and failures, as well as a continuation token in case the operation was terminated prematurely. |
Exceptions
Type | Description |
---|---|
User can restart the operation using continuation_token field of AzureError if the token is available. |
rename_directory
Rename the source directory.
rename_directory(new_name: str, **kwargs: Any) -> DataLakeDirectoryClient
Parameters
Name | Description |
---|---|
new_name
Required
|
the new directory name the user want to rename to. The value must have the following format: "{filesystem}/{directory}/{subdirectory}". |
Keyword-Only Parameters
Name | Description |
---|---|
source_lease
|
A lease ID for the source path. If specified, the source path must have an active lease and the lease ID must match. |
lease
|
Required if the file/directory has an active lease. Value can be a LeaseClient object or the lease ID as a string. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
source_if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
source_if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
source_etag
|
The source ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
source_match_condition
|
The source match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A DataLakeDirectoryClient with the renamed directory. |
Examples
Rename the source directory.
new_dir_name = "testdir2"
print("Renaming the directory named '{}' to '{}'.".format(dir_name, new_dir_name))
new_directory = directory_client\
.rename_directory(new_name=directory_client.file_system_name + '/' + new_dir_name)
set_access_control
Set the owner, group, permissions, or access control list for a path.
set_access_control(owner: str | None = None, group: str | None = None, permissions: str | None = None, acl: str | None = None, **kwargs) -> Dict[str, str | datetime]
Parameters
Name | Description |
---|---|
owner
Required
|
Optional. The owner of the file or directory. |
group
Required
|
Optional. The owning group of the file or directory. |
permissions
Required
|
Optional and only valid if Hierarchical Namespace is enabled for the account. Sets POSIX access permissions for the file owner, the file owning group, and others. Each class may be granted read, write, or execute permission. The sticky bit is also supported. Both symbolic (rwxrw-rw-) and 4-digit octal notation (e.g. 0766) are supported. permissions and acl are mutually exclusive. |
acl
Required
|
Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]". permissions and acl are mutually exclusive. |
Keyword-Only Parameters
Name | Description |
---|---|
lease
|
Required if the file/directory has an active lease. Value can be a LeaseClient object or the lease ID as a string. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
response dict containing access control options (Etag and last modified). |
set_access_control_recursive
Sets the Access Control on a path and sub-paths.
set_access_control_recursive(acl: str, **kwargs: Any) -> AccessControlChangeResult
Parameters
Name | Description |
---|---|
acl
Required
|
Sets POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]". |
Keyword-Only Parameters
Name | Description |
---|---|
progress_hook
|
<xref:func>(AccessControlChanges)
Callback where the caller can track progress of the operation as well as collect paths that failed to change Access Control. |
continuation_token
|
Optional continuation token that can be used to resume previously stopped operation. |
batch_size
|
Optional. If data set size exceeds batch size then operation will be split into multiple requests so that progress can be tracked. Batch size should be between 1 and 2000. The default when unspecified is 2000. |
max_batches
|
Optional. Defines maximum number of batches that single change Access Control operation can execute. If maximum is reached before all sub-paths are processed, then continuation token can be used to resume operation. Empty value indicates that maximum number of batches in unbound and operation continues till end. |
continue_on_failure
|
If set to False, the operation will terminate quickly on encountering user errors (4XX). If True, the operation will ignore user errors and proceed with the operation on other sub-entities of the directory. Continuation token will only be returned when continue_on_failure is True in case of user errors. If not set the default value is False for this. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A summary of the recursive operations, including the count of successes and failures, as well as a continuation token in case the operation was terminated prematurely. |
Exceptions
Type | Description |
---|---|
User can restart the operation using continuation_token field of AzureError if the token is available. |
set_http_headers
Sets system properties on the file or directory.
If one property is set for the content_settings, all properties will be overridden.
set_http_headers(content_settings: ContentSettings | None = None, **kwargs) -> Dict[str, Any]
Parameters
Name | Description |
---|---|
content_settings
Required
|
ContentSettings object used to set file/directory properties. |
Keyword-Only Parameters
Name | Description |
---|---|
lease
|
If specified, set_file_system_metadata only succeeds if the file system's lease is active and matches this ID. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
file/directory-updated property dict (Etag and last modified) |
set_metadata
Sets one or more user-defined name-value pairs for the specified file system. Each call to this operation replaces all existing metadata attached to the file system. To remove all metadata from the file system, call this operation with no metadata dict.
set_metadata(metadata: Dict[str, str], **kwargs) -> Dict[str, str | datetime]
Parameters
Name | Description |
---|---|
metadata
Required
|
A dict containing name-value pairs to associate with the file system as metadata. Example: {'category':'test'} |
Keyword-Only Parameters
Name | Description |
---|---|
lease
|
If specified, set_file_system_metadata only succeeds if the file system's lease is active and matches this ID. |
if_modified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has been modified since the specified time. |
if_unmodified_since
|
A DateTime value. Azure expects the date value passed in to be UTC. If timezone is included, any non-UTC datetimes will be converted to UTC. If a date is passed in without timezone info, it is assumed to be UTC. Specify this header to perform the operation only if the resource has not been modified since the specified date/time. |
etag
|
An ETag value, or the wildcard character (*). Used to check if the resource has changed, and act according to the condition specified by the match_condition parameter. |
match_condition
|
The match condition to use upon the etag. |
cpk
|
Encrypts the data on the service-side with the given key. Use of customer-provided keys must be done over HTTPS. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
file system-updated property dict (Etag and last modified). |
update_access_control_recursive
Modifies the Access Control on a path and sub-paths.
update_access_control_recursive(acl: str, **kwargs: Any) -> AccessControlChangeResult
Parameters
Name | Description |
---|---|
acl
Required
|
Modifies POSIX access control rights on files and directories. The value is a comma-separated list of access control entries. Each access control entry (ACE) consists of a scope, a type, a user or group identifier, and permissions in the format "[scope:][type]:[id]:[permissions]". |
Keyword-Only Parameters
Name | Description |
---|---|
progress_hook
|
<xref:func>(AccessControlChanges)
Callback where the caller can track progress of the operation as well as collect paths that failed to change Access Control. |
continuation_token
|
Optional continuation token that can be used to resume previously stopped operation. |
batch_size
|
Optional. If data set size exceeds batch size then operation will be split into multiple requests so that progress can be tracked. Batch size should be between 1 and 2000. The default when unspecified is 2000. |
max_batches
|
Optional. Defines maximum number of batches that single change Access Control operation can execute. If maximum is reached before all sub-paths are processed, then continuation token can be used to resume operation. Empty value indicates that maximum number of batches in unbound and operation continues till end. |
continue_on_failure
|
If set to False, the operation will terminate quickly on encountering user errors (4XX). If True, the operation will ignore user errors and proceed with the operation on other sub-entities of the directory. Continuation token will only be returned when continue_on_failure is True in case of user errors. If not set the default value is False for this. |
timeout
|
Sets the server-side timeout for the operation in seconds. For more details see https://learn.microsoft.com/rest/api/storageservices/setting-timeouts-for-blob-service-operations. This value is not tracked or validated on the client. To configure client-side network timesouts see here. |
Returns
Type | Description |
---|---|
A summary of the recursive operations, including the count of successes and failures, as well as a continuation token in case the operation was terminated prematurely. |
Exceptions
Type | Description |
---|---|
User can restart the operation using continuation_token field of AzureError if the token is available. |
Attributes
api_version
location_mode
The location mode that the client is currently using.
By default this will be "primary". Options include "primary" and "secondary".
Returns
Type | Description |
---|---|
primary_endpoint
primary_hostname
secondary_endpoint
The full secondary endpoint URL if configured.
If not available a ValueError will be raised. To explicitly specify a secondary hostname, use the optional secondary_hostname keyword argument on instantiation.
Returns
Type | Description |
---|---|
Exceptions
Type | Description |
---|---|
secondary_hostname
The hostname of the secondary endpoint.
If not available this will be None. To explicitly specify a secondary hostname, use the optional secondary_hostname keyword argument on instantiation.
Returns
Type | Description |
---|---|
url
The full endpoint URL to this entity, including SAS token if used.
This could be either the primary endpoint, or the secondary endpoint depending on the current location_mode. :returns: The full endpoint URL to this entity, including SAS token if used. :rtype: str
Azure SDK for Python