Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
This feature is in Beta.
This article describes serverless GPU compute on Databricks and provides recommended use cases, guidance for how to set up GPU compute resources, and feature limitations.
What is serverless GPU compute?
Serverless GPU compute is part of the Serverless compute offering. Serverless GPU compute is specialized for custom single and multi-node deep learning workloads. You can use serverless GPU compute to train and fine-tune custom models using your favorite frameworks and get state-of-the-art efficiency, performance, and quality.
Serverless GPU compute includes:
- An integrated experience across Notebooks, Unity Catalog, and MLflow: You can develop your code interactively using Notebooks.
- Serverless GPU compute supports A10s.
The pre-installed packages on serverless GPU compute are not a replacement for Databricks Runtime ML. While there are common packages, not all Databricks Runtime ML dependencies and libraries are reflected in the serverless GPU compute environment.
Recommended use cases
Databricks recommends serverless GPU compute for any model training use case that requires training customizations and GPUs.
For example:
- Deep-learning-based forecasting workloads
- Fine-tuning
- Computer vision
- Computer audio
- Recommender systems
Requirements
- A workspace in one of the following Azure-supported regions:
eastus
eastus2
centralus
northcentralus
westcentralus
What's installed
Serverless GPU compute for notebooks uses environment versions, which provide a stable client API to ensure application compatibility. This allows Databricks to upgrade the server independently, delivering performance improvements, security enhancements, and bug fixes without requiring any code changes to workloads.
Serverless GPU compute uses environment version 3 in addition to the following packages:
CUDA 12.4
torch 2.6.0
torchvision 0.21.0
See Serverless environment version 3 for the packages included in system environment version 3.
Note
Base environments are not supported for serverless GPU compute. In order to set up serverless GPU compute on your environment, specify the dependencies directly in the Environments side panel or pip install
them.
Add libraries to the environment
You can install additional libraries to the serverless GPU compute environment. See Add dependencies to the notebook.
Set up serverless GPU compute
You can select to use a serverless GPU compute from the notebook environment in your workspace.
After you open your notebook:
- Select the
to open the Environment side panel.
- Select A10 from the Accelerator field.
- Select 3 as the Environment version.
- Select Apply and then Confirm that you want to apply the serverless GPU compute to your notebook environment. After connecting to a resource, notebooks immediately begin using the available compute.
Note
Connection to your compute auto-terminates after 60 minutes of inactivity.
Limitations
- Serverless GPU compute only supports A10 compute.
- Private Link is not supported. Storage or pip repos behind Private Link are not supported.
- Serverless GPU compute is not supported for compliance security profile workspaces (like HIPAA or PCI). Processing regulated data is not supported at this time.
Notebook examples
The following notebook provides a simple example of how to run deep learning training using PyTorch and serverless GPU compute.
Serverless GPU compute notebook
The following notebook provides an example of how to efficiently fine-tune the Qwen2-0.5B model using:
- Transformer reinforcement learning (TRL) for supervised finetuning
- Liger Kernels for memory-efficient training with optimized Triton kernels.
- LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning.and train it using PyTorch and serverless GPU compute.
Serverless GPU compute fine-tuning notebook
The following notebook provides an example of how to fine-tune an embedding model. This example uses contrastive learning to fine-tune an embedding model, gte-large-en-v1.5
on a single A10G.