Manage production scorers

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

After you set up production monitoring, you can manage your scorers throughout their lifecycle. This page covers how to list, update, stop, restart, and delete scorers.

For the full API parameter reference, see Scorer lifecycle management API reference.

Scorer lifecycle

Scorer lifecycles are centered around MLflow experiments. Scorers are immutable — each lifecycle operation returns a new scorer instance rather than modifying the original.

State Description API
Unregistered Scorer function is defined but not known to the server.
Registered Scorer is registered to the active MLflow experiment. .register()
Active Scorer is running with a sample rate > 0. .start()
Stopped Scorer is registered but not running (sample rate = 0). .stop()
Deleted The scorer has been removed from the server and is no longer associated with the experiment. delete_scorer()

Lifecycle example

The following example demonstrates a scorer moving through all lifecycle states:

from mlflow.genai.scorers import Safety, scorer, ScorerSamplingConfig, delete_scorer

# Register → Start → Update → Stop → Delete
safety_judge = Safety().register(name="safety_check")
safety_judge = safety_judge.start(
    sampling_config=ScorerSamplingConfig(sample_rate=1.0),
)
safety_judge = safety_judge.update(
    sampling_config=ScorerSamplingConfig(sample_rate=0.8),
)
safety_judge = safety_judge.stop()
delete_scorer(name="safety_check")

Manage scorers

The following APIs are available to manage scorers.

API Description Example
list_scorers() List all registered scorers for the current experiment. List scorers
get_scorer() Retrieve a registered scorer by name. Get and update a scorer
Scorer.update() Modify the sampling configuration of an active scorer. This is an immutable operation. Get and update a scorer
backfill_scorer() Retroactively apply new or updated metrics to historical traces. Backfill historical traces with scorers
delete_scorer() Delete a registered scorer by name. Stop and delete scorers

List scorers

To view all registered scorers for your experiment:

from mlflow.genai.scorers import list_scorers

# List all registered scorers
scorers = list_scorers()
for scorer in scorers:
    print(f"Name: {scorer.name}")
    print(f"Sample rate: {scorer.sample_rate}")
    print(f"Filter: {scorer.filter_string}")
    print("---")

Get and update a scorer

Use get_scorer() to retrieve a scorer by name, then update() to modify its configuration. Because scorers are immutable, update() returns a new instance.

from mlflow.genai.scorers import get_scorer, ScorerSamplingConfig

# Get existing scorer and update its configuration (immutable operation)
safety_judge = get_scorer(name="safety_monitor")
updated_judge = safety_judge.update(sampling_config=ScorerSamplingConfig(sample_rate=0.8))

# The original scorer remains unchanged; update() returns a new scorer instance
print(f"Original sample rate: {safety_judge.sample_rate}")  # Original rate
print(f"Updated sample rate: {updated_judge.sample_rate}")   # New rate

Stop and delete scorers

Stopping a scorer sets its sample rate to 0 but keeps it registered. Deleting a scorer removes it from the server entirely.

from mlflow.genai.scorers import get_scorer, delete_scorer, ScorerSamplingConfig

# Get existing scorer
databricks_scorer = get_scorer(name="databricks_mentions")

# Stop monitoring (sets sample_rate to 0, keeps scorer registered)
stopped_scorer = databricks_scorer.stop()
print(f"Sample rate after stop: {stopped_scorer.sample_rate}")  # 0

# Restart monitoring from a stopped scorer
restarted_scorer = stopped_scorer.start(sampling_config=ScorerSamplingConfig(sample_rate=0.5))

# Or remove scorer entirely from the server
delete_scorer(name=databricks_scorer.name)

Immutable updates

Scorers, including LLM Judges, are immutable objects. When you update a scorer, an updated copy is created rather than modifying the original. This immutability helps ensure that scorers meant for production are not accidentally modified.

from mlflow.genai.scorers import Safety, ScorerSamplingConfig

original_judge = Safety().register(name="safety")
original_judge = original_judge.start(
   sampling_config=ScorerSamplingConfig(sample_rate=0.3),
)

# Update returns new instance
updated_judge = original_judge.update(
    sampling_config=ScorerSamplingConfig(sample_rate=0.8),
)

# Original remains unchanged
print(f"Original: {original_judge.sample_rate}")  # 0.3
print(f"Updated: {updated_judge.sample_rate}")    # 0.8

Best practices

  • Check the scorer state before operations using sample_rate.
  • Use the immutable pattern. Assign the results of .start(), .update(), .stop() to variables.
  • Understand the difference between .stop() (preserves registration) and delete_scorer() (removes entirely).

Next steps