HyperDriveConfig Class

Configuration that defines a HyperDrive run.

HyperDrive configuration includes information about hyperparameter space sampling, termination policy, primary metric, resume from configuration, estimator, and the compute target to execute the experiment runs on.

Initialize the HyperDriveConfig.

Inheritance
builtins.object
HyperDriveConfig

Constructor

HyperDriveConfig(hyperparameter_sampling, primary_metric_name, primary_metric_goal, max_total_runs, max_concurrent_runs=None, max_duration_minutes=10080, policy=None, estimator=None, run_config=None, resume_from=None, resume_child_runs=None, pipeline=None, debug_flag=None, custom_run_id=None)

Parameters

Name Description
estimator

An estimator that will be called with sampled hyperparameters. Specify only one of the following parameters: estimator, run_config, or pipeline.

Default value: None
hyperparameter_sampling
Required

The hyperparameter sampling space.

policy

The early termination policy to use. If None - the default, no early termination policy will be used.

The MedianStoppingPolicy with delay_evaluation of 5 is a good termination policy to start with. These are conservative settings, that can provide 25%-35% savings with no loss on primary metric (based on our evaluation data).

Default value: None
primary_metric_name
Required
str

The name of the primary metric reported by the experiment runs.

primary_metric_goal
Required

Either PrimaryMetricGoal.MINIMIZE or PrimaryMetricGoal.MAXIMIZE. This parameter determines if the primary metric is to be minimized or maximized when evaluating runs.

max_total_runs
Required
int

The maximum total number of runs to create. This is the upper bound; there may be fewer runs when the sample space is smaller than this value. If both max_total_runs and max_duration_minutes are specified, the hyperparameter tuning experiment terminates when the first of these two thresholds is reached.

max_concurrent_runs
int

The maximum number of runs to execute concurrently. If None, all runs are launched in parallel. The number of concurrent runs is gated on the resources available in the specified compute target. Hence, you need to ensure that the compute target has the available resources for the desired concurrency.

Default value: None
max_duration_minutes
int

The maximum duration of the HyperDrive run. Once this time is exceeded, any runs still executing are cancelled. If both max_total_runs and max_duration_minutes are specified, the hyperparameter tuning experiment terminates when the first of these two thresholds is reached.

Default value: 10080
resume_from

A hyperdrive run or a list of hyperdrive runs that will be inherited as data points to warm start the new run.

Default value: None
resume_child_runs
Run or list[Run]

A hyperdrive child run or a list of hyperdrive child runs that will be resumed as new child runs of the new hyperdrive run.

Default value: None
run_config

An object for setting up configuration for script/notebook runs. Specify only one of the following parameters: estimator, run_config, or pipeline.

Default value: None
pipeline

A pipeline object for setting up configuration for pipeline runs. The pipeline object will be called with the sample hyperparameters to submit pipeline runs. Specify only one of the following parameters: estimator, run_config, or pipeline.

Default value: None
custom_run_id
str

Custom run id provided by user

Default value: None
hyperparameter_sampling
Required

The hyperparameter space sampling definition.

primary_metric_name
Required
str

The name of the primary metric reported by the experiment runs.

primary_metric_goal
Required

Either PrimaryMetricGoal.MINIMIZE or PrimaryMetricGoal.MAXIMIZE. This parameter determines if the primary metric is to be minimized or maximized when evaluating runs.

max_total_runs
Required
int

The maximum total number of runs to create. This is the upper bound; there may be fewer runs when the sample space is smaller than this value.

max_concurrent_runs
Required
int

The maximum number of runs to execute concurrently. If None, all runs are launched in parallel.

max_duration_minutes
Required
int

The maximum duration of the HyperDrive run. Once this time is exceeded, any runs still executing are cancelled.

policy
Required

The early termination policy to use. If None - the default, no early termination policy will be used.

The <xref:azureml.train.hyperdrive.MedianTerminationPolicy> with delay_evaluation of 5 is a good termination policy to start with. These are conservative settings, that can provide 25%-35% savings with no loss on primary metric (based on our evaluation data).

estimator
Required

An estimator that will be called with sampled hyper parameters. Specify only one of the following parameters: estimator, run_config, or pipeline.

run_config
Required

An object for setting up configuration for script/notebook runs. Specify only one of the following parameters: estimator, run_config, or pipeline.

resume_from
Required

A hyperdrive run or a list of hyperdrive runs that will be inherited as data points to warm start the new run.

resume_child_runs
Required
Run | list[Run]

A hyperdrive child run or a list of hyperdrive child runs that will be resumed as new child runs of the new hyperdrive run.

pipeline
Required

A pipeline object for setting up configuration for pipeline runs. The pipeline object will be called with the sample hyperparameters to submit pipeline runs. Specify only one of the following parameters: estimator, run_config, or pipeline.

custom_run_id
Required
str

Custom run id provided by user

debug_flag
Default value: None

Remarks

The example below shows creating a HyperDriveConfig object to use for hyperparameter tunning. In the example, the primary metric name matches a value logged in the training script.


   hd_config = HyperDriveConfig(run_config=src,
                    hyperparameter_sampling=ps,
                    policy=early_termination_policy,
                    primary_metric_name='validation_acc',
                    primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
                    max_total_runs=4,
                    max_concurrent_runs=4)

For more information about working with HyperDriveConfig, see the tutorial Tune hyperparameters for your model.

Attributes

estimator

Return the estimator used in the HyperDrive run.

Value is None if the run uses a script run configuration or a pipeline.

Returns

Type Description

The estimator.

pipeline

Return the pipeline used in the HyperDrive run.

Value is None if the run uses a script run configuration or estimator.

Returns

Type Description

The pipeline.

run_config

Return the script/notebook configuration used in the HyperDrive run.

Value is None if the run uses an estimator or pipeline.

Returns

Type Description

The run configuration.

source_directory

Return the source directory from the config to run.

Returns

Type Description
str

The source directory