Jobs - Create
Create Job
Creates a Job.
PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Media/mediaServices/{accountName}/transforms/{transformName}/jobs/{jobName}?api-version=2022-07-01
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
account
|
path | True |
string |
The Media Services account name. |
job
|
path | True |
string |
The Job name. |
resource
|
path | True |
string |
The name of the resource group within the Azure subscription. |
subscription
|
path | True |
string |
The unique identifier for a Microsoft Azure subscription. |
transform
|
path | True |
string |
The Transform name. |
api-version
|
query | True |
string |
The version of the API to be used with the client request. |
Request Body
Name | Required | Type | Description |
---|---|---|---|
properties.input | True | JobInput: |
The inputs for the Job. |
properties.outputs | True | JobOutput[]: |
The outputs for the Job. |
properties.correlationData |
object |
Customer provided key, value pairs that will be returned in Job and JobOutput state events. |
|
properties.description |
string |
Optional customer supplied description of the Job. |
|
properties.priority |
Priority with which the job should be processed. Higher priority jobs are processed before lower priority jobs. If not set, the default is normal. |
Responses
Name | Type | Description |
---|---|---|
201 Created |
Created |
|
Other Status Codes |
Detailed error information. |
Examples
Create a Job
Sample request
PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contosoresources/providers/Microsoft.Media/mediaServices/contosomedia/transforms/exampleTransform/jobs/job1?api-version=2022-07-01
{
"properties": {
"input": {
"@odata.type": "#Microsoft.Media.JobInputAsset",
"assetName": "job1-InputAsset"
},
"outputs": [
{
"@odata.type": "#Microsoft.Media.JobOutputAsset",
"assetName": "job1-OutputAsset"
}
],
"correlationData": {
"key1": "value1",
"Key 2": "Value 2"
}
}
}
Sample response
{
"name": "job1",
"id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/contosoresources/providers/Microsoft.Media/mediaservices/contosomedia/transforms/exampleTransform/jobs/job1",
"type": "Microsoft.Media/mediaservices/transforms/jobs",
"properties": {
"created": "2022-10-17T23:14:33.6140749Z",
"state": "Queued",
"input": {
"@odata.type": "#Microsoft.Media.JobInputAsset",
"files": [],
"inputDefinitions": [],
"assetName": "job1-InputAsset"
},
"lastModified": "2022-10-17T23:14:33.6140749Z",
"outputs": [
{
"@odata.type": "#Microsoft.Media.JobOutputAsset",
"state": "Queued",
"progress": 0,
"label": "BuiltInStandardEncoderPreset_0",
"assetName": "job1-OutputAsset"
}
],
"priority": "Normal",
"correlationData": {
"key1": "value1",
"Key 2": "Value 2"
}
},
"systemData": {
"createdBy": "[email protected]",
"createdByType": "User",
"createdAt": "2022-10-17T23:14:33.6140749Z",
"lastModifiedBy": "[email protected]",
"lastModifiedByType": "User",
"lastModifiedAt": "2022-10-17T23:14:33.6140749Z"
}
}
Definitions
Name | Description |
---|---|
Aac |
Describes Advanced Audio Codec (AAC) audio encoding settings. |
Aac |
The encoding profile to be used when encoding audio with AAC. |
Absolute |
Specifies the clip time as an absolute time position in the media file. The absolute time can point to a different position depending on whether the media file starts from a timestamp of zero or not. |
Analysis |
Specifies the maximum resolution at which your video is analyzed. The default behavior is "SourceResolution," which will keep the input video at its original resolution when analyzed. Using "StandardDefinition" will resize input videos to standard definition while preserving the appropriate aspect ratio. It will only resize if the video is of higher resolution. For example, a 1920x1080 input would be scaled to 640x360 before processing. Switching to "StandardDefinition" will reduce the time it takes to process high resolution video. It may also reduce the cost of using this component (see https://azure.microsoft.com/en-us/pricing/details/media-services/#analytics for details). However, faces that end up being too small in the resized video may not be detected. |
Attribute |
The type of AttributeFilter to apply to the TrackAttribute in order to select the tracks. |
Audio |
Defines the common properties for all audio codecs. |
Audio |
Determines the set of audio analysis operations to be performed. If unspecified, the Standard AudioAnalysisMode would be chosen. |
Audio |
The Audio Analyzer preset applies a pre-defined set of AI-based analysis operations, including speech transcription. Currently, the preset supports processing of content with a single audio track. |
Audio |
Describes the properties of an audio overlay. |
Audio |
A TrackSelection to select audio tracks. |
Blur |
Blur type |
Built |
Describes a built-in preset for encoding the input video with the Standard Encoder. |
Channel |
Optional designation for single channel audio tracks. Can be used to combine the tracks into stereo or multi-channel audio tracks. |
Complexity |
Allows you to configure the encoder settings to control the balance between speed and quality. Example: set Complexity as Speed for faster encoding but less compression efficiency. |
Copy |
A codec flag, which tells the encoder to copy the input audio bitstream. |
Copy |
A codec flag, which tells the encoder to copy the input video bitstream without re-encoding. |
created |
The type of identity that created the resource. |
DDAudio |
Describes Dolby Digital Audio Codec (AC3) audio encoding settings. The current implementation for Dolby Digital Audio support are: Audio channel numbers at 1((mono), 2(stereo), 6(5.1side); Audio sampling frequency rates at: 32K/44.1K/48K Hz; Audio bitrate values as AC3 specification supports: 32000, 40000, 48000, 56000, 64000, 80000, 96000, 112000, 128000, 160000, 192000, 224000, 256000, 320000, 384000, 448000, 512000, 576000, 640000 bps. |
Deinterlace |
Describes the de-interlacing settings. |
Deinterlace |
The deinterlacing mode. Defaults to AutoPixelAdaptive. |
Deinterlace |
The field parity for de-interlacing, defaults to Auto. |
Encoder |
The built-in preset to be used for encoding videos. |
Entropy |
The entropy mode to be used for this layer. If not specified, the encoder chooses the mode that is appropriate for the profile and level. |
Error |
The resource management error additional info. |
Error |
The error detail. |
Error |
Error response |
Face |
Describes all the settings to be used when analyzing a video in order to detect (and optionally redact) all the faces present. |
Face |
This mode provides the ability to choose between the following settings: 1) Analyze - For detection only.This mode generates a metadata JSON file marking appearances of faces throughout the video.Where possible, appearances of the same person are assigned the same ID. 2) Combined - Additionally redacts(blurs) detected faces. 3) Redact - This enables a 2-pass process, allowing for selective redaction of a subset of detected faces.It takes in the metadata file from a prior analyze pass, along with the source video, and a user-selected subset of IDs that require redaction. |
Fade |
Describes the properties of a Fade effect applied to the input media. |
Filters |
Describes all the filtering operations, such as de-interlacing, rotation etc. that are to be applied to the input media before encoding. |
From |
An InputDefinition that looks across all of the files provided to select tracks specified by the IncludedTracks property. Generally used with the AudioTrackByAttribute and VideoTrackByAttribute to allow selection of a single track across a set of input files. |
From |
An InputDefinition that looks at each input file provided to select tracks specified by the IncludedTracks property. Generally used with the AudioTrackByAttribute and VideoTrackByAttribute to select tracks from each file given. |
H264Complexity |
Tells the encoder how to choose its encoding settings. The default value is Balanced. |
H264Layer |
Describes the settings to be used when encoding the input video into a desired output bitrate layer with the H.264 video codec. |
H264Rate |
The video rate control mode |
H264Video |
Describes all the properties for encoding a video with the H.264 codec. |
H264Video |
We currently support Baseline, Main, High, High422, High444. Default is Auto. |
H265Complexity |
Tells the encoder how to choose its encoding settings. Quality will provide for a higher compression ratio but at a higher cost and longer compute time. Speed will produce a relatively larger file but is faster and more economical. The default value is Balanced. |
H265Layer |
Describes the settings to be used when encoding the input video into a desired output bitrate layer with the H.265 video codec. |
H265Video |
Describes all the properties for encoding a video with the H.265 codec. |
H265Video |
We currently support Main. Default is Auto. |
Image |
Describes the basic properties for generating thumbnails from the input video |
Image |
Describes the properties for an output image file. |
Input |
An InputDefinition for a single file. TrackSelections are scoped to the file specified. |
Insights |
Defines the type of insights that you want the service to generate. The allowed values are 'AudioInsightsOnly', 'VideoInsightsOnly', and 'AllInsights'. The default is AllInsights. If you set this to AllInsights and the input is audio only, then only audio insights are generated. Similarly if the input is video only, then only video insights are generated. It is recommended that you not use AudioInsightsOnly if you expect some of your inputs to be video only; or use VideoInsightsOnly if you expect some of your inputs to be audio only. Your Jobs in such conditions would error out. |
Interleave |
Sets the interleave mode of the output to control how audio and video are stored in the container format. Example: set InterleavedOutput as NonInterleavedOutput to produce audio-only and video-only outputs in separate MP4 files. |
Job |
A Job resource type. The progress and state can be obtained by polling a Job or subscribing to events using EventGrid. |
Job |
Details of JobOutput errors. |
Job |
Helps with categorization of errors. |
Job |
Error code describing the error. |
Job |
Details of JobOutput errors. |
Job |
Represents an Asset for input into a Job. |
Job |
Represents input files for a Job. |
Job |
Represents HTTPS job input. |
Job |
Describes a list of inputs to a Job. |
Job |
A Sequence contains an ordered list of Clips where each clip is a JobInput. The Sequence will be treated as a single input. |
Job |
Represents an Asset used as a JobOutput. |
Job |
Indicates that it may be possible to retry the Job. If retry is unsuccessful, please contact Azure support via Azure Portal. |
Job |
Describes the state of the JobOutput. |
Jpg |
Describes the settings for producing JPEG thumbnails. |
Jpg |
Describes the properties for producing a series of JPEG images from the input video. |
Jpg |
Describes the settings to produce a JPEG image from the input video. |
Mp4Format |
Describes the properties for an output ISO MP4 file. |
Multi |
Describes the properties for producing a collection of GOP aligned multi-bitrate files. The default behavior is to produce one output file for each video layer which is muxed together with all the audios. The exact output files produced can be controlled by specifying the outputFiles collection. |
Output |
Represents an output file produced. |
Png |
Describes the settings for producing PNG thumbnails. |
Png |
Describes the properties for producing a series of PNG images from the input video. |
Png |
Describes the settings to produce a PNG image from the input video. |
Preset |
An object of optional configuration settings for encoder. |
Priority |
Sets the relative priority of the TransformOutputs within a Transform. This sets the priority that the service uses for processing TransformOutputs. The default priority is Normal. |
Rectangle |
Describes the properties of a rectangular window applied to the input media before processing it. |
Rotation |
The rotation, if any, to be applied to the input video, before it is encoded. Default is Auto |
Select |
Select audio tracks from the input by specifying an attribute and an attribute filter. |
Select |
Select audio tracks from the input by specifying a track identifier. |
Select |
Select video tracks from the input by specifying an attribute and an attribute filter. |
Select |
Select video tracks from the input by specifying a track identifier. |
Standard |
Describes all the settings to be used when encoding the input video with the Standard Encoder. |
Stretch |
The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize |
system |
Metadata pertaining to creation and last modification of the resource. |
Track |
The TrackAttribute to filter the tracks by. |
Transport |
Describes the properties for generating an MPEG-2 Transport Stream (ISO/IEC 13818-1) output video file(s). |
Utc |
Specifies the clip time as a Utc time position in the media file. The Utc time can point to a different position depending on whether the media file starts from a timestamp of zero or not. |
Video |
Describes the basic properties for encoding the input video. |
Video |
A video analyzer preset that extracts insights (rich metadata) from both audio and video, and outputs a JSON format file. |
Video |
Describes the properties of a video overlay. |
Video |
The Video Sync Mode |
Video |
A TrackSelection to select video tracks. |
AacAudio
Describes Advanced Audio Codec (AAC) audio encoding settings.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
bitrate |
integer |
The bitrate, in bits per second, of the output encoded audio. |
channels |
integer |
The number of channels in the audio. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
profile |
The encoding profile to be used when encoding audio with AAC. |
|
samplingRate |
integer |
The sampling rate to use for encoding in hertz. |
AacAudioProfile
The encoding profile to be used when encoding audio with AAC.
Name | Type | Description |
---|---|---|
AacLc |
string |
Specifies that the output audio is to be encoded into AAC Low Complexity profile (AAC-LC). |
HeAacV1 |
string |
Specifies that the output audio is to be encoded into HE-AAC v1 profile. |
HeAacV2 |
string |
Specifies that the output audio is to be encoded into HE-AAC v2 profile. |
AbsoluteClipTime
Specifies the clip time as an absolute time position in the media file. The absolute time can point to a different position depending on whether the media file starts from a timestamp of zero or not.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
time |
string |
The time position on the timeline of the input media. It is usually specified as an ISO8601 period. e.g PT30S for 30 seconds. |
AnalysisResolution
Specifies the maximum resolution at which your video is analyzed. The default behavior is "SourceResolution," which will keep the input video at its original resolution when analyzed. Using "StandardDefinition" will resize input videos to standard definition while preserving the appropriate aspect ratio. It will only resize if the video is of higher resolution. For example, a 1920x1080 input would be scaled to 640x360 before processing. Switching to "StandardDefinition" will reduce the time it takes to process high resolution video. It may also reduce the cost of using this component (see https://azure.microsoft.com/en-us/pricing/details/media-services/#analytics for details). However, faces that end up being too small in the resized video may not be detected.
Name | Type | Description |
---|---|---|
SourceResolution |
string |
|
StandardDefinition |
string |
AttributeFilter
The type of AttributeFilter to apply to the TrackAttribute in order to select the tracks.
Name | Type | Description |
---|---|---|
All |
string |
All tracks will be included. |
Bottom |
string |
The first track will be included when the attribute is sorted in ascending order. Generally used to select the smallest bitrate. |
Top |
string |
The first track will be included when the attribute is sorted in descending order. Generally used to select the largest bitrate. |
ValueEquals |
string |
Any tracks that have an attribute equal to the value given will be included. |
Audio
Defines the common properties for all audio codecs.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
bitrate |
integer |
The bitrate, in bits per second, of the output encoded audio. |
channels |
integer |
The number of channels in the audio. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
samplingRate |
integer |
The sampling rate to use for encoding in hertz. |
AudioAnalysisMode
Determines the set of audio analysis operations to be performed. If unspecified, the Standard AudioAnalysisMode would be chosen.
Name | Type | Description |
---|---|---|
Basic |
string |
This mode performs speech-to-text transcription and generation of a VTT subtitle/caption file. The output of this mode includes an Insights JSON file including only the keywords, transcription,and timing information. Automatic language detection and speaker diarization are not included in this mode. |
Standard |
string |
Performs all operations included in the Basic mode, additionally performing language detection and speaker diarization. |
AudioAnalyzerPreset
The Audio Analyzer preset applies a pre-defined set of AI-based analysis operations, including speech transcription. Currently, the preset supports processing of content with a single audio track.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
audioLanguage |
string |
The language for the audio payload in the input using the BCP-47 format of 'language tag-region' (e.g: 'en-US'). If you know the language of your content, it is recommended that you specify it. The language must be specified explicitly for AudioAnalysisMode::Basic, since automatic language detection is not included in basic mode. If the language isn't specified or set to null, automatic language detection will choose the first language detected and process with the selected language for the duration of the file. It does not currently support dynamically switching between languages after the first language is detected. The automatic detection works best with audio recordings with clearly discernable speech. If automatic detection fails to find the language, transcription would fallback to 'en-US'." The list of supported languages is available here: https://go.microsoft.com/fwlink/?linkid=2109463 |
experimentalOptions |
object |
Dictionary containing key value pairs for parameters not exposed in the preset itself |
mode |
Determines the set of audio analysis operations to be performed. If unspecified, the Standard AudioAnalysisMode would be chosen. |
AudioOverlay
Describes the properties of an audio overlay.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
audioGainLevel |
number |
The gain level of audio in the overlay. The value should be in the range [0, 1.0]. The default is 1.0. |
end |
string |
The end position, with reference to the input video, at which the overlay ends. The value should be in ISO 8601 format. For example, PT30S to end the overlay at 30 seconds into the input video. If not specified or the value is greater than the input video duration, the overlay will be applied until the end of the input video if the overlay media duration is greater than the input video duration, else the overlay will last as long as the overlay media duration. |
fadeInDuration |
string |
The duration over which the overlay fades in onto the input video. The value should be in ISO 8601 duration format. If not specified the default behavior is to have no fade in (same as PT0S). |
fadeOutDuration |
string |
The duration over which the overlay fades out of the input video. The value should be in ISO 8601 duration format. If not specified the default behavior is to have no fade out (same as PT0S). |
inputLabel |
string |
The label of the job input which is to be used as an overlay. The Input must specify exactly one file. You can specify an image file in JPG, PNG, GIF or BMP format, or an audio file (such as a WAV, MP3, WMA or M4A file), or a video file. See https://aka.ms/mesformats for the complete list of supported audio and video file formats. |
start |
string |
The start position, with reference to the input video, at which the overlay starts. The value should be in ISO 8601 format. For example, PT05S to start the overlay at 5 seconds into the input video. If not specified the overlay starts from the beginning of the input video. |
AudioTrackDescriptor
A TrackSelection to select audio tracks.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
channelMapping |
Optional designation for single channel audio tracks. Can be used to combine the tracks into stereo or multi-channel audio tracks. |
BlurType
Blur type
Name | Type | Description |
---|---|---|
Black |
string |
Black: Black out filter |
Box |
string |
Box: debug filter, bounding box only |
High |
string |
High: Confuse blur filter |
Low |
string |
Low: box-car blur filter |
Med |
string |
Med: Gaussian blur filter |
BuiltInStandardEncoderPreset
Describes a built-in preset for encoding the input video with the Standard Encoder.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
configurations |
Optional configuration settings for encoder. Configurations is only supported for ContentAwareEncoding and H265ContentAwareEncoding BuiltInStandardEncoderPreset. |
|
presetName |
The built-in preset to be used for encoding videos. |
ChannelMapping
Optional designation for single channel audio tracks. Can be used to combine the tracks into stereo or multi-channel audio tracks.
Name | Type | Description |
---|---|---|
BackLeft |
string |
The Back Left Channel. Sometimes referred to as the Left Surround Channel. |
BackRight |
string |
The Back Right Channel. Sometimes referred to as the Right Surround Channel. |
Center |
string |
The Center Channel. |
FrontLeft |
string |
The Front Left Channel. |
FrontRight |
string |
The Front Right Channel. |
LowFrequencyEffects |
string |
Low Frequency Effects Channel. Sometimes referred to as the subwoofer. |
StereoLeft |
string |
The Left Stereo channel. Sometimes referred to as Down Mix Left. |
StereoRight |
string |
The Right Stereo channel. Sometimes referred to as Down Mix Right. |
Complexity
Allows you to configure the encoder settings to control the balance between speed and quality. Example: set Complexity as Speed for faster encoding but less compression efficiency.
Name | Type | Description |
---|---|---|
Balanced |
string |
Configures the encoder to use settings that achieve a balance between speed and quality. |
Quality |
string |
Configures the encoder to use settings optimized to produce higher quality output at the expense of slower overall encode time. |
Speed |
string |
Configures the encoder to use settings optimized for faster encoding. Quality is sacrificed to decrease encoding time. |
CopyAudio
A codec flag, which tells the encoder to copy the input audio bitstream.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
CopyVideo
A codec flag, which tells the encoder to copy the input video bitstream without re-encoding.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
createdByType
The type of identity that created the resource.
Name | Type | Description |
---|---|---|
Application |
string |
|
Key |
string |
|
ManagedIdentity |
string |
|
User |
string |
DDAudio
Describes Dolby Digital Audio Codec (AC3) audio encoding settings. The current implementation for Dolby Digital Audio support are: Audio channel numbers at 1((mono), 2(stereo), 6(5.1side); Audio sampling frequency rates at: 32K/44.1K/48K Hz; Audio bitrate values as AC3 specification supports: 32000, 40000, 48000, 56000, 64000, 80000, 96000, 112000, 128000, 160000, 192000, 224000, 256000, 320000, 384000, 448000, 512000, 576000, 640000 bps.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
bitrate |
integer |
The bitrate, in bits per second, of the output encoded audio. |
channels |
integer |
The number of channels in the audio. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
samplingRate |
integer |
The sampling rate to use for encoding in hertz. |
Deinterlace
Describes the de-interlacing settings.
Name | Type | Description |
---|---|---|
mode |
The deinterlacing mode. Defaults to AutoPixelAdaptive. |
|
parity |
The field parity for de-interlacing, defaults to Auto. |
DeinterlaceMode
The deinterlacing mode. Defaults to AutoPixelAdaptive.
Name | Type | Description |
---|---|---|
AutoPixelAdaptive |
string |
Apply automatic pixel adaptive de-interlacing on each frame in the input video. |
Off |
string |
Disables de-interlacing of the source video. |
DeinterlaceParity
The field parity for de-interlacing, defaults to Auto.
Name | Type | Description |
---|---|---|
Auto |
string |
Automatically detect the order of fields |
BottomFieldFirst |
string |
Apply bottom field first processing of input video. |
TopFieldFirst |
string |
Apply top field first processing of input video. |
EncoderNamedPreset
The built-in preset to be used for encoding videos.
Name | Type | Description |
---|---|---|
AACGoodQualityAudio |
string |
Produces a single MP4 file containing only AAC stereo audio encoded at 192 kbps. |
AdaptiveStreaming |
string |
Produces a set of GOP aligned MP4 files with H.264 video and stereo AAC audio. Auto-generates a bitrate ladder based on the input resolution, bitrate and frame rate. The auto-generated preset will never exceed the input resolution. For example, if the input is 720p, output will remain 720p at best. |
ContentAwareEncoding |
string |
Produces a set of GOP-aligned MP4s by using content-aware encoding. Given any input content, the service performs an initial lightweight analysis of the input content, and uses the results to determine the optimal number of layers, appropriate bitrate and resolution settings for delivery by adaptive streaming. This preset is particularly effective for low and medium complexity videos, where the output files will be at lower bitrates but at a quality that still delivers a good experience to viewers. The output will contain MP4 files with video and audio interleaved. |
ContentAwareEncodingExperimental |
string |
Exposes an experimental preset for content-aware encoding. Given any input content, the service attempts to automatically determine the optimal number of layers, appropriate bitrate and resolution settings for delivery by adaptive streaming. The underlying algorithms will continue to evolve over time. The output will contain MP4 files with video and audio interleaved. |
CopyAllBitrateNonInterleaved |
string |
Copy all video and audio streams from the input asset as non-interleaved video and audio output files. This preset can be used to clip an existing asset or convert a group of key frame (GOP) aligned MP4 files as an asset that can be streamed. |
DDGoodQualityAudio |
string |
Produces a single MP4 file containing only DD(Digital Dolby) stereo audio encoded at 192 kbps. |
H264MultipleBitrate1080p |
string |
Produces a set of 8 GOP-aligned MP4 files, ranging from 6000 kbps to 400 kbps, and stereo AAC audio. Resolution starts at 1080p and goes down to 180p. |
H264MultipleBitrate720p |
string |
Produces a set of 6 GOP-aligned MP4 files, ranging from 3400 kbps to 400 kbps, and stereo AAC audio. Resolution starts at 720p and goes down to 180p. |
H264MultipleBitrateSD |
string |
Produces a set of 5 GOP-aligned MP4 files, ranging from 1900kbps to 400 kbps, and stereo AAC audio. Resolution starts at 480p and goes down to 240p. |
H264SingleBitrate1080p |
string |
Produces an MP4 file where the video is encoded with H.264 codec at 6750 kbps and a picture height of 1080 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. |
H264SingleBitrate720p |
string |
Produces an MP4 file where the video is encoded with H.264 codec at 4500 kbps and a picture height of 720 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. |
H264SingleBitrateSD |
string |
Produces an MP4 file where the video is encoded with H.264 codec at 2200 kbps and a picture height of 480 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. |
H265AdaptiveStreaming |
string |
Produces a set of GOP aligned MP4 files with H.265 video and stereo AAC audio. Auto-generates a bitrate ladder based on the input resolution, bitrate and frame rate. The auto-generated preset will never exceed the input resolution. For example, if the input is 720p, output will remain 720p at best. |
H265ContentAwareEncoding |
string |
Produces a set of GOP-aligned MP4s by using content-aware encoding. Given any input content, the service performs an initial lightweight analysis of the input content, and uses the results to determine the optimal number of layers, appropriate bitrate and resolution settings for delivery by adaptive streaming. This preset is particularly effective for low and medium complexity videos, where the output files will be at lower bitrates but at a quality that still delivers a good experience to viewers. The output will contain MP4 files with video and audio interleaved. |
H265SingleBitrate1080p |
string |
Produces an MP4 file where the video is encoded with H.265 codec at 3500 kbps and a picture height of 1080 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. |
H265SingleBitrate4K |
string |
Produces an MP4 file where the video is encoded with H.265 codec at 9500 kbps and a picture height of 2160 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. |
H265SingleBitrate720p |
string |
Produces an MP4 file where the video is encoded with H.265 codec at 1800 kbps and a picture height of 720 pixels, and the stereo audio is encoded with AAC-LC codec at 128 kbps. |
EntropyMode
The entropy mode to be used for this layer. If not specified, the encoder chooses the mode that is appropriate for the profile and level.
Name | Type | Description |
---|---|---|
Cabac |
string |
Context Adaptive Binary Arithmetic Coder (CABAC) entropy encoding. |
Cavlc |
string |
Context Adaptive Variable Length Coder (CAVLC) entropy encoding. |
ErrorAdditionalInfo
The resource management error additional info.
Name | Type | Description |
---|---|---|
info |
object |
The additional info. |
type |
string |
The additional info type. |
ErrorDetail
The error detail.
Name | Type | Description |
---|---|---|
additionalInfo |
The error additional info. |
|
code |
string |
The error code. |
details |
The error details. |
|
message |
string |
The error message. |
target |
string |
The error target. |
ErrorResponse
Error response
Name | Type | Description |
---|---|---|
error |
The error object. |
FaceDetectorPreset
Describes all the settings to be used when analyzing a video in order to detect (and optionally redact) all the faces present.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
blurType |
Blur type |
|
experimentalOptions |
object |
Dictionary containing key value pairs for parameters not exposed in the preset itself |
mode |
This mode provides the ability to choose between the following settings: 1) Analyze - For detection only.This mode generates a metadata JSON file marking appearances of faces throughout the video.Where possible, appearances of the same person are assigned the same ID. 2) Combined - Additionally redacts(blurs) detected faces. 3) Redact - This enables a 2-pass process, allowing for selective redaction of a subset of detected faces.It takes in the metadata file from a prior analyze pass, along with the source video, and a user-selected subset of IDs that require redaction. |
|
resolution |
Specifies the maximum resolution at which your video is analyzed. The default behavior is "SourceResolution," which will keep the input video at its original resolution when analyzed. Using "StandardDefinition" will resize input videos to standard definition while preserving the appropriate aspect ratio. It will only resize if the video is of higher resolution. For example, a 1920x1080 input would be scaled to 640x360 before processing. Switching to "StandardDefinition" will reduce the time it takes to process high resolution video. It may also reduce the cost of using this component (see https://azure.microsoft.com/en-us/pricing/details/media-services/#analytics for details). However, faces that end up being too small in the resized video may not be detected. |
FaceRedactorMode
This mode provides the ability to choose between the following settings: 1) Analyze - For detection only.This mode generates a metadata JSON file marking appearances of faces throughout the video.Where possible, appearances of the same person are assigned the same ID. 2) Combined - Additionally redacts(blurs) detected faces. 3) Redact - This enables a 2-pass process, allowing for selective redaction of a subset of detected faces.It takes in the metadata file from a prior analyze pass, along with the source video, and a user-selected subset of IDs that require redaction.
Name | Type | Description |
---|---|---|
Analyze |
string |
Analyze mode detects faces and outputs a metadata file with the results. Allows editing of the metadata file before faces are blurred with Redact mode. |
Combined |
string |
Combined mode does the Analyze and Redact steps in one pass when editing the analyzed faces is not desired. |
Redact |
string |
Redact mode consumes the metadata file from Analyze mode and redacts the faces found. |
Fade
Describes the properties of a Fade effect applied to the input media.
Name | Type | Description |
---|---|---|
duration |
string |
The Duration of the fade effect in the video. The value can be in ISO 8601 format (For example, PT05S to fade In/Out a color during 5 seconds), or a frame count (For example, 10 to fade 10 frames from the start time), or a relative value to stream duration (For example, 10% to fade 10% of stream duration) |
fadeColor |
string |
The Color for the fade In/Out. it can be on the CSS Level1 colors https://developer.mozilla.org/en-US/docs/Web/CSS/color_value/color_keywords or an RGB/hex value: e.g: rgb(255,0,0), 0xFF0000 or #FF0000 |
start |
string |
The position in the input video from where to start fade. The value can be in ISO 8601 format (For example, PT05S to start at 5 seconds), or a frame count (For example, 10 to start at the 10th frame), or a relative value to stream duration (For example, 10% to start at 10% of stream duration). Default is 0 |
Filters
Describes all the filtering operations, such as de-interlacing, rotation etc. that are to be applied to the input media before encoding.
Name | Type | Description |
---|---|---|
crop |
The parameters for the rectangular window with which to crop the input video. |
|
deinterlace |
The de-interlacing settings. |
|
fadeIn |
Describes the properties of a Fade effect applied to the input media. |
|
fadeOut |
Describes the properties of a Fade effect applied to the input media. |
|
overlays | Overlay[]: |
The properties of overlays to be applied to the input video. These could be audio, image or video overlays. |
rotation |
The rotation, if any, to be applied to the input video, before it is encoded. Default is Auto |
FromAllInputFile
An InputDefinition that looks across all of the files provided to select tracks specified by the IncludedTracks property. Generally used with the AudioTrackByAttribute and VideoTrackByAttribute to allow selection of a single track across a set of input files.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
includedTracks | TrackDescriptor[]: |
The list of TrackDescriptors which define the metadata and selection of tracks in the input. |
FromEachInputFile
An InputDefinition that looks at each input file provided to select tracks specified by the IncludedTracks property. Generally used with the AudioTrackByAttribute and VideoTrackByAttribute to select tracks from each file given.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
includedTracks | TrackDescriptor[]: |
The list of TrackDescriptors which define the metadata and selection of tracks in the input. |
H264Complexity
Tells the encoder how to choose its encoding settings. The default value is Balanced.
Name | Type | Description |
---|---|---|
Balanced |
string |
Tells the encoder to use settings that achieve a balance between speed and quality. |
Quality |
string |
Tells the encoder to use settings that are optimized to produce higher quality output at the expense of slower overall encode time. |
Speed |
string |
Tells the encoder to use settings that are optimized for faster encoding. Quality is sacrificed to decrease encoding time. |
H264Layer
Describes the settings to be used when encoding the input video into a desired output bitrate layer with the H.264 video codec.
Name | Type | Description |
---|---|---|
adaptiveBFrame |
boolean |
Whether or not adaptive B-frames are to be used when encoding this layer. If not specified, the encoder will turn it on whenever the video profile permits its use. |
bFrames |
integer |
The number of B-frames to be used when encoding this layer. If not specified, the encoder chooses an appropriate number based on the video profile and level. |
bitrate |
integer |
The average bitrate in bits per second at which to encode the input video when generating this layer. This is a required field. |
bufferWindow |
string |
The VBV buffer window length. The value should be in ISO 8601 format. The value should be in the range [0.1-100] seconds. The default is 5 seconds (for example, PT5S). |
crf |
number |
The value of CRF to be used when encoding this layer. This setting takes effect when RateControlMode of video codec is set at CRF mode. The range of CRF value is between 0 and 51, where lower values would result in better quality, at the expense of higher file sizes. Higher values mean more compression, but at some point quality degradation will be noticed. Default value is 23. |
entropyMode |
The entropy mode to be used for this layer. If not specified, the encoder chooses the mode that is appropriate for the profile and level. |
|
frameRate |
string |
The frame rate (in frames per second) at which to encode this layer. The value can be in the form of M/N where M and N are integers (For example, 30000/1001), or in the form of a number (For example, 30, or 29.97). The encoder enforces constraints on allowed frame rates based on the profile and level. If it is not specified, the encoder will use the same frame rate as the input video. |
height |
string |
The height of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in height as the input. |
label |
string |
The alphanumeric label for this layer, which can be used in multiplexing different video and audio layers, or in naming the output file. |
level |
string |
We currently support Level up to 6.2. The value can be Auto, or a number that matches the H.264 profile. If not specified, the default is Auto, which lets the encoder choose the Level that is appropriate for this layer. |
maxBitrate |
integer |
The maximum bitrate (in bits per second), at which the VBV buffer should be assumed to refill. If not specified, defaults to the same value as bitrate. |
profile |
We currently support Baseline, Main, High, High422, High444. Default is Auto. |
|
referenceFrames |
integer |
The number of reference frames to be used when encoding this layer. If not specified, the encoder determines an appropriate number based on the encoder complexity setting. |
slices |
integer |
The number of slices to be used when encoding this layer. If not specified, default is zero, which means that encoder will use a single slice for each frame. |
width |
string |
The width of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in width as the input. |
H264RateControlMode
The video rate control mode
Name | Type | Description |
---|---|---|
ABR |
string |
Average Bitrate (ABR) mode that hits the target bitrate: Default mode. |
CBR |
string |
Constant Bitrate (CBR) mode that tightens bitrate variations around target bitrate. |
CRF |
string |
Constant Rate Factor (CRF) mode that targets at constant subjective quality. |
H264Video
Describes all the properties for encoding a video with the H.264 codec.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
complexity |
Tells the encoder how to choose its encoding settings. The default value is Balanced. |
|
keyFrameInterval |
string |
The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
layers |
The collection of output H.264 layers to be produced by the encoder. |
|
rateControlMode |
The video rate control mode |
|
sceneChangeDetection |
boolean |
Whether or not the encoder should insert key frames at scene changes. If not specified, the default is false. This flag should be set to true only when the encoder is being configured to produce a single output video. |
stretchMode |
The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize |
|
syncMode |
The Video Sync Mode |
H264VideoProfile
We currently support Baseline, Main, High, High422, High444. Default is Auto.
Name | Type | Description |
---|---|---|
Auto |
string |
Tells the encoder to automatically determine the appropriate H.264 profile. |
Baseline |
string |
Baseline profile |
High |
string |
High profile. |
High422 |
string |
High 4:2:2 profile. |
High444 |
string |
High 4:4:4 predictive profile. |
Main |
string |
Main profile |
H265Complexity
Tells the encoder how to choose its encoding settings. Quality will provide for a higher compression ratio but at a higher cost and longer compute time. Speed will produce a relatively larger file but is faster and more economical. The default value is Balanced.
Name | Type | Description |
---|---|---|
Balanced |
string |
Tells the encoder to use settings that achieve a balance between speed and quality. |
Quality |
string |
Tells the encoder to use settings that are optimized to produce higher quality output at the expense of slower overall encode time. |
Speed |
string |
Tells the encoder to use settings that are optimized for faster encoding. Quality is sacrificed to decrease encoding time. |
H265Layer
Describes the settings to be used when encoding the input video into a desired output bitrate layer with the H.265 video codec.
Name | Type | Description |
---|---|---|
adaptiveBFrame |
boolean |
Specifies whether or not adaptive B-frames are to be used when encoding this layer. If not specified, the encoder will turn it on whenever the video profile permits its use. |
bFrames |
integer |
The number of B-frames to be used when encoding this layer. If not specified, the encoder chooses an appropriate number based on the video profile and level. |
bitrate |
integer |
The average bitrate in bits per second at which to encode the input video when generating this layer. For example: a target bitrate of 3000Kbps or 3Mbps means this value should be 3000000 This is a required field. |
bufferWindow |
string |
The VBV buffer window length. The value should be in ISO 8601 format. The value should be in the range [0.1-100] seconds. The default is 5 seconds (for example, PT5S). |
crf |
number |
The value of CRF to be used when encoding this layer. This setting takes effect when RateControlMode of video codec is set at CRF mode. The range of CRF value is between 0 and 51, where lower values would result in better quality, at the expense of higher file sizes. Higher values mean more compression, but at some point quality degradation will be noticed. Default value is 28. |
frameRate |
string |
The frame rate (in frames per second) at which to encode this layer. The value can be in the form of M/N where M and N are integers (For example, 30000/1001), or in the form of a number (For example, 30, or 29.97). The encoder enforces constraints on allowed frame rates based on the profile and level. If it is not specified, the encoder will use the same frame rate as the input video. |
height |
string |
The height of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in height as the input. |
label |
string |
The alphanumeric label for this layer, which can be used in multiplexing different video and audio layers, or in naming the output file. |
level |
string |
We currently support Level up to 6.2. The value can be Auto, or a number that matches the H.265 profile. If not specified, the default is Auto, which lets the encoder choose the Level that is appropriate for this layer. |
maxBitrate |
integer |
The maximum bitrate (in bits per second), at which the VBV buffer should be assumed to refill. If not specified, defaults to the same value as bitrate. |
profile |
We currently support Main. Default is Auto. |
|
referenceFrames |
integer |
The number of reference frames to be used when encoding this layer. If not specified, the encoder determines an appropriate number based on the encoder complexity setting. |
slices |
integer |
The number of slices to be used when encoding this layer. If not specified, default is zero, which means that encoder will use a single slice for each frame. |
width |
string |
The width of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in width as the input. |
H265Video
Describes all the properties for encoding a video with the H.265 codec.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
complexity |
Tells the encoder how to choose its encoding settings. Quality will provide for a higher compression ratio but at a higher cost and longer compute time. Speed will produce a relatively larger file but is faster and more economical. The default value is Balanced. |
|
keyFrameInterval |
string |
The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
layers |
The collection of output H.265 layers to be produced by the encoder. |
|
sceneChangeDetection |
boolean |
Specifies whether or not the encoder should insert key frames at scene changes. If not specified, the default is false. This flag should be set to true only when the encoder is being configured to produce a single output video. |
stretchMode |
The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize |
|
syncMode |
The Video Sync Mode |
H265VideoProfile
We currently support Main. Default is Auto.
Name | Type | Description |
---|---|---|
Auto |
string |
Tells the encoder to automatically determine the appropriate H.265 profile. |
Main |
string |
Main profile (https://x265.readthedocs.io/en/default/cli.html?highlight=profile#profile-level-tier) |
Main10 |
string |
Main 10 profile (https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding#Main_10) |
Image
Describes the basic properties for generating thumbnails from the input video
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
keyFrameInterval |
string |
The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
range |
string |
The position relative to transform preset start time in the input video at which to stop generating thumbnails. The value can be in ISO 8601 format (For example, PT5M30S to stop at 5 minutes and 30 seconds from start time), or a frame count (For example, 300 to stop at the 300th frame from the frame at start time. If this value is 1, it means only producing one thumbnail at start time), or a relative value to the stream duration (For example, 50% to stop at half of stream duration from start time). The default value is 100%, which means to stop at the end of the stream. |
start |
string |
The position in the input video from where to start generating thumbnails. The value can be in ISO 8601 format (For example, PT05S to start at 5 seconds), or a frame count (For example, 10 to start at the 10th frame), or a relative value to stream duration (For example, 10% to start at 10% of stream duration). Also supports a macro {Best}, which tells the encoder to select the best thumbnail from the first few seconds of the video and will only produce one thumbnail, no matter what other settings are for Step and Range. The default value is macro {Best}. |
step |
string |
The intervals at which thumbnails are generated. The value can be in ISO 8601 format (For example, PT05S for one image every 5 seconds), or a frame count (For example, 30 for one image every 30 frames), or a relative value to stream duration (For example, 10% for one image every 10% of stream duration). Note: Step value will affect the first generated thumbnail, which may not be exactly the one specified at transform preset start time. This is due to the encoder, which tries to select the best thumbnail between start time and Step position from start time as the first output. As the default value is 10%, it means if stream has long duration, the first generated thumbnail might be far away from the one specified at start time. Try to select reasonable value for Step if the first thumbnail is expected close to start time, or set Range value at 1 if only one thumbnail is needed at start time. |
stretchMode |
The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize |
|
syncMode |
The Video Sync Mode |
ImageFormat
Describes the properties for an output image file.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
filenamePattern |
string |
The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename. |
InputFile
An InputDefinition for a single file. TrackSelections are scoped to the file specified.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
filename |
string |
Name of the file that this input definition applies to. |
includedTracks | TrackDescriptor[]: |
The list of TrackDescriptors which define the metadata and selection of tracks in the input. |
InsightsType
Defines the type of insights that you want the service to generate. The allowed values are 'AudioInsightsOnly', 'VideoInsightsOnly', and 'AllInsights'. The default is AllInsights. If you set this to AllInsights and the input is audio only, then only audio insights are generated. Similarly if the input is video only, then only video insights are generated. It is recommended that you not use AudioInsightsOnly if you expect some of your inputs to be video only; or use VideoInsightsOnly if you expect some of your inputs to be audio only. Your Jobs in such conditions would error out.
Name | Type | Description |
---|---|---|
AllInsights |
string |
Generate both audio and video insights. Fails if either audio or video Insights fail. |
AudioInsightsOnly |
string |
Generate audio only insights. Ignore video even if present. Fails if no audio is present. |
VideoInsightsOnly |
string |
Generate video only insights. Ignore audio if present. Fails if no video is present. |
InterleaveOutput
Sets the interleave mode of the output to control how audio and video are stored in the container format. Example: set InterleavedOutput as NonInterleavedOutput to produce audio-only and video-only outputs in separate MP4 files.
Name | Type | Description |
---|---|---|
InterleavedOutput |
string |
The output includes both audio and video. |
NonInterleavedOutput |
string |
The output is video-only or audio-only. |
Job
A Job resource type. The progress and state can be obtained by polling a Job or subscribing to events using EventGrid.
Name | Type | Description |
---|---|---|
id |
string |
Fully qualified resource ID for the resource. Ex - /subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/{resourceProviderNamespace}/{resourceType}/{resourceName} |
name |
string |
The name of the resource |
properties.correlationData |
object |
Customer provided key, value pairs that will be returned in Job and JobOutput state events. |
properties.created |
string |
The UTC date and time when the customer has created the Job, in 'YYYY-MM-DDThh:mm:ssZ' format. |
properties.description |
string |
Optional customer supplied description of the Job. |
properties.endTime |
string |
The UTC date and time at which this Job finished processing. |
properties.input | JobInput: |
The inputs for the Job. |
properties.lastModified |
string |
The UTC date and time when the customer has last updated the Job, in 'YYYY-MM-DDThh:mm:ssZ' format. |
properties.outputs | JobOutput[]: |
The outputs for the Job. |
properties.priority |
Priority with which the job should be processed. Higher priority jobs are processed before lower priority jobs. If not set, the default is normal. |
|
properties.startTime |
string |
The UTC date and time at which this Job began processing. |
properties.state |
The current state of the job. |
|
systemData |
The system metadata relating to this resource. |
|
type |
string |
The type of the resource. E.g. "Microsoft.Compute/virtualMachines" or "Microsoft.Storage/storageAccounts" |
JobError
Details of JobOutput errors.
Name | Type | Description |
---|---|---|
category |
Helps with categorization of errors. |
|
code |
Error code describing the error. |
|
details |
An array of details about specific errors that led to this reported error. |
|
message |
string |
A human-readable language-dependent representation of the error. |
retry |
Indicates that it may be possible to retry the Job. If retry is unsuccessful, please contact Azure support via Azure Portal. |
JobErrorCategory
Helps with categorization of errors.
Name | Type | Description |
---|---|---|
Account |
string |
The error is related to account information. |
Configuration |
string |
The error is configuration related. |
Content |
string |
The error is related to data in the input files. |
Download |
string |
The error is download related. |
Service |
string |
The error is service related. |
Upload |
string |
The error is upload related. |
JobErrorCode
Error code describing the error.
Name | Type | Description |
---|---|---|
ConfigurationUnsupported |
string |
There was a problem with the combination of input files and the configuration settings applied, fix the configuration settings and retry with the same input, or change input to match the configuration. |
ContentMalformed |
string |
There was a problem with the input content (for example: zero byte files, or corrupt/non-decodable files), check the input files. |
ContentUnsupported |
string |
There was a problem with the format of the input (not valid media file, or an unsupported file/codec), check the validity of the input files. |
DownloadNotAccessible |
string |
While trying to download the input files, the files were not accessible, please check the availability of the source. |
DownloadTransientError |
string |
While trying to download the input files, there was an issue during transfer (storage service, network errors), see details and check your source. |
IdentityUnsupported |
string |
There was an error verifying to the account identity. Check and fix the identity configurations and retry. If unsuccessful, please contact support. |
ServiceError |
string |
Fatal service error, please contact support. |
ServiceTransientError |
string |
Transient error, please retry, if retry is unsuccessful, please contact support. |
UploadNotAccessible |
string |
While trying to upload the output files, the destination was not reachable, please check the availability of the destination. |
UploadTransientError |
string |
While trying to upload the output files, there was an issue during transfer (storage service, network errors), see details and check your destination. |
JobErrorDetail
Details of JobOutput errors.
Name | Type | Description |
---|---|---|
code |
string |
Code describing the error detail. |
message |
string |
A human-readable representation of the error. |
JobInputAsset
Represents an Asset for input into a Job.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
assetName |
string |
The name of the input Asset. |
end | ClipTime: |
Defines a point on the timeline of the input media at which processing will end. Defaults to the end of the input media. |
files |
string[] |
List of files. Required for JobInputHttp. Maximum of 4000 characters each. Query strings will not be returned in service responses to prevent sensitive data exposure. |
inputDefinitions | InputDefinition[]: |
Defines a list of InputDefinitions. For each InputDefinition, it defines a list of track selections and related metadata. |
label |
string |
A label that is assigned to a JobInputClip, that is used to satisfy a reference used in the Transform. For example, a Transform can be authored so as to take an image file with the label 'xyz' and apply it as an overlay onto the input video before it is encoded. When submitting a Job, exactly one of the JobInputs should be the image file, and it should have the label 'xyz'. |
start | ClipTime: |
Defines a point on the timeline of the input media at which processing will start. Defaults to the beginning of the input media. |
JobInputClip
Represents input files for a Job.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
end | ClipTime: |
Defines a point on the timeline of the input media at which processing will end. Defaults to the end of the input media. |
files |
string[] |
List of files. Required for JobInputHttp. Maximum of 4000 characters each. Query strings will not be returned in service responses to prevent sensitive data exposure. |
inputDefinitions | InputDefinition[]: |
Defines a list of InputDefinitions. For each InputDefinition, it defines a list of track selections and related metadata. |
label |
string |
A label that is assigned to a JobInputClip, that is used to satisfy a reference used in the Transform. For example, a Transform can be authored so as to take an image file with the label 'xyz' and apply it as an overlay onto the input video before it is encoded. When submitting a Job, exactly one of the JobInputs should be the image file, and it should have the label 'xyz'. |
start | ClipTime: |
Defines a point on the timeline of the input media at which processing will start. Defaults to the beginning of the input media. |
JobInputHttp
Represents HTTPS job input.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
baseUri |
string |
Base URI for HTTPS job input. It will be concatenated with provided file names. If no base uri is given, then the provided file list is assumed to be fully qualified uris. Maximum length of 4000 characters. The query strings will not be returned in service responses to prevent sensitive data exposure. |
end | ClipTime: |
Defines a point on the timeline of the input media at which processing will end. Defaults to the end of the input media. |
files |
string[] |
List of files. Required for JobInputHttp. Maximum of 4000 characters each. Query strings will not be returned in service responses to prevent sensitive data exposure. |
inputDefinitions | InputDefinition[]: |
Defines a list of InputDefinitions. For each InputDefinition, it defines a list of track selections and related metadata. |
label |
string |
A label that is assigned to a JobInputClip, that is used to satisfy a reference used in the Transform. For example, a Transform can be authored so as to take an image file with the label 'xyz' and apply it as an overlay onto the input video before it is encoded. When submitting a Job, exactly one of the JobInputs should be the image file, and it should have the label 'xyz'. |
start | ClipTime: |
Defines a point on the timeline of the input media at which processing will start. Defaults to the beginning of the input media. |
JobInputs
Describes a list of inputs to a Job.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
inputs | JobInput[]: |
List of inputs to a Job. |
JobInputSequence
A Sequence contains an ordered list of Clips where each clip is a JobInput. The Sequence will be treated as a single input.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
inputs | JobInputClip[]: |
JobInputs that make up the timeline. |
JobOutputAsset
Represents an Asset used as a JobOutput.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
assetName |
string |
The name of the output Asset. |
endTime |
string |
The UTC date and time at which this Job Output finished processing. |
error |
If the JobOutput is in the Error state, it contains the details of the error. |
|
label |
string |
A label that is assigned to a JobOutput in order to help uniquely identify it. This is useful when your Transform has more than one TransformOutput, whereby your Job has more than one JobOutput. In such cases, when you submit the Job, you will add two or more JobOutputs, in the same order as TransformOutputs in the Transform. Subsequently, when you retrieve the Job, either through events or on a GET request, you can use the label to easily identify the JobOutput. If a label is not provided, a default value of '{presetName}_{outputIndex}' will be used, where the preset name is the name of the preset in the corresponding TransformOutput and the output index is the relative index of the this JobOutput within the Job. Note that this index is the same as the relative index of the corresponding TransformOutput within its Transform. |
presetOverride | Preset: |
A preset used to override the preset in the corresponding transform output. |
progress |
integer |
If the JobOutput is in a Processing state, this contains the Job completion percentage. The value is an estimate and not intended to be used to predict Job completion times. To determine if the JobOutput is complete, use the State property. |
startTime |
string |
The UTC date and time at which this Job Output began processing. |
state |
Describes the state of the JobOutput. |
JobRetry
Indicates that it may be possible to retry the Job. If retry is unsuccessful, please contact Azure support via Azure Portal.
Name | Type | Description |
---|---|---|
DoNotRetry |
string |
Issue needs to be investigated and then the job resubmitted with corrections or retried once the underlying issue has been corrected. |
MayRetry |
string |
Issue may be resolved after waiting for a period of time and resubmitting the same Job. |
JobState
Describes the state of the JobOutput.
Name | Type | Description |
---|---|---|
Canceled |
string |
The job was canceled. This is a final state for the job. |
Canceling |
string |
The job is in the process of being canceled. This is a transient state for the job. |
Error |
string |
The job has encountered an error. This is a final state for the job. |
Finished |
string |
The job is finished. This is a final state for the job. |
Processing |
string |
The job is processing. This is a transient state for the job. |
Queued |
string |
The job is in a queued state, waiting for resources to become available. This is a transient state. |
Scheduled |
string |
The job is being scheduled to run on an available resource. This is a transient state, between queued and processing states. |
JpgFormat
Describes the settings for producing JPEG thumbnails.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
filenamePattern |
string |
The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename. |
JpgImage
Describes the properties for producing a series of JPEG images from the input video.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
keyFrameInterval |
string |
The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
layers |
Jpg |
A collection of output JPEG image layers to be produced by the encoder. |
range |
string |
The position relative to transform preset start time in the input video at which to stop generating thumbnails. The value can be in ISO 8601 format (For example, PT5M30S to stop at 5 minutes and 30 seconds from start time), or a frame count (For example, 300 to stop at the 300th frame from the frame at start time. If this value is 1, it means only producing one thumbnail at start time), or a relative value to the stream duration (For example, 50% to stop at half of stream duration from start time). The default value is 100%, which means to stop at the end of the stream. |
spriteColumn |
integer |
Sets the number of columns used in thumbnail sprite image. The number of rows are automatically calculated and a VTT file is generated with the coordinate mappings for each thumbnail in the sprite. Note: this value should be a positive integer and a proper value is recommended so that the output image resolution will not go beyond JPEG maximum pixel resolution limit 65535x65535. |
start |
string |
The position in the input video from where to start generating thumbnails. The value can be in ISO 8601 format (For example, PT05S to start at 5 seconds), or a frame count (For example, 10 to start at the 10th frame), or a relative value to stream duration (For example, 10% to start at 10% of stream duration). Also supports a macro {Best}, which tells the encoder to select the best thumbnail from the first few seconds of the video and will only produce one thumbnail, no matter what other settings are for Step and Range. The default value is macro {Best}. |
step |
string |
The intervals at which thumbnails are generated. The value can be in ISO 8601 format (For example, PT05S for one image every 5 seconds), or a frame count (For example, 30 for one image every 30 frames), or a relative value to stream duration (For example, 10% for one image every 10% of stream duration). Note: Step value will affect the first generated thumbnail, which may not be exactly the one specified at transform preset start time. This is due to the encoder, which tries to select the best thumbnail between start time and Step position from start time as the first output. As the default value is 10%, it means if stream has long duration, the first generated thumbnail might be far away from the one specified at start time. Try to select reasonable value for Step if the first thumbnail is expected close to start time, or set Range value at 1 if only one thumbnail is needed at start time. |
stretchMode |
The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize |
|
syncMode |
The Video Sync Mode |
JpgLayer
Describes the settings to produce a JPEG image from the input video.
Name | Type | Description |
---|---|---|
height |
string |
The height of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in height as the input. |
label |
string |
The alphanumeric label for this layer, which can be used in multiplexing different video and audio layers, or in naming the output file. |
quality |
integer |
The compression quality of the JPEG output. Range is from 0-100 and the default is 70. |
width |
string |
The width of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in width as the input. |
Mp4Format
Describes the properties for an output ISO MP4 file.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
filenamePattern |
string |
The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename. |
outputFiles |
The list of output files to produce. Each entry in the list is a set of audio and video layer labels to be muxed together . |
MultiBitrateFormat
Describes the properties for producing a collection of GOP aligned multi-bitrate files. The default behavior is to produce one output file for each video layer which is muxed together with all the audios. The exact output files produced can be controlled by specifying the outputFiles collection.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
filenamePattern |
string |
The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename. |
outputFiles |
The list of output files to produce. Each entry in the list is a set of audio and video layer labels to be muxed together . |
OutputFile
Represents an output file produced.
Name | Type | Description |
---|---|---|
labels |
string[] |
The list of labels that describe how the encoder should multiplex video and audio into an output file. For example, if the encoder is producing two video layers with labels v1 and v2, and one audio layer with label a1, then an array like '[v1, a1]' tells the encoder to produce an output file with the video track represented by v1 and the audio track represented by a1. |
PngFormat
Describes the settings for producing PNG thumbnails.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
filenamePattern |
string |
The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename. |
PngImage
Describes the properties for producing a series of PNG images from the input video.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
keyFrameInterval |
string |
The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
layers |
Png |
A collection of output PNG image layers to be produced by the encoder. |
range |
string |
The position relative to transform preset start time in the input video at which to stop generating thumbnails. The value can be in ISO 8601 format (For example, PT5M30S to stop at 5 minutes and 30 seconds from start time), or a frame count (For example, 300 to stop at the 300th frame from the frame at start time. If this value is 1, it means only producing one thumbnail at start time), or a relative value to the stream duration (For example, 50% to stop at half of stream duration from start time). The default value is 100%, which means to stop at the end of the stream. |
start |
string |
The position in the input video from where to start generating thumbnails. The value can be in ISO 8601 format (For example, PT05S to start at 5 seconds), or a frame count (For example, 10 to start at the 10th frame), or a relative value to stream duration (For example, 10% to start at 10% of stream duration). Also supports a macro {Best}, which tells the encoder to select the best thumbnail from the first few seconds of the video and will only produce one thumbnail, no matter what other settings are for Step and Range. The default value is macro {Best}. |
step |
string |
The intervals at which thumbnails are generated. The value can be in ISO 8601 format (For example, PT05S for one image every 5 seconds), or a frame count (For example, 30 for one image every 30 frames), or a relative value to stream duration (For example, 10% for one image every 10% of stream duration). Note: Step value will affect the first generated thumbnail, which may not be exactly the one specified at transform preset start time. This is due to the encoder, which tries to select the best thumbnail between start time and Step position from start time as the first output. As the default value is 10%, it means if stream has long duration, the first generated thumbnail might be far away from the one specified at start time. Try to select reasonable value for Step if the first thumbnail is expected close to start time, or set Range value at 1 if only one thumbnail is needed at start time. |
stretchMode |
The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize |
|
syncMode |
The Video Sync Mode |
PngLayer
Describes the settings to produce a PNG image from the input video.
Name | Type | Description |
---|---|---|
height |
string |
The height of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in height as the input. |
label |
string |
The alphanumeric label for this layer, which can be used in multiplexing different video and audio layers, or in naming the output file. |
width |
string |
The width of the output video for this layer. The value can be absolute (in pixels) or relative (in percentage). For example 50% means the output video has half as many pixels in width as the input. |
PresetConfigurations
An object of optional configuration settings for encoder.
Name | Type | Description |
---|---|---|
complexity |
Allows you to configure the encoder settings to control the balance between speed and quality. Example: set Complexity as Speed for faster encoding but less compression efficiency. |
|
interleaveOutput |
Sets the interleave mode of the output to control how audio and video are stored in the container format. Example: set InterleavedOutput as NonInterleavedOutput to produce audio-only and video-only outputs in separate MP4 files. |
|
keyFrameIntervalInSeconds |
number |
The key frame interval in seconds. Example: set KeyFrameIntervalInSeconds as 2 to reduce the playback buffering for some players. |
maxBitrateBps |
integer |
The maximum bitrate in bits per second (threshold for the top video layer). Example: set MaxBitrateBps as 6000000 to avoid producing very high bitrate outputs for contents with high complexity. |
maxHeight |
integer |
The maximum height of output video layers. Example: set MaxHeight as 720 to produce output layers up to 720P even if the input is 4K. |
maxLayers |
integer |
The maximum number of output video layers. Example: set MaxLayers as 4 to make sure at most 4 output layers are produced to control the overall cost of the encoding job. |
minBitrateBps |
integer |
The minimum bitrate in bits per second (threshold for the bottom video layer). Example: set MinBitrateBps as 200000 to have a bottom layer that covers users with low network bandwidth. |
minHeight |
integer |
The minimum height of output video layers. Example: set MinHeight as 360 to avoid output layers of smaller resolutions like 180P. |
Priority
Sets the relative priority of the TransformOutputs within a Transform. This sets the priority that the service uses for processing TransformOutputs. The default priority is Normal.
Name | Type | Description |
---|---|---|
High |
string |
Used for TransformOutputs that should take precedence over others. |
Low |
string |
Used for TransformOutputs that can be generated after Normal and High priority TransformOutputs. |
Normal |
string |
Used for TransformOutputs that can be generated at Normal priority. |
Rectangle
Describes the properties of a rectangular window applied to the input media before processing it.
Name | Type | Description |
---|---|---|
height |
string |
The height of the rectangular region in pixels. This can be absolute pixel value (e.g 100), or relative to the size of the video (For example, 50%). |
left |
string |
The number of pixels from the left-margin. This can be absolute pixel value (e.g 100), or relative to the size of the video (For example, 50%). |
top |
string |
The number of pixels from the top-margin. This can be absolute pixel value (e.g 100), or relative to the size of the video (For example, 50%). |
width |
string |
The width of the rectangular region in pixels. This can be absolute pixel value (e.g 100), or relative to the size of the video (For example, 50%). |
Rotation
The rotation, if any, to be applied to the input video, before it is encoded. Default is Auto
Name | Type | Description |
---|---|---|
Auto |
string |
Automatically detect and rotate as needed. |
None |
string |
Do not rotate the video. If the output format supports it, any metadata about rotation is kept intact. |
Rotate0 |
string |
Do not rotate the video but remove any metadata about the rotation. |
Rotate180 |
string |
Rotate 180 degrees clockwise. |
Rotate270 |
string |
Rotate 270 degrees clockwise. |
Rotate90 |
string |
Rotate 90 degrees clockwise. |
SelectAudioTrackByAttribute
Select audio tracks from the input by specifying an attribute and an attribute filter.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
attribute |
The TrackAttribute to filter the tracks by. |
|
channelMapping |
Optional designation for single channel audio tracks. Can be used to combine the tracks into stereo or multi-channel audio tracks. |
|
filter |
The type of AttributeFilter to apply to the TrackAttribute in order to select the tracks. |
|
filterValue |
string |
The value to filter the tracks by. Only used when AttributeFilter.ValueEquals is specified for the Filter property. |
SelectAudioTrackById
Select audio tracks from the input by specifying a track identifier.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
channelMapping |
Optional designation for single channel audio tracks. Can be used to combine the tracks into stereo or multi-channel audio tracks. |
|
trackId |
integer |
Track identifier to select |
SelectVideoTrackByAttribute
Select video tracks from the input by specifying an attribute and an attribute filter.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
attribute |
The TrackAttribute to filter the tracks by. |
|
filter |
The type of AttributeFilter to apply to the TrackAttribute in order to select the tracks. |
|
filterValue |
string |
The value to filter the tracks by. Only used when AttributeFilter.ValueEquals is specified for the Filter property. For TrackAttribute.Bitrate, this should be an integer value in bits per second (e.g: '1500000'). The TrackAttribute.Language is not supported for video tracks. |
SelectVideoTrackById
Select video tracks from the input by specifying a track identifier.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
trackId |
integer |
Track identifier to select |
StandardEncoderPreset
Describes all the settings to be used when encoding the input video with the Standard Encoder.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
codecs | Codec[]: |
The list of codecs to be used when encoding the input video. |
experimentalOptions |
object |
Dictionary containing key value pairs for parameters not exposed in the preset itself |
filters |
One or more filtering operations that are applied to the input media before encoding. |
|
formats | Format[]: |
The list of outputs to be produced by the encoder. |
StretchMode
The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize
Name | Type | Description |
---|---|---|
AutoFit |
string |
Pad the output (with either letterbox or pillar box) to honor the output resolution, while ensuring that the active video region in the output has the same aspect ratio as the input. For example, if the input is 1920x1080 and the encoding preset asks for 1280x1280, then the output will be at 1280x1280, which contains an inner rectangle of 1280x720 at aspect ratio of 16:9, and pillar box regions 280 pixels wide at the left and right. |
AutoSize |
string |
Override the output resolution, and change it to match the display aspect ratio of the input, without padding. For example, if the input is 1920x1080 and the encoding preset asks for 1280x1280, then the value in the preset is overridden, and the output will be at 1280x720, which maintains the input aspect ratio of 16:9. |
None |
string |
Strictly respect the output resolution without considering the pixel aspect ratio or display aspect ratio of the input video. |
systemData
Metadata pertaining to creation and last modification of the resource.
Name | Type | Description |
---|---|---|
createdAt |
string |
The timestamp of resource creation (UTC). |
createdBy |
string |
The identity that created the resource. |
createdByType |
The type of identity that created the resource. |
|
lastModifiedAt |
string |
The timestamp of resource last modification (UTC) |
lastModifiedBy |
string |
The identity that last modified the resource. |
lastModifiedByType |
The type of identity that last modified the resource. |
TrackAttribute
The TrackAttribute to filter the tracks by.
Name | Type | Description |
---|---|---|
Bitrate |
string |
The bitrate of the track. |
Language |
string |
The language of the track. |
TransportStreamFormat
Describes the properties for generating an MPEG-2 Transport Stream (ISO/IEC 13818-1) output video file(s).
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
filenamePattern |
string |
The file naming pattern used for the creation of output files. The following macros are supported in the file name: {Basename} - An expansion macro that will use the name of the input video file. If the base name(the file suffix is not included) of the input video file is less than 32 characters long, the base name of input video files will be used. If the length of base name of the input video file exceeds 32 characters, the base name is truncated to the first 32 characters in total length. {Extension} - The appropriate extension for this format. {Label} - The label assigned to the codec/layer. {Index} - A unique index for thumbnails. Only applicable to thumbnails. {AudioStream} - string "Audio" plus audio stream number(start from 1). {Bitrate} - The audio/video bitrate in kbps. Not applicable to thumbnails. {Codec} - The type of the audio/video codec. {Resolution} - The video resolution. Any unsubstituted macros will be collapsed and removed from the filename. |
outputFiles |
The list of output files to produce. Each entry in the list is a set of audio and video layer labels to be muxed together . |
UtcClipTime
Specifies the clip time as a Utc time position in the media file. The Utc time can point to a different position depending on whether the media file starts from a timestamp of zero or not.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
time |
string |
The time position on the timeline of the input media based on Utc time. |
Video
Describes the basic properties for encoding the input video.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
keyFrameInterval |
string |
The distance between two key frames. The value should be non-zero in the range [0.5, 20] seconds, specified in ISO 8601 format. The default is 2 seconds(PT2S). Note that this setting is ignored if VideoSyncMode.Passthrough is set, where the KeyFrameInterval value will follow the input source setting. |
label |
string |
An optional label for the codec. The label can be used to control muxing behavior. |
stretchMode |
The resizing mode - how the input video will be resized to fit the desired output resolution(s). Default is AutoSize |
|
syncMode |
The Video Sync Mode |
VideoAnalyzerPreset
A video analyzer preset that extracts insights (rich metadata) from both audio and video, and outputs a JSON format file.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
audioLanguage |
string |
The language for the audio payload in the input using the BCP-47 format of 'language tag-region' (e.g: 'en-US'). If you know the language of your content, it is recommended that you specify it. The language must be specified explicitly for AudioAnalysisMode::Basic, since automatic language detection is not included in basic mode. If the language isn't specified or set to null, automatic language detection will choose the first language detected and process with the selected language for the duration of the file. It does not currently support dynamically switching between languages after the first language is detected. The automatic detection works best with audio recordings with clearly discernable speech. If automatic detection fails to find the language, transcription would fallback to 'en-US'." The list of supported languages is available here: https://go.microsoft.com/fwlink/?linkid=2109463 |
experimentalOptions |
object |
Dictionary containing key value pairs for parameters not exposed in the preset itself |
insightsToExtract |
Defines the type of insights that you want the service to generate. The allowed values are 'AudioInsightsOnly', 'VideoInsightsOnly', and 'AllInsights'. The default is AllInsights. If you set this to AllInsights and the input is audio only, then only audio insights are generated. Similarly if the input is video only, then only video insights are generated. It is recommended that you not use AudioInsightsOnly if you expect some of your inputs to be video only; or use VideoInsightsOnly if you expect some of your inputs to be audio only. Your Jobs in such conditions would error out. |
|
mode |
Determines the set of audio analysis operations to be performed. If unspecified, the Standard AudioAnalysisMode would be chosen. |
VideoOverlay
Describes the properties of a video overlay.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |
audioGainLevel |
number |
The gain level of audio in the overlay. The value should be in the range [0, 1.0]. The default is 1.0. |
cropRectangle |
An optional rectangular window used to crop the overlay image or video. |
|
end |
string |
The end position, with reference to the input video, at which the overlay ends. The value should be in ISO 8601 format. For example, PT30S to end the overlay at 30 seconds into the input video. If not specified or the value is greater than the input video duration, the overlay will be applied until the end of the input video if the overlay media duration is greater than the input video duration, else the overlay will last as long as the overlay media duration. |
fadeInDuration |
string |
The duration over which the overlay fades in onto the input video. The value should be in ISO 8601 duration format. If not specified the default behavior is to have no fade in (same as PT0S). |
fadeOutDuration |
string |
The duration over which the overlay fades out of the input video. The value should be in ISO 8601 duration format. If not specified the default behavior is to have no fade out (same as PT0S). |
inputLabel |
string |
The label of the job input which is to be used as an overlay. The Input must specify exactly one file. You can specify an image file in JPG, PNG, GIF or BMP format, or an audio file (such as a WAV, MP3, WMA or M4A file), or a video file. See https://aka.ms/mesformats for the complete list of supported audio and video file formats. |
opacity |
number |
The opacity of the overlay. This is a value in the range [0 - 1.0]. Default is 1.0 which mean the overlay is opaque. |
position |
The location in the input video where the overlay is applied. |
|
start |
string |
The start position, with reference to the input video, at which the overlay starts. The value should be in ISO 8601 format. For example, PT05S to start the overlay at 5 seconds into the input video. If not specified the overlay starts from the beginning of the input video. |
VideoSyncMode
The Video Sync Mode
Name | Type | Description |
---|---|---|
Auto |
string |
This is the default method. Chooses between Cfr and Vfr depending on muxer capabilities. For output format MP4, the default mode is Cfr. |
Cfr |
string |
Input frames will be repeated and/or dropped as needed to achieve exactly the requested constant frame rate. Recommended when the output frame rate is explicitly set at a specified value |
Passthrough |
string |
The presentation timestamps on frames are passed through from the input file to the output file writer. Recommended when the input source has variable frame rate, and are attempting to produce multiple layers for adaptive streaming in the output which have aligned GOP boundaries. Note: if two or more frames in the input have duplicate timestamps, then the output will also have the same behavior |
Vfr |
string |
Similar to the Passthrough mode, but if the input has frames that have duplicate timestamps, then only one frame is passed through to the output, and others are dropped. Recommended when the number of output frames is expected to be equal to the number of input frames. For example, the output is used to calculate a quality metric like PSNR against the input |
VideoTrackDescriptor
A TrackSelection to select video tracks.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
The discriminator for derived types. |