Video Decoding on HoloLens 2 using Media Foundation has Delay

Nick 0 Reputation points
2025-11-07T20:06:18.45+00:00

Hello,

I am currently developing a HoloLens 2 application that requires receiving and decoding h264 encoded video data from a drone camera over a UDP connection. I was able to decode and display the video in the app, but the video is about 2 seconds behind from being real time. For the decoding, I'm using a custom plugin that uses Media Foundation.

This is what the overall pipeline looks like:

Receive data asynchronously --> parse NAL units --> send parsed units to decoder --> Every Unity frame (using Update()), use LoadRawTextureData() to upload most recent decoded data to Texture2D for display

I would like the video displayed to be real-time. I've tried to find where the delay is coming from using Unity timestamps. I first tested the response time of the drone. The difference in time from [sending a command to the drone to start streaming] to [receiving the first packet of data] is about 80 milliseconds, not nearly enough to account for the 2 second delay.

The measured time to parse the data was less than 10 milliseconds. In the Unity code, calling the native decoder function took only a few milliseconds. And finally, the cost of uploading the decoded data to a Unity Texture2D object took just a few milliseconds.

So I'm wondering: am I measuring the time all these processes take incorrectly? Or is there some other place where this delay could be coming from that I'm forgetting? Or am I doing something wrong altogether?

I would greatly appreciate any help/suggestions.

HoloLens | Development
HoloLens | Development
The process of creating immersive mixed reality applications for Microsoft HoloLens,
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Q&A Assist
    2025-11-07T20:06:31.8566667+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    It sounds like you have a well-structured pipeline for receiving and decoding video data, but the 2-second delay you're experiencing could be attributed to several factors beyond the measurements you've already taken. Here are a few suggestions to investigate further:

    1. Buffering: Media Foundation may be buffering frames before they are decoded and displayed. Check if there are any settings or configurations in your Media Foundation pipeline that control buffering behavior.
    2. Latency in UDP: While the initial response time from the drone is quick, UDP can introduce variability in packet arrival times. Ensure that you are handling packet loss and reordering correctly, as this can lead to delays in displaying the video.
    3. Decoding Latency: Even if your native decoder function is fast, the overall decoding process might still introduce latency. Investigate if the decoder has any internal buffering or if it waits for a certain number of frames before processing.
    4. Unity Frame Rate: Ensure that your Unity app is running at a consistent frame rate. If the frame rate drops, it could lead to perceived delays in video display.
    5. Synchronization: Make sure that the timestamps for the video frames are being handled correctly. If the frames are being displayed out of order or if there is a mismatch in timing, it could contribute to the delay.
    6. Testing with Different Settings: Experiment with different settings for the video stream, such as resolution and frame rate, to see if they affect the delay.

    By systematically checking these areas, you may be able to identify the source of the delay and make adjustments to achieve real-time video display.


    References:

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.