Fine-tuning job stuck in "Training" status for over 24 hours - Microsoft Learning Exercise
Hi there! I need some help with a fine-tuning job I started for a Microsoft Learning module. I'm working through the exercise at https://microsoftlearning.github.io/mslearn-ai-studio/Instructions/05-Finetune-model.html where you fine-tune a language model.
I started a fine-tuning job yesterday evening using the small travel assistant JSONL file from the tutorial, but it's still showing as "Training started" more than 24 hours later.
I'm pretty sure something's wrong because it's a small dataset that shouldn't take this long to train. Do you know if there are any issues with the fine-tuning service right now? On the first training run I did cancel it after 3 hours of still being on "Running" Any ideas on what might be happening or how I can fix it?
Thanks for your help!