Hello Nidhi Priya,
Welcome to Microsoft Q&A and thank you for the more information.
What you are trying to achieve is learning from user feedback at runtime, not traditional model training. This is a common pattern in LLM-based automation systems, and it’s best handled outside the model itself.
- GPT-5 models do not currently support fine-tuning in Azure OpenAI / AI Foundry.
- Even when fine-tuning is available (for older models), it is not suitable for per-user or per-interaction learning.
- LLMs are stateless any “memory” must be implemented externally.
Recommended approach (production-proven pattern)
The correct solution is a retrieval-augmented memory layer, not fine-tuning.
- Capture user corrections explicitly
Whenever a meeting goes to review and the user selects the correct task:
Store meeting metadata (title, participants, keywords, context)
Store the correct task ID / name
Scope this data per tenant or per user
This becomes trusted ground-truth feedback.
- Use embeddings + vector storage
Generate embeddings for:
Meeting descriptions
Task names
Confirmed meeting → task mappings
Store these in a vector-capable store such as:
Azure AI Search (vector search)
Cosmos DB with vector indexing
Any managed vector database
This enables semantic similarity matching.
- Dynamic retrieval at inference time
For each new meeting:
Retrieve similar past meetings and their corrected mappings
Pass the retrieved examples to the LLM as context
Ask the model to choose the best task using both current input and historical feedback
This gives you adaptive behavior without retraining.
- Confidence-based automation
High confidence → auto-map
Low confidence → send to review
Each review automatically improves future results because it feeds the memory layer.
Why fine-tuning is not recommended here
Not supported for GPT-5
Requires offline datasets and retraining cycles
Cannot learn immediately from individual corrections
Difficult to scope per user or tenant
Slower iteration compared to retrieval-based memory
Fine-tuning works best for static domain behavior, not dynamic, user-driven feedback loops.
This is a memory and retrieval problem, not a model training problem
Use vector search + stored user feedback
Keep the model stateless
Let learning happen through retrieval, not weight updates
This architecture aligns with how copilots and adaptive AI systems are implemented on Azure today.
Please refer this
I Hope this helps. Do let me know if you have any further queries.
Thank you!