An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
Hi @G Sanil
Thanks for your clear and detailed question about the Text Split skill in Azure AI Search — it's a great one, and you're spot on with your observations.
Quick answers to your questions:
- Yes — when textSplitMode is set to sentences, the skill splits the text so that each sentence becomes its own individual chunk. It breaks strictly on sentence-ending punctuation (., ?, !, etc., respecting the language set in defaultLanguageCode).
- Correct — parameters like maximumPageLength, pageOverlapLength, and maximumPagesToTake are ignored in sentences mode. They only apply when textSplitMode is set to pages. The skill simply performs a clean punctuation-based split in sentences mode.
- Recommended use cases for sentences mode:
- You need very fine-grained, precise chunks where preserving exact sentence boundaries is important.
- Your content consists of short, self-contained sentences and you want natural language units rather than fixed-length chunks.
- You're doing highly targeted semantic search or analysis where one sentence = one meaningful retrieval unit.
That said, for most vector search / RAG scenarios, Microsoft recommends using pages mode instead. It gives you better control over chunk size (to stay within embedding model token limits) and supports overlap for improved context. Sentences mode often creates a much larger number of chunks, which can increase indexing time and storage costs.
Reference :
https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-textsplit
https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-chunk-documents
Please do not forget to click "Accept the answer” and Yes, this can be beneficial to other community members.
If you have any other questions, let me know in the "comments" and I would be happy to help you