GetStringTypeExW function (stringapiset.h)
Note
This API may have incomplete/outdated information for certain Unicode characters, particularly those in the supplementary range. For more accurate and comprehensive Unicode character type information, consider using equivalent ICU APIs such as u_charType, u_islower, u_isspace, and u_ispunct. For guidance on using ICU APIs on Windows, see Getting Started with ICU on Windows.
Retrieves character type information for the characters in the specified source string. For each character in the string, the function sets one or more bits in the corresponding 16-bit element of the output array. Each bit identifies a given character type, for example, letter, digit, or neither.
Syntax
BOOL GetStringTypeExW(
[in] LCID Locale,
[in] DWORD dwInfoType,
[in] _In_NLS_string_(cchSrc)LPCWCH lpSrcStr,
[in] int cchSrc,
[out] LPWORD lpCharType
);
Parameters
[in] Locale
Locale identifier that specifies the locale. This value uniquely defines the ANSI code page. You can use the MAKELCID macro to create a locale identifier or use one of the following predefined values.
Windows Vista and later: The following custom locale identifiers are also supported.[in] dwInfoType
Flags specifying the character type information to retrieve. For possible flag values, see the dwInfoType parameter of GetStringTypeW. For detailed information about the character type bits, see Remarks for GetStringTypeW.
[in] lpSrcStr
Pointer to the string for which to retrieve the character types. The string is assumed to be null-terminated if cchSrc is set to any negative value.
[in] cchSrc
Size, in characters, of the string indicated by lpSrcStr. The size refers to bytes for the ANSI version of the function or wide characters for the Unicode version. If the size includes a terminating null character, the function retrieves character type information for that character. If the application sets the size to any negative integer, the source string is assumed to be null-terminated and the function calculates the size automatically with an additional character for the null termination.
[out] lpCharType
Pointer to an array of 16-bit values. The length of this array must be large enough to receive one 16-bit value for each character in the source string. If cchSrc is not a negative number, lpCharType should be an array of words with cchSrc elements. If cchSrc is set to a negative number, lpCharType is an array of words with lpSrcStr + 1 elements. When the function returns, this array contains one word corresponding to each character in the source string.
Return value
Returns a nonzero value if successful, or 0 otherwise. To get extended error information, the application can call GetLastError, which can return one of the following error codes:
- ERROR_INVALID_FLAGS. The values supplied for flags were not valid.
- ERROR_INVALID_PARAMETER. Any of the parameter values was invalid.
Remarks
For an overview of the use of the string functions, see Strings.
Using the ANSI code page for the supplied locale, this function translates the source string from ANSI to Unicode. It then analyzes each Unicode character for character type information.
The ANSI version of this function converts the source string to Unicode and calls the corresponding GetStringTypeW function. Thus the words in the output buffer correspond not to the original ANSI string but to its Unicode equivalent. The conversion from ANSI to Unicode can result in a change in string length, for example, a pair of ANSI characters can map to a single Unicode character. Therefore, the correspondence between the words in the output buffer and the characters in the original ANSI string is not one-to-one in all cases, for example, multibyte strings. Thus, the ANSI version of this function is of limited use for multi-character strings. The Unicode version of the function is recommended instead.
This function circumvents a limitation caused by the difference in parameters between GetStringTypeA and GetStringTypeW. Because of the parameter difference, an application cannot automatically invoke the proper ANSI or Unicode version of a GetStringType* function through the use of the #define UNICODE switch. On the other hand, GetStringTypeEx, behaves properly with regard to that switch. Thus it is the recommended function.
When the ANSI version of this function is used with a Unicode-only locale identifier, the function can succeed because the operating system uses the system code page. However, characters that are undefined in the system code page appear in the string as a question mark (?).
The values of the lpSrcStr and lpCharType parameters must not be the same. If they are the same, the function fails with ERROR_INVALID_PARAMETER.
The Locale parameter is only used to perform string conversion to Unicode. It has nothing to do with the CTYPE* values supplied by the application. These values are solely determined by Unicode code points, and do not vary on a locale basis. For example, Greek letters are specified as C1_ALPHA for any value of Locale.
Requirements
Requirement | Value |
---|---|
Minimum supported client | Windows 2000 Professional [desktop apps | UWP apps] |
Minimum supported server | Windows 2000 Server [desktop apps | UWP apps] |
Target Platform | Windows |
Header | stringapiset.h (include Windows.h) |
Library | Kernel32.lib |
DLL | Kernel32.dll |