This function applies the Unicode Bidirectional Text Algorithm (BiDi) to a run of text (which is supplied as an array of Unicode characters).
Parameter | Description |
---|---|
nr_of_chars |
The number of characters in the input chars array and the size of the types and levels arrays. |
chars |
A pointer to a memory buffer that holds the Unicode characters (input array). |
chars_format |
Format of the characters in the memory buffer. Can be one of the following:
|
types |
The resulting array of character types (output array). Each element of this array identifies the Unicode Script of the corresponding Unicode character in the chars array. |
levels |
The resulting array of character directional levels (output array). Each element of this array describes the directional level of the corresponding Unicode character in the chars array. |
start_level |
Start embedding level for the run of text supplied via the chars array. Set to an even value (0, 2, 4...60) for left-to-right. Set to an odd value (1, 3..61) for right-to-left. The value 255 is special: it means divide the supplied text into paragraphs and determine the each paragraph's embedding level by finding the first character in the paragraph with a strong bidirectional category. If the character is strongly left-to-right, the paragraph's embedding level will be 0, otherwise (i.e. if the character is strongly right-to-left), the paragraph's embedding level will be 1. Review the Comments section below for more information on Unicode's bidirectional types and embedding levels. |
flags |
Flags to configure the behavior of the function.
|
The chars, types and levels arrays are allocated and freed by the user. Their size must be nr_of_chars.
If successful, the function returns 1. If not successful (e.g. an error occurs or an invalid input parameter is supplied), the function returns 0.
This function implements the following rules of the Unicode Bidirectional Text Algorithm:
The function does not implement the L1, L2, L3 and L4 rules (Reordering Resolved Levels) because these rules act on a per-line basis and are applied after any line wrapping is applied to the paragraph. More details on the Unicode Bidirectional Text Algorithm can be found in the Unicode Standard and/or on the Unicode website.
Unicode characters have a "bidirectional type". There are many types, but they are divided into three categories: strong, weak, and neutral.
Characters with a strong bidirectional type know their directionality. For example, the characters in most alphabets are strongly left-to-right, and the characters in the Hebrew and Arabic alphabets (and some others) are strongly right-to-left.
Characters with a weak bidirectional type determine their directionality according to their proximity to other characters with strong directionality.
Characters with a neutral bidirectional type determine their directionality from either the surrounding strong text or the embedding level.
The Unicode Bidirectional Algorithm works in terms of "levels" of right-to-left text embedded with left-to-right text, and vice versa.
Text at an even level is rendered left-to-right. Text at an odd level is rendered right-to-left.
The Unicode Bidirectional Algorithm works on paragraphs, so the first step is to divide text into paragraphs. The paragraph embedding level can be determined by finding the first character in the paragraph with a strong bidirectional category. If the character is strongly left-to-right, the paragraph embedding level is 0, otherwise (i.e. if the character is strongly right-to-left), the embedding level is 1.
Embedding goes on from there: contained text with the opposite directionality is at the next embedding level, and text with the original directionality that is contained by the text with the opposite directionality is at the next lowest embedding level.