Index

Initialization Files

Font and Attribute Mapping Table (Fontmap) for Unicode Text Output

The font and attribute mapping table, or simply Fontmap, is a special configuration file that helps D-Type Unicode Text Module process Unicode text. Specifically, this file assigns suitable fonts and layout attributes to various Unicode scripts and/or characters defined within the Unicode Standard. This ensures that various international scripts and characters that appear in your text document(s) are displayed properly and, additionally, rendered using the most suitable fonts and layout attributes. Conceptually, it's useful to think of a Fontmap as one mega-font (consisting of many individual fonts) customizable by your application which D-Type Unicode Text Module then uses to display various characters in your text document(s). Your application can have as many Fontmaps as necessary and may use different Fontmaps for different purposes and/or documents, if this is what you want.

fontmap.inf

# Font and Attribute Mapping Table (Fontmap) for Unicode Text Output
#
# This font and attribute mapping table (or Fontmap) is read when D-Type
# Unicode Text Module lays out Unicode text. The Fontmap allows D-Type
# clients to explicitly assign various scripts and/or characters defined
# within the Unicode Standard to a default set of text layout attributes
# and fonts. This makes it possible to properly display any user-supplied
# text, in both horizontal and vertical writing mode.
#
# When processing the Fontmap, D-Type Unicode Text Module reads the rows
# with mapping instructions one at a time. As soon as the first matching
# Unicode script or character range is found, D-Type Unicode Text Module
# applies the mapping instructions in that row and stops reading and
# processing any subsequent rows.
#
# Each row in the Fontmap consists of precisely three fields, separated
# by a single vertical bar, i.e. pipe (|):
#
# * 1st Field - Unicode script or character range
# * 2nd Field - Assigned set of text layout attributes
# * 3rd Field - Assigned font(s)
#
# Example:
#
# SCRIPTCODE:latn|ATTRIBS:ro=8,ts=10025|FONTNAME:Times;+FONTNAME:Arial
#
# Any given Unicode script or character range can be assigned a single
# set of text layout attributes and one or more fonts. If multiple fonts
# are assigned, it means they are all suitable for the display of the
# specified Unicode script or character range. The order in which fonts
# are listed is important -- the first suitable font, as determined by
# the instructions described below, is always chosen as the default.
#
# -----------
# First Field
# -----------
#
# Within the first field, the CHARCODE, CHARSPAN or SCRIPTCODE
# instruction specifies a character range or Unicode script:
#
# * CHARCODE selects a single Unicode character using an 8-digit
#   hexadecimal integer.
#
#   Example: CHARCODE:0000004E
#
# * CHARSPAN selects a range of Unicode characters using two
#   8-digit hexadecimal integers (first and last).
#
#   Example: CHARSPAN:0000004E-0000005A
#
#   Note: CHARCODE and CHARSPAN have a higher priority than
#   SCRIPTCODE described below.
#
# * SCRIPTCODE selects a Unicode script using a 4 character script
#   code (e.g. latn, cyrl, arab, kana etc). See ISO 15924 Code Lists
#   for a list of available script codes.
#
#   Example: SCRIPTCODE:arab
#
#   Note 1: Script code 0000 is a special code which means any script.
#   Fonts associated with this script are usually fallback fonts, i.e.
#   fonts that are used in the absence of a more suitable alternative.
#   As such, SCRIPTCODE:0000 should be specified after all other
#   SCRIPTCODE instructions.
#
#   Note 2: The best fallback fonts are those that are specifically
#   designed to support a large number of Unicode characters and
#   scripts, such as Arial Unicode MS and Code2000.
#
# ------------
# Second Field
# ------------
#
# Within the second field, the ATTRIBS instruction specifies the
# associated set of text layout attributes. This set is supplied as a
# comma separated list of keyword/value pairs. Currently two attribute
# keywords are supported:
#
# * ro (relative orientation)
#
#   The ro attribute keyword corresponds to D-Type Power Engine's
#   pdRelativeOrientation property and D-Type Unicode Text Module's
#   TX_ATTR_ORIENTATION attribute identifier. This attribute allows
#   D-Type clients to set the most appropriate relative text orientation
#   (portrait/landscape) and baseline (e.g. default, shifted) for the
#   associated Unicode script or character range and font(s), in both
#   horizontal and vertical writing mode.
#
# * ts (text shaping)
#
#   The ts attribute keyword corresponds to D-Type Power Engine's
#   pdTextShaping property and D-Type Unicode Text Module's TX_ATTR_SHAPING
#   attribute identifier. When required, this attribute allows D-Type
#   clients to explicitly set the text shaping method for the associated
#   Unicode script or character range and font(s). If this attribute is
#   not specified, the shaping method will be set automatically based on
#   the Unicode script.
#
# Examples:
#
#   ATTRIBS:ro=2
#   ATTRIBS:ro=8,ts=10025
#
# -----------
# Third Field
# -----------
#
# Within the third field, the associated font can be identified by its
# name or Unique Font Identifier (fuid). To accomplish this, one of
# the following four instructions can be used:
#
# * FONTFUID
#
#   Specifies the font by its Unique Font Identifier (fuid). This
#   instruction also performs some basic validation on the font file
#   and attempts to activate it. As a result, missing, inaccessible,
#   invalid and/or corrupt fonts are rejected as suitable fonts.
#
#   Example: FONTFUID:S001R_ARIAL_001A
#
# * FASTFUID
#
#   Same as FONTFUID but noticeably faster since it does not perform any
#   validation on the font file and does not attempt to activate it. Thus,
#   any specified font is accepted as a suitable font.
#
#   Example: FASTFUID:S001R_ARIAL_001A
#
# * FONTNAME
#
#   Specifies the font by its name. This instruction also performs some
#   basic validation on the font file and attempts to activate it. As a
#   result, missing, inaccessible, invalid and/or corrupt fonts are
#   rejected as suitable fonts. This instruction is slower than FONTFUID
#   and FASTFUID since it must access the font file and parse its header
#   to extract the font name.
#
#   Example: FONTNAME:Times New Roman
#
# * FASTNAME
#
#   Same as FONTNAME but does not perform any validation on the font and
#   does not attempt to activate it. Thus, any specified font is accepted
#   as a suitable font.
#
#   Example: FASTNAME:Times New Roman
#
# Multiple Fonts:
#
# To associate multiple fonts with the same Unicode script or character
# range, two different methods are available:
#
# Method 1: Within the third field, multiple font instructions can be
# concatenated using the ;+ operator, as shown in the following example:
#
# SCRIPTCODE:kana|ATTRIBS:ro=2|FONTNAME:Code2000;+FONTFUID:F0011_ARPLSH_NS0
#
# Method 2: Multiple rows that target the same Unicode script or character
# range, but assign different fonts, can be added to the Fontmap. This is
# shown in the following example:
#
# SCRIPTCODE:kana|ATTRIBS:ro=2|FONTNAME:Code2000
# SCRIPTCODE:kana|ATTRIBS:ro=2|FONTFUID:F0011_ARPLSH_NS0
#
# Note that the first method is more compact as it requires only one row
# per Unicode script or character range. However, the second method is
# slightly more flexible as it provides the ability to specify different
# text layout attributes for different fonts.
#
# ----------------
# Additional Notes
# ----------------
#
# * It is the client's responsibility to ensure that the fonts listed in
#   the Fontmap have a sufficient number of glyphs to adequately represent
#   all of the Unicode scripts and character ranges they are associated
#   with. Only associate a font with a Unicode script or character range
#   if you know it provides adequate support for that particular script or
#   character range. You can use D-Type Font Viewer to check your font's
#   support for the intended Unicode script or character range.
#
# * For more information on the Unique Font Identifiers (fuid),
#   see the Configure Initial Font List document.
#
# * For more information on Unicode Scripts and script codes, relative
#   orientation and text shaping, see the manual.
#

{
SCRIPTCODE:zzzz|ATTRIBS:ts=0|FASTFUID:F0010_CMSANS_SS0
SCRIPTCODE:zyyy|ATTRIBS:ro=8|FASTFUID:F0010_CMSANS_SS0
SCRIPTCODE:latn|ATTRIBS:ro=8|FASTFUID:F0010_CMSANS_SS0
SCRIPTCODE:cyrl|ATTRIBS:ro=8|FASTFUID:F0010_CMSANS_SS0
SCRIPTCODE:grek|ATTRIBS:ro=8|FASTFUID:F0010_CMSANS_SS0
SCRIPTCODE:hani|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FONTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
SCRIPTCODE:hira|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FONTNAME:Code2000;+FONTFUID:F0011_ARPLSH_NS0
SCRIPTCODE:kana|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FONTNAME:Code2000;+FONTFUID:F0011_ARPLSH_NS0
SCRIPTCODE:hang|ATTRIBS:ro=8|FONTNAME:Arial Unicode MS;+FONTNAME:Code2000
SCRIPTCODE:hebr|ATTRIBS:ro=8|FONTFUID:F0010_DEJAVU_NS0;+FONTNAME:Arial;+FONTNAME:Tahoma;+FONTNAME:Times New Roman
SCRIPTCODE:arab|ATTRIBS:ro=8|FONTNAME:Arial;+FONTNAME:Tahoma;+FONTNAME:Times New Roman;+FONTNAME:KacstBook
SCRIPTCODE:deva|ATTRIBS:ro=8|FONTNAME:Raghindi;+FONTNAME:Thyaka Rabison
SCRIPTCODE:thai|ATTRIBS:ro=8|FONTNAME:Tahoma;+FONTNAME:Norasi;+FONTNAME:Loma;+FONTNAME:Thonburi;+FONTNAME:AngsanaDSE
SCRIPTCODE:0000|ATTRIBS:ro=8|FONTNAME:Arial Unicode MS;+FONTNAME:Code2000;+FONTFUID:F0010_CMSANS_SS0
CHARCODE:0000FF01|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARCODE:0000FF0C|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARCODE:0000FF1A|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARCODE:0000FF1B|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARCODE:0000FF1F|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARCODE:0000FF3B|ATTRIBS:ro=8|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARCODE:0000FF3D|ATTRIBS:ro=8|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARSPAN:0000FF08-0000FF09|ATTRIBS:ro=8|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARSPAN:00003001-00003002|ATTRIBS:ro=2|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARSPAN:00003008-00003011|ATTRIBS:ro=8|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
CHARSPAN:00003014-00003019|ATTRIBS:ro=8|FONTNAME:Arial Unicode MS;+FASTFUID:F0011_ARPLSH_NS0;+FONTNAME:Code2000
}

# END OF FILE

Configuration File Structure

The Fontmap is a standard UTF-8 text file that can be edited using any text editor. The CR, LF or CR+LF character serves as the newline character. Any characters with the corresponding ASCII value less than 32 (i.e. control characters) are ignored. The null character (ASCII value 0) is invalid.

The Fontmap consists of precisely one mapping table, which contains multiple table rows. A single opening curly bracket on its own line signals the start of the mapping table, while a single closing curly bracket on its own line signals the table's end. Table rows are separated by a single newline character.

Each table row consists of precisely three fields, separated by a single vertical bar (pipe character). The fields must be specified in precisely the same order as shown below. Their content cannot contain the newline character (CR, LF or CR+LF), pipe or opening/closing curly bracket.*

* Note: However, starting with D-Type 5.0.3.1 the caret character (^) can be used as the escape character so that any byte value can be encoded. The caret character must be followed by precisely two more characters that represent two hexadecimal digits. Thus, each of these characters must be either a digit ('0' - '9') or a letter in the 'a' - 'f' (or 'A' - 'F') range. These two digits make up a single hexadecimal value in the 0 - 255 range, which is the encoded byte value. For example, the sequence ^4B encodes the byte value 75 (or ASCII character 'K') and the sequence ^2C encodes the byte value 44 (or ASCII character ','). If one of the characters that follow the caret is not a hexadecimal digit, the entire escape sequence is considered invalid (as soon as the first bad digit is encountered) and the normal processing of the text file resumes.

Lines outside the mapping table represents comments. Comments are optional but useful. To make it more clear that a text line is a comment, we recommend placing the # character at its beginning. Comments are completely ignored by D-Type Unicode Text Module.

Explanation of Configuration Instructions

Each table row in the Fontmap consists of precisely three fields, separated by a single vertical bar (pipe character):

Example

SCRIPTCODE:latn|ATTRIBS:ro=8,ts=10025|FONTNAME:Times;+FONTNAME:Arial

Any given Unicode script or character range can be assigned a single set of text layout attributes and one or more fonts. If multiple fonts are assigned, it means they are all suitable for the display of the specified Unicode script or character range. The order in which fonts are listed is important — the first suitable font, as determined by the instructions described below, is always chosen as the default.

When processing the Fontmap, D-Type Unicode Text Module reads table rows one at a time. As soon as the first matching Unicode script or character range is found, D-Type Unicode Text Module applies the mapping instructions in that row and stops reading and processing any subsequent rows.

First Field

Within the first field, the CHARCODE, CHARSPAN or SCRIPTCODE instruction specifies a character range or Unicode script:

Second Field

Within the second field, the ATTRIBS instruction specifies the associated set of text layout attributes. This set is supplied as a comma separated list of keyword/value pairs. Currently two attribute keywords are supported:

ro (relative orientation)

The ro attribute keyword corresponds to D-Type Power Engine's pdRelativeOrientation property and D-Type Unicode Text Module's TX_ATTR_ORIENTATION attribute identifier. This attribute allows D-Type clients to set the most appropriate relative text orientation (portrait/landscape) and baseline (e.g. default, shifted) for the associated Unicode script or character range and font(s), in both horizontal and vertical writing mode.

ts (text shaping)

The ts attribute keyword corresponds to D-Type Power Engine's pdTextShaping property and D-Type Unicode Text Module's TX_ATTR_SHAPING attribute identifier. When required, this attribute allows D-Type clients to explicitly set the text shaping method for the associated Unicode script or character range and font(s). If this attribute is not specified, the shaping method will be set automatically based on the Unicode script.

Example 2:

ATTRIBS:ro=2

Example 2:

ATTRIBS:ro=8,ts=10025

Third Field

Within the third field, the associated font can be identified by its name or Unique Font Identifier (fuid). To accomplish this, one of the following four instructions can be used:

FONTFUID

Specifies the font by its Unique Font Identifier (fuid). This instruction also performs some basic validation on the font file and attempts to activate it. As a result, missing, inaccessible, invalid and/or corrupt fonts are rejected as suitable fonts.

Example

FONTFUID:S001R_ARIAL_001A

This instruction selects a font whose Unique Font Identifier in D-Type's Font Catalog is S001R_ARIAL_001A.

FASTFUID

Same as FONTFUID but noticeably faster since it does not perform any validation on the font file and does not attempt to activate it. Thus, any specified font is accepted as a suitable font.

Example

FASTFUID:S001R_ARIAL_001A

This instruction selects a font whose Unique Font Identifier in D-Type's Font Catalog is S001R_ARIAL_001A.

FONTNAME

Specifies the font by its name. This instruction also performs some basic validation on the font file and attempts to activate it. As a result, missing, inaccessible, invalid and/or corrupt fonts are rejected as suitable fonts. This instruction is slower than FONTFUID and FASTFUID since it must access the font file and parse its header to extract the font name.

Example

FONTNAME:Times New Roman

This instruction selects a font whose name is Times New Roman.

FASTNAME

Same as FONTNAME but does not perform any validation on the font and does not attempt to activate it. Thus, any specified font is accepted as a suitable font.

Example

FASTNAME:Times New Roman

This instruction selects a font whose name is Times New Roman.

Multiple Fonts

To associate multiple fonts with the same Unicode script or character range, two different methods are available:

Method 1

Within the third field, multiple font instructions can be concatenated using the ;+ operator, as shown in the following example:

SCRIPTCODE:kana|ATTRIBS:ro=2|FONTNAME:Code2000;+FONTFUID:F0011_ARPLSH_NS0

Here we have a single table row that targets the kana Unicode script. Within the third field, however, there are two suitable fonts: the first one specified using the FONTNAME instruction (Code2000) and the second one specified using the FONTFUID instruction (F0011_ARPLSH_NS0).

Method 2

Multiple table rows that target the same Unicode script or character range, but assign different fonts, can be added to the Fontmap. This is shown in the following example:

SCRIPTCODE:kana|ATTRIBS:ro=2|FONTNAME:Code2000
SCRIPTCODE:kana|ATTRIBS:ro=2|FONTFUID:F0011_ARPLSH_NS0

Here we have two table rows that target the same Unicode script (kana). In the first row, the first suitable font is specified using the FONTNAME instruction (Code2000). In the second row, the second suitable font is specified using the FONTFUID instruction (F0011_ARPLSH_NS0). The final result is precisely the same as in the previous example.

Note that the first method is more compact as it requires only one row per Unicode script or character range. However, the second method is slightly more flexible as it provides the ability to specify different text layout attributes for different fonts.

Important Notes

 

Index