Character Set File Format

This is the format of the character set files for conversion from Unicode characters to the operating system character set. The file must be in ASCII format. The line endings does not matter, nor does the case of the commands. These files are stored in the CSet folder inside the SoftLogik folder, and new ones may be created using existing files as a template to start from. It is recommend that all of the commands be included as shown (but with the variables you want, if any) in every file.

This should be the first line in every character set file.

NAME "CP1250"
The name of the character set as it is used in scripts. It must be unique to all the existing character sets, and it is recommended that it contain no spaces since spaces in script variables can be tricky to deal with. Normally the file name is the same as this, with a .txt extension, although they may differ, and the extension is not important to PageStream.

GUINAME "Windows CP1250(EE)"
The name of the character set as it appears in the CharSet popup in the Type panel of the Preferences dialog box, as well as any other location that a character set may be specified (such as the ASCII text filter's Import and Export dialog box).

BASESET "Windows"
The built-in character set starting point. It is not mandatory, but if given, then only those characters which are different from the built-in character set need be listed. As the first 128 characters are almost always the same, this saves some redundancy. Also, some character sets are very minor variations in the upper 128. If you want to use a BASESET, choose the one closest to the character set you are trying to support. It must be either Amiga, Macintosh, Windows, or MSDos.

Defines what the host platform for this character set requires for a new line. Amiga character sets will normally have a newlinetype of LF. Macintosh character sets will normally have a newlinetype of CR. Windows (and MSDos) character sets will normally have a newlinetype of CRLF. The main function of the field is for the export of PageStream's newline into a text file by the ASCII filter.

The start of the character map.

# This line is for information only
Everything after the # symbol is treated as a comment and provides information for anyone reading the contents of the ASCII file. The comment is ignored by PageStream

207 323
0x2E 0x002E # FULL STOP

The operating system character number followed, after the comma separator, by the required Unicode character.

This time the character numbers are given in Hex (HexaDecimal) format. The space is acting as a separator. The comment after the # explains that this defines the Full Stop punctuation mark character. You may find Unicode maps on the internet expressed in Hex rather than Decimal and it would be easier to use the Hex numerical format rather than convert to decimal.


Character Set File Format  Sub-Section  url:PGSuser/charactersets#fileformat
  created:2006-04-18 16:37:46   last updated:2006-08-13 21:19:03
  Copyright © 1985-2017 GrasshopperLLC. All Rights Reserved.

User Contributed Comments For Character Set File Format
There are no user contributed comments for this page.