4

One of the problems with PGN is that it requires ASCII, while names of players often contain non-ASCII characters. (And of course, many names are not written in the Latin alphabet ...)

The background to this question is the rendering of non-ASCII characters in Arena 3.5.1 (on Windows 7), as in the image below. The names for white and black are Mittelsträß, Jürg and 王, 玥, respectively. Arena reads the UTF-8-encoded file as if it were encoded as Windows Codepage 1252. Editors such as Notepad++, SublimeText and JEdit identify the file as UTF-8 (which is the default encoding of all my plain text files, anyway).

enter image description here

Here's an example PGN for testing:

[Event "Office Armageddon"]
[Site "Böblingen GER"]
[Date "??.??.??"]
[Round "?"]
[White "Mittelsträß, Jürg"]
[Black "王, 玥"]
[Result "*"]
[SetUp "1"]
[Mode "OTB"]
[Termination "unterminated"]

1. d2-d4 d7-d5 2. e2-e3 c7-c5 3. Bf1-b5+ Bc8-d7 4. a2-a4 Ng8-f6 5. d4xc5 e7-e5 6. b2-b4 Nb8-c6 7. Ng1-f3 Qd8-c7 8. Nb1-c3 a7-a6 9.  * 

(SCID assumes a system encoding if a PGN is encoded in UTF-8 without a byte order mark (BOM), but the BOM makes no difference in Areana 3.5.1.)

So I'd like to know if there is a Unicode-compatible alternative (ideally UTF-8) that is supported by some chess programs. (If not, PGN needs an overhaul.)

Tsundoku
  • 1,113
  • 1
  • 9
  • 27

2 Answers2

2

https://sourceforge.net/p/pgn4web/wiki/PGN_Support/

if you read that, you will notice:

Screenshot in order for pgn4web to display those characters correctly, the PGN file should be saved in unicode UTF-8 format.

you can save PGN as UTF-8... Just open notepad > Save As... > Encoding and select UTF-8. and you can then add the appropriate player names! I've tested this, and it works with Arena Chess and a few others...

Hope this solves your problem. ~CSS

cascading-style
  • 693
  • 7
  • 17
  • I'm surprised that this is supposed to work in Arena. UTF-8 is the default encoding for all my plain text files, including PGN. When a PG file contains an ü Arena renders it as ü (regardless whether I save the file as "UTF-8" or "UTF-8 without BOM"). Which Arena version are you using? – Tsundoku Nov 22 '16 at 22:46
  • @ChristopheStrobbe Arena 3.7.1 Alpha – cascading-style Nov 23 '16 at 00:14
  • I'll update my own version of Arena (3.0 Build 2542) in the next few days and test my UTF-8 encoded PGN again. – Tsundoku Nov 23 '16 at 10:31
  • I now have Arena 3.5.1 build 2861 (20/12/2015), on Windows 7 (32 bits), and ü is still rendered as ü. – Tsundoku Nov 27 '16 at 17:37
  • I have updated my question with a screenshot of how Arena 3.5.1 build 2861 displays non-ASCII characters. It clearly treats my UTF-8 file as Windows Codepage 1252. – Tsundoku Nov 30 '16 at 12:46
  • 1
    So, yes, pgn4web games viewer supports UTF-8 and even supports Chinese characters. Thanks. – Tsundoku Dec 07 '16 at 18:06
  • Could you add a screenshot of Arena with my example PGN rendering correctly? – Tsundoku Jul 28 '17 at 13:23
2

PGN databases created by ChessBase programs are encoded as UTF-8 by default, too.

Archäopath
  • 199
  • 5
  • Do you have an example of a UTF-8 encoded PGN file that uses non-ASCII characters? As long as you use only ASCII in such files, the UTF-8 encoding is irrelevant, as far as I know. – Tsundoku Nov 23 '16 at 18:35
  • 2
    https://ingram-braun.net/public/programming/web/testpages/chess-game-viewer/replay/ – Archäopath Nov 23 '16 at 19:32
  • Yes, I created a test file for pgn viewers embedded in web pages with German special characters. The above website (sorry, hit to early) shows the results. And I have no problems to type non-Latin characters in CB 14. – Archäopath Nov 23 '16 at 19:42