Fossil

View Ticket
Login

View Ticket

Ticket Hash: 6b498a792c0f9ca02b5b2c11f7ca9a6a2c01a0b1
Title: Cyrillic symbols does not display correctly
Status: Fixed Type: Code_Defect
Severity: Severe Priority: Medium
Subsystem: Resolution: Fixed
Last Modified: 2010-02-16 20:03:28
Version Found In: c4c231069e
Description:
In Wiki page and other pages, if web-server started by fossil ui reponame
Encoding in browser UTF-8

<hr><i>ron added on 2010-02-03 12:17:56:</i><br>
Can you please post an example of UTF8 Cyrillic text which doesn't appear correctly?

Are you sure the font you are using for the web page supports the characters you want to display?

<hr><i>anonymous added on 2010-02-03 13:49:25:</i><br>
Example of cyrillic text: 
Пример печати примечаний

<hr><i>anonymous added on 2010-02-03 13:51:58:</i><br>
Example2 of cyrillic text: Пример печати комментария

<hr><i>anonymous added on 2010-02-03 13:54:10:</i><br>
I can't reproduce this problem :(

<hr><i>ron added on 2010-02-03 14:00:57:</i><br>
Nor can I.  Maybe it was a font problem.

<hr><i>anonymous added on 2010-02-03 14:01:22:</i><br>
Possible workaround: 
In Windows, before start fossil ui reponame from cmd.exe, set encoding in cmd.exe to UTF-8 by chcp 65001

<hr><i>anonymous added on 2010-02-04 06:01:18:</i><br>
In my repository I can reproduce this. When create new ticket, in title field the symbol 'й' may bу print incorrectly.
Same problem in comment field for ticket. And chcp 65001 not fix the problem.

<hr><i>ron added on 2010-02-04 06:09:06:</i><br>
It works fine for me, using [49cffc0187].  I can only assume you don't have the browser set to use UTF8 or something.  I can create a page with that name, and I can show content with that name.



<hr><i>anonymous added on 2010-02-04 06:37:25:</i><br>
Problem present in WindowsXP only.

Steps for reproduce the problem:

In Windows, start cmd.exe
In cmd.exe console:
fossil new test
fossil ui test

In browser select Tickets > New Ticket and in field 'Enter a one-line summary of the problem:' type cyrillic text with symbol 'й', for example type 'Это комментарий'.
Also type text 'Это комментарий' in detailed description field. Click 'Submit' button. 
And in View Ticket form loock at corrupted text for Title field and Description & Comments field.

<hr><i>anonymous added on 2010-02-04 06:43:37:</i><br>
And reproduced in Firefox, IE and Chrome

<hr><i>ron added on 2010-02-04 06:55:05:</i><br>
Can you please try this version: http://ronware.org/fossil.zip

I do not have Windows, but it runs correctly under Wine on Linux ... not that that means anything, but please give it a try.

<hr><i>anonymous added on 2010-02-04 07:50:02:</i><br>
I try the version from http://ronware.org/fossil.zip and it have the same defect

I can send you screenshot with example of corrupted text.

<hr><i>ron added on 2010-02-04 09:10:57:</i><br>
Try that fossil.zip again -- I updated it with a small fix.  Also, please email me (ron @ ronware.org) the images of the corrupted text.

<hr><i>anonymous added on 2010-02-04 09:31:17:</i><br>
Screenshot emailed, updated version has the defect too.

<hr><i>ron added on 2010-02-04 13:37:25:</i><br>
Can repro now, on XP (in a VirtualBox).

<hr><i>ron added on 2010-02-04 14:55:03:</i><br>
I think I understand the problem to some extent.

The final character is 0x0439, and the last byte is 0x39, which is the same as the single-quote character.  I don't know where, but my guess is that some cleaning code is stripping out that quote. 

What I don't understand is why it should fail on Windows but not on Linux.  I did confirm that XP can take the troublesome string and convert it to UCS2 and back to UTF8 without any loss, so that isn't the problem.

<hr><i>ron added on 2010-02-15 20:29:15:</i><br>
The routine 'fossilize' seems to be causing grief:

<nowiki><pre>
fossilize:
[Это комментарий]
after fossilization:
[╨¡\╤é╨\╛\s╨║╨\╛╨╝╨╝╨\╡╨\╜\╤é╨░\╤Ç╨\╕╨]

removing backslashes:
[╨¡╤é╨╛s╨║╨╛╨╝╨╝╨╡╨╜╤é╨░╤Ç╨╕╨]
</pre></nowiki>

The last character is getting munged.  I am guessing that fossilize and unfossilize should be made UTF8 aware...

<hr><i>ron added on 2010-02-16 10:08:30:</i><br>
Interesting:  this bug ONLY happens on XP.  I tried Vista and Windows 7, and on both it is ok.

I wonder if it is actually a problem in IE 6?  I'll try Firefox on XP to see if that fixes the problem.

<hr><i>ron added on 2010-02-16 10:56:14:</i><br>
No, Firefox has the same problem (as the OP noted above).  So... what's the difference between XP and later Windows?

<hr><i>ron added on 2010-02-16 20:03:28:</i><br>
checkin [0e2281fc8a]