21 Jul 2008

For every instance of a comma or hyphen I get the – character in the output.

How can i correct this in jojo?


26 Jul 2008
Posts: 43

I'm still having problems having to into MySQL after cut & pasting text from a web page and having to replace the – character with a hypen or quote mark.

I thought xinha would be set up to escape these characters, & clean proprietory code from word etc..

How do i resolve this?


26 Jul 2008
Posts: 43

PS - For Pages, I noticed that pg_body table field replaces puntuation by the above character, but not pg_body_code??

the same situation occurs for artcles, the ar_body content gets contaminated but the ar_bbbody content is fine?

In addition if I go back to re-edit the pages in xinha, the content gets repopulated by these characters again.

Is there a setting e.g. should I be using utf-8 encoding or a setting in xinha that I haven't configed correctly?
1 Jul 2009
Posts: 4

I think you need to use UTF-8 on your database, i use jojo on wamp server with UTF-8 on database and it shows correct Azerbaijani charaters.



1 Jul 2009
Posts: 379

No text editor for web that I'm aware of can cope with apostrophes and quote marks, particularly when imported from word and also deal effectively with non-english texts, without some manual intervention.

Xinha has one solution - Paste as Plain Text, which will convert most things (but not all - it struggles with soft line breaks) to the proper html entities.

It also has a Word format cleaning button, but it's not very effective, and doesn't deal with html entities.

Neither is a magic bullet. Office programs (Open Office included) are crap. If you use them as a source, you have to clean the text and reformat.

I looked at implementing a full 'convert everything' option in Xinha, but it was too aggressive - rendering entire cyrillic texts into unicode strings, which gave the server a heart attack.
