Cardbox Talk


CardboxForumsCardbox Talk > "UTF-8"


Set UTF-8 as default for Cardbox?

Posted By Post

Charles Welling

18-May-2013 09:56

There's no need to elaborate on the fact that UTF-8 is the preferred encoding. Cardbox however still uses Windows-1252 as default the code page for its databases for (so the Help file tells us) compatibility with older versions.

As the present version of Cardbox no longer supports the older non-unicode versions of Windows and there's no reason why people should still use older versions of Cardbox, I'd say: let's make UTF-8 the default setting for Cardbox and let's finally rid ourselves from all other code pages.

Conversion to UTF-8 when rebuilding a database is NOT a watertight and permanent solution. The FMT does not register that a database was converted to UTF-8 and (re-)creating a database from such an FMT turns the resultant database back into Windows-1252. I did just that recently and imported a large number of records from a UTF-8 database into the new supposedly UTF-8 database. The result was that all unicode characters were gone. I did not find out until it was too late.

Does anyone support my suggestion?


18-May-2013 10:35

I like your suggestion. I send a question like that some years ago at Cardbox Support. But it was never honoured.

However, there is one pitfall: not all fonts support Unicode right. So if you set Unicode default you introduce something that met default Windows limitations.

Perhaps better would be a warning when you import Unicode records into a non-Unicode database there would appear a warning.
An alternative could be a Cardbox option which you can check on/off: "Create new databases into Unicode"

Btw: adding a new field, still is it indexed manual. Default today it should be: automatic.


Charles Welling

18-May-2013 14:36

The fonts should be no problem. There are quite a few that support Unicode and what's more important: using a non-Unicode font does not change the database but only affects the display. So that's more or less harmless.

A warning that Unicode records are imported into a Windows-1252 database would help. When importing from a file the user is given a choice, but the import from a Cardbox window is automatic, which can lead, as in the case I described in my first post, to a certain amount of lost data.

Anyhow, it would be consistent with the international use of Cardbox and with its ability to correctly store and index many different languages if Cardbox would present Unicode as the first, best, obvious and default choice.

Christopher Spry

20-May-2013 11:41

Yes, UTF-8 does seem to be the way forward. I'm in favour.

When rebuilding a database, could Cardbox could check the fonts being used and warn if any would be a problem, so that users could edit their formats to deal with this before doing the conversion? Font names in FMTs seem to be held in plain text there.


© 2010 Cardbox Software Limited   Home