>Postgres – Converting Encodings

>I’ve run into a number of problems recently with dealing with old databases of ours which are encoded with LATIN1. Now, with postgres 8.3 (maybe before) you’ll get a message if you try to create LATIN1 saying something like

createdb: database creation failed: ERROR:  encoding LATIN1 does not match server's locale en_US.UTF-8
DETAIL: The server's LC_CTYPE setting requires encoding UTF8.

I got bored trying to work out why – it seems to be that postgres now prevents what it shouldn’t have allowed in the past but did. But if you do want to convert from the old to the new locale – how do you do it? Remarkably simple it turns out.

Do a pg_dump of your existing database. Then take the dump file and run it through iconv – something like this.

 
iconv -f latin1 -t utf8 original.sql > converted.sql

It’s pretty obvious what the options mean (-f = from, -t = to). iconv comes as standard on Mac and should be available for most linux distos (seems to be installed on ubuntu server by default).

However, before you get too excited – you should ensure that whatever apps are using that database will cope with the new encoding for input and output. That may be ‘non-trivial’ 😉

Leave a Reply

Your email address will not be published. Required fields are marked *