Emacs, Nxml-Mode and Unicode

I've run into this too many times and fixed just as many.. grr. And i always forget.. Time to write it down.

I use James Clark's excellent nxml-mode to edit pretty much anything that's vaguely XML, i.e. i usually convert HTML i have to edit into XHTML so i can use this mode.

Problem is, if you just start writing XML in that mode, the resulting file will be Unicode encoded. There is a fix to this. Write proper XML :) Basically, first add this to your .emacs file:

(unify-8859-on-decoding-mode)

Next, make sure you got your proper XML header:

<?xml version="1.0" encoding="utf-8"?>

Tada! File is saved in proper form.

Just to be all proper and stuff, I use this header for my html/xhtmlL

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">