Wednesday, 31 October 2007

Unicode and fonts

Kenichi Handa has been hard at work on a Unicode based Emacs for some years now. For Windows users, there is nothing radical in the default build, a few more languages are supported, and a wider range of Unicode characters, but the Windows specific code has only been updated enough to keep working. In future, optimisations and simplifications can be made due to the internal encodings of Emacs and Windows being both based on Unicode. Messing around with code pages to get fonts displaying will be a thing of the past, and can be already thanks to the new font backend.

While work progressed in parallel on Emacs 22 and the Unicode codebase, there were several other developments happening outside the core Emacs development team. Multiple terminal support (multi-tty) has already been merged with the CVS trunk, though it doesn't mean much to Windows users. Limitations in the way Windows handles console output mean that it is never likely to provide much in the way of new features on Windows, though it may be possible to rid ourselves of the runemacs.exe hack without sacrificing console support using the multi-tty feature.

Another new development was support for X freetype font rendering. On the face of it, this doesn't seem to mean much to Windows users either, but after being merged with the Unicode branch, it has been refactored into a new font backend design, with better support for unicode fonts. No longer are fonts defined by their character-set, Emacs can make use of font meta-data to determine which Unicode subranges each font supports. Currently the font backend is not enabled by default, but has to be enabled with a configure option. A backend has been implemented for Windows native fonts, and is ready for testing.

  • Checking out the Unicode branch:

    cvs co -r emacs-unicode-2 emacs

  • Building Emacs with font backend support:

    cd emacs\nt

    configure --enable-font-backend

    make bootstrap

Future work

The new font backend is noticeably slower on Windows. A lot of this is probably down to the fact that the old font code cached the font metrics for ASCII characters of fixed width fonts, whereas the new font backend does no such caching yet. We can probably do a better job of caching by calculating which ranges of characters the fixed width applies to, rather than just ASCII. We might even allow multiple such range/width combinations to be associated with a font, to speed up CJK text display (where characters in the font are one of two widths).

There is no support for BDF fonts in the new font backend. BDF fonts will be given their own font backend, hopefully mostly reusable on other platforms.

A Uniscribe font backend may be introduced to enable some of the more advanced layout features in Windows XP and later. The new font backend design makes it easier to add new support like this without complex dynamic loading logic to support older versions of Windows.


No comments: