Inspired by the post by Behdad at “State of Text Rendering”, I have been working on adding a Pango based font rendering for OBS called text-pango. During this process you run into all the fun of lightly documented Gnome projects (Pango, Fontconfig, Harfbuzz are all heavily influenced by the Gnome Community).
First lets take a look at the libraries we will be discussing. First the lowest level library in this
stack is the eponymous FreeType. This library does all the leg work
in loading font files, and extracting useful information in the form of an
FT_Face. This exposes all
of the various tables and information in the font on how to render specific glyphs (images inside the font).
As you might expect FreeType provides a naive
FT_Get_Char_Index to convert Unicode codepoints to a
glyph index via the font’s charset mapping tables. If you are reading this then you probably know that
this mapping is only correct in the simplest of scripts, and doesn’t account for ligatures, connected
fonts like Arabic, combining characters, or even where to place the glyph image on the line.
To clear this hurdle we will move up the stack to Harfbuzz the most
important element at this step is that Harfbuzz provides a function
hb_shape() which takes a font
(created from an
FT_Face in most cases via
hb_ft_* family of functions but others are available)
and a buffer (Unicode codepoints). Once it completes you can get the index and precise glyph position info
hb_buffer_get_glyph_positions(). With this its easy right?
Load a FreeType font, give your text buffer and font to Harfbuzz, and get perfect text back. Unfortunately
Harfbuzz provides a tiny 1 line blurb in it’s documentation about
hb_shape(): “[buffers] are sequences
of Unicode characters that use the same font and have the same text direction, script and language.”
We will call these buffers of the same font/direction/script/language ‘runs’ of text, and to divide our
text into runs we will need a few more libraries.
This is the layer at which Pango comes in. It comes equipped with a handy
pango_itemize() that you will probably never need to use as there are much higher level
and useful functions which will internally deal with this (such as layouts ). It does precisely what is needed to segment our text into runs of
consistent font/direction/script/language. In order to accomplish this it leverages a few more libraries.
First is the amazing GNU FriBidi to correctly break any text buffer into
runs of the appropriate direction ala the Unicode Bidirectional Algorithm.
Allows for language specification and fallback, and attempts to guess script (scripts are writing systems)
spans in text via glib as defined by
Unicode Standard Annex #24 (depending on you compiled this may be
provided by the ICU library).
Interestingly in all of these cases we have had to specify a font (presumably from FreeType). But what if that font doesn’t cover all those codepoints? At the Harfbuzz layer we presume it does and it will do all it can to compose/decompose characters to fit whats available in the given font. You might say “FreeType must have functions for checking that” and indeed FreeType exposes those character set and glyph tables. But its infeasible to scan every font on the system for glyph coverage every time you want to render some text.
This is where Fontconfig
comes in. It provides two key functions: caching/querying font metrics and codepoint coverage, providing user
configurable font fallback. When you query Fontconfig for a FcPattern (collection of attributes describing
a font: Family, Size, Weight, Style, coverage, and more) via
FcFontSort() you can receive a
of fonts prioritized by their compatibility with your input pattern. Prior to this function Fontconfig
FcDefaultSubstitute() which allow user configuration to take
place before the metrics checking.
And there you have it. Beautiful text rendered every time. Except whats this? Every thing we have gone through so far simply translates your text characters into their corresponding glyph indexes in appropriate fonts. We have yet to actually deal with drawing anything to a texture, and indeed there is another kink here too. Font glyphs as you might expect come in a variety of formats: bitmap, Bézier curves, png, jpeg/png/tiff, even sets of other glyphs and full svg’s. Each operating system supports rendering engines to draw these glyphs. From GDI+ and DirectDraw on Windows to CoreText on MacOS and Cairo on all platforms. I briefly cover Cairo as it’s whats used on almost all Free Software stacks even those that might otherwise leverage OS interfaces. Cairo comes with many different rendering primitives giving it the ability to usually map font glyph descriptions to drawing primitives fairly directly. Cairo recently added support for decoding PNGs giving it the ability to render fonts containing these (found in emoji fonts using CBDT/CBLC or sbix) in addition to the more traditional bitmap and Bézier based vector fonts.
Unfortunately it doesn’t always end so nicely. We are in the middle of a 4 way dispute over color emoji. As mentioned the Free Software stack relies on Cairo for rendering in most cases. However some font “standards” require renders to be capable of full SVG rasterization e.g. SVGinOT fonts. As such these fonts are often left to only browsers which can afford to include everything and the kitchen sink and in-fact usually rely on multiple renderers already.
I hope this was an interesting whirlwind tour of text rendering in free software. This is mostly just a dump of all the things I’ve noticed as I examined implementing things on a few levels before deciding on simply using Pango. I hope it informs others about the issues they may face or need to solve in their own font rendering travels or just informs them of whats out there. Good luck, and have fun~