The ruby I’m talking about is not the Ruby programming language. As a native Mandarin speaker, this element is pretty relevant to me. If you have seen how East Asian glyphs are annotated with pronunciation guides, then this shouldn’t be too foreign, but just in case you have no idea what I’m talking about, here’s an image from the manga, Slamdunk.
See those little glosses along the main text? Those are known as Ruby characters (sometimes also called rubi). These short runs of text alongside the base text are usually used in East Asian documents as pronunciation guides.
East Asian writing systems
The Japanese language writing system is relatively complex, in that there are 2 kinds of syllabary, hiragana and katakana. There is also kanji, which are adopted from Chinese Han characters. As a single kanji potentially could have a variety of readings, ruby characters help indicate how they should be read. There is also the romanised version known as romaji.
For Chinese, there are 2 styles of syllabary for ruby characters, the one used in mainland China (Pinyin) and the one used in Taiwan (Zhuyin).
The Korean language also adopts some Chinese Han characters, and these are known as hanja. The Korean alphabet is hangul, which is the official script of both South and North Korea. Hangul is usually used as the ruby annotation for hanja characters. There is also a romanised version known as romaja.
East Asian languages on the web
The W3C recommendation for ruby characters came about in 2001 to address formatting for such writing systems on the web. The specification defines markup required to define the association between the base text and the ruby text. This also allows for styling of content marked up in this manner with CSS.
The initial working draft of the CSS Ruby module also came about in 2001 and became a candidate recommendation in 2003. The standard has evolved over the years and now we have the CSS Ruby Layout Module Level 1.
You can refer to the HTML specification for ruby for all the details. I found that the Japanese version of the HTML specifications provide better examples though. Even if you don’t understand Japanese, there are images that can make things much clearer as to how each element should be used.
Browser support for <ruby>
As of time of writing, caniuse.com shows there is at least partial support for all major browsers, with Firefox fully supporting this property. No love from Opera Mini though. If you want details on how browsers support and render
<ruby>, check out the W3C browser tests.
There are a number of ruby elements as defined in the HTML specification. The table below summarises what each of these elements do.
rbelements in the complex ruby markup use-case. Only one
rbcelement should appear in a
rbelements are allowed in an
rbelement has a corresponding
rtelements in the complex ruby markup use-case. A maximum of 2
rtcelements can appear in a
rubyelements as children. Has
rbspanattribute for use in complex ruby markup, which allows an
rtelement to span multiple
Basic ruby markup is supported by all major browsers except Opera Mini.
<ruby> <rb>大马女子篮球队</rb> <rt>dà mǎ nǚ zǐ lán qiú duì</rt> </ruby>
The complex ruby markup (use of the
<rbc> elements) is only fully supported by Firefox, so if you’re using any other browser, the example should have parentheses around the ruby annotations, and the formatting may look quite bad. In Firefox, the
<rp> tag tells the browser to ignore those parentheses.
<ruby> <ruby xml:lang="zh"> <rbc> <rb>大</rb><rp>(</rp><rt>dà</rt><rp>)</rp> <rb>马</rb><rp>(</rp><rt>mǎ</rt><rp>)</rp> <rb>女</rb><rp>(</rp><rt>nǚ</rt><rp>)</rp> <rb>子</rb><rp>(</rp><rt>zǐ</rt><rp>)</rp> <rb>篮</rb><rp>(</rp><rt>lán</rt><rp>)</rp> <rb>球</rb><rp>(</rp><rt>qiú</rt><rp>)</rp> <rb>队</rb><rp>(</rp><rt>duì</rt><rp>)</rp> </rbc> </ruby> <rtc xml:lang="en" style="ruby-position: under;"> <rp>(</rp><rt>Malaysia Women's Basketball Team</rt><rp>)</rp> </rtc> </ruby>
For the benefit of people who do not have Firefox installed, here’s how the markup is supposed to be rendered.
If you noticed the additional ruby-position style applied to the bottom ruby text, that’s because the default position of ruby text is above the base text. In this case, there are 2 lines of ruby text and they will overlap each other. Setting
ruby-position: under; moves that line under the base text instead.
Styling <ruby> elements
There are 3 formatting properties we can use with ruby elements,
The property only applies to ruby element containers,
<rbc>, and controls the position of ruby text with respect to its base. As of time of writing, this only works on Firefox.
This property controls how ruby boxes should be rendered if there are multiple ruby containers adjacent to each other. This property is essentially not implemented in any browser simply because the only value that works is
separate, which is the default value.
The above two code examples mean exactly the same thing. Each annotation box is in the same column as its corresponding base box, and this style is called “mono ruby”.
There is also the
collapse value, which concatenates all ruby annotation boxes within the same ruby segment on the same line. This combined annotation box spans across their corresponding base boxes.
The above two code examples mean exactly the same thing. Each annotation box is in the same column as its corresponding base box, and this style is called “group ruby”.
auto value cedes the rendering style to the user agent depending on the length of the annotation box with respect to the base box.
This property dictates the distribution of ruby boxes when their contents do not fill their respective boxes exactly. Currently, Firefox is the only browser that supports this property.
Support for the
<ruby> element has improved quite a lot since 2010, and although the more complex markup and styling options are limited to Firefox only, at least all browsers can display basic ruby markup now. Below is the list of relevant resources for the HTML
<ruby> element if you’re interested to find out more.