XML safe entities with TinyMCE

Currently I use RSS 1.0 with the content stored in a content:encoded property of each item. The full content of blog entries is in the RSS entries. Of course, in XML documents everything must be nested properly, ampersands need to be encoded, etc. People find all kind of strange ways of getting around this requirement and allowing tag soup to be placed in RSS documents. The content:encoded property kind of allows that – if there is a mismatched tag the RSS file will still work.

However, the requirement for properly encoded characters remains. TinyMCE to the rescue! By default, TinyMCE will ensure that anything that needs to be an entity is an entity. If you type £, it generates £ If you type &, it generates &. For some reason, XML doesn’t support all the entities that HTML does – £ is an invalid XML entity. However, the numeric version of the entity, £ works and is valid.

To ensure valid RSS, I tried to set the "entityencoding" : "numeric" property on TinyMCE. It kind of worked – it produced valid numeric entities instead of named entities. This option seems to have a serious bug however; when you try to use an angle bracket it doesn’t encode it. The "hack" which solved it was to use named entity encoding but to configure the names to produce numeric entities.

entities :
"160,nbsp,38,amp,34,quot,60,lt,62,gt,162,#162,8364,#8364,163,#163,165,#165,169,#
169,174,#174,8482,#8482,8240,#8240,181,#181,183,#183,8226,#8226,8230,#8230,8242,
#8242,8243,#8243,167,#167,182,#182,223,#223,8249,#8249,8250,#8250,171,#171,187,#
187,8216,#8216,8217,#8217,8220,#8220,8221,#8221,8218,#8218,8222,#8222,8804,#8804
,8805,#8805,8211,#8211,8212,#8212,175,#175,8254,#8254,164,#164,166,#166,168,#168
,161,#161,191,#191,710,#710,732,#732,176,#176,8722,#8722,177,#177,247,#247,8260,
#8260,215,#215,185,#185,178,#178,179,#179,188,#188,189,#189,190,#190,402,#402,87
47,#8747,8721,#8721,8734,#8734,8730,#8730,8764,#8764,8773,#8773,8776,#8776,8800,
#8800,8801,#8801,8712,#8712,8713,#8713,8715,#8715,8719,#8719,8743,#8743,8744,#87
44,172,#172,8745,#8745,8746,#8746,8706,#8706,8704,#8704,8707,#8707,8709,#8709,87
11,#8711,8727,#8727,8733,#8733,8736,#8736,180,#180,184,#184,170,#170,186,#186,82
24,#8224,8225,#8225,192,#192,194,#194,195,#195,196,#196,197,#197,198,#198,199,#1
99,200,#200,202,#202,203,#203,204,#204,206,#206,207,#207,208,#208,209,#209,210,#
210,212,#212,213,#213,214,#214,216,#216,338,#338,217,#217,219,#219,220,#220,376,
#376,222,#222,224,#224,226,#226,227,#227,228,#228,229,#229,230,#230,231,#231,232
,#232,234,#234,235,#235,236,#236,238,#238,239,#239,240,#240,241,#241,242,#242,24
4,#244,245,#245,246,#246,248,#248,339,#339,249,#249,251,#251,252,#252,254,#254,2
55,#255,914,#914,915,#915,916,#916,917,#917,918,#918,919,#919,920,#920,921,#921,
922,#922,923,#923,924,#924,925,#925,926,#926,927,#927,928,#928,929,#929,931,#931
,932,#932,933,#933,934,#934,935,#935,936,#936,937,#937,945,#945,946,#946,947,#94
7,948,#948,949,#949,950,#950,951,#951,952,#952,953,#953,954,#954,955,#955,956,#9
56,957,#957,958,#958,959,#959,960,#960,961,#961,962,#962,963,#963,964,#964,965,#
965,966,#966,967,#967,968,#968,969,#969,8501,#8501,982,#982,8476,#8476,977,#977,
978,#978,8472,#8472,8465,#8465,8592,#8592,8593,#8593,8594,#8594,8595,#8595,8596,
#8596,8629,#8629,8656,#8656,8657,#8657,8658,#8658,8659,#8659,8660,#8660,8756,#87
56,8834,#8834,8835,#8835,8836,#8836,8838,#8838,8839,#8839,8853,#8853,8855,#8855,
8869,#8869,8901,#8901,8968,#8968,8969,#8969,8970,#8970,8971,#8971,9001,#9001,900
2,#9002,9674,#9674,9824,#9824,9827,#9827,9829,#9829,9830,#9830,8194,#8194,8195,#
8195,8201,#8201,8204,#8204,8205,#8205,8206,#8206,8207,#8207,173,#173,233,#233,",
entity
encoding : "named",

Leave a Reply

Your email address will not be published. Required fields are marked *