I found a solution to garbage characters showing up in my blog. The solution is to download the MTStripControlChars plugin which essentially translates the (would-be) Windows-1252 characters into the corresponding Unicode numeric entities. This fixed most of the weird garbage characters, but not all of them. Unfortunately, the plugin is not complete, so I had to customize the MTStripControlChars file to add other character mappings such as copyright symbols, registered trademarks, letter 'e' with an accent é), and other mappings. If you want, you can download my MTStripControlChars.pl where I added some more character mappings.
I had to break out the old ASCII chart of characters and perform some decimal to Hexidecimal conversion which was then added to the MTStripControlChars.pl file. I haven't done decimal-to-hex conversions since my Assembly programming class in college. Ah, the memories. Anyway, you then you simply put <$insert-MT-tag strip_controlchars="2"$> into various locations in the blog's template and presto bango it works!
For example, for the blog's content you would change <$MTEntryBody$> to <$MTEntryBody strip_controlchars="2"$> Just repeat for the Comments and Trackback sections.
Here is a useful chart that assisted in the character mappings I did. I didn't map all of the characters, so some of them may look like gobbly-gook unless I decide to go crazy and add them all into the MT plugin. Actually, the characters will display just fine, but you will see  in front of it - just ignore it - the ones I mapped don't display the  character or probably is already supported by most browsers. To add new entries to the plugin, you basically use this chart to lookup the appropriate code and then add the new entry below this function:my %windows_1252 = (
So for example, I added this line of code for the copyright symbol (©)
'\xA9' => '©',
(Make sure the last line for your mappings doesn't have a comma)
| Explanation of Symbol | Entity Encoding | Entity Looks Like | ASCII Encoding | ASCII Looks Like | Unicode Encoding | Unicode Looks Like | ALT+ASCII Number Key Combination | ALT+ASCII Looks Like |
| Standard Keyboard Characters | ||||||||
| double quotes | " | " | " | " | " | " | ALT+0034 | " |
| ampersand | & | & | & | & | & | & | n/a | n/a |
| less than sign | < | < | < | < | < | < | n/a | n/a |
| greater than sign | > | > | > | > | > | > | n/a | n/a |
| ASCII 127 - 159 Not Supported By Some Browsers | ||||||||
| euro sign | € | € | € | € | € | € | ALT+0128 | € |
| single low-9 quotation mark | n/a | n/a | ‚ | ‚ | ‚ | ‚ | ALT+0130 | ‚ |
| latin small f with hook - function | n/a | n/a | ƒ | ƒ | ƒ | ƒ | ALT+0131 | ƒ |
| double low-9 quotation mark | „ | „ | „ | „ | „ | „ | ALT+0132 | „ |
| horizontal ellipsis | … | … | … | … | … | … | ALT+0133 | … |
| dagger | † | † | † | † | † | † | ALT+0134 | † |
| double dagger | ‡ | ‡ | ‡ | ‡ | ‡ | ‡ | ALT+0135 | ‡ |
| circumflex accent | n/a | n/a | ˆ | ˆ | ˆ | ˆ | ALT+0136 | ˆ |
| per thousand sign | ‰ | ‰ | ‰ | ‰ | ‰ | ‰ | ALT+0137 | ‰ |
| latin cap S with caron | n/a | n/a | Š | Š | Š | Š | ALT+0138 | Š |
| left single angle quote | n/a | n/a | ‹ | ‹ | ‹ | ‹ | ALT+0139 | ‹ |
| latin cap ligature OE | n/a | n/a | Œ | Œ | Œ | Œ | ALT+0140 | Œ |
| latin cap Z with caron | n/a | n/a | Ž | Ž | Ž | Ž | ALT+0142 | Ž |
| left single quotation mark | ‘ | ‘ | ‘ | ‘ | ‘ | ‘ | ALT+0145 | ‘ |
| right single quotation mark | ’ | ’ | ’ | ’ | ’ | ’ | ALT+0146 | ’ |
| left double quotation mark | “ | “ | “ | “ | “ | “ | ALT+0147 | “ |
| right double quotation mark | ” | ” | ” | ” | ” | ” | ALT+0148 | ” |
| bullet | • | • | • | • | • | • | ALT+0149 | • |
| en dash | &endash; | &endash; | – | – | – | – | ALT+0150 | – |
| em dash | &emdash; | &emdash; | — | — | — | — | ALT+0151 | — |
| small tilde | n/a | n/a | ˜ | ˜ | ˜ | ˜ | ALT+0152 | ˜ |
| trade mark sign | ™ | ™ | ™ | ™ | ™ | ™ | ALT+0153 | ™ |
| latin small letter s with caron | n/a | n/a | š | š | š | š | ALT+0154 | š |
| right single angle quote | n/a | n/a | › | › | › | › | ALT+0155 | › |
| latin small letter oe | n/a | n/a | œ | œ | œ | œ | ALT+0156 | œ |
| latin small z with caron | n/a | n/a | ž | ž | ž | ž | ALT+0158 | ž |
| latin capital letter Y with diaeresis | n/a | n/a | Ÿ | Ÿ | Ÿ | Ÿ | ALT+0159 | Ÿ |
| ASCII 160 - 255 Supported By Most Browsers | ||||||||
| non-breaking space | |   |   | ALT+0160 | ||||
| inverted exclamation mark | ¡ | ¡ | ¡ | ¡ | ¡ | ¡ | ALT+0161 | ¡ |
| cent sign | ¢ | ¢ | ¢ | ¢ | ¢ | ¢ | ALT+0162 | ¢ |
| pound sign | £ | ¢ | £ | ¢ | £ | ¢ | ALT+0163 | ¢ |
| currency sign | ¤ | £ | ¤ | £ | ¤ | £ | ALT+0164 | £ |
| yen sign | ¥ | ¥ | ¥ | ¥ | ¥ | ¥ | ALT+0165 | ¥ |
| broken vertical bar | ¦ | ¦ | ¦ | ¦ | ¦ | ¦ | ALT+0166 | ¦ |
| section sign | § | § | § | § | § | § | ALT+0167 | § |
| spacing diaeresis - umlaut | ¨ | ¨ | ¨ | ¨ | ¨ | ¨ | ALT+0168 | ¨ |
| copyright sign | © | © | © | © | © | © | ALT+0169 | © |
| feminine ordinal indicator | ª | ª | ª | ª | ª | ª | ALT+0170 | ª |
| left double angle quotes | « | « | « | « | « | « | ALT+0171 | « |
| not sign | ¬ | ¬ | ¬ | ¬ | ¬ | ¬ | ALT+0172 | ¬ |
| registered trade mark sign | ® | ® | ® | ® | ® | ® | ALT+0174 | ® |
| spacing macron - overline | ¯ | ¯ | ¯ | ¯ | ¯ | ¯ | ALT+0175 | ¯ |
| degree sign | ° | ° | ° | ° | ° | ° | ALT+0176 | ° |
| plus-or-minus sign | ± | ± | ± | ± | ± | ± | ALT+0177 | ± |
| superscript two - squared | ² | ² | ² | ² | ² | ² | ALT+0178 | ² |
| superscript three - cubed | ³ | ³ | ³ | ³ | ³ | ³ | ALT+0179 | ³ |
| acute accent - spacing acute | ´ | ´ | ´ | ´ | ´ | ´ | ALT+0180 | ´ |
| micro sign | µ | µ | µ | µ | µ | µ | ALT+0181 | µ |
| pilcrow sign - paragraph sign | ¶ | ¶ | ¶ | ¶ | ¶ | ¶ | ALT+0182 | ¶ |
| middle dot - Georgian comma | · | · | · | · | · | · | ALT+0183 | · |
| spacing cedilla | ¸ | ¸ | ¸ | ¸ | ¸ | ¸ | ALT+0184 | ¸ |
| superscript one | ¹ | ¹ | ¹ | ¹ | ¹ | ¹ | ALT+0185 | ¹ |
| masculine ordinal indicator | º | º | º | º | º | º | ALT+0186 | º |
| right double angle quotes | » | » | » | » | » | » | ALT+0187 | » |
| fraction one quarter | ¼ | ¼ | ¼ | ¼ | ¼ | ¼ | ALT+0188 | ¼ |
| fraction one half | ½ | ½ | ½ | ½ | ½ | ½ | ALT+0189 | ½ |
| fraction three quarters | ¾ | ¾ | ¾ | ¾ | ¾ | ¾ | ALT+0190 | ¾ |
| inverted question mark | ¿ | ¿ | ¿ | ¿ | ¿ | ¿ | ALT+0191 | ¿ |
| latin capital letter A with grave | À | À | À | À | À | À | ALT+0192 | À |
| latin capital letter A with acute | Á | Á | Á | Á | Á | Á | ALT+0193 | Á |
| latin capital letter A with circumflex | Â | Â | Â | Â | Â | Â | ALT+0194 | Â |
| latin capital letter A with tilde | Ã | Ã | Ã | Ã | Ã | Ã | ALT+0195 | Ã |
| latin capital letter A with diaeresis | Ä | Ä | Ä | Ä | Ä | Ä | ALT+0196 | Ä |
| latin capital letter A with ring above | Å | Å | Å | Å | Å | Å | ALT+0197 | Å |
| latin capital letter AE | Æ | Æ | Æ | Æ | Æ | Æ | ALT+0198 | Æ |
| latin capital letter C with cedilla | Ç | Ç | Ç | Ç | Ç | Ç | ALT+0199 | Ç |
| latin capital letter E with grave | È | È | È | È | È | È | ALT+0200 | È |
| latin capital letter E with acute | É | É | É | É | É | É | ALT+0201 | É |
| latin capital letter E with circumflex | Ê | Ê | Ê | Ê | Ê | Ê | ALT+0202 | Ê |
| latin capital letter E with diaeresis | Ë | Ë | Ë | Ë | Ë | Ë | ALT+0203 | Ë |
| latin capital letter I with grave | Ì | Ì | Ì | Ì | Ì | Ì | ALT+0204 | Ì |
| latin capital letter I with acute | Í | Í | Í | Í | Í | Í | ALT+0205 | Í |
| latin capital letter I with circumflex | Î | Î | Î | Î | Î | Î | ALT+0206 | Î |
| latin capital letter I with diaeresis | Ï | Ï | Ï | Ï | Ï | Ï | ALT+0207 | Ï |
| latin capital letter ETH | Ð | Ð | Ð | Ð | Ð | Ð | ALT+0208 | Ð |
| latin capital letter N with tilde | Ñ | Ñ | Ñ | Ñ | Ñ | Ñ | ALT+0209 | Ñ |
| latin capital letter O with grave | Ò | Ò | Ò | Ò | Ò | Ò | ALT+0210 | Ò |
| latin capital letter O with acute | Ó | Ó | Ó | Ó | Ó | Ó | ALT+0211 | Ó |
| latin capital letter O with circumflex | Ô | Ô | Ô | Ô | Ô | Ô | ALT+0212 | Ô |
| latin capital letter O with tilde | Õ | Õ | Õ | Õ | Õ | Õ | ALT+0213 | Õ |
| latin capital letter O with diaeresis | Ö | Ö | Ö | Ö | Ö | Ö | ALT+0214 | Ö |
| multiplication sign | × | × | × | × | × | × | ALT+0215 | × |
| latin capital letter O with slash | Ø | Ø | Ø | Ø | Ø | Ø | ALT+0216 | Ø |
| latin capital letter U with grave | Ù | Ù | Ù | Ù | Ù | Ù | ALT+0217 | Ù |
| latin capital letter U with acute | Ú | Ú | Ú | Ú | Ú | Ú | ALT+0218 | Ú |
| latin capital letter U with circumflex | Û | Û | Û | Û | Û | Û | ALT+0219 | Û |
| latin capital letter U with diaeresis | Ü | Ü | Ü | Ü | Ü | Ü | ALT+0220 | Ü |
| latin capital letter Y with acute | Ý | Ý | Ý | Ý | Ý | Ý | ALT+0221 | Ý |
| latin capital letter THORN | Þ | Þ | Þ | Þ | Þ | Þ | ALT+0222 | Þ |
| latin small letter sharp s - ess-zed | ß | ß | ß | ß | ß | ß | ALT+0223 | ß |
| latin small letter a with grave | à | à | à | à | à | à | ALT+0224 | à |
| latin small letter a with acute | á | á | á | á | á | á | ALT+0225 | á |
| latin small letter a with circumflex | â | â | â | â | â | â | ALT+0226 | â |
| latin small letter a with tilde | ã | ã | ã | ã | ã | ã | ALT+0227 | ã |
| latin small letter a with diaeresis | ä | ä | ä | ä | ä | ä | ALT+0228 | ä |
| latin small letter a with ring above | å | å | å | å | å | å | ALT+0229 | å |
| latin small letter ae | æ | æ | æ | æ | æ | æ | ALT+0230 | æ |
| latin small letter c with cedilla | ç | ç | ç | ç | ç | ç | ALT+0231 | ç |
| latin small letter e with grave | è | è | è | è | è | è | ALT+0232 | è |
| latin small letter e with acute | é | é | é | é | é | é | ALT+0233 | é |
| latin small letter e with circumflex | ê | ê | ê | ê | ê | ê | ALT+0234 | ê |
| latin small letter e with diaeresis | ë | ë | ë | ë | ë | ë | ALT+0235 | ë |
| latin small letter i with grave | ì | ì | ì | ì | ì | ì | ALT+0236 | ì |
| latin small letter i with acute | í | í | í | í | í | í | ALT+0237 | í |
| latin small letter i with circumflex | î | î | î | î | î | î | ALT+0238 | î |
| latin small letter i with diaeresis | ï | ï | ï | ï | ï | ï | ALT+0239 | ï |
| latin small letter eth | ð | ð | ð | ð | ð | ð | ALT+0240 | ð |
| latin small letter n with tilde | ñ | ñ | ñ | ñ | ñ | ñ | ALT+0241 | ñ |
| latin small letter o with grave | ò | ò | ò | ò | ò | ò | ALT+0242 | ò |
| latin small letter o with acute | ó | ó | ó | ó | ó | ó | ALT+0243 | ó |
| latin small letter o with circumflex | ô | ô | ô | ô | ô | ô | ALT+0244 | ô |
| latin small letter o with tilde | õ | õ | õ | õ | õ | õ | ALT+0245 | õ |
| latin small letter o with diaeresis | ö | ö | ö | ö | ö | ö | ALT+0246 | ö |
| division sign | ÷ | ÷ | ÷ | ÷ | ÷ | ÷ | ALT+0247 | ÷ |
| latin small letter o with slash | ø | ø | ø | ø | ø | ø | ALT+0248 | ø |
| latin small letter u with grave | ù | ù | ù | ù | ù | ù | ALT+0249 | ù |
| latin small letter u with acute | ú | ú | ú | ú | ú | ú | ALT+0250 | ú |
| latin small letter u with circumflex | û | û | û | û | û | û | ALT+0251 | û |
| latin small letter u with diaeresis | ü | ü | ü | ü | ü | ü | ALT+0252 | ü |
| latin small letter y with acute | ý | ý | ý | ý | ý | ý | ALT+0253 | ý |
| latin small letter thorn | þ | þ | þ | þ | þ | þ | ALT+0254 | þ |
| latin small letter y with diaeresis | ÿ | ÿ | ÿ | ÿ | ÿ | ÿ | ALT+0255 | ÿ |
| Unicode Characters Supported By Most Browsers | ||||||||
| not equal to | ≠ | ≠ | n/a | n/a | ≠ | ≠ | n/a | n/a |
| less-than or equal to | ≤ | ≤ | n/a | n/a | ≤ | ≤ | n/a | n/a |
| greater-than or equal to | ≥ | ≥ | n/a | n/a | ≥ | ≥ | n/a | n/a |
| black spade suit | ♠ | ♠ | n/a | n/a | ♠ | ♠ | n/a | n/a |
| black club suit, shamrock | ♣ | ♣ | n/a | n/a | ♣ | ♣ | n/a | n/a |
| black heart suit, valentine | ♥ | ♥ | n/a | n/a | ♥ | ♥ | n/a | n/a |
| black diamond suit | ♦ | ♦ | n/a | n/a | ♦ | ♦ | n/a | n/a |



Technorati
Del.icio.us
Slashdot
Digg
twitter
Dear Tom,
Your MT plugin for stripping control characters is just what I'm looking for, but unfortunately I'm not able to download it... I get an error-message of some kind.
Is there any possibility that you would email me the pl-file?
Seems to be one greate piece of work you've done there!
Thanks in advance!
Best regards,
* hilde *
MTStripControlChars.pl Should be fixed now. I had to rename it from .pl to .txt
Hope it works for you! Seems to have fixed the problem for me. I was afraid I'd have to convert to UTF-8 - not a fun project - lots of issues moving MT to UTF-8.
Hi Tom,
I downloaded the plug-in just to eliminate the garbage value that is being displayed whenever we posted entries with prime and double prime quotation marks. However, it doesn't work. Any idea on this? you can check the entries at:
http://72.10.53.208
Thanks
Wendy