I’ve been working on a web app that uses UTF-8 encoding and have been surprised at how little information is available about how to do internationalization that works with all browsers. (A.J. Flavell’s “FORM submission and i18n” article and related charset issues site were quite helpful.) Consider this my small contribution. Here’s the scenario for my app: users can enter a search string and it will search a database for matching entries. The search form includes a few other contols so there are a number of variations and potentially 20 named fields. I only want to show the relevant name/value pairs on the URL if possible. If I submit the form with the GET method, all fields are shown on the URL, even if their values are null, which is fairly ugly. I could use a POST, but then the URL can’t be sent to others and generate the same result. The search is also an idempotent transaction (it is just retrieving data and has no side-effects), so I’d prefer to use GET.
escape() function in the past to fix up ASCII characters that are not URL safe, but
escape() does not handle unicode characters well. In IE, unicode characters are suported, but the function generates a %unnnn format which is not well understood by servers. It would give %u03B1 for the previous example. What to do?
I found the
I just happened to think that all my mozilla and IE5.5 bookmarklets should probably be converted to using
encodeURIComponent() instead of
escape(). That would allow searching for non-ASCII characters.