Thursday, 18 August 2011

UTF-8 encoding of URI parameters in tomcat

I've spend quite a long time today trying to fix encoding issue on one of our simple services. The service resource takes couple of path parameters and can take two optional parameters from servletRequest. One of the parameters is a String, which may contain UTF-8 characters. Later on the service checks if given String exists in a cache, and if it does it let's us know in the response.

It turned out that the service wasn't able to match strings with cache if they contained UTF-8 characters. To make long story short, after lots of debugging I was sure that the problem is with the encoding of the request parameter, not the string from cache. I've written a Filter class, that should be a first thing to pick up request and, before anything else can read it's parameters, set the encoding on the request to UTF-8. This, however, did not solve the issue (to read more about encoding filters go to: http://blog.sidu.in/2007/05/tomcat-and-utf-8-encoded-uri-parameters.html). The service worked fine with Jetty, so I started looking for tomcat specific solutions, and finally found following bit of xml:



maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
connectionTimeout="20000" disableUploadTimeout="true"
URIEncoding="UTF-8"
/>


What this does is it set's URI encoding to "UTF-8" for the Connector in Tomcat's conf/server.xml. It will make sure that request parameters passed in URL are properly encoded before they are shoved into parameter map in the servletRequest object. One line change in config file, outside of our codebase and the whole day of investigation work, whew!

No comments:

Post a Comment