tomcat7与tomcat8中文乱码的问题

作者: admin 分类: IT运维 发布时间: 2019-07-22 15:51

背景:

同样的程序在tomcat7下查询没问题,tomcat8下查询不出结果来。debug发现,controller参数有乱码导致的。

分析:

我们的程序中使用了过滤器:org.springframework.web.filter.CharacterEncodingFilter,并且所有的编码有非常统一,都是UTF-8。但是,setCharacterEncoding只对POST请求也起作用,对GET没有作用啊!恰好我们这个查询是用的GET!

后来查到tomcat的两个配置参数:

tomcat7:

URIEncoding

This specifies the character encoding used to decode the URI bytes, after %xx decoding the URL. If not specified, ISO-8859-1 will be used

这个参数用来设置解码url参数,如果没指定,默认是ISO-8859-1。

useBodyEncodingForURI

This specifies if the encoding specified in contentType should be used for URI query parameters, instead of using the URIEncoding. This setting is present for compatibility with Tomcat 4.1.x, where the encoding specified in the contentType, or explicitly set using Request.setCharacterEncoding method was also used for the parameters from the URL. The default value is false.
这个参数是说是否使用contentType 里面的encoding来解码url中的参数,而不是使用URIEncoding。主要是为了兼容Tomcat 4.1.x。默认值是false。设置了setCharacterEncoding以后实际上也就设置了contentType 。

tomcat8:

URIEncoding

This specifies the character encoding used to decode the URI bytes, after %xx decoding the URL. If not specified, UTF-8 will be used unless the org.apache.catalina.STRICT_SERVLET_COMPLIANCE system property is set to true in which case ISO-8859-1 will be used.
这个参数用来设置解码url参数,如果没指定,默认是UTF-8,除非设置了org.apache.catalina.STRICT_SERVLET_COMPLIANCE这个系统参数为true,这个时候会使用ISO-8859-1。

useBodyEncodingForURI

This specifies if the encoding specified in contentType should be used for URI query parameters, instead of using the URIEncoding. This setting is present for compatibility with Tomcat 4.1.x, where the encoding specified in the contentType, or explicitly set using Request.setCharacterEncoding method was also used for the parameters from the URL. The default value is false.
Notes: 1) This setting is applied only to the query string of a request. Unlike URIEncoding it does not affect the path portion of a request URI. 2) If request character encoding is not known (is not provided by a browser and is not set by SetCharacterEncodingFilter or a similar filter using Request.setCharacterEncoding method), the default encoding is always “ISO-8859-1”. The URIEncoding setting has no effect on this default.

同上。这个参数值对url里面的查询参数起作用,对path部分不起作用。如果无法确定character encoding,默认是”ISO-8859-1″。

结论:

(1)URIEncoding和useBodyEncodingForURI两个参数互斥。useBodyEncodingForURI主要是为了兼容老版本,尽量用URIEncoding。

(2)tomcat8之前,URL中参数的默认解码是ISO-8859-1,而tomcat8的默认解码为utf-8。

(3)在过滤器中使用 request.setCharacterEncoding() 只能用来处理 POST 请求,对 GET 则无效。

(4)tomcat乱码的终极解决办法:

tomcat8之前:设置URIEncoding + Post的过滤器 或者 设置useBodyEncodingForURI+Post过滤器。

tomcat8之后:只要设置一个Post过滤器就可以了,tomcat9跟tomcat8是一样的。

发表评论

电子邮件地址不会被公开。 必填项已用*标注