手机txt文件打开乱码修复方法 txt文档乱码怎么修复( 二 )


recipients implementing this specificationMUST support the character sets "ISO-8859-1" and "UTF-8".并且在[RFC 5987] 3.2.1规定,百分号编码遵从 RFC 3986.section 2.1中的定义,摘录如下:
A percent-encoding mechanism is used to represent a data octet in acomponent when that octet's corresponding character is outside theallowed set or is being used as a delimiter of, or within, thecomponent.A percent-encoded octet is encoded as a charactertriplet, consisting of the percent character "%" followed by the twohexadecimal digits representing that octet's numeric value.Forexample, "%20" is the percent-encoding for the binary octet"00100000" (ABNF: %x20), which in US-ASCII corresponds to the spacecharacter (SP).Section 2.4 describes when percent-encoding anddecoding is applied.注意了,[RFC 3986] 明确规定了空格 会被百分号编码为%20
而在另一份文档 RFC 1866.Section 8.2.1 The form-urlencoded Media Type 中却规定:
The default encoding for all forms is `application/x-www-form-urlencoded'. A form data set is represented in this media type asfollows:1. The form field names and values are escaped: spacecharacters are replaced by `+', and then reserved charactersare escaped as per [URL]这里要求application/x-www-form-urlencoded类型的消息中,空格要被替换为+,其他字符按照[URL]中的定义来转义,其中的[URL]指向的是RFC 1738 而它的修订版中和 URL 有关的最新文档恰恰就是 [RFC 3986]
这也就是为什么很多文档中描述空格(white space)的百分号编码结果都是 +或%20,如:
w3schools:URL encoding normally replaces a space with a plus (+) sign or with %20.
MDN:Depending on the context, the character ‘ ‘ is translated to a ‘+’ (like in the percent-encoding version used in an application/x-www-form-urlencoded message), or in ‘%20’ like on URLs.
那么问题来了,开发过程中,对于空格符的百分号编码我们应该怎么处理?
课代表建议大家遵循最新文档,因为 [RFC 1866] 中定义的情况仅适用于application/x-www-form-urlencoded类型,就百分号编码的定义来说,我们应该以 [RFC 3986] 为准,所以,任何需要百分号编码的地方,都应该将空格符 百分号编码为%20,stackoverflow 上也有支持此观点的答案:When to encode space to plus (+) or %20?
3. 代码实践有了理论基础,代码写起来就水到渠成了,直接上代码:
@GetMapping("/downloadFile")public String download(String serverFileName, HttpServletRequest request, HttpServletResponse response) throws IOException {request.setCharacterEncoding("utf-8");response.setContentType("application/octet-stream");String clientFileName = fileService.getClientFileName(serverFileName);// 对真实文件名进行百分号编码String percentEncodedFileName = URLEncoder.encode(clientFileName, "utf-8").replaceAll("\+", "%20");// 组装contentDisposition的值StringBuilder contentDispositionValue = https://www.520longzhigu.com/diannao/new StringBuilder();contentDispositionValue.append("attachment; filename=").append(percentEncodedFileName).append(";").append("filename*=").append("utf-8''").append(percentEncodedFileName);response.setHeader("Content-disposition",contentDispositionValue.toString());// 将文件流写到response中try (InputStream inputStream = fileService.getInputStream(serverFileName);OutputStream outputStream = response.getOutputStream()) {IOUtils.copy(inputStream, outputStream);}return "OK!";}代码很简单,其中有两点需要说明一下:
URLEncoder.encode(clientFileName, “utf-8”)方法之后,为什么还要.replaceAll(“\+”, “%20″) 。正如前文所述,我们已经明确,任何需要百分号编码的地方,都应该把 空格符编码为 %20,而URLEncoder这个类的说明上明确标注其会将空格符转换为+:The space character ” ” is converted into a plus sign “{@code +}”.其实这并不怪 JDK,因为它的备注里说明了其遵循的是application/x-www-form-urlencoded( PHP 中也有这么一个函数,也是这么个套路)Translates a string into {@code application/x-www-form-urlencoded} format using a specific encoding scheme. This method uses the所以这里我们用.replaceAll(“\+”, “%20”) 把+号处理一下,使其完全符合 [RFC 3986] 的百分号编码规范 。这里为了方便说明问题,把所有操作都展现出来了 。当然,你完全可以自己实现一个PercentEncoder类,丰俭由人 。[RFC 6266] 标准中filename=的value是不需要编码的,这里的filename=后面的 value 为什么要百分号编码?回顾 [RFC 6266] 文档,filename和filename*同时出现时取后者,浏览器太老不支持新标准时取前者 。目前主流的浏览器都采用自升级策略,所以大部分都支持新标准——除了老版本IE 。老版本的IE对 value 的处理策略是 进行百分号解码 并使用 。所以这里专门把filename=的value进行百分号编码,用来兼容老版本 IE 。PS:课代表实测 IE11 及 Edge 已经支持新标准了 。4. 浏览器测试根据下图 statcounter 统计的 2019 年中国市场浏览器占有率,课代表设计了一个包含中文,英文,空格的文件名 下载-down test .txt用来测试


以上关于本文的内容,仅作参考!温馨提示:如遇健康、疾病相关的问题,请您及时就医或请专业人士给予相关指导!

「四川龙网」www.sichuanlong.com小编还为您精选了以下内容,希望对您有所帮助: