python3使用requests编码异常

[toc]

python3编码

背景

requests请求时出现以下异常:

UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-1: Body ('你好') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.

原因

  • requests中默认使用Latin-1编码传输数据,在发送请求前会encode('latin-1')
  • if isinstance(body, str):
  • # RFC 2616 Section 3.7.1 says that text default has a
  • # default charset of iso-8859-1.
  • body = _encode(body, 'body')
  • def _encode(data, name='data'):
  • """Call data.encode("latin-1") but show a better error message."""
  • try:
  • return data.encode("latin-1")
  • except UnicodeEncodeError as err:
  • raise UnicodeEncodeError(
  • err.encoding,
  • err.object,
  • err.start,
  • err.end,
  • "%s (%.20r) is not valid Latin-1. Use %s.encode('utf-8') "
  • "if you want to send it encoded in UTF-8." %
  • (name.title(), data[err.start:err.end], name)) from None
  • 数据传输格式Json

unicode的是能够直接编码成latin-1格式的,但是如果其中含中文则无法编码

  • json.dumps的ensure_ascii

该参数指的是如果含非ascii则保留原样

  • If ``ensure_ascii`` is false, then the return value can contain non-ASCIIcharacters if they appear in strings contained in ``obj``. Otherwise, allsuch characters are escaped in JSON strings.
  • json.dumps('你好')
  • Out[26]: '"\\u4f60\\u597d"'
  • json.dumps('你好', ensure_ascii=False)
  • Out[27]: '"你好"'

所以含中文a编码结果如下

  • # false
  • json.dumps(a, ensure_ascii=False).encode('latin-1')
  • UnicodeEncodeError: 'latin-1' codec can't encode characters in position 1-2: ordinal not in range(256)
  • # true
  • json.dumps(a).encode('latin-1')
  • b'"\\u4f60\\u597d"'

结论

  • 使用json格式化时不要随意使用ensure_ascii
  • requests请求时body默认使用latin-1编码

本文作者:朝圣

本文链接:www.zh-noone.cn/2019/12/python3编码

版权声明:本博客所有文章除特别声明外,均采用CC BY-NC-SA 3.0许可协议。转载请注明出处!

mysql分组取最新一条记录
0 条评论