配置文件编码为UTF-8,内容如下:
1 2 | [section1] p1=中文字符串 |
如果用 config.readfp(open(‘cfg.ini’)),会出现乱码问题
因为open函数不能指定编码,所以改用codecs.open,指定编码为 utf-8,在eclipse pydev下测试通过。
但因为UTF-8文本文件有两种格式:带BOM和不带BOM
而windows 记事本保存时只支持带BOM格式,为了兼容用记事本编辑过的文件能被正确读取,
最好把编码指定为 utf-8-sig,完整的代码如下:
1 2 3 4 5 6 7 8 | import codecs import ConfigParser cfgfile="cfg.ini" config = ConfigParser.ConfigParser() config.readfp(codecs.open(cfgfile, "r", "utf-8-sig")) p1 = config.get("section1","p1") print p1 |
关于encodings.utf_8_sig:
This module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded BOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this is only done once (on the first write to the byte stream). For decoding an optional UTF-8 encoded BOM at the start of the data will be skipped.
(decoding的时候,如果有BOM,将会过滤掉)