分享

【已解决】Python中通过csv的writerow输出的内容有多余的空行 | 在路上

 OneDayDayUp 2016-10-15

【问题】

Python中,通过csv的writerow输出内容:

1
2
3
4
5
6
7
    #output all info dict list
    outputFp = open(gConst['csvFilename'], 'a+');
    csvWriter = csv.writer(outputFp, dialect='excel');
    for eachInfoDict in itemInfoDictList:
        fieldList = [];
        fieldList.append(eachInfoDict['Lead Source']);
...
1
2
3
4
    logging.info("fieldList=%s", fieldList);
    csvWriter.writerow(fieldList);
outputFp.close();

结果却发现输出了csv中,每一行row之后,有个多余的空行:

csv redundant row

用excel打开后,效果如下:

redundant new line

现在需要去掉这个多余的空行。

【解决过程】

1.去查了查writerow:

http://docs./2/library/csv.html#writer-objects

13.1.4. Writer Objects

Writer objects (DictWriter instances and objects returned by the writer() function) have the following public methods. A row must be a sequence of strings or numbers for Writer objects and a dictionary mapping fieldnames to strings or numbers (by passing them through str() first) for DictWriter objects. Note that complex numbers are written out surrounded by parens. This may cause some problems for other programs which read CSV files (assuming they support complex numbers at all).

csvwriter.writerow(row)

Write the row parameter to the writer’s file object, formatted according to the current dialect.

csvwriter.writerows(rows)

Write all the rows parameters (a list of row objects as described above) to the writer’s file object, formatted according to the current dialect.

 

但是貌似没太大帮助。

2.后来注意到,输出的csv的效果是:

行末是CR

然后才是一个CRLF的换行:

end of line is CR then CR LF

所以,要搞清楚,CR和CRLF 分别是谁输出的。

3.后来参考:

Extraneous newlines with csv.writer on Windows

说是,使用binary模式即可。

所以把:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    #init output file
    # 'a+': read,write,append
    # 'w' : clear before, then write
    outputFp = open(gConst['csvFilename'], 'w');
    csvWriter = csv.writer(outputFp, dialect='excel');
    # itemInfoDict = {
...    csvWriter.writerow(csvHeaderList);
    outputFp.close();
    #output all info dict list
    outputFp = open(gConst['csvFilename'], 'a+');
    csvWriter = csv.writer(outputFp, dialect='excel');
    for eachInfoDict in itemInfoDictList:
        fieldList = [];
        fieldList.append(eachInfoDict['Lead Source']);
...    outputFp.close();

去改为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
    #init output file
    # 'a+': read,write,append
    # 'w' : clear before, then write
    outputFp = open(gConst['csvFilename'], 'wb');
    csvWriter = csv.writer(outputFp, dialect='excel');
...    csvWriter.writerow(csvHeaderList);
    outputFp.close();
    #output all info dict list
    outputFp = open(gConst['csvFilename'], 'a+');
    csvWriter = csv.writer(outputFp, dialect='excel');
    for eachInfoDict in itemInfoDictList:
        fieldList = [];
...        logging.info("fieldList=%s", fieldList);
        csvWriter.writerow(fieldList);
    outputFp.close();

试试,结果是,

标题那一行,由于是二进制的wb打开的文件,所以OK了,没有多余的CR了

但是其余的各行,由于是文本方式的a+打开的,结果还是有多余的CR:

header is ok but other line is still has CRLF

4.所以再去改为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
outputFp = open(gConst['csvFilename'], 'wb');
csvWriter = csv.writer(outputFp, dialect='excel');
...
csvWriter.writerow(csvHeaderList);
outputFp.close();
...
...
#output all info dict list
#outputFp = open(gConst['csvFilename'], 'a+');
outputFp = open(gConst['csvFilename'], 'ab+');
csvWriter = csv.writer(outputFp, dialect='excel');
......
csvWriter.writerow(fieldList);
outputFp.close();

试试效果,结果终于可以了:

CSV中,没有了多余的CR了,只有行尾的CRLF:

csv only last has the CRLF

对应的excel中,也可以显示正常,没有多余的空行了:

excel can show ok no extra empy line

 

【总结】

Python中的csv的writer,打开文件的时候,要小心,

要通过binary模式去打开,即带b的,比如wb,ab+等

而不能通过文本模式,即不带b的方式,w,w+,a+等,否则,会导致使用writerow写内容到csv中时,产生对于的CR,导致多余的空行。

 

注:关于文件打开的方式,是binary还是text,详见:

【详解】Python中的文件操作,readline读取单行,readlines读取全部行,文件打开模式

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多