python3 csv/xls/json/pickle 等序列化反序列化代碼速查

全部

# pandas 秒天秒地n# 條件有限的時候再考慮別的nimport pandas as pdnimport numpy as npnndf = pd.read_clipboard()ndf = pd.read_csv(data_or_path)ndf = pd.read_html(data_or_path)ndf = pd.read_json(data_or_path)ndf = pd.read_msgpack(data_or_path)ndf = pd.read_pickle(data_or_path)nndf.to_clipboard()ndf.to_csv()ndf.to_csv(fn)ndf.to_excel(fn)ndf.to_html()ndf.to_html(fn)ndf.to_json()ndf.to_json(fn)ndf.to_msgpack()ndf.to_msgpack(fn)ndf.to_pickle(fn)nn# 轉 dict 時 orientn# orient : str {『dict』, 『list』, 『series』, 『split』, 『records』, 『index』}n# Determines the type of the values of the dictionary.n# dict (default) : dict like {column -> {index -> value}}n# list : dict like {column -> [values]}n# series : dict like {column -> Series(values)}n# split : dict like {index -> [index], columns -> [columns], data -> [values]}n# records : list like [{column -> value}, ... , {column -> value}]n# index : dict like {index -> {column -> value}}nnn# 轉 json 時 orientn# orient : stringn# The format of the JSON stringn# split : dict like {index -> [index], columns -> [columns], data -> [values]}n# records : list like [{column -> value}, ... , {column -> value}]n# index : dict like {index -> {column -> value}}n# columns : dict like {column -> {index -> value}}n# values : just the values arrayn# table : dict like {『schema』: {schema}, 『data』: {data}} describing the data, and the data component is like orient=records.nn# 其餘不一一列出,詳見n# http://pandas.pydata.org/pandas-docs/stable/api.html#id12n

csv

# 讀取nimport csvnreader = csv.reader(open(test.csv, r, encoding=utf-8))nreader = csv.reader(open(test.csv, r, encoding=cp936)) # for exceln

# 寫入nimport csvninfo = []nwriter = csv.writer(open(ret.csv, w, newline=, encoding=utf-8))nwriter = csv.writer(open(ret.csv, w, newline=, encoding=utf_8_sig)) # BOM utf-8, 這樣excel不亂碼nwriter.writerows(info)n

xls

盡量用 pandas,不然處理時間啥的還要費力

# 單列讀取nimport xlrdnxls = xlrd.open_workbook(test.xls)nsheet = xls.sheet_by_name(Sheet1)nrowA = sheet.row_values(0) # 第一行ncolA = sheet.col_values(0) # 第一列n

# 全部讀取nimport xlrdnxls = xlrd.open_workbook(test.xls)nsheet = xls.sheet_by_name(Sheet1)ndata = [sheet.row_values(x) for x in range(sheet.nrows)]n

# 寫入n# 摸了,用 pandas 或 csv, utf-8 with bom 輸出吧n

json

# 讀取nimport jsonninfo = json.loads(open(data.json, r).read())n

# 寫入nimport jsonnopen(data.json, w).write(json.dumps(info))n

pickle

# 讀取nimport pickleninfo = pickle.loads(open(data.pkl, rb).read())n

# 寫入nimport picklenopen(data.pkl, wb).write(pickle.dumps(info))n

純文本

# 讀取ntxt = open(fn, r, encoding=utf-8).read()n

# 寫入nopen(fn, w, encoding=utf-8).write(txt)n

二進位

# 讀取ndata = open(fn, rb).read()n

# 寫入nopen(fn, wb).write(data)n

P.S. 其中很多一直在用但一直沒統一整理,這樣就方便很多了。內容實際大部分是月初填充的,但覺得太水不好意思單發。正好今天發個文,這篇算是附帶的。

來自個人博客:fy0.me/t/15


推薦閱讀:

Python數據分析及可視化實例之可視化圖表應用簡介
如何在阿里ECS雲端運行Jupyter Notebook進行機器/深度學習?
隊列和棧
機器學習--手寫識別(k-NearestNeighbor)
PY交易(一)使用Pygame

TAG:Python | 数据清洗 |