python 正則表達式
來自專欄 python研究
正則表達式是一個特殊的字元序列,它能幫助你方便的檢查一個字元串是否與某種模式匹配。更多基礎內容請瀏覽:Python 正則表達式 | 菜鳥教程,好多人對正則的書寫比較頭疼,下面給大家推薦兩個比較好的正則表達式測試網站:a Python regular expression editor和PHP, PCRE, Python, Golang and JavaScript;不多講了,上手開練,今天我們用re.findall做個簡單的案例
需求是這樣的:怎麼使用python取出一個目錄下面所有文件的指定內容 - 天善智能:專註於商業智能BI和數據分析、大數據領域的垂直社區平台這是一個dba兄弟需要解決的一個問題,分析需求後,先在網頁上測試正則表達式:
完成正則表達式之後,我們開始整理一下代理,比如:
#!/usr/bin/python# --*-- coding:utf-8 --*--# Author: Jack.Zimport reimport osregex_string = static.*(get.*)(.*)?{|DBCollection co[n]?llection = (MongoUtil.get.*);def detection(content): return re.findall(regex_string, content)def read_file(filename): with open(filename, r) as text: content = text.read() result_list = detection(content) count = 0 for item in result_list: count += 1 if result_list.index(item)+1 < len(result_list): if item[0] and result_list[count][1]: print "函數: " + item[0] print "SQL: " + result_list[count][1] print else: continueif __name__ == __main__: # java_file = /Users/jack/Downloads/service/IndexCapitalService.java java_file = /Users/jack/Downloads/IndexInfoService.java print "文件名: " + os.path.basename(java_file) print read_file(java_file)
執行結果:
$ python file_operate.py 文件名: IndexGradeService.java函數: getIndexStatisticsSQL: MongoUtil.getGGf10dbCollection("t_f10_index_fdnav")函數: getIndexGradeFundListSQL: MongoUtil.getGGf10dbCollection("t_f10_index_fund_basicinfo")函數: getFirstTradeDaySQL: MongoUtil.getGGStockBaseCollection("gg_date")函數: getFundCodesSQL: MongoUtil.getGGf10dbCollection("t_api_fund_detail_new")函數: getNewDateSQL: MongoUtil.getGGf10dbCollection("t_api_fund_detail_new")
繼續優化了一版:
#!/usr/bin/python# --*-- coding:utf-8 --*--# Author: Jack.Zimport reimport os# regex_string = static.*(get.*)(.*)?{|DBCollection co[n]?llection = (MongoUtil.get.*);regex_string = static.*(get.*)(.*)?{|s+co[n]?llection = (MongoUtil.get.*);def detection(content): return re.findall(regex_string, content)def read_file(filename): with open(filename, r) as text: content = text.read() result_list = detection(content) # print result_list count = 0 for item in result_list: count += 1 if count < len(result_list): if item[0] and result_list[count][1]: print "函數: " + item[0] print "SQL: " + result_list[count][1] elif not item[0] and result_list[count][1]: print "SQL: " + result_list[count][1] else: continueif __name__ == __main__: # java_file = /Users/jack/Downloads/service/IndexCapitalService.java # java_file = /Users/jack/Downloads/service/IndexGradeService.java # java_file = /Users/jack/Downloads/service/IndexInfoService.java # java_file = /Users/jack/Downloads/service/IndustryUtil.java # java_file = /Users/jack/Downloads/IndexInfoService.java java_file = /Users/jack/Downloads/FundManagerService.java print "文件名: " + os.path.basename(java_file) print read_file(java_file)
推薦閱讀:
※C++ 11 輕鬆上手
※AppleScript類自然語言與非英語語法設計
※寫了那麼久代碼了,給大家談談我怎麼理解編程
※開源 | Python基礎入門課程
※字元指針數組的使用--替換人名-題目分析