Python程序的性能分析方法

04-29

有時候需要去優化Python端代碼的性能，所以需要先找到瓶頸在哪裡。好在這個事做起來特別方便。

一般先找出哪些函數是瓶頸，然後再看看那些函數裡面時間都具體花在哪些行上了。Python裡面提供的cProfiler可以解決第一個問題，另外這裡line profiler[0]可以解決第二個問題。

第一步可以這樣（test.py）：

def bar(n): a = 0 for i in range(n): a += i**2 return a def foo(): ret = 0 for i in range(1000): ret += bar(i) return retdef c_profile(codestr): import cProfile p = cProfile.Profile() p.run(codestr) p.print_stats()if __name__ == __main__: c_profile("foo()")# python test.py

輸出

Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.157 0.157 <string>:1(<module>) 1000 0.157 0.000 0.157 0.000 test.py:10(bar) 1 0.000 0.000 0.157 0.157 test.py:16(foo) 1 0.000 0.000 0.157 0.157 {built-in method exec} 1 0.000 0.000 0.000 0.000 {method disable of _lsprof.Profiler objects}

可以看出所有時間都基本花在了foo和bar兩個函數里。然後修改一下代碼來逐行地對foo和bar進行分析：

@profiledef bar(n): a = 0 for i in range(n): a += i**2 return a @profiledef foo(): ret = 0 for i in range(1000): ret += bar(i) return retif __name__ == __main__: foo()# kernprof -lv test.py

輸出：

Timer unit: 1e-06 sTotal time: 0.525531 sFile: test.pyFunction: bar at line 9Line # Hits Time Per Hit % Time Line Contents============================================================== 9 @profile 10 def bar(n): 11 1000 495 0.5 0.1 a = 0 12 500500 169433 0.3 32.2 for i in range(n): 13 499500 355155 0.7 67.6 a += i ** 2 14 1000 448 0.4 0.1 return a Total time: 0.804509 sFile: test.pyFunction: foo at line 16Line # Hits Time Per Hit % Time Line Contents============================================================== 16 @profile 17 def foo(): 18 1 5 5.0 0.0 ret = 0 19 1001 552 0.6 0.1 for i in range(1000): 20 1000 803951 804.0 99.9 ret += bar(i) 21 1 1 1.0 0.0 return ret

從上面的數據就可以清楚找到瓶頸在哪裡了。

====

[0]: GitHub - rkern/line_profiler: Line-by-line profiling for Python