[Debug] Clang / LLVM 關於continue語句的二三事
Part 1: 問題表現
首先讓我們看幾個C語言例子,然後來理解問題所在:
1.c
1.int main()n2.{n3. int i;n4.n5. for (i = 0; i < 256; i++)n6. {n7. continue;n8. }n9.}n
然後使用clang -g 1.c編譯(這裡使用的clang版本:clang version 5.0.0 (trunk 296253) ),並且使用gdb調試:
(gdb) b mainnBreakpoint 1 at 0x100000f8b: file 1.c, line 5.n(gdb) rnThread 2 hit Breakpoint 1, main () at 1.c:5n5t for (i = 0; i < 256; i++)n(gdb) nn7t continue;n(gdb) nn5t for (i = 0; i < 256; i++)n(gdb) nn7t continue;n(gdb) nn5t for (i = 0; i < 256; i++)n(gdb) nn7t continue;n
在這裡,有幾個注意點。
- 我們可以正常的停在第7行的continue語句。
- 我們不會在第8行的for循環右大括弧那裡停下來,這只是語意組織,所以執行next命令時,直接跳到for循環開頭。可以理解。
2.c
1.int main()n2.{n3. int i;n4.n5. for (i = 0; i < 256; i++)n6. {n7. i++;n8. }n9.}n
使用clang 2.c -g編譯,然後使用gdb a.out調試.
(gdb) b mainnBreakpoint 1 at 0x100000f7b: file 2.c, line 5.n(gdb) rnThread 2 hit Breakpoint 1, main () at 2.c:5n5t for (i = 0; i < 256; i++)n(gdb) nn7t i++;n(gdb) nn5t for (i = 0; i < 256; i++)n(gdb) nn7t i++;n(gdb) nn5t for (i = 0; i < 256; i++)n(gdb) nn7t i++;n
這也符合我們的預期,正常的停靠在第7行,但是執行next命令時,直接跳到for循環開頭,而不是執行第8行的for循環右大括弧。
接下來,我們看第三個例子:
3.c
1.int main()n2.{n3. int i;n4.n5. for (i = 0; i < 256; i++)n6. {n7. i++;n8. continue;n9. }n10.}n
然後我們使用clang -g 3.c,使用gdb進行調試
(gdb) b mainnBreakpoint 1 at 0x100000f7b: file 3.c, line 5.n(gdb) rnThread 2 hit Breakpoint 1, main () at 3.c:5n5t for (i = 0; i < 256; i++)n(gdb) nn7t i++;n(gdb) nn5t for (i = 0; i < 256; i++)n(gdb) nn7t i++;n(gdb) nn5t for (i = 0; i < 256; i++)n(gdb) nn7t i++;n(gdb) nn5t for (i = 0; i < 256; i++)n
Interesting! 這裡我們發現在執行完第7行以後,我們按下next指令,我們沒有停靠在第8行continue語句上,也沒有停靠在第9行的for循環右大括弧上,而是直接跳到了for循環開頭第5行。若Clang / LLVM決定我們對於continue需要停下來,那麼這裡也應該停下來。若Clang / LLVM決定我們對於continue不需要停下來,那麼1.c的第7行continue語句也不應該停下來。
讓我們再看一個例子4.c
1.int main()n2.{n3. int i;n4.n5. for (i = 0; i < 256; i++)n6. {n7. }n8.}n
我們使用clang 4.c -g編譯,然後使用gdb a.out調試:
(gdb) b mainnBreakpoint 1 at 0x100000f8b: file 4.c, line 5.n(gdb) rnThread 2 hit Breakpoint 1, main () at 4.c:5n5t for (i = 0; i < 256; i++)n(gdb) nn7t }n(gdb) nn5t for (i = 0; i < 256; i++)n(gdb) nn7t }n(gdb) nn5t for (i = 0; i < 256; i++)n(gdb) nn7t }n(gdb) nn5t for (i = 0; i < 256; i++)n
Interseting Again. 我們這裡一個空的for循環,我們停靠在了第7行的for循環右大括弧處,然而,然而若Clang / LLVM已經決定這只是語意組織,不應該停靠的話,那麼這裡就不應該停在第7行。如果你覺得應該停在右大括弧,那麼先前的例子也應該停在右大括弧,然而並沒有。
Part 2: 解決問題的思路與過程
我想這一部分才是最重要的,如何從現象出發,在浩如煙海的LLVM代碼中找到對應的問題(或者說處理邏輯),這對於其餘領域,我相信也是相通的。
這裡有四個例子,我們要想解決,那麼我們需要的是它們更詳盡與更底層的信息,這樣我們才能知道發生了什麼。而對於Clang / LLVM來說,第一個詳盡與底層的信息就是LLVM IR,即中間語言表示。於是,在遇到這個問題的時候,我第一步做的是,將這四個例子的LLVM IR拿到,然後觀測for循環。而Clang / LLVM要拿到LLVM IR很簡單,你只需要使用 -S -emit-llvm即可,如clang 1.c -g -S -emit-llvm,就會產生1.ll.
那麼接下來,我來列舉出來相應的LLVM IR。
1.ll
for.cond: ; preds = %for.inc, %entryn %0 = load i32, i32* %i, align 4, !dbg !17n %cmp = icmp slt i32 %0, 256, !dbg !19n br i1 %cmp, label %for.body, label %for.end, !dbg !20nnfor.body: ; preds = %for.condn br label %for.inc, !dbg !21 ; 這是continue語句,會停下來nnfor.inc: ; preds = %for.bodyn %1 = load i32, i32* %i, align 4, !dbg !23n %inc = add nsw i32 %1, 1, !dbg !23n store i32 %inc, i32* %i, align 4, !dbg !23n br label %for.cond, !dbg !24, !llvm.loop !25nnfor.end: ; preds = %for.condn %2 = load i32, i32* %retval, align 4, !dbg !27n ret i32 %2, !dbg !27nn!21 = !DILocation(line: 7, column: 5, scope: !22); continue語句的debug信息,行號7,列號5n
2.ll
for.cond: ; preds = %for.inc, %entryn %0 = load i32, i32* %i, align 4, !dbg !17n %cmp = icmp slt i32 %0, 256, !dbg !19n br i1 %cmp, label %for.body, label %for.end, !dbg !20nnfor.body: ; preds = %for.condn %1 = load i32, i32* %i, align 4, !dbg !21n %inc = add nsw i32 %1, 1, !dbg !21n store i32 %inc, i32* %i, align 4, !dbg !21n br label %for.inc, !dbg !23; 這是for循環右大括弧,不會停下來nnfor.inc: ; preds = %for.bodyn %2 = load i32, i32* %i, align 4, !dbg !24n %inc1 = add nsw i32 %2, 1, !dbg !24n store i32 %inc1, i32* %i, align 4, !dbg !24n br label %for.cond, !dbg !25, !llvm.loop !26nnfor.end: ; preds = %for.condn %3 = load i32, i32* %retval, align 4, !dbg !28n ret i32 %3, !dbg !28nn!23 = !DILocation(line: 8, column: 3, scope: !22)n
這裡我們是不是有所疑惑了?我們可以提煉出來幾個注意點:
- 在LLVM IR中,我們具有了右大括弧的循環的Debug信息(即!dbg !23,行號為8,列號為3),那麼為什麼最後LLVM在2.c這個列子中,卻不會在第8行的for循環右大循環處停下來?
- 這裡的右大括弧和1.ll的continue語句一樣,都是br label %for.cond,即無條件跳轉。但是 1.ll的continue語句會在最後的debugger中停下里,這裡的右大括弧卻不會停下來。
接下來,我們看3.ll
for.cond: ; preds = %for.inc, %entryn %0 = load i32, i32* %i, align 4, !dbg !17n %cmp = icmp slt i32 %0, 256, !dbg !19n br i1 %cmp, label %for.body, label %for.end, !dbg !20nnfor.body: ; preds = %for.condn %1 = load i32, i32* %i, align 4, !dbg !21n %inc = add nsw i32 %1, 1, !dbg !21n store i32 %inc, i32* %i, align 4, !dbg !21n br label %for.inc, !dbg !23; 這是continue語句,卻不會停下來nnfor.inc: ; preds = %for.bodyn %2 = load i32, i32* %i, align 4, !dbg !24n %inc1 = add nsw i32 %2, 1, !dbg !24n store i32 %inc1, i32* %i, align 4, !dbg !24n br label %for.cond, !dbg !25, !llvm.loop !26nnfor.end: ; preds = %for.condn %3 = load i32, i32* %retval, align 4, !dbg !28n ret i32 %3, !dbg !28nn!23 = !DILocation(line: 8, column: 5, scope: !22)n
在3.c中,我們不會停在第8行的continue語句處,這是我們發現的一個問題。同樣是continue,我們對比可以停下來的1.ll,我們可以發現,一個不同點,就是這裡的continue br lable %for.inc,它不是在一個單獨的basic block. 而對比2.ll的br label %for.inc, !dbg !23; 同樣是不能停下來的for循環右大括弧,它也不是在單獨的basic block.
於是,在最初看到這裡的時候,我當時的懷疑點在於:是否這個無條件跳轉,跟在一個單獨的basic block有關,而跟是否是continue或者是for循環大括弧無關?
然後,看4.ll
for.cond: ; preds = %for.inc, %entryn %0 = load i32, i32* %i, align 4, !dbg !17n %cmp = icmp slt i32 %0, 256, !dbg !19n br i1 %cmp, label %for.body, label %for.end, !dbg !20nnfor.body: ; preds = %for.condn br label %for.inc, !dbg !21; 這是for循環右大括弧,可以停下來nnfor.inc: ; preds = %for.bodyn %1 = load i32, i32* %i, align 4, !dbg !23n %inc = add nsw i32 %1, 1, !dbg !23n store i32 %inc, i32* %i, align 4, !dbg !23n br label %for.cond, !dbg !24, !llvm.loop !25nnfor.end: ; preds = %for.condn %2 = load i32, i32* %retval, align 4, !dbg !27n ret i32 %2, !dbg !27nn!21 = !DILocation(line: 7, column: 3, scope: !22)n
在 4.c中,空的for循環,我們能停在右大括弧,然後我發現和1.ll一模一樣,若能停下來,都是一個單獨的basic block。而不能停下來的時候,都不是一個單獨的basic block. 這也驗證了我在3.ll分析的結論,這跟continue / for循環右大括弧無關,而只跟這個無條件跳轉是否在一個單獨的basic block有關,若是一個單獨的basic block,那麼就可以跳。而如果不是,則不能跳。
這是分析LLVM IR時,我得到的信息。現在回想的話,若我能這樣走下去,其實是正確的道路,雖然信息並不完整。這時候開始,我開始走向了一條錯誤的道路。我開始使用LLVM後端編譯器llc去編譯LLVM IR,然後我看LLVM IR被轉換成為了什麼。我使用的命令很簡單llc 1.ll -debug > log 2>&1,然後我在log中發現了這樣的東西:
Subtarget features: SSELevel 5, 3DNowLevel 1, 64bit 1n********** Begin Constant Hoisting **********n********** Function: mainn********** End Constant Hoisting **********n*** Interleaved Access Pass: mainnMERGING MOSTLY EMPTY BLOCKS - BEFORE:nnfor.body: ; preds = %for.condn br label %for.inc, !dbg !21nnfor.inc: ; preds = %for.bodyn %1 = load i32, i32* %i, align 4, !dbg !23n %inc = add nsw i32 %1, 1, !dbg !23n store i32 %inc, i32* %i, align 4, !dbg !23n br label %for.cond, !dbg !24, !llvm.loop !25nAFTER:nnfor.inc: ; preds = %for.condn %1 = load i32, i32* %i, align 4, !dbg !21n %inc = add nsw i32 %1, 1, !dbg !21n store i32 %inc, i32* %i, align 4, !dbg !21n br label %for.cond, !dbg !22, !llvm.loop !23n
也就是這一個讓我往錯誤的道路走遠了,我發現原來這樣的單獨的basic block會與它跳轉的basic block合併,然後更新debug信息。現在回想起來,我這裡至少犯下了兩個錯:
- 若是O0的情況,不應該出現跨basic block的合併,這應該是開優化的情況做的
- 當我發現合併debug信息的時候,我並沒有去驗證生成的目標文件。其實這時候生成的目標文件的Debug信息其實是不正確的,我這裡武斷的猜測更新的Debug信息是正確的,依然可以停在continue語句,其實如果這時候去驗證,已經不會停在continue語句了。
(gdb) b mainnBreakpoint 1 at 0x100000f8b: file 1.c, line 5.n(gdb) rnThread 2 hit Breakpoint 1, main () at 1.c:5n5t for (i = 0; i < 256; i++)n(gdb) nn9t}n(gdb) nn0x00007fffa5ea2235 in ?? ()n
這時候我才發現llc默認並不是我以為的-O0. 於是,我強制使用llc -O0 1.ll -debug,然後發現再也找不到Basic Block合併了。我於是意識到了,我走了彎路走了很遠很遠。
隨後,我使用llc -O0 1.ll -print-after-all > log 2>&1列印每一次IR的變化。這時候我發現在Machine Code生成前,Basic Block都沒有被合併,包括Debug信息也完整保留。那麼,我確定發生問題的地方不會是在IR處理階段,而應該是Machine Code階段。然後,我們看1.ll的Machine Code:
# Machine code for function main: NoPHIs, TracksLivenessnFrame Objects:n fi#0: size=4, align=4, at location [SP+8]n fi#1: size=4, align=4, at location [SP+8]nnBB#0: derived from LLVM BB %entryntMOV32mi <fi#0>, 1, %noreg, 0, %noreg, 0; mem:ST4[%retval]ntMOV32mi <fi#1>, 1, %noreg, 0, %noreg, 0; mem:ST4[%i] dbg:1.c:5:10n Successors according to CFG: BB#1nnBB#1: derived from LLVM BB %for.condn Predecessors according to CFG: BB#0 BB#3ntCMP32mi <fi#1>, 1, %noreg, 0, %noreg, 256, %EFLAGS<imp-def>; mem:LD4[%i] dbg:1.c:5:17ntJGE_1 <BB#4>, %EFLAGS<imp-use>; dbg:1.c:5:3n Successors according to CFG: BB#4 BB#2nnBB#2: derived from LLVM BB %for.bodyn Predecessors according to CFG: BB#1ntJMP_1 <BB#3>; dbg:1.c:7:5 ======> continue語句處n Successors according to CFG: BB#3nnBB#3: derived from LLVM BB %for.incn Predecessors according to CFG: BB#2nt%vreg6<def> = MOV32rm <fi#1>, 1, %noreg, 0, %noreg; mem:LD4[%i] GR32:%vreg6 dbg:1.c:5:25nt%vreg5<def> = COPY %vreg6; GR32:%vreg5,%vreg6 dbg:1.c:5:25nt%vreg5<def,tied1> = ADD32ri8 %vreg5<tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg5 dbg:1.c:5:25ntMOV32mr <fi#1>, 1, %noreg, 0, %noreg, %vreg5<kill>; mem:ST4[%i] GR32:%vreg5 dbg:1.c:5:25ntJMP_1 <BB#1>; dbg:1.c:5:3n Successors according to CFG: BB#1nnBB#4: derived from LLVM BB %for.endn Predecessors according to CFG: BB#1nt%vreg2<def> = MOV32rm <fi#0>, 1, %noreg, 0, %noreg; mem:LD4[%retval] GR32:%vreg2 dbg:1.c:9:1nt%EAX<def> = COPY %vreg2; GR32:%vreg2 dbg:1.c:9:1ntRETQ %EAX<imp-use>; dbg:1.c:9:1nn# End machine code for function main.n
我們發現BB2依然保留了continue語句,並且第7行的debug信息也有。
接下來,我們看2.ll的Machine Code:
# Machine code for function main: IsSSA, TracksLivenessnFrame Objects:n fi#0: size=4, align=4, at location [SP+8]n fi#1: size=4, align=4, at location [SP+8]nnBB#0: derived from LLVM BB %entryntMOV32mi <fi#0>, 1, %noreg, 0, %noreg, 0; mem:ST4[%retval]ntMOV32mi <fi#1>, 1, %noreg, 0, %noreg, 0; mem:ST4[%i] dbg:2.c:5:10n Successors according to CFG: BB#1nnBB#1: derived from LLVM BB %for.condn Predecessors according to CFG: BB#0 BB#3ntCMP32mi <fi#1>, 1, %noreg, 0, %noreg, 256, %EFLAGS<imp-def>; mem:LD4[%i] dbg:2.c:5:17ntJGE_1 <BB#4>, %EFLAGS<imp-use>; dbg:2.c:5:3n Successors according to CFG: BB#4 BB#2nnBB#2: derived from LLVM BB %for.bodyn Predecessors according to CFG: BB#1nt%vreg6<def> = MOV32rm <fi#1>, 1, %noreg, 0, %noreg; mem:LD4[%i] GR32:%vreg6 dbg:2.c:7:6nt%vreg5<def,tied1> = ADD32ri8 %vreg6<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg5,%vreg6 dbg:2.c:7:6ntMOV32mr <fi#1>, 1, %noreg, 0, %noreg, %vreg5<kill>; mem:ST4[%i] GR32:%vreg5 dbg:2.c:7:6n Successors according to CFG: BB#3 ======>沒有了for循環右大括弧指令!!!nnBB#3: derived from LLVM BB %for.incn Predecessors according to CFG: BB#2nt%vreg10<def> = MOV32rm <fi#1>, 1, %noreg, 0, %noreg; mem:LD4[%i] GR32:%vreg10 dbg:2.c:5:25nt%vreg9<def,tied1> = ADD32ri8 %vreg10<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg9,%vreg10 dbg:2.c:5:25ntMOV32mr <fi#1>, 1, %noreg, 0, %noreg, %vreg9<kill>; mem:ST4[%i] GR32:%vreg9 dbg:2.c:5:25ntJMP_1 <BB#1>; dbg:2.c:5:3n Successors according to CFG: BB#1nnBB#4: derived from LLVM BB %for.endn Predecessors according to CFG: BB#1nt%vreg2<def> = MOV32rm <fi#0>, 1, %noreg, 0, %noreg; mem:LD4[%retval] GR32:%vreg2 dbg:2.c:9:1nt%EAX<def> = COPY %vreg2; GR32:%vreg2 dbg:2.c:9:1ntRETQ %EAX<imp-use>; dbg:2.c:9:1nn# End machine code for function main.n
可以發現,在Machine Code階段,第二個例子的無條件跳轉被刪除跳了。根據這裡的列印信息來看,它的後繼者為BB3,也就是它接下來的自然Basic Block,所以它可以直接落下去,而不需要額外的無條件跳轉,所以是可以理解的。
那麼,針對剛才的1.ll的Machine Code,它的後繼者是BB3,其實也是自然Basic Block,其實它也是不需要無條件跳轉的,但是LLVM卻保留了這個無條件跳轉。無論如何,當我走到這裡,我認為我快要接近真相了,因為問題在收斂,而不是先前去找合併Basic Block一樣,會越走越遠。
然後看3.ll的Machine Code.
# Machine code for function main: IsSSA, TracksLivenessnFrame Objects:n fi#0: size=4, align=4, at location [SP+8]n fi#1: size=4, align=4, at location [SP+8]nnBB#0: derived from LLVM BB %entryntMOV32mi <fi#0>, 1, %noreg, 0, %noreg, 0; mem:ST4[%retval]ntMOV32mi <fi#1>, 1, %noreg, 0, %noreg, 0; mem:ST4[%i] dbg:3.c:5:10n Successors according to CFG: BB#1nnBB#1: derived from LLVM BB %for.condn Predecessors according to CFG: BB#0 BB#3ntCMP32mi <fi#1>, 1, %noreg, 0, %noreg, 256, %EFLAGS<imp-def>; mem:LD4[%i] dbg:3.c:5:17ntJGE_1 <BB#4>, %EFLAGS<imp-use>; dbg:3.c:5:3n Successors according to CFG: BB#4 BB#2nnBB#2: derived from LLVM BB %for.bodyn Predecessors according to CFG: BB#1nt%vreg6<def> = MOV32rm <fi#1>, 1, %noreg, 0, %noreg; mem:LD4[%i] GR32:%vreg6 dbg:3.c:7:6nt%vreg5<def,tied1> = ADD32ri8 %vreg6<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg5,%vreg6 dbg:3.c:7:6ntMOV32mr <fi#1>, 1, %noreg, 0, %noreg, %vreg5<kill>; mem:ST4[%i] GR32:%vreg5 dbg:3.c:7:6n Successors according to CFG: BB#3 =====>沒有了continue語句的無條件跳轉指令!nnBB#3: derived from LLVM BB %for.incn Predecessors according to CFG: BB#2nt%vreg10<def> = MOV32rm <fi#1>, 1, %noreg, 0, %noreg; mem:LD4[%i] GR32:%vreg10 dbg:3.c:5:25nt%vreg9<def,tied1> = ADD32ri8 %vreg10<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg9,%vreg10 dbg:3.c:5:25ntMOV32mr <fi#1>, 1, %noreg, 0, %noreg, %vreg9<kill>; mem:ST4[%i] GR32:%vreg9 dbg:3.c:5:25ntJMP_1 <BB#1>; dbg:3.c:5:3n Successors according to CFG: BB#1nnBB#4: derived from LLVM BB %for.endn Predecessors according to CFG: BB#1nt%vreg2<def> = MOV32rm <fi#0>, 1, %noreg, 0, %noreg; mem:LD4[%retval] GR32:%vreg2 dbg:3.c:10:1nt%EAX<def> = COPY %vreg2; GR32:%vreg2 dbg:3.c:10:1ntRETQ %EAX<imp-use>; dbg:3.c:10:1nn# End machine code for function main.n
意料之中,沒有了continue語句的指令,因為它和2.ll一樣,不是在一個單獨的Basic Block。而如果我們猜測的沒錯的話,4.ll會含有右大括弧的無條件跳轉,因為在一個單獨的Basic Block.
4.ll的Machine Code:
# Machine code for function main: NoPHIs, TracksLiveness, NoVRegsnFrame Objects:n fi#-1: size=8, align=16, fixed, at location [SP-8]n fi#0: size=4, align=4, at location [SP-12]n fi#1: size=4, align=4, at location [SP-16]nnBB#0: derived from LLVM BB %entryn Live Ins: %RBPntPUSH64r %RBP<kill>, %RSP<imp-def>, %RSP<imp-use>; flags: FrameSetupntCFI_INSTRUCTION <call frame instruction>ntCFI_INSTRUCTION <call frame instruction>nt%RBP<def> = MOV64rr %RSP; flags: FrameSetupntCFI_INSTRUCTION <call frame instruction>ntMOV32mi %RBP, 1, %noreg, -4, %noreg, 0; mem:ST4[%retval]ntMOV32mi %RBP, 1, %noreg, -8, %noreg, 0; mem:ST4[%i] dbg:4.c:5:10n Successors according to CFG: BB#1nnBB#1: derived from LLVM BB %for.condn Live Ins: %RBPn Predecessors according to CFG: BB#0 BB#3ntCMP32mi %RBP, 1, %noreg, -8, %noreg, 256, %EFLAGS<imp-def>; mem:LD4[%i] dbg:4.c:5:17ntJGE_1 <BB#4>, %EFLAGS<imp-use>; dbg:4.c:5:3n Successors according to CFG: BB#4 BB#2nnBB#2: derived from LLVM BB %for.bodyn Live Ins: %RBPn Predecessors according to CFG: BB#1ntJMP_1 <BB#3>; dbg:4.c:7:3 ===> for循環右大括弧n Successors according to CFG: BB#3nnBB#3: derived from LLVM BB %for.incn Live Ins: %RBPn Predecessors according to CFG: BB#2nt%EAX<def> = MOV32rm %RBP, 1, %noreg, -8, %noreg; mem:LD4[%i] dbg:4.c:5:25nt%EAX<def,tied1> = ADD32ri8 %EAX<tied0>, 1, %EFLAGS<imp-def>; dbg:4.c:5:25ntMOV32mr %RBP, 1, %noreg, -8, %noreg, %EAX<kill>; mem:ST4[%i] dbg:4.c:5:25ntJMP_1 <BB#1>; dbg:4.c:5:3n Successors according to CFG: BB#1nnBB#4: derived from LLVM BB %for.endn Live Ins: %RBPn Predecessors according to CFG: BB#1nt%EAX<def> = MOV32rm %RBP, 1, %noreg, -4, %noreg; mem:LD4[%retval] dbg:4.c:8:1nt%RBP<def> = POP64r %RSP<imp-def>, %RSP<imp-use>; flags: FrameDestroy dbg:4.c:8:1ntRETQ %EAX<imp-use,kill>; dbg:4.c:8:1nn# End machine code for function main.n
所以,這裡猜測就完全正確了。
- 發生的階段不在IR,而是在生成Machine Code的時候
- 若無條件跳轉發生不是在一個單獨的Basic Block,且跳躍目標是自然Fall Through的Basic Block,那麼LLVM就會刪掉這一個無條件跳轉指令
然後Machine Basic Block類很好找,在include/llvm/CodeGen/MachineBasicBlock.h。然後,我在耐心看這個類的介面函數的時候,發現這一個函數:
/// Return true if the specified MBB will be emitted immediately after thisn /// block, such that if this block exits by falling through, control willn /// transfer to the specified MBB. Note that MBB need not be a successor atn /// all, for example if this block ends with an unconditional branch to somen /// other block.n bool isLayoutSuccessor(const MachineBasicBlock *MBB) const;n
看到這個注釋的時候,我感覺我的曙光來了,這正好符合我們之前說的,含有無條件跳轉,並且目標可以是自然fall through的。於是我接下里做的事情,就是一股腦全工程搜索這個函數調用的地方。搜出來的確會有很多地方,但是我們可以有條件過濾:跟IR變化無關;跟優化無關;跟平台也是無關的;跟無條件跳轉相關的;並且是在生產Machine Code的時候去掉的;
秉承著這個思路,我最後在lib/CodeGen/SelectionDAG/FastISel.cpp的地方找到了源頭:
/// Emit an unconditional branch to the given block, unless it is the immediaten/// (fall-through) successor, and update the CFG.nvoid FastISel::fastEmitBranch(MachineBasicBlock *MSucc,n const DebugLoc &DbgLoc) {n if (FuncInfo.MBB->getBasicBlock()->size() > 1 &&n FuncInfo.MBB->isLayoutSuccessor(MSucc)) {n // For more accurate line information if this is the only instructionn // in the block then emit it, otherwise we have the unconditionaln // fall-through case, which needs no instructions.n } else {n // The unconditional branch case.n TII.InsertBranch(*FuncInfo.MBB, MSucc, nullptr,n SmallVector<MachineOperand, 0>(), DbgLoc);n }n if (FuncInfo.BPI) {n auto BranchProbability = FuncInfo.BPI->getEdgeProbability(n FuncInfo.MBB->getBasicBlock(), MSucc->getBasicBlock());n FuncInfo.MBB->addSuccessor(MSucc, BranchProbability);n } elsen FuncInfo.MBB->addSuccessorWithoutProb(MSucc);n}n
這裡的:
if (FuncInfo.MBB->getBasicBlock()->size() > 1 &&n FuncInfo.MBB->isLayoutSuccessor(MSucc)) {n // For more accurate line information if this is the only instructionn // in the block then emit it, otherwise we have the unconditionaln // fall-through case, which needs no instructions.n }n
其實就是我們之前總結的條件,也完全匹配了我們的猜想。所以,歸根結底,一切的源頭都是在這裡。
那麼,回過頭來看,在這裡有發生過彎路,默認了llc命令是不包含任何優化Pass的,從而耗費了很多時間。但是,其實這是可以避免的,就是更小心的驗證每一步結果,而不應該有任何的假設。整體尋找的思路則是正確的,由現象往裡找,確定清楚了問題,並且找出根源,再去找代碼,而不是一開始就直接去LLVM那裡找代碼,那一定會如同我在找合併Basic Block的時候一樣,迷失在裡面,如果運氣不好,可能就迷失出不來了,所以在找代碼的時候一定要清楚路了再走。
推薦閱讀:
※LLVM每日談之四 Pass初探
※LLVM中如何獲取程序的控制流圖CFG?
※LLVM 怎樣入門和上手?
※如何理解LLVM的PassManager系統的實現?
※LLVM 相比與其他 Compiler Infrastructure 有什麼優勢?
TAG:LLVM |