C語言if與else if寫成的這樣一段代碼效率上或者編譯完成後的結構上是否有區別(主要看補充內容中的詳細)?
int compare(int a, int b)
{
if (a &< b) return -1; else if (a &> b) return 1;
else return 0;
}
int compare(int a, int b)
{
if (a &< b) return -1; if (a &> b) return 1;
return 0;
}
聯動傳送門:程序函數條件與返回的區別? - RednaxelaFX 的回答
另寫了三個函數,一併附上:
int f1(int a, int b)
{
return (a &> b) | -(b &> a);
}
int f2(int a, int b)
{
return (a &> b) - (a &< b);
}
int f3(int a, int b)
{
__asm__ __volatile__ (
"sub %1, %0
"
"jno 1f
"
"cmc
"
"rcr %0
"
"1: "
: "+r"(a)
: "r"(b)
: "cc");
return a;
}
int compare1(int a, int b)
{
if (a &< b) return -1;
else if (a &> b) return 1;
else return 0;
}
int compare2(int a, int b)
{
if (a &< b) return -1;
if (a &> b) return 1;
return 0;
}
編譯命令用
gcc -g -O2 test.c -o test -lrt
使用隨機數生成進行測試,結果如下(ubuntu 12.04):
f1: diff:10269370, start:655642, end:10925012, ret:0
f2: diff:10187843, start:10975174, end:21163017, ret:0
f3: diff:15027995, start:21183528, end:36211523, ret:0
compare1: diff:11095015, start:36236973, end:47331988, ret:0
compare2: diff:10858906, start:47362289, end:58221195, ret:0
註:diff 即為時間差。
可以看出 f1 / f2 較佔優勢,但大多在10%以內,畢竟這些簡單演算法不會差距太大。compare1和compare2的彙編是完全相同的,都使用了5條指令,沒有效率差別。0x4005a0 &
0x4005a2 &
0x4005a4 &
0x4005a9 &
0x4005ac &
0x4005b0 & f1 使用了八條指令,但在這裡卻令人訝異的性能也不錯,應該是都是簡單指令的緣故: f2 使用了六條指令,性能也不錯: 注意還有一個很常見的實現: 它的彙編和 compare1 / compare2 一樣,所以不單獨列出了。 綜上:
0x4005b2 &
0x4005b4 &
0x4005b9 &
0x4005bc &
xorl %eax, %eax
cmpl %edi, %esi
setg %al
xorl %edx, %edx
negl %eax
cmpl %esi, %edi
setg %dl
orl %edx, %eax
xorl %eax, %eax
cmpl %esi, %edi
setl %dl
setg %al
movzbl %dl, %edx
subl %edx, %eax
return (a &< b) ? -1 : (a &> b);
完整代碼庫,包含源碼、彙編、數據:
https://github.com/geekan/c-algorithm/tree/master/integer_comparison參考:Efficient integer compare function先說結論:不開編譯器優化,略有區別,打開優化,完全相同測試環境:win7 + gcc4.9.2測試代碼
int compare1(int a, int b)
{ if (a &< b) return -1;else if (a &> b) return 1;
else return 0;}int compare2(int a, int b){ if (a &< b) return -1; if (a &> b) return 1; return 0;}
gcc -c之後objdump -d
Disassembly of section .text:
00000000 &<_compare1&>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 08 mov 0x8(%ebp),%eax 6: 3b 45 0c cmp 0xc(%ebp),%eax 9: 7d 07 jge 12 &<_compare1+0x12&> b: b8 ff ff ff ff mov $0xffffffff,%eax 10: eb 14 jmp 26 &<_compare1+0x26&> 12: 8b 45 08 mov 0x8(%ebp),%eax 15: 3b 45 0c cmp 0xc(%ebp),%eax
18: 7e 07 jle 21 &<_compare1+0x21&> 1a: b8 01 00 00 00 mov $0x1,%eax 1f: eb 05 jmp 26 &<_compare1+0x26&> 21: b8 00 00 00 00 mov $0x0,%eax 26: 5d pop %ebp 27: c3 ret00000028 &<_compare2&>: 28: 55 push %ebp29: 89 e5 mov %esp,%ebp
2b: 8b 45 08 mov 0x8(%ebp),%eax 2e: 3b 45 0c cmp 0xc(%ebp),%eax 31: 7d 07 jge 3a &<_compare2+0x12&> 33: b8 ff ff ff ff mov $0xffffffff,%eax 38: eb 14 jmp 4e &<_compare2+0x26&> 3a: 8b 45 08 mov 0x8(%ebp),%eax 3d: 3b 45 0c cmp 0xc(%ebp),%eax 40: 7e 07 jle 49 &<_compare2+0x21&> 42: b8 01 00 00 00 mov $0x1,%eax47: eb 05 jmp 4e &<_compare2+0x26&>
49: b8 00 00 00 00 mov $0x0,%eax 4e: 5d pop %ebp 4f: c3 ret
使用-O2優化之後
Disassembly of section .text:
00000000 &<_compare1&>: 0: 8b 44 24 08 mov 0x8(%esp),%eax 4: 39 44 24 04 cmp %eax,0x4(%esp) 8: ba ff ff ff ff mov $0xffffffff,%edxd: 0f 9f c0 setg %al
10: 0f b6 c0 movzbl %al,%eax 13: 0f 4c c2 cmovl %edx,%eax 16: c3 ret 17: 89 f6 mov %esi,%esi 19: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi00000020 &<_compare2&>: 20: 8b 44 24 08 mov 0x8(%esp),%eax 24: 39 44 24 04 cmp %eax,0x4(%esp)28: ba ff ff ff ff mov $0xffffffff,%edx
2d: 0f 9f c0 setg %al 30: 0f b6 c0 movzbl %al,%eax 33: 0f 4c c2 cmovl %edx,%eax 36: c3 ret
假設源碼如圖1所示:
int compare1(int a, int b) {
if (a &< b) return -1;
else if (a &> b) return 1;
else return 0;
}
int compare2(int a, int b) {
if (a &< b) return -1;
if (a &> b) return 1;
return 0;
}
以下是不帶優化參數的clang37產生的LLVM IR,就是完全按照原版語義翻譯過來的(圖2),不過只要帶了O1或者以上,兩個函數產生的IR全都一樣(圖3),IR都一樣了後面的產生的彙編不出錯肯定也一樣啦。
; Function Attrs: nounwind ssp uwtable
define i32 @compare1(i32 %a, i32 %b) #0 {
entry:
%retval = alloca i32, align 4
%a.addr = alloca i32, align 4
%b.addr = alloca i32, align 4
store i32 %a, i32* %a.addr, align 4
store i32 %b, i32* %b.addr, align 4
%0 = load i32, i32* %a.addr, align 4
%1 = load i32, i32* %b.addr, align 4
%cmp = icmp slt i32 %0, %1
br i1 %cmp, label %if.then, label %if.else
if.then: ; preds = %entry
store i32 -1, i32* %retval
br label %return
if.else: ; preds = %entry
%2 = load i32, i32* %a.addr, align 4
%3 = load i32, i32* %b.addr, align 4
%cmp1 = icmp sgt i32 %2, %3
br i1 %cmp1, label %if.then.2, label %if.else.3
if.then.2: ; preds = %if.else
store i32 1, i32* %retval
br label %return
if.else.3: ; preds = %if.else
store i32 0, i32* %retval
br label %return
return: ; preds = %if.else.3, %if.then.2, %if.then
%4 = load i32, i32* %retval
ret i32 %4
}
; Function Attrs: nounwind ssp uwtable
define i32 @compare2(i32 %a, i32 %b) #0 {
entry:
%retval = alloca i32, align 4
%a.addr = alloca i32, align 4
%b.addr = alloca i32, align 4
store i32 %a, i32* %a.addr, align 4
store i32 %b, i32* %b.addr, align 4
%0 = load i32, i32* %a.addr, align 4
%1 = load i32, i32* %b.addr, align 4
%cmp = icmp slt i32 %0, %1
br i1 %cmp, label %if.then, label %if.end
if.then: ; preds = %entry
store i32 -1, i32* %retval
br label %return
if.end: ; preds = %entry
%2 = load i32, i32* %a.addr, align 4
%3 = load i32, i32* %b.addr, align 4
%cmp1 = icmp sgt i32 %2, %3
br i1 %cmp1, label %if.then.2, label %if.end.3
if.then.2: ; preds = %if.end
store i32 1, i32* %retval
br label %return
if.end.3: ; preds = %if.end
store i32 0, i32* %retval
br label %return
return: ; preds = %if.end.3, %if.then.2, %if.then
%4 = load i32, i32* %retval
ret i32 %4
}
; Function Attrs: nounwind readnone ssp uwtable
define i32 @compare1(i32 %a, i32 %b) #0 {
entry:
%cmp = icmp slt i32 %a, %b
%cmp1 = icmp sgt i32 %a, %b
%. = zext i1 %cmp1 to i32
%retval.0 = select i1 %cmp, i32 -1, i32 %.
ret i32 %retval.0
}
; Function Attrs: nounwind readnone ssp uwtable
define i32 @compare2(i32 %a, i32 %b) #0 {
entry:
%cmp = icmp slt i32 %a, %b
%cmp1 = icmp sgt i32 %a, %b
%. = zext i1 %cmp1 to i32
%retval.0 = select i1 %cmp, i32 -1, i32 %.
ret i32 %retval.0
}
cygwin gcc 默認選項 是一樣的
00401190 &<_compare1&>:
401190: 55 push %ebp
401191: 89 e5 mov %esp,%ebp
401193: 8b 45 08 mov 0x8(%ebp),%eax
401196: 3b 45 0c cmp 0xc(%ebp),%eax
401199: 7d 07 jge 4011a2 &<_compare1+0x12&>
40119b: b8 ff ff ff ff mov $0xffffffff,%eax
4011a0: eb 14 jmp 4011b6 &<_compare1+0x26&>
4011a2: 8b 45 08 mov 0x8(%ebp),%eax
4011a5: 3b 45 0c cmp 0xc(%ebp),%eax
4011a8: 7e 07 jle 4011b1 &<_compare1+0x21&>
4011aa: b8 01 00 00 00 mov $0x1,%eax
4011af: eb 05 jmp 4011b6 &<_compare1+0x26&>
4011b1: b8 00 00 00 00 mov $0x0,%eax
4011b6: 5d pop %ebp
4011b7: c3 ret
004011b8 &<_compare2&>:
4011b8: 55 push %ebp
4011b9: 89 e5 mov %esp,%ebp
4011bb: 8b 45 08 mov 0x8(%ebp),%eax
4011be: 3b 45 0c cmp 0xc(%ebp),%eax
4011c1: 7d 07 jge 4011ca &<_compare2+0x12&>
4011c3: b8 ff ff ff ff mov $0xffffffff,%eax
4011c8: eb 14 jmp 4011de &<_compare2+0x26&>
4011ca: 8b 45 08 mov 0x8(%ebp),%eax
4011cd: 3b 45 0c cmp 0xc(%ebp),%eax
4011d0: 7e 07 jle 4011d9 &<_compare2+0x21&>
4011d2: b8 01 00 00 00 mov $0x1,%eax
4011d7: eb 05 jmp 4011de &<_compare2+0x26&>
4011d9: b8 00 00 00 00 mov $0x0,%eax
4011de: 5d pop %ebp
4011df: c3 ret
int compare(int a, int b)
//update: may overflow,
//for example:when int is 32-bit,
//compare(2147483647,-1) will return negative number(-2147483648)
{
return (a-b);
}
看彙編不就好了?if...else if...else 並聯if...if 串聯再針對同一權重問題時優先選擇並聯,減少不必要的開銷
推薦閱讀: