標籤:

c語言里的char大小到底是4還是1?

下面的代碼

#include &

int main()
{
char *s = "hello";
printf("sizeof char is %lu, sizeof char* is %lu, sizeof "a" is %lu
",
sizeof(char), sizeof(char*), sizeof("a"));
printf("sizeof s is %lu
", sizeof(*s+0));
printf("sizeof s is %lu
", sizeof(*s));
return 0;
}

使用gcc4.4.6編譯,運行的結果是

sizeof char is 1, sizeof char* is 8, sizeof "a" is 4
sizeof s is 4
sizeof s is 1

各位大牛們,能不能解釋下呢


首先,sizeof(char)一定是1

c99標準裡面對sizeof是這麼定義的:

6.5.3.4 The sizeof operator

When applied to an operand that has type char, unsigned char, or signed char, (or a quali?ed version thereof) the result is 1. When applied to an operand that has array type, the result is the total number of bytes in the array.) When applied to an operand that has structure or union type, the result is the total number of bytes in such an object, including internal and trailing padding.

另外,其實在char的定義裡面其實是沒有規定一個char為一個byte的,只是要求large enough:

6.2.5 Types

An object declared as type char is large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in a char object, its value is guaranteed to be positive. If any other character is stored in a char object, the resulting value is implementation-de?ned but shall be within the range of values that can be represented in that type.

所以,sizeof的結果其實告訴我們的是:多少個char,而不是多少個byte。但是,如果結合sizeof後半段關於array和struct都用了bytes,所以,唯一合理的解釋就是:一個char就是一個byte。(我個人估計標準委員會那幫傢伙寫暈頭了?)

然後,sizeof("a")一定是sizeof(int)

6.4.4.4 Character constants

An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in "x".

An integer character constant has type int.

接著,再解釋一下sizeof(*s+0)。

這其實是發生了算數提升:

6.3.1 Arithmetic operands

The following may be used in an expression wherever an int or unsigned int may be used:

— An object or expression with an integer type whose integer conversion rank is less

than the rank of int and unsigned int.

也就是,小於int的整形(char也是),在運算中會提升為int運算,那結果自然也是int了。

最後,再解釋一下byte和bit的關係:

3.6

1 byte

addressable unit of data storage large enough to hold any member of the basic character set of the execution environment

2 NOTE 1 It is possible to express the address of each individual byte of an object uniquely.

3 NOTE 2 A byte is composed of a contiguous sequence of bits, the number of which is implementation-de?ned. The least signi?cant bit is called the low-order bit; the most signi?cant bit is called the high-order bit.

但是這個實現相關也不是天馬行空的,至少規定了下限:

large enough to hold any member of the basic character set

而在5.2.1的定義,basic character set有95個:26*2的大小寫字母,10個數字、29個符號、4個特殊字元,那也就是起碼要7個bit。再結合6.2.5中char的定義,必須要求是正數,那就要再加一個符號位:

If a member of the basic execution character set is stored in a char object, its value is guaranteed to be positive

那也就是說,1個byte起碼要8個bit。也就是說,理論上可能會出現1byte=9bit的系統,但是不可能出現1byte=7bit的系統。


為什麼 C 語言中的一些特性不被 C++ 支持? - 陳碩的回答


C語言里,"a"是被當做int常量的,所以sizeof("a")是4。

*s是char,但是0是int,這兩個相加的話char被隱式轉換(謝謝指出,應為提升)成int了,所以和是int。

sizeof(char)總是1,不管多少bit……


char是一個byte,至於一個byte是多少個bit,這要看CPU


char *s = "hello";
printf("sizeof char is %lu, sizeof char* is %lu, sizeof "a" is %lu
",
sizeof(char), sizeof(char*), sizeof("a"));
//char 字元大小 1byte char* 指針 64位的機子為 8byte 『a』為int 4byte
printf("sizeof s is %lu
", sizeof(*s+0));
//*s為數組第一個,"h" +0 轉化為 int 4byte
printf("sizeof s is %lu
", sizeof(*s));
//*s 數組第一個元素 』h『 1byte


樓主,一般為8位。


推薦閱讀:

為什麼不都用memmove代替memcpy?
反編譯工具能反編譯出注釋嗎?
程序的靜態存儲區,動態存儲區和堆以及棧的關係是什麼?
為什麼很多編程語言採用花括弧區分 block 而非縮進?
為什麼說 C 語言是系統級編程的首選?

TAG:C編程語言 | CC |