Clang裡面真正的前端是什麼?

clang -cc1算是比較狹窄的前端了,但是他也是調用各個FrontEndAction。FrontEndAction感覺就像一個driver, 會做parse, semantic analysis甚至調用後端生成代碼(-emit-obj)。怎麼比較準確的定義Clang裡面的前端呢?


簡單的來說:

clang -cc1就是前端了,你說的-emit-obj是屬於FrontendBaseAction的一個Action,但是默認的話,是不會走這個Action的,除非有選項參數傳遞過來,改變默認的Action。

詳細的來說,即從代碼層面來一步一步的說:

在進入到Driver後,我們有

return ExecuteCC1Tool(argv, argv[1] + 4);

clang/driver.cpp at master · llvm-mirror/clang · GitHub

來進行CC1的處理,而該函數的處理為

static int ExecuteCC1Tool(ArrayRef& argv, StringRef Tool) {
void *GetExecutablePathVP = (void *)(intptr_t) GetExecutablePath;
if (Tool == "")
return cc1_main(argv.slice(2), argv[0], GetExecutablePathVP);
if (Tool == "as")
return cc1as_main(argv.slice(2), argv[0], GetExecutablePathVP);

// Reject unknown tools.
llvm::errs() &<&< "error: unknown integrated tool "" &<&< Tool &<&< "" "; return 1; }

clang/driver.cpp at master · llvm-mirror/clang · GitHub

然後,我們進入到cc1_main裡面後,其中值得注意的是這一條語句

std::unique_ptr& Clang(new CompilerInstance());

clang/cc1_main.cpp at master · llvm-mirror/clang · GitHub

而這裡的clang就對應一個CompilerInstance,而CompilerInstance是一個輔助類,

/// CompilerInstance - Helper class for managing a single instance of the Clang
/// compiler.

我們可以發現,如PP,Sema都單獨拿出來了

clang/CompilerInstance.h at master · llvm-mirror/clang · GitHub

然後我們繼續走下去,可以發現

// Execute the frontend actions.
Success = ExecuteCompilerInvocation(Clang.get());

而這也是我們重點看的

clang/cc1_main.cpp at master · llvm-mirror/clang · GitHub

隨後,我們進入ExecuteCompilerInvocation,我們找到了你提到的有關的FrontAction

// Create and execute the frontend action.
std::unique_ptr& Act(CreateFrontendAction(*Clang));

clang/ExecuteCompilerInvocation.cpp at master · llvm-mirror/clang · GitHub

而這一個函數呢就在這個文件裡面,我們重點觀測這個BaseAction

// Create the underlying action.
FrontendAction *Act = CreateFrontendBaseAction(CI);

clang/ExecuteCompilerInvocation.cpp at master · llvm-mirror/clang · GitHub

而這個BaseAction也剛好在這個文件

static FrontendAction *CreateFrontendBaseAction(CompilerInstance CI) {
using namespace clang::frontend;
StringRef Action("unknown");
(void)Action;

switch (CI.getFrontendOpts().ProgramAction) {
case ASTDeclList: return new ASTDeclListAction();
case ASTDump: return new ASTDumpAction();
case ASTPrint: return new ASTPrintAction();
case ASTView: return new ASTViewAction();
case DumpRawTokens: return new DumpRawTokensAction();
case DumpTokens: return new DumpTokensAction();
case EmitAssembly: return new EmitAssemblyAction();
case EmitBC: return new EmitBCAction();
case EmitHTML: return new HTMLPrintAction();
case EmitLLVM: return new EmitLLVMAction();
case EmitLLVMOnly: return new EmitLLVMOnlyAction();
case EmitCodeGenOnly: return new EmitCodeGenOnlyAction();
case EmitObj: return new EmitObjAction();
case FixIt: return new FixItAction();
case GenerateModule: return new GenerateModuleAction;
case GeneratePCH: return new GeneratePCHAction;
case GeneratePTH: return new GeneratePTHAction();
case InitOnly: return new InitOnlyAction();
case ParseSyntaxOnly: return new SyntaxOnlyAction();
case ModuleFileInfo: return new DumpModuleInfoAction();
case VerifyPCH: return new VerifyPCHAction();
// ...

clang/ExecuteCompilerInvocation.cpp at master · llvm-mirror/clang · GitHub

我們也終於找到了你說的EmitObj,然而對於Clang cc1來說,它默認的ProgramAction是什麼呢?

讓我們再次回到cc1_main

bool Success = CompilerInvocation::CreateFromArgs(
Clang-&>getInvocation(), Argv.begin(), Argv.end(), Diags);

clang/cc1_main.cpp at master · llvm-mirror/clang · GitHub

而這句話有什麼作用呢?我們進入CreateFromArgs函數後看這行代碼

InputKind DashX = ParseFrontendArgs(Res.getFrontendOpts(), Args, Diags);

clang/CompilerInvocation.cpp at master · llvm-mirror/clang · GitHub

然後我們走進去這個函數

static InputKind ParseFrontendArgs(FrontendOptions Opts, ArgList Args,
DiagnosticsEngine Diags) {
using namespace options;
Opts.ProgramAction = frontend::ParseSyntaxOnly;
if (const Arg *A = Args.getLastArg(OPT_Action_Group)) {
switch (A-&>getOption().getID()) {
default:
llvm_unreachable("Invalid option in group!");
case OPT_ast_list:
Opts.ProgramAction = frontend::ASTDeclList; break;
case OPT_ast_dump:
case OPT_ast_dump_lookups:
Opts.ProgramAction = frontend::ASTDump; break;
case OPT_ast_print:
Opts.ProgramAction = frontend::ASTPrint; break;
case OPT_ast_view:
Opts.ProgramAction = frontend::ASTView; break;
case OPT_dump_raw_tokens:
Opts.ProgramAction = frontend::DumpRawTokens; break;
case OPT_dump_tokens:
Opts.ProgramAction = frontend::DumpTokens; break;
case OPT_S:
Opts.ProgramAction = frontend::EmitAssembly; break;
case OPT_emit_llvm_bc:
Opts.ProgramAction = frontend::EmitBC; break;
case OPT_emit_html:
Opts.ProgramAction = frontend::EmitHTML; break;
case OPT_emit_llvm:
Opts.ProgramAction = frontend::EmitLLVM; break;
case OPT_emit_llvm_only:
Opts.ProgramAction = frontend::EmitLLVMOnly; break;
case OPT_emit_codegen_only:
Opts.ProgramAction = frontend::EmitCodeGenOnly; break;
case OPT_emit_obj:
Opts.ProgramAction = frontend::EmitObj; break;

clang/CompilerInvocation.cpp at master · llvm-mirror/clang · GitHub

所以,默認行為是

Opts.ProgramAction = frontend::ParseSyntaxOnly;

於是,我們先前的CreateFrontendBaseAction對應的是

case ParseSyntaxOnly: return new SyntaxOnlyAction();

於是,CC1即是前端了,只是它有一系列的Action,只有你在進入CC1前傳遞給它,它才會對應的去改變默認的Action,若有興趣,可以繼續深挖各個Action到底做了什麼,以及從Driver是如何一步一步的走前端後端的,不過我這裡就不展開了。


推薦閱讀:

gcc編譯器為什麼不直接編譯成機器代碼?
請問國內有哪些關於clang編譯技術的重要會議或者論壇?
單純學習C,windows下有什麼好的編譯器?
std::array 是被編譯器特別對待的嗎?
C語言編譯器為什麼能夠用C語言編寫?

TAG:編譯器 | Clang | LLVM |