Saturday, April 3, 2010

How does a C program work

Being a new user in stack overflow, I cannot post more than one link, hence ended up writing answer here. Question is Some general C questions.

It seemed intriguing, so I did some research and then found few things, which are explained below.

MSDN has some material on C language. Check out [C langugae reference on MSDN][1] . In general, in my view, when you are programming on Windows, it is good to search MSDN using Google. You will find many useful samples of OS APIs along the way making your job much easier. The new MSDN layout of related links of left side of a help page can also help discover things.

Now, as for the structure of program and how it executes. When compiling, your compiler will look into standard location for standard c library calls. When using Visual studio on Windows, those functions will be found inside msvcrtxxx.dll. On VS 2010, XXX is 100 or version 10. In VS 2010, you can tell it to use VC90 also. More is [here][2] and @ [Run time library reference][3]. This DLL will not be linked inside your program. Only a reference or stub will be inserted inside using the corresponding .lib file. At runtime, the DLL that actually implemented this function will be loaded into memory and stub will be extended to the address of function in DLL. This DLL is part of your program's virtual memory. BTW, A DLL is like shared library of UNIX. Once loaded into memory, other programs too can reference its functions. You can used [depends.exe][4] to see the implicit dependency of your program. In windows, you can use loadlibrary to explicitly load dll at runtime also and then use its functions.

For non-standard C functions or OS APIs, you will need to see help to find the library name that implemented it. So, if you need to get system information, [GetSystemInfo][5] will need to be called. Seeing its help on MSDN, you notice that it asks you to use kernel32.dll/kernel32.lib. During compilation, you will refer to kernel32.lib using a compiler (cl.exe for VS) switch and during runtime, kernel32.dll will be located by VC runtime as this is standard OS library. Runtime mostly does this de-referencing only once for each function call. You can use Cl.exe switches to specify the custom location of dll & lib files also.

For non-standard C functions and non-standard OS APIs, you will need to see their help to find the library name that implemented it and put/locate that DLL on system running the application. Usually, setup project of VS does all this and put things in nice package for you to deploy on target machine.

You can choose to do static binding of libraries also. This saves times in loading DLLs as well as runtime de-referencing, but increase program size since library code is embedded inside your exe. 

For EXE structure, check out [Peering inside the PE][6] and [Portable Executable structure][7]. All windows EXE are structured in this way or a slight variant. For your executable created by C program, exact entry point will not be main() function. Compiler creates another functions known as initialization and termination functions inside your executable. See [GNU CC Init/Terminate][8] & [Sun Init/Terminate][9]. As you continue working on C programs, you will get to know these things better. In short, Init functions do stack & static data space setup, required dll loading and in general setting up environment. They also process arguments received. After all this is done, main is called with arguments. Once main exits, again terminate routines are called which handles passing back return values, closing open handles & general cleanup. Implementation is very much compiler and OS dependent.

  [1]: http://msdn.microsoft.com/en-us/library/fw5abdx6(v=VS.80).aspx
  [2]: http://msdn.microsoft.com/en-us/library/abx4dbyh(VS.80).aspx
  [3]: http://msdn.microsoft.com/en-us/library/59ey50w6(v=VS.80).aspx
  [4]: http://www.dependencywalker.com/
  [5]: http://msdn.microsoft.com/en-us/library/ms724381(VS.85).aspx
  [6]: http://msdn.microsoft.com/en-us/library/ms809762.aspx
  [7]: http://www.skynet.ie/~caolan/publink/winresdump/winresdump/doc/pefile.html
  [8]: http://gcc.gnu.org/onlinedocs/gccint/Initialization.html
  [9]: http://docs.sun.com/app/docs/doc/817-3677/6mj8mbtbi?a=view

PS: Even though Stack overflow did not allow me to post this thing on their site, I like their editor which creates a nice structure. I will consider writing my entries on their site in future and then copy-paste in windows live writer for better alignment and further posting.

Toughest thing to search was Initialization and termination routines reference. Definitely, their is a lack of a standard documentation on windows platform for internals of tools used by us.

No comments: