Linux boot process. When is linux kernel loaded into physical memory? How does the linux kernel is loaded?

In this article I will answer these questions.

It has been assumed that the reader is new to kernel concepts and programming. So here I will try to address what is kernel and how does it come into the picture after the computer is powered on. Here by kernel I will be talking about Linux Kernels only. In the first part of the blog I will discuss about the basic theoretical concepts of kernel and after acquiring some basic knowledge we will see how to write a kernel module.

                            



So lets see what happens when a CPU is powered on. As you may be knowing RAM or the Physical memory is the only place from where the CPU reads and executes instruction. But RAM is volatile memory, so when a CPU is powered on the RAM is empty. So during this time from where does the instructions are loaded into RAM. Its typically HDD (PXE is another option, though for sake of simplicity we will not take this into account).But computer being a dummy machine how does it understands where in the HDD to look for the first instructions.


What is BIOS?                                                   



Here comes BIOS (Basic Input/Output System). It performs some basic check and thereafter decides whether it is safe to go ahead with boot process.It searches and loads the boot loader programme. You can change the boot sequence. Its part of the BIOS.

                                             
Now this BIOS part is hard wired in the motherboard chip. Next BIOS looks into MBR (Master Bood Record) where it finds the actual boot loader programme. In LINUX its typically GRUB (Grand Unified Bootloader). A system can have multiple kernel instaces present, so in that case it will ask you for which one.

                          
Next it mounts the root file system and executes the kernel image. Kernel is the core part of the operating system which does directly talks with the hardware and does some crucial administrative tasks behind the curtain which makes our life easier.


strace: In linux strace can be used to monitor the interactions between a process and the kernel. Thus you can get a glimpse of the tasks that the kernel does behind the curtain to make our life easier.
$  strace ls
execve("/bin/ls", ["ls"], [/* 21 vars */]) = 0
brk(0)                                  = 0x8c31000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb78c7000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=65354, ...}) = 0
...
...
...
Now kernel is of two types:
1) Micro Kernel (Real time Operating Systems uses them)
2) Monolithic kernel  (Linux kernel is of this type)
We will discuss about them later on.

For the time being lets visit some gcc & computer architecture stuffs. Here I will assume all of you have go0d knowledge in C Programming & gcc.


Machine code & Assembly code

     
In gcc -s you can generate the assembly code. This then finally gets translated to machine dependent code which the computer can understand. The code is machine dependant because each machine has is own instruction set, we call them Instruction Set Architecture. Using those instructions in the set interactions with hardware can be done readily.
                        

Protection Rings

Since the kernel does very critical tasks and have direct access to the hardware , the codes which do not run in kernel space must be given less privilege so that anyone can not tamper with it with malicious intention. 

                       

                              image coursey: Wikipideia
So the hardware enforces some protection levels or protection ring. Ring 0 is the most privileged one and Ring 3 is the least one. Generally there are two protection levels. The user space code runs in Ring 3 and Kernel code runs in Ring 0.

When user code needs to do privileged tasks there is a gate mechanism by which it can access Ring 0s resources. Typically gcc performs this job.

For instance in the user space code the statement printf("Hello World"); actually asks the glibc library to perform the print operation using this gate mechanism because this operation needs to access hardware and requires high privileges. But user space codes are prevented to run in high privileges mode for security reasons.

What happens further?

The very first process that is spawned is the swpapper process which has pid 0. Its responsible for context switch,scheduling and paging. Obviously its in kernel context as at that point of time no other process exists and the OS is not ready yet. The next process that is spawned is the init process which has pid 1.

What is Linux Run Level


Depending on the run levels  different types of service can be enabled. For windows users those who have booted the machine into safe mode to trouble shoot a problem, they have actually booted the OS is more restrictive run level. Thus run level is the degree of restriction that the OS will impose on the programmes & resources . Depending on run level some specific kind of services or hardwares may not be used. Typically /etc/initatab file is consulted to find the default run level. After that /etc/rc.d/rc.d  is executed which defines the set of services and degree of restriction with which the OS will boot.

The init process runs as a daemon and all other process is child of init process.

Post a Comment

3 Comments
* Please Don't Spam Here. All the Comments are Reviewed by Admin.