Linux Kernel Exploration ToolKit – Part 2 – Queue (FIFO)

A common programming pattern in any operating system kernel is producer and consumer. In this pattern, a producer creates data—say, networking packets to be processed  or  error messages to be read —while a consumer, in turn, reads, processes, or otherwise consumes the data. Often the easiest way to implement this pattern is with a queue.The producer pushes data onto the queue and the consumer pulls data off the queue.The consumer retrieves the data in the order it was enqueued.That is, the first data on the queue is the first data off the queue. For this reason, queues are also called FIFOs, short for first-in, first-out. See the below figure for an example of a standard queue.

The Linux kernel’s generic queue implementation is called kfifo and is implemented in <kernel/kfifo.c> and declared in <linux/kfifo.h>.


Linux’s kfifo works like most other queue abstractions, providing two primary operations:enqueue (in) and dequeue (out).The kfifo object maintains two offsets into the queue: an in offset and an out offset.The in offset is the location in the queue to which the next enqueue will occur.The out offset is the location in the queue from which the next dequeue will occur.The out offset is always less than or equal to the in offset. It wouldn’t make sense for it to be greater; otherwise, you could dequeue data that had not yet been enqueued

The enqueue (in) operation copies data into the queue, starting at the in offset.When complete, the in offset is incremented by the amount of data enqueued.The dequeue (out) operation copies data out of the queue, starting from the out offset.When complete, the out offset is incremented by the amount of data dequeued.When the out offset is equal to the in offset, the queue is empty: No more data can be dequeued until more data is enqueued.When the in offset is equal to the length of the queue, no more data can be enqueued until the queue is reset.

kfifo declaration

  58struct __kfifo {
  59        unsigned int    in;
  60        unsigned int    out;
  61        unsigned int    mask;
  62        unsigned int    esize;
  63        void            *data;

Creating a Queue

To use a kfifo, you must first define and initialize it. as with most kernel objects, you can do this dynamically or statically.The most common method is dynamic.

 320 * kfifo_alloc - dynamically allocates a new fifo buffer
 321 * @fifo: pointer to the fifo
 322 * @size: the number of elements in the fifo, this must be a power of 2
 323 * @gfp_mask: get_free_pages mask, passed to kmalloc()
 324 *
 325 * This macro dynamically allocates a new fifo buffer.
 326 *
 327 * The numer of elements will be rounded-up to a power of 2.
 328 * The fifo will be release with kfifo_free().
 329 * Return 0 if no error, otherwise an error code.
 330 */
 331#define kfifo_alloc(fifo, size, gfp_mask) \
 332__kfifo_int_must_check_helper( \
 333({ \
 334        typeof((fifo) + 1) __tmp = (fifo); \
 335        struct __kfifo *__kfifo = &__tmp->kfifo; \
 336        __is_kfifo_ptr(__tmp) ? \
 337        __kfifo_alloc(__kfifo, size, sizeof(*__tmp->type), gfp_mask) : \
 338        -EINVAL; \
 339}) \

What is gfp_mask ?

The low-level kernel memory allocation functions take a set of flags describing how that allocation is to be performed. Among other things, these GFP_ (“get free page”) flags control whether the allocation process can sleep and wait for memory, whether high memory can be used, and so on.  See this article for the full set.

If you want to allocate the buffer yourself, you can:

 355 * kfifo_init - initialize a fifo using a preallocated buffer
 356 * @fifo: the fifo to assign the buffer
 357 * @buffer: the preallocated buffer to be used
 358 * @size: the size of the internal buffer, this have to be a power of 2
 359 *
 360 * This macro initialize a fifo using a preallocated buffer.
 361 *
 362 * The numer of elements will be rounded-up to a power of 2.
 363 * Return 0 if no error, otherwise an error code.
 364 */
 365#define kfifo_init(fifo, buffer, size) \
 366({ \
 367        typeof((fifo) + 1) __tmp = (fifo); \
 368        struct __kfifo *__kfifo = &__tmp->kfifo; \
 369        __is_kfifo_ptr(__tmp) ? \
 370        __kfifo_init(__kfifo, buffer, size, sizeof(*__tmp->type)) : \
 371        -EINVAL; \

Statically declaring a kfifo is simpler, but less common:

 125 * DECLARE_KFIFO - macro to declare a fifo object
 126 * @fifo: name of the declared fifo
 127 * @type: type of the fifo elements
 128 * @size: the number of elements in the fifo, this must be a power of 2
 129 */
 130#define DECLARE_KFIFO(fifo, type, size) STRUCT_KFIFO(type, size) fifo
 133 * INIT_KFIFO - Initialize a fifo declared by DECLARE_KFIFO
 134 * @fifo: name of the declared fifo datatype
 135 */
 136#define INIT_KFIFO(fifo) \
 137(void)({ \
 138        typeof(&(fifo)) __tmp = &(fifo); \
 139        struct __kfifo *__kfifo = &__tmp->kfifo; \
 140        __kfifo->in = 0; \
 141        __kfifo->out = 0; \
 142        __kfifo->mask = __is_kfifo_ptr(__tmp) ? 0 : ARRAY_SIZE(__tmp->buf) - 1;\
 143        __kfifo->esize = sizeof(*__tmp->buf); \
 144        __kfifo->data = __is_kfifo_ptr(__tmp) ?  NULL : __tmp->buf; \

This creates a static kfifo named name with a queue of size bytes.As before, size must be a power of 2.

There are various functions  on  kfifo . please see here .

Mostly used functions are

Enqueuing Data

When your kfifo is created and initialized, enqueuing data into the queue is performed via the kfifo_in() function:

unsigned int kfifo_in(struct kfifo *fifo, const void *from, unsigned int len);

This function copies the len bytes starting at from into the queue represented by fifo. On success it returns the number of bytes enqueued. If less than len bytes are free in the queue, the function copies only up to the amount of available bytes.Thus the return value can be less than len or even zero, if nothing was copied.

Dequeuing Data

When you add data to a queue with kfifo_in(), you can remove it with kfifo_out():

unsigned int kfifo_out(struct kfifo *fifo, void *to, unsigned int len);

This function copies at most len bytes from the queue pointed at by fifo to the buffer pointed at by to. On success the function returns the number of bytes copied. If less than len bytes are in the queue, the function copies less than requested. When dequeued, data is no longer accessible from the queue.This is the normal usage of a queue, but if you want to “peek” at data within the queue without removing it, you can use kfifo_out_peek():

unsigned int kfifo_out_peek(struct kfifo *fifo, void *to, unsigned int len,unsigned offset);

This works the same as kfifo_out(), except that the out offset is not incremented,and thus the dequeued data is available to read on a subsequent call to kfifo_out().The parameter offset specifies an index into the queue; specify zero to read from the head of the queue, as kfifo_out() does.

Obtaining the Size of a Queue

To obtain the total size in bytes of the buffer used to store a kfifo’s queue, call kfifo_size():

static inline unsigned int kfifo_size(struct kfifo *fifo);

In another example of horrible kernel naming, use kfifo_len() to obtain the number of bytes enqueued in a kfifo:
static inline unsigned int kfifo_len(struct kfifo *fifo);
To find out the number of bytes available to write into a kfifo, call kfifo_avail():
static inline unsigned int kfifo_avail(struct kfifo *fifo);
Finally, kfifo_is_empty() and kfifo_is_full() return nonzero if the given kfifo is
empty or full, respectively, and zero if not:
static inline int kfifo_is_empty(struct kfifo *fifo);
static inline int kfifo_is_full(struct kfifo *fifo);

Resetting and Destroying the Queue

To reset a kfifo, jettisoning all the contents of the queue, call kfifo_reset():
static inline void kfifo_reset(struct kfifo *fifo);
To destroy a kfifo allocated with kfifo_alloc(), call kfifo_free():
void kfifo_free(struct kfifo *fifo);
If you created your kfifo with kfifo_init(), it is your responsibility to free the associated
buffer. How you do so depends on how you created it.

Example Queue Usage

With these interfaces under our belt, let’s take a look at a simple example of using a kfifo.
Assume we created a kfifo pointed at by fifo with a queue size of 8KB.We can now
enqueue data onto the queue. In this example, we enqueue simple integers. In your own
code, you will likely enqueue more complicated, task-specific structures. Using integers in
this example, let’s see exactly how the kfifo works:

unsigned int i;
/* enqueue [0, 32) to the kfifo named ‘fifo’ */
for (i = 0; i < 32; i++)
kfifo_in(fifo, &i; sizeof(i));
The kfifo named fifo now contains 0 through 31, inclusive.We can take a peek at the
first item in the queue and verify it is 0:
unsigned int val;
int ret;
ret = kfifo_out_peek(fifo, &val, sizeof(val), 0);
if (ret != sizeof(val))
return -EINVAL;

printk(KERN_INFO “%u\n”, val); /* should print 0 */
To dequeue and print all the items in the kfifo, we can use kfifo_out():
/* while there is data in the queue … */
while (kfifo_avail(fifo)) {
unsigned int val;
int ret;
/* … read it, one integer at a time */
ret = kfifo_out(fifo, &val, sizeof(val));
if (ret != sizeof(val))
return -EINVAL;
printk(KERN_INFO “%u\n”, val);
This prints 0 through 31, inclusive, and in that order. (If this code snippet printed the
numbers backward, from 31 to 0, we would have a stack, not a queue.)


1) Linux.Kernel.Development.3rd.Edition



Linux Kernel Exploration ToolKit – Part 1 – Linked lists

I thought to write about Linux kernel architecture  and various components in it. But before digging into that let me introduce you several built-in “code entities”  in the Linux kernel .Though it  is written in “C” it does not use libc,  it got its own entities . As with any large software project, the Linux kernel provides these “code entities”  to encourage code reuse. Kernel developers should use these data structures whenever possible and not “roll our own” solutions.

In the following sections, we cover the most useful of these generic data structures and other kernel code components

Linked List:

As you know in “C” , the size of the array is static and inserting a new element in the array ((not at the end :) ) is potentially expensive . Linked lists have their own strengths and weaknesses, but they happen to be strong where arrays are weak.

An array allocates memory for all its elements lumped together as one block of memory. In contrast, a linked list allocates space for each element separately in its own block of memory called a “linked list element” or “node”. The list gets is overall structure by using  pointers to connect all its nodes together like the links in a chain.The first element is often represented by a special pointer—called the head.

Types of Linked Lists:

Singly Linked List:

Each node contains two fields: a “data” field to store whatever element type the list holds for its client, and a “next” field which is a pointer used to link one node to the next node.

/* node in a  singly linked list */

struct  node {

void *data;

struct  node *next; /* pointer to the next element */


Doubly linked list

In some linked lists, each element also contains a pointer to the previous element.These lists are called doubly linked lists because they are linked both forward and backward.

/* a node in a doubly linked list */

struct node {

void *data; /* the payload */

struct list_element *next; /* pointer to the next element */

struct list_element *prev; /* pointer to the previous element */


Circular Linked Lists:

Normally, because the last element in a linked list has no next element, it is set to point to a special value, such as NULL, to indicate it is the last element in the list. In some linked lists, the last element does not point to a special value. Instead, it points back to the first value.This linked list is called a circular linked list because the list is cyclic. Circular linked  lists can come in both doubly and singly linked versions. In a circular doubly linked list, the first node’s “previous” pointer points at the last node.

Circular singly linked list:

Circular doubly linked list:

Linux Kernel Linked List implementation:

The Linux kernel’s linked list implementation is unique, it is fundamentally a circular doubly linked list. Using this type of linked list provides the greatest flexibility.

For example , we have a structure like as shown below

struct todo_tasks{

char *task_name;

unsigned int name_len;

short int status;


The common pattern for storing this structure in a linked list is to embed the list pointer in the structure as shown below

struct todo_tasks{

char *task_name;

unsigned int name_len;

short int status;

struct  todo_tasks  *next,*prev;


The Linux kernel approach is different. Instead of turning the structure into a linked list, the Linux approach is to embed a linked list node in the structure!

The kernel list code is declared in <linux/types.h> , These are implemented by a single include file <linux/list.h> . There is no separate “.c” file with any library of support routines. All of the code for handling linked lists is simple enough to be implemented using inline functions. Thus it is very easy to use this implementation in any other (GPLv2-licensed) project.

The list_head structure is shown below

struct list_head {

struct list_head *next,*prev;


The next pointer points to the next list node, and the prev pointer points to the previous

list node.Yet, seemingly, this is not particularly useful.

struct todo_tasks{

char *task_name;

unsigned int name_len;

short int status;

struct list_head  tasks_list


With this, in todo_tasks points to the next element, and taks_list.prev in todo_tasks points to the previous. Now this is becoming useful.

The kernel provides a family of routines to manipulate linked lists. For example, the list_add() method adds a new node to an existing linked list.These methods , are generic and they accept only list_head structures.

So how to find the  address of the parent structure containing  the  kernel lists.?

Kernel uses the below code to do that.

 345 * list_entry - get the struct for this entry
 346 * @ptr:        the &struct list_head pointer.
 347 * @type:       the type of the struct this is embedded in.
 348 * @member:     the name of the list_struct within the struct.
 349 */
 350#define list_entry(ptr, type, member) \
 351        container_of(ptr, type, member)

 685 * container_of - cast a member of a structure out to the containing structure
 686 * @ptr:        the pointer to the member.
 687 * @type:       the type of the container struct this is embedded in.
 688 * @member:     the name of the member within the struct.
 689 *
 690 */
 691#define container_of(ptr, type, member) ({                      \
 692        const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
 693        (type *)( (char *)__mptr - offsetof(type,member) );})

  20#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

Don’t get confused with the above the code snippets it is quite simple and a well-known technique. Given a pointer to struct list_head in a data structure, macro list_entry() simply computes the pointer of the data structure. To achieve this we need to figure out where in the data structure the list_head is (offset of list_head). Then, simply deduct the list_head’s offset from the actual pointer passed to the macro.

Now the question is how can we compute the offset of an element in a structure? Suppose in the above  data structure structtodo_tasks   to find the offset of element tasks_list in it, this is how you do it:

(unsigned long)(&((struct todo_tasks *)0)->tasks_list)

Take memory address 0, and cast it to whatever the type of structure you have– in this case struct todo_tasks. Then, take the address of the member you’re interested in. This gives the offset of the member within the structure. Since we already know the absolute memory address of this element for a particular instance of the structure (for example in stuct todo_tasks *task1, by way of task1->tasks.list) deducting this offset point us to the address of the structure itself. That’s all there is to it.

To get a better handle on this I suggest you play around with the below code.

#include <stdio.h>

#include <stdlib.h>

struct foobar{

unsigned int foo;

char bar;

char boo;


int main(int argc, char** argv){

struct foobar tmp;

printf(“address of &tmp is= %p\n\n”, &tmp);

printf(“address of tmp->foo= %p \t offset of tmp->foo= %lu\n”, &, (unsigned long) &((struct foobar *)0)->foo);

printf(“address of tmp->bar= %p \t offset of tmp->bar= %lu\n”, &, (unsigned long) &((struct foobar *)0)->bar);

printf(“address of tmp->boo= %p \t offset of tmp->boo= %lu\n\n”, &, (unsigned long) &((struct foobar *)0)->boo);

printf(“computed address of &tmp using:\n”);

printf(“\taddress and offset of tmp->foo= %p\n”,

(struct foobar *) (((char *) & – ((unsigned long) &((struct foobar *)0)->foo)));

printf(“\taddress and offset of tmp->bar= %p\n”,

(struct foobar *) (((char *) & – ((unsigned long) &((struct foobar *)0)->bar)));

printf(“\taddress and offset of tmp->boo= %p\n”,

(struct foobar *) (((char *) & – ((unsigned long) &((struct foobar *)0)->boo)));

return 0;


Output from this code is:

address of &tmp is= 0xbff43dfc

address of tmp->foo= 0xbff43dfc          offset of tmp->foo= 0

address of tmp->bar= 0xbff43e00          offset of tmp->bar= 4

address of tmp->boo= 0xbff43e01          offset of tmp->boo= 5

computed address of &tmp using:

address and offset of tmp->foo= 0xbff43dfc

address and offset of tmp->bar= 0xbff43dfc

address and offset of tmp->boo= 0xbff43dfc

Couple of things to note here about kernel lists.

  1. You can put struct list_head anywhere in your structure.
  2. You can name struct list_head variable anything you wish.
  3. You can have multiple lists!

So for example, the declaration below is also a valid one:

struct todo_tasks{

char *task_name;

unsigned int name_len;

short int status;

int sub_tasks;

int subtasks_completed;

struct list_head completed_subtasks;/* list structure */

int subtasks_waiting;

struct list_head waiting_subtasks;    /* another list of same or different items! */

struct list_head todo_list;                   /* list of todo_tasks */


Here are some examples of such lists from the kernel:

 Manipulating Linked Lists:

The kernel provides a family of functions to manipulate linked lists.They all take pointers to one or more list_head structures.The functions are implemented as inline functions in generic C and can be found in <linux/list.h>.

Interestingly, all these functions are O(1).1 This means they execute in constant time, regardless of the size of the list or any other inputs. For example, it takes the same  amount of time to add or remove an entry to or from a list whether that list has 3 or 3,000 entries.

For example To add a node to a linked list:

list_add(struct list_head *new, struct list_head *head)

This function adds the new node to the given list immediately after the head node.Because the list is circular and generally has no concept of first or last nodes, you can pass any element for head. If you do pass the “last” element, however, this function can be used to implement a stack.

To add a node to the end of a linked list:

list_add_tail(struct list_head *new, struct list_head *head)

This function adds the new node to the given list immediately before the head node. As with list_add(), because the lists are circular, you can generally pass any element for head. This function can be used to implement a queue, however, if you pass the “first” element. 

I think the best way to get familiar with the list functions is to simply scan this file   and this doc . There are some examples of adding,deleting,traversing through kernel lists here .


1) Linux.Kernel.Development.3rd.Edition

2)  Linux.Kernel.Primer.A.Top.Down.Approach.for.x86.and.PowerPC.Architectures




What are the different types of Operating systems?

In last post I have mentioned about Operating system and  types of kernel. In this I will be describing about various types of operating systems. I am not getting into all the details . I am just giving what is what.

There are generally various types of Operating systems , categorized based on the

1)    How the data is entered into the system

2)    Number of users, OS  can support

3)    Number of Tasks, OS can run in parallel

4)    Response time for a service 

5)    Special purpose systems

6)   Type of architecture of the computer 

1)    How the data is entered into the system

1.a  Batch Operating system

In olden days  from 1955-65, the most common devices were card reader and tape driver. The common o/p devices were printer, tape driver and punch card. The user does not directly interact with computer system. The programmer responsibility was to prepare the program, data or any information that is to be processed and  punch it on the cards, finally submitted to the computer operator. At some time later, when the computer finish the task whatever job it was running, and operator finally print the result. And hand over to the programmer. Much computer time was wasted while operators were walking around the machine room. The method adopted to reduce the waste time was batch system. The operating system used at that time was very simple, as its main job is to transfer the control automatically from one job to next. The basic idea behind this was to collect the similar information and read them onto a magnetic tape. After collecting a batch of jobs, the tape was brought into the machine room, where it was mounted on the tape driver. OS runs the series of jobs sequentially without user intervention. Now a days batch system have been almost replaced by GUI systems , although the internal functions of an operating system are done even now through batch system ,if a task is indeed repetitive, a batch process can be used to automate much of the work in any OS. Example : Mainframe OS

1.b  Interactive Operating system

An operating system that allows users to run interactive programs. Pretty much all operating systems that are on PCs are interactive OS’s

2 & 3) Number of users  OS can support and Number of Tasks OS can run in parallel 

a. Single-user, single task -

     As the name implies, this operating system is designed to manage the computer so that one user can effectively do one thing at a time. DOS and The Palm OS for Palm handheld computers is a good example of a modern single-user, single-task operating system.

b. Single User – Multi-Tasking OS

This OS allows a single user to simultaneously run multiple applications on their computer. This is the type of operating system found on most personal desktop and laptop computers. The personal editions of Windows (Microsoft) and Macintosh (Apple) platforms are the most popular single-user, multi-tasking OS. For example, it’s entirely possible for a Windows user to be writing a note in a word processor while downloading a file from the Internet while printing the text of an e-mail message.

 C. Multi-User OS

A multi-user operating system allows many different users to take advantage of the computer’s resources simultaneously. The operating system must make sure that the requirements of the various users are balanced, and that each of the programs they are using has sufficient and separate resources so that a problem with one user doesn’t affect the entire community of users.Unix,Solaris,Linux,Windows NT are examples of multi-user operating systems.

4)    Response time for a service 

a. Real time operating system(RTOS)

This OS is intended to serve real time application requests.

When an application makes calls to the OS how much  time the service takes to send a response back is variable. In RTOS we implement the services in such a way that irrespective  of the load, the  response time is fixed.

If the response time is fixed then application can be designed deterministic  and predictable behavior .

There are mainly 3  types of RTOS.

In hard real-time systems, the most critical type, failure to meet its time constraints will lead to system failure (imagine a nuclear reactor control system or Missile).

In firm real-time systems, the time constraints must be met in most cases, but can tolerate missing a low number of deadlines (consider a food processing plant control system).

In soft real-time systems, the performance is degraded when time constraints are not met but the system will not catastrophically fail (this is often the case with real-time animation of visual displays)

Example : RTlinux, VXWorks, Windows CE, QNX

5)    Special purpose systems 

a. Embedded OS

Embedded OS runs on  a computer system designed to perform one or a few dedicated functions, unlike general purpose OS.

Industrial machines, automobiles, medical equipment, cameras, household appliances, airplanes, vending machines and toys (as well as the more obvious cellular phone and PDA)

Example : embedded Linux, Android,iOS

Embedded OS  often comes with RTOS capability.

6)   Type of architecture of the computer 

6.a. Multiprocessor Operating system

At any given time there is a technological limit on the speed with which a single processor can operate. If the system workload cannot be handled satisfactorily by the single processor. The response is to apply multiple processors to the problem and is known as multiprocessor environment. This also provides  the Increased reliability and economy of scale.

An Operating System capable of supporting and utilizing more than one computer processor is called a Multiprocessor Operating system

Multiprocessor systems are broadly divided into

6.a.1 Tightly-coupled:

Processors work in close association, possibly sharing memory / node peripherals.This is the typical multiprocessor model

6.a.2. Loosely-coupled :

Each processor has its own memory and copy Of the OS. This is  again classified into  Multi computer (cluster) , distributed  systems (wide area Multi computer)

The tightly-coupled   multiple-processor systems in use today are of two types.

Asymmetric(Master/Slave) and


 6.a.1.1 Asymmetric(Master/Slave)

A master processor controls the  OS and other processors
either look to the master for instruction or have predefined tasks. This scheme
defines a master-slave relationship. The master processor schedules and
allocates work to the slave processors.

This arrangement allows the parallel execution of a single task by allocating several subtasks to multiple processors concurrently. Since the operating system is executed by only master processors this system is relatively simple to develop and efficient to use.

The problem with this model is that with many CPUs, the master will become a bottleneck. After all, it must handle all system calls from all CPUs. If, say, 10% of all time is spent handling system calls, then 10 CPUs will pretty much saturate the master, and with 20 CPUs it will be completely overloaded. Thus this model is simple and workable for small multiprocessors, but for large ones it fails.

 6.a.1.2 symmetric

The most common systems use symmetric multiprocessing (SMP), in which each processor performs all tasks within the operating system. SMP means that all processors are peers; no master-slave relationship exists between processors.

since there is no master,  it introduces its own problems. In particular, if two or more CPUs are running operating system code at the same time, disaster will result. Imagine two CPUs simultaneously picking the same process to run or claiming the same free memory page. The simplest way around these problems is to associate a mutex (i.e., lock) with the operating system, making the whole system one big critical region. When a CPU wants to run operating system code, it must first acquire the mutex. If the mutex is locked, it just waits. In this way, any CPU can run the operating system, but only one at a time.

This model works, but is almost as bad as the master-slave model. Again, suppose that 10% of all run time is spent inside the operating system. With 20 CPUs, there will be long queues of CPUs waiting to get in. Fortunately, it is easy to improve. Many parts of the operating system are independent of one another. For example, there is no problem with one CPU running the scheduler while another CPU is handling a file system call and a third one is processing a page fault.

Virtually all modern operating systems—including Windows, Windows XP, Mac OS X, and Linux now provide support for SMP.

6.a.2. Loosely-coupled :

Each processor has its own memory and copy Of the OS. This is  again classified into  Multi computer (cluster) , distributed  systems (wide area Mult icomputer)

6.a.2.1 Multi comupter (Cluster) :

A cluster consists of two or more computers working together to provide a higher level of availability, reliability, and scalability than can be obtained by using a single computer.for the end users it appears as single computer(appliance).

Clustering can be structured asymmetrically or symmetrically.

In asymmetric clustering, one machine is in hot-standby mode while the other is running the applications. The hot-standby host machine does nothing but monitor and record the active server state . If that server fails, the hot-standby host becomes the active server.

In symmetric mode, two or more hosts are running applications, and are monitoring each other. This mode is obviously more efficient, as it uses all of the available hardware.

6.a.2.2 Distributed system(OS) :

These are similar to the multi computers in that each node has its own private memory with no shared physical memory in the system. However distributed systems are even more loosely couple than the multicomputer.

  To start with the nodes of a multicomputer  generally has a CPU,RAM,  a network interface and perhaps a hard disk for paging . In contrast ,each node in a distributed system is a complete computer with a full  complement of peripherals .

Next, The nodes of multicomputer are normally in a single room so they can communicate by a dedicated high-speed network. Where as the nodes of distributed system may be spread around the world.

Finally, all the nodes of a multicomputer run the same operating system share a single filesystem and are under a common administration. Whereas nodes of a distributed systems may run different operating systems , each have their own file system and be under different  administrators.

Network operating system vs distributed system operating system

at the first glance network operating system appears like distributed operated system . here is the difference.

A network operating system provides an environment in which users, who are aware of the multiplicity of machines, can access remote resources by either logging in to the appropriate remote machine or transferring data from the remote machine to their own machines.

 A computer running a network operating system acts autonomously from all other computers on the network, although it is aware of the network and is able to communicate with other networked computers. A distributed operating system provides a less autonomous environment: The different operating systems communicate closely enough to provide the illusion that only a single operating system controls the network.

Here is the comparison of Multi processor systems (OS).

Item Multiprocessor Cluster(Multicomputer) Distributed system
Node configuration Cpu CPU, RAM, net interface Complete computer
Node  peripherals All shared Shared exc, May be disk Full set per node
Location Same rock Same room Possible worldwide
Inter node communication Shared RAM Dedicated interconnect Tradition network
Operating systems One , shared Multiple, same Possibly all different
File systems One, Shared One, Shared Each node has own
Administration One organization One organization Many organizations

For some of the sections i have referred

1) Operating System Concepts – Silberschatz, Galvin, Gagne

2) Modern Operating Systems -Tanenbaum

3) Distributed Operating Systems -Tanenbaum

in Next post  I will talk about  virtualization operating systems.


What is an Operating system? – Types of Kernel


Welcome :)

Let us start with definition of operating system.

An operating system is software that manages the computer hardware as well as providing an environment for application programs to run.Perhaps the most visible aspect of an operating system is the interface to the computer system, it provides to the human user.

We can primarily divide the operating system into kernel( core part + system services ) + user interface (command line or graphical)

Though each operating system has a kernel, this is buried behind a lot of other software and most users don’t even know it exists

 is user interface part of operating system?.

Well for an end-user a Linux distribution (say Ubuntu) is an Operating System while for a programmer the Linux kernel( itself is a perfectly valid OS depending on what you’re trying to achieve. For instance embedded systems are mostly just kernel with very small number of specialized processes running on top of them. In that case the kernel itself becomes the OS itself.

So what is kernel?

The kernel is a piece of software that constitutes the central core of a computer operating system. It has complete control over everything that occurs in the system.

Okay, but what does this mean for us? What exactly is an OS Kernel, and why should we care for it?

There is no rule that states, “a kernel is mandatory”. We can easily just load and execute programs at specific addresses without any “kernel”. In fact, all of the early computer systems started this way. However, it was eventually realized that convenience and efficiency could be increased by retaining small utility programs, such as program loaders and debuggers, in memory between applications. These programs gradually evolved into operating system kernels

More  details:- Accessing the hardware directly could  be very complex, so kernels usually implement a set of hardware abstractions. These abstractions are a way of hiding the complexity, and providing a clean and uniform interface to the underlying hardware, which makes it easier to application programmers.

And  provides secure access to the machine’s hardware to various computer programs. Since there are many programs, and access to the hardware is limited, the kernel is also responsible for deciding when and how long a program should be able to make use of a piece of hardware.

It does lot of other stuff , we will see it later.

Kernel Space vs User Space

Because of its critical nature, the kernel code is usually loaded into a protected area of memory, which prevents it from being overwritten by other, less frequently used parts of the operating system or by application programs. The kernel performs its tasks in kernel space.

whereas everything a user normally does, such as writing text in a text editor or running programs in a GUI (graphical user interface), is done in user space. This separation is made in order to prevent user data and kernel data from interfering with each other and thereby diminishing performance or causing the system to become unstable (and possibly crashing).

The system calls act as an interface between the user space and the kernel space.We will discuss about system calls later.

By definition kernel have to have direct control over every little needs to run in supervisor(kernel) mode. This is  enforced by the CPU hardware.

Kernel Mode

In Kernel mode, the executing code has complete and unrestricted access to the underlying hardware. It can execute any CPU instruction and reference any memory address. Kernel mode is generally reserved for the lowest-level, most trusted functions of the operating system. Crashes in kernel mode are catastrophic; they will halt the entire PC.

User Mode

In User mode, the executing code has no ability to directly access hardware or reference memory. Code running in user mode must delegate to system APIs to access hardware or memory. Due to the protection afforded by this sort of isolation, crashes in user mode are always recoverable. Most of the code running on your computer will execute in user mode.

x86 CPU hardware actually provides four protection rings: 0, 1, 2, and 3

Kernel Types

Kernels can be classified into three broad categories

1. Monolithic kernels

2. Micro kernel

3.Hybrid Kernel (This is bit controversial. We will see why it is )

Monolithic Kernel

Monolithic kernel got simple design and it  is a single large processes running entirely in a single address space and in the kernel space. It is a single static binariy file. All kernel services exist and execute in kernel address space. The kernel can invoke functions directly.

Applications run in the userspace(ring 3)

but one consequence of all parts of the kernel running in the same address space is that if there is an error (‘bug’) somewhere in the kernel, it will have an effect on the entire address space; in other words, a bug in the subsystem that  takes care of networking might crash the kernel as a whole, resulting in the user needing to reboot his system.

Example : MS-DOS, BSD, HP-UX, AIX, Windows 98, Linux ( with Module support )

Micro Kernel

Micro kernels designed to fix the  above problem, it try to limit the amount of damage a bug can cause. They do this by moving parts of the kernel away from the dangerous  kernelspace into userspace, where the parts run in isolated processes (so-called ‘servers’) as a consequence, they do not influence each other’s functioning. The bug in the networking subsystem which crashed monolithic kernel (in the above example) will have far less severe results in a microkernel design: the subsystem in question will crash, but all other subsystems will continue to function.  In fact, many microkernel operating systems have a system in place which will automatically reload crashed servers.

While this seems to be a  very elegant design, it has two major downsides compared to monolithic kernels: added complexity and performance penalties .In a microkernel design, only a small subset of the tasks a monolithic kernel performs reside in kernelspace, while all other tasks live in user space. Generally, the part residing in kernelspace (the actual ‘microkernel’) takes care of the communication between the servers running in userspace; this is called ‘inter-process communication (IPC)’ . These servers provide functionality such as sound, display, disk access, networking, and so on.

This complexity also creates performance. Simply put, the communication between the servers of a micro kernel takes time. In a monolithic design, this communication is not needed as all the servers are tied into one big piece of computer code, instead of  several different pieces. The result is that a monolithic kernel will generally out perform a micro kernel  (provided they are similar feature-wise). This explains why Torvalds chose to write Linux in a  monolithic fashion; in the early ‘90s, computer resources were much more limited than they are today, and hence anything that could increase performance was a welcome addition.

This scheme adds a lot of complexity to the overall system. A good analogy (Microkernels: augmented criticism (no date)) is to take a piece of Goat (the monolithic kernel), chop it into small  parts (the servers), put each of those parts into hygienic plastic bags (the isolation), and then link the individual bags to one another with strings (the IPC). The total weight of the end result will be that of the original Goat, plus that of the plastic bags and string. Therefore, while a microkernel may appear simple on a very local level, at a global level it will be much more complex than a similar monolithic kernel.

Example : Mach, Minix, QNX, L4

Hybrid Kernel

As an answer to these concerns, a new type of kernel design was devised. This design combines the monolithic and microkernel design in that it has characteristics of both. It keeps some subsystems in kernel space to increase performance, while keeping others out of kernel space to improve stability.

That part of a hybrid kernel running in kernel space is in fact structured as if it were a microkernel; as a consequence, parts which run in kernel space can actually be ‘moved out’ of it to userspace relatively easily. Microsoft Corp. has   demonstrated this flexibility by moving large parts of its audio subsystem in the Windows operating system from kernel space to userspace .

The hybrid design   is controversial and  has been heavily criticised. Linus Torvalds  said the term hybrid was devised only for marketing reasons. He say that  hybrid kernel is a monolothic kernel trying to be a microkernel, or it is a microkernel trying to be monolithic

Hybrid kernels are used in most commercial operating systems such as Microsoft Windows NT, 2000, XP, Vista, and 7. Apple Inc’s own Mac OS X uses a hybrid kernel called XNU which is based upon code from Carnegie Mellon’s Mach kernel and FreeBSD’s monolithic kernel.

Hybrid kernels should not be confused with monolithic kernels that can load modules after booting (such as Linux).

Some monolithic kernels can be compiled to be modular (e.g Linux), what matters is that the module is inserted to and run from the same space that handles core functionality.

Monolithic  kernel vs  Micro kernel  with emphasis on Linux 

1)      Supporters of micro kernels point out that monolithic kernels have the disadvantage that an error in the kernel can cause the entire system to crash. However, with a micro kernel, if a kernel process crashes, it is still possible to prevent a crash of the system as a whole by merely restarting the service that caused the error. Although this sounds sensible, it is questionable how important it is in reality, because operating systems with monolithic kernels such as Linux have become extremely stable and can run for years without crashing.

2)      Monolithic kernels also appear to have the disadvantage that their source code can become extremely large
For example, the source code for the Linux kernel version 2.6.0 is 212MB and contains 5.93 million lines. This adds to the complexity of maintaining the kernel, and it also makes it difficult for new generations of computer science students to study and comprehend the kernel. However, the advocates of monolithic kernels claim that in spite of their size such kernels are easier to design correctly, and thus they can be improved more quickly than can micro kernel-based systems.

Moreover, the size of the compiled kernel is only a tiny fraction of that of the source code, for example roughly 1.1MB in the case of Linux version 2.4 on a typical Red Hat Linux 9 desktop installation. Contributing to the small size of the compiled Linux kernel is its ability to dynamically load modules at runtime, so that the basic kernel contains only those components that are necessary for the system to start itself and to load modules.

The monolithic Linux kernel can be made extremely small not only because of its ability to dynamically load modules but also because of its ease of customization. In fact, there are some versions that are small enough to fit together with a large number of utilities and other programs on a single floppy disk and still provide a fully functional operating system (one of the most popular of which is muLinux). This ability to miniaturize its kernel has also led to a rapid growth in the use of Linux in embedded systems

3)     Another disadvantage cited for monolithic kernels is that they are not easily portable; that is, they must be rewritten for each new architecture (i.e., processor type) that the operating system is to be used on. However, in practice, this has not appeared to be a major disadvantage, and it has not prevented Linux from being ported to numerous processors.

This topic is not complete without reading this

Conclusion : Which Kernel type I should choose.

I have mentioned about Pros and Cons of kernel  design types.  Decide your self based on requirement. Some researchers are claiming that with some tweaking in IPC and clever communication Micro kernel performance is  comparable with Monolithic Kernel.

Other than the Monolithic,Micro,Hybrid there are other kernel types like Exo ,Nano Kernel etc.We will discuss about these later.

In next post we will see about different types of Operating systems :)