Fork Paging
From Wikipedia, the free encyclopedia
Fork and vfork (stands for virtual memory fork) are standard system calls used for creating new processes. This topic "Fork Paging" discusses how paging can be related to these process creation techniques.
Contents |
[edit] Fork
Fork is a standard UNIX system call used to create a new process. Whenever a process issues a fork system call, a new process will be created. The process which forked the new process is called the parent process and the one which was created from fork is called the child process. These two processes are entirely different ( both will have different process id ). The child process will be allocated separate memory space ( address space ) within which it will execute. In fact, an exact copy of the parent process's address space is used by the child process.
[edit] Address Space Copy
What does the "copy of parent process's address space" mean? Let us try to examine in internals of a process. Whenever an executable file is executed, it becomes a process. An executable file contains binary code grouped into a number of blocks called segments. Each segment is used for storing a particular type of data. A few segment names of a typical ELF executable file are listed below.
- text - Segment containing code ( of course, the machine code equivalent)
- .bss - Segment containing uninitialized data.
- data - Segment containing initialized data.
- symtab - Segment containing the program symbols ( like function name, variable names etc..)
- interp - Segment containing the name of the interpreter to be used ( /lib/ld-linux.so.2 )
If you want further analysis of an ELF file you can use the "readelf" command. When such a file is loaded in the memory for execution, the segments are loaded in memory. It is not necessary for the entire executable to be loaded in contiguous memory locations. Memory is divided into equal sized partitions called pages (typically 4KB). Hence when the executable is loaded in the memory, different parts of the executable are placed in different pages ( which might not be contiguous ). Consider an ELF executable file named "hello" of size say 10K and if the page size supported by the OS is 4K, then the file will be split into 3 pieces (also called frames) of size 4K, 4K and 2K respectively. These 3 frames will be accommodated in any 3 free pages in memory.
[edit] Fork and page sharing
When a fork() system call is issued, a copy of all the pages corresponding to the parent process is created, loaded into a separate memory location by the OS for the child process. But this is not need is certain cases. Consider the case when a child executes an "execv" system call ( which is used to execute any executable file from within a C program) and exits. When the child is needed just to execute a command for the parent process, there is no need for copying the parent process's pages, since execv replaces the address space of the process which invoked it, with the command to be executed.
In such cases, a technique called Copy-on-write is used. In this technique, when a fork is done, the parent process's pages are not copied for the child process. Instead, the pages are shared between the child and the parent process. Whenever a process (parent or child) modifies a page, a separate copy of that particular page alone is made for that process (parent or child) which performed the modification. This process will then use the newly copied page rather than the shared one in all future references. The other process ( the one which did not modify the shared page ) continues to use the shared version of the page. This technique is called Copy-on-write since the page is copied when some process writes to it.
[edit] Vfork and page sharing
vfork is yet another UNIX system call used to create a new process. When a vfork() system call is issued, the parent process will be suspended until the child process has completed execution. Even in vfork, the pages are shared among the parent and child process. But vfork does not use/support Copy-on-write technique. Hence if the child process makes a modification in any of the shared pages, no new page will be created. Hence the modified pages is visible to the parent process too. Since there is absolutely no page copying involved (consuming additional memory), this technique is highly efficient when a process needs to execute a blocking command using the child process.
[edit] Application usage
On some systems, vfork() is the same as fork().
The vfork() function differs from fork() only in that the child process can share code and data with the calling process (parent process). This speeds cloning activity significantly at a risk to the integrity of the parent process if vfork() is misused.
The use of vfork() for any purpose except as a prelude to an immediate call to a function from the exec family, or to _exit(), is not advised.
The vfork() function can be used to create new processes without fully copying the address space of the old process. If a forked process is simply going to call exec, the data space copied from the parent to the child by fork() is not used. This is particularly inefficient in a paged environment, making vfork() particularly useful. Depending upon the size of the parent's data space, vfork() can give a significant performance improvement over fork().
The vfork() function can normally be used just like fork(). It does not work, however, to return while running in the child's context from the caller of vfork() since the eventual return from vfork() would then return to a no longer existent stack frame. Be careful, also, to call _exit() rather than exit() if you cannot exec, since exit() flushes and closes standard I/O channels, thereby damaging the parent process' standard I/O data structures. (Even with fork(), it is wrong to call exit(), since buffered data would then be flushed twice.)
If signal handlers are invoked in the child process after vfork(), they must follow the same rules as other code in the child process.[1]
[edit] References
- ^ UNIX Specification Version 2, 1997 http://www.vfork.org