Linux is a relatively new operating system that has begun to enjoy a lot of attention from the business and academic worlds. As the operating system matures, its feature set, capabilities and performance grows but unfortunately as a necessary side effect, so does its size and complexity. The table in Figure 1.1 shows the total gzipped size of the kernel source code and size in bytes and lines of code of the mm/ part of the kernel tree. This does not include the machine dependent code or any of the buffer management code and does not even pretend to be a strong metric for complexity but still serves as a small indicator.
|
As is the habit of Open Source projects in general, new developers are sometimes told to refer to the source with the polite acronym RTFS1.1 when questions are asked or are referred to the kernel newbies mailing list (http://www.kernelnewbies.org). With the Linux Virtual Memory (VM) manager, this was a suitable response for earlier kernels as the time required to understand the VM could be measured in weeks. The books available on the operating system devoted enough time into the memory management chapters to make the relatively small amount of code easy to navigate.
This is no longer the case. The books that describe the operating system such as `Understanding the Linux Kernel''[#!bovet00!#], tend to be an overview of all subsystems without giving specific attention to one topic with the notable exception of device drivers[#!rubini01!#]. Increasingly, to get a comprehensive view on how the kernel functions, the developer or researcher is required to read through the source code line by line which requires a large investment of time. This is especially true as the implementations of several VM algorithms diverge from the papers describing them considerably.
The documentation on the Memory Manager that exists today is relatively poor. It is not an area of the kernel that many wish to get involved in for a variety of reasons ranging from the amount of code involved, to the complexity of the subject of memory management to the difficulty of debugging the kernel with an unstable VM. In this thesis a comprehensive guide to the VM as implemented in the late 2.4 Kernels is given. A companion document called ``Code Commentary On The Linux Virtual Memory Manager'', hereafter referred to as the companion document, provides a detailed tour of the code. It is envisioned that with this pair of documents, the time required to have a working understanding of the VM, even later VM's, will be measured in weeks instead of the estimated 8 months currently required by even an experienced developer.