Next: 4. Page Table Management
Up: 3. Describing Physical Memory
Previous: 3.2 Zones
  Contents
  Index
Subsections
3.3 Pages
Every physical page frame in the system has an associated struct page
which is used to keep track of its status. In the 2.2 kernel[#!bovet00!#], the
structure of this page resembled to some extent to System V[#!goodheart94!#]
but like the other families in UNIX, it changed considerably. It is declared
as follows in include/linux/mm.h
152 typedef struct page {
153 struct list_head list;
154 struct address_space *mapping;
155 unsigned long index;
156 struct page *next_hash;
158 atomic_t count;
159 unsigned long flags;
161 struct list_head lru;
163 struct page **pprev_hash;
164 struct buffer_head * buffers;
175
176 #if defined(CONFIG_HIGHMEM) || defined(WANT_PAGE_VIRTUAL)
177 void *virtual;
179 #endif /* CONFIG_HIGMEM || WANT_PAGE_VIRTUAL */
180 } mem_map_t;
- list Pages may belong to many lists and this field is used as the
list head. For example, pages in a mapping will be in one of
three circular linked links kept by the address_space. These
are clean_pages, dirty_pages and locked_pages. In the slab
allocator, this field is used to store pointers to the slab and
cache the page belongs to. It is also used to link blocks of
free pages together.
- mapping When files or devices are mmaped, their inode has an associated
address_space. This field will point to this
address space if the page belongs to the file.
- index This field has two uses and it depends on the state of
the page what it means. If the page is part of a file mapping,
it is the offset within the file. This includes if the page is
part of the swap cache where the address_space is
the swap address space (swapper_space). Secondly,
if a block of pages is being freed for a particular process,
the order (power of two number of pages being freed) of the
block being freed is stored in index. This is set in
the function __free_pages_ok()
- next_hash Pages that are part of a file mapping are hashed on the inode
and offset. This field links pages together that share the same
hash bucket.
- count The reference count to the page. If it drops to 0, it may be
freed. Any greater and it is in use by one or more processes or
is in use by the kernel like when waiting for IO.
- flags Flags which describe the status of the page. All of them are
declared in include/linux/mm.h and are listed and
described in Table 3.1. There is a
number of macros defined for testing, clearing and setting the
bits which are all listed in Table 3.2
- lru For the page replacement policy, pages that may be swapped out
will exist on either the active_list or the inactive_list
declared in page_alloc.c. This is the list head for
these LRU lists
- pprev_hash The complement to next_hash
- buffers If a page has buffers for a block device associated with it,
this field is used to keep track of the buffer_head
- virtual Normally only pages from ZONE_NORMAL may be directly mapped
by the kernel. To address pages in ZONE_HIGHMEM, kmap()
is used to map the page for the kernel. There is only a fixed
number of pages that may be mapped. When it is mapped, this is
its virtual address
The struct page is a typedef for mem_map_t so the struct page
can be easily referred to within the mem_map array.
Table 3.1:
Flags Describing Page Status
 |
Table 3.2:
Macros For Testing, Setting and Clearing Page Status Bits
Bit name |
Set |
Test |
Clear |
PG_active |
SetPageActive |
PageActive |
ClearPageActive |
PG_arch_1 |
n/a |
n/a |
n/a |
PG_checked |
SetPageChecked |
PageChecked |
n/a |
PG_dirty |
SetPageDirty |
PageDirty |
ClearPageDirty |
PG_error |
SetPageError |
PageError |
ClearPageError |
PG_highmem |
n/a |
PageHighMem |
n/a |
PG_launder |
SetPageLaunder |
PageLaunder |
ClearPageLaunder |
PG_locked |
LockPage |
PageLocked |
UnlockPage |
PG_lru |
TestSetPageLRU |
PageLRU |
TestClearPageLRU |
PG_referenced |
SetPageReferenced |
PageReferenced |
ClearPageReferenced |
PG_reserved |
SetPageReserved |
PageReserved |
ClearPageReserved |
PG_skip |
n/a |
n/a |
n/a |
PG_slab |
PageSetSlab |
PageSlab |
PageClearSlab |
PG_unused |
n/a |
n/a |
n/a |
PG_uptodate |
SetPageUptodate |
PageUptodate |
ClearPageUptodate |
|
3.3.1 Mapping Pages to Zones
Up until as recently as Kernel 2.4.18, a reference was stored to the zone at
page
zone which was later considered wasteful. In the most
recent kernels, this has been removed and instead the top ZONE_SHIFT
(8 in the x86) bits of the page
flags is used to determine
the zone a page belongs to. First a zone_table of zones is set up. It
is declared in include/linux/page_alloc.c as
33 zone_t *zone_table[MAX_NR_ZONES*MAX_NR_NODES];
34 EXPORT_SYMBOL(zone_table);
MAX_NR_ZONES is the maximum number of zones that can be in a
node, i.e. 3. MAX_NR_NODES is the maximum number of nodes that may
exist. This table is treated like a multi-dimensional array. During
free_area_init_core(), all the pages in a node are initialized.
First it sets the value for the table
734 zone_table[nid * MAX_NR_ZONES + j] = zone;
Where nid is the node ID, j is the zone index and zone
is the zone_t struct. For each page, the function
set_page_zone() is called as
788 set_page_zone(page, nid * MAX_NR_ZONES + j);
page is the page to be set. So, clearly the index in the
zone_table is stored in the page.
Next: 4. Page Table Management
Up: 3. Describing Physical Memory
Previous: 3.2 Zones
  Contents
  Index
Mel
2003-01-14