next up previous contents index
Next: 3.3 Pages Up: 3. Describing Physical Memory Previous: 3.1 Nodes   Contents   Index

Subsections


3.2 Zones

Zones are described by a struct zone_t. It keeps track of information like page usage statistics, free area information and locks. It is declared as follows in include/linux/mmzone.h

37 typedef struct zone_struct {
41         spinlock_t         lock;
42         unsigned long      free_pages;
43         unsigned long      pages_min, pages_low, pages_high;
44         int                need_balance;
45 
49         free_area_t        free_area[MAX_ORDER];
50 
76         wait_queue_head_t  * wait_table;
77         unsigned long      wait_table_size;
78         unsigned long      wait_table_shift;
79 
83         struct pglist_data *zone_pgdat;
84         struct page        *zone_mem_map;
85         unsigned long      zone_start_paddr;
86         unsigned long      zone_start_mapnr;
87 
91         char               *name;
92         unsigned long      size;
93 } zone_t;

lock Spinlock to protect the zone
free_pages Total number of free pages in the zone
pages_min,pages_low,pages_high Zone watermarks, described in the next section
need_balance A flag that tells kswapd to balance the zone
free_area Free area bitmaps used by the buddy allocator

wait_table A hash table of wait queues of processes waiting on a page to be freed. This is of importance to wait_on_page() and unlock_page(). While processes could all wait on one queue, this would cause a thundering herd of processes to race for pages still locked when woken up

wait_table_size Size of the hash table

wait_table_shift Defined as the number of bits in a long minus the table size. When the hash is calculated, it will be shifted right this number of bits so that the hash index will be inside the table.

zone_pgdat Points to the parent pg_data_t
zone_mem_map The first page in mem_map this zone refers to
zone_start_paddr Same principle as node_start_paddr
zone_start_mapnr Same principle as node_start_mapnr
name The string name of the zone, DMA, Normal or HighMem
size The size of the zone in pages


3.2.1 Zone Watermarks

When available memory in the system is low, the pageout daemon kswapd is woken up to start freeing up pages (See Chapter 11). If memory gets too low, the process will free up memory synchronously. The parameters affecting pageout behavior are similar to those by FreeBSD[#!mckusick96!#] and Solaris[#!mauro01!#].

Each zone has watermarks which help track how much pressure a zone is under. They are pages_low, pages_min and pages_high. The number of pages for pages_min is calculated in the function free_area_init_core() during memory init and is based on a ratio to the size of the zone in pages. It is calculated initially as $ZoneSizeInPages / 128$. The lowest value it will be is 20 pages (80K on a x86) and the highest possible value is 255 pages (1MiB on a x86).

pages_low When pages_low number of free pages is reached, kswapd is woken up by the buddy allocator to start freeing pages. This is equivalent to when lotsfree is reached in Solaris and freemin in FreeBSD.

pages_min When reached, the allocator will do the kswapd work in a synchronous fashion. There is no real equivalent in Solaris but the closest is the desfree or minfree which determine how often the pageout scanner. The value is twice the size of pages_min

pages_high Once kswapd is woken, it won't consider the zone to be ``balanced'' sleep until pages_high pages are free. In Solaris, this is called lotsfree and in BSD, it is called free_target. The value is three times the size of pages_min.

Whatever the pageout parameters are called in each operating system, the meaning is the same, it helps determine how hard the pageout daemon or processes work to free up pages.


next up previous contents index
Next: 3.3 Pages Up: 3. Describing Physical Memory Previous: 3.1 Nodes   Contents   Index
Mel 2003-01-14