next up previous contents index
Next: 3.2 Zones Up: 3. Describing Physical Memory Previous: 3. Describing Physical Memory   Contents   Index


3.1 Nodes

Each node in memory is described by a pg_data_t struct. When allocating a page, Linux uses a node-local allocation policy to allocate memory from the node closest to the running CPU. As processes tend to run on the same CPU or can be explicitly bound, it is likely the memory from the current node will be used.

The struct is declared as follows in include/linux/mmzone.h

129 typedef struct pglist_data {
130         zone_t node_zones[MAX_NR_ZONES];
131         zonelist_t node_zonelists[GFP_ZONEMASK+1];
132         int nr_zones;
133         struct page *node_mem_map;
134         unsigned long *valid_addr_bitmap;
135         struct bootmem_data *bdata;
136         unsigned long node_start_paddr;
137         unsigned long node_start_mapnr;
138         unsigned long node_size;
139         int node_id;
140         struct pglist_data *node_next;
141 } pg_data_t;

node_zones The zones for this node, usually ZONE_HIGHMEM, ZONE_NORMAL, ZONE_DMA

node_zonelists This is the order of zones that allocations are preferred from. build_zonelists() in page_alloc.c does the work when called by free_area_init_core(). So a failed allocation ZONE_HIGHMEM may fall back to ZONE_NORMAL or back to ZONE_DMA

nr_zones Number of zones in this node, between 1 and 3. Not all nodes will have three. A CPU bank may not have ZONE_DMA for example

node_mem_map The first page of the physical block this node represents

valid_addr_bitmap A bitmap which describes "holes" in the memory node that no memory exists for.

bdata This is only of interest to the boot memory allocator

node_start_paddr The starting physical address of the node. This doesn't work really well as an unsigned long as it breaks for ia32 with PAE for example. A more suitable solution would be to record this as a Page Frame Number (pfn) . This could be trivially defined as (page_phys_addr >> PAGE_SHIFT)

node_start_mapnr This gives the offset within the lmem_map. This is contained within the global mem_map. lmem_map is the mapping of page frames for this node

node_size The total number of pages in this zone

node_id The ID of the node, starts at 0

node_next Pointer to next node in a NULL terminated list

All nodes in the system are maintained on a list called pgdat_list. Up until late 2.4 kernels (> 2.4.18), blocks of code that traversed the list looked something like;

        pg_data_t * pgdat;
        pgdat = pgdat_list;
        do {
              /* do something with pgdata_t */
              ...
        } while ((pgdat = pgdat->node_next));

In more recent kernels, a macro for_each_pgdat, which is trivially defined as a for loop, is provided to make the code more readable.


next up previous contents index
Next: 3.2 Zones Up: 3. Describing Physical Memory Previous: 3. Describing Physical Memory   Contents   Index
Mel 2003-01-14