When the max_mapped number of pages has been found in the page cache,
swap_out() (See Figure 11.4) is called to start
swapping out process pages. Starting from the mm pointed to by swap_mm
and the address mmswap_address, the page tables are
searched forward until nr_pages have been freed.
All pages are examined regardless of where they are in the lists or when they were last referenced but pages which are part of the active_list or have been recently referenced will be skipped over. The examination of hot pages is a bit costly but nothing in comparison to linearly searching all processes for the PTE's that reference a particular struct page.
Once it has been decided to swap out pages from a process, an attempt will be made to swap out at least SWAP_CLUSTER number of pages and the full list of mm_struct's will only be examined once so avoid constant looping when no pages are available. Writing out the pages in bulk like this increases the chance that pages close together in the process address space will be written out to adjacent slots on disk.
swap_mm is initialised to point to init_mm and the swap_address is initialised to 0 the first time it is used. A task has been fully searched when the swap_address is equal to TASK_SIZE. Once a task has been selected to swap pages from, the reference count to the mm_struct is incremented so that it will not be freed early and swap_out_mm is called with the selected mm as a parameter. This function walks each VMA the process holds and calls swap_out_vma for it. This is to avoid having to walk the entire page table which will be largely sparse. swap_out_pgd() and swap_out_pmd() walk the page tables for given VMA until finally try_to_swap_out() is called on the actual page and PTE.
try_to_swap_out() first checks to make sure the page isn't part of the active_list, been recently referenced or part of a zone that we are not interested in. Once it has been established this is a page to be swapped out, it is removed from the page tables of the process and further work is performed. It is at this point the PTE is checked to see if it is dirty. If it is, the struct page flags will be updated to reflect that so that it will get laundered. Pages with buffers are not handled further as they can not be swapped out to backing storage so the PTE for the process is simply established again and the page will be flushed later.
If this is the first time the page has been swapped, a swap entry is allocated for it with get_swap_page() and the page is added to the swap cache. If the page is already part of the swap cache, the reference to it in the current process will be simply dropped, when it reaches 0, the page will be freed. Once in the swap cache, the PTE in the process page tables will be updated with the information needed to get the page from swap again. This is important because it means the PTE's for a process can never be swapped out or discarded.