LinuxConf.Au: A Matter of Hygiene: Automatic Page Migration for Linux
The Linux 2.6 kernels continue to evolve their support for NUMA platforms. Mechanisms and interfaces now [~2.6.16 and later] support "NUMA friendly" placement of kernel and user data, both automatically and under explict kernel subsystem and/or application control, and kernel developers are enhancing their subsystems to use these interfaces. As a result, the kernel does a fairly good job of allocating memory close to where it will be referenced -- at least initially. However, once automatic load balancing kicks in, all bets are off. The scheduler takes no note of NUMA locality when migrating tasks between nodes, other than fact that internode load balancing takes place somewhat less frequently than intranode balancing due to scheduling domain thresholds.
This paper will present two series of patches developed by the author: "Migrate on Fault" and "Automatic Page Migration" that, together, permit a task to pull pages local to itself after being migrated to a new node. These patches build on the direct migration mechanisms in recent upstream kernel to support "lazy migration" -- i.e., page are migrated in the context of the referencing task in the fault path. The "automatic migration" series of patches enhances the scheduler to notify a task that it has been migrated to a new node, so that it can arrange for lazy migration of the pages that if references. Both "migrate on fault" and "automatic migration" can be enabled/disabled on a per cpuset basis.
The paper will present the results of running the McAlpin Stream benchmark to show that enabling automatic page migration results in overall increased bandwidth of a NUMA platform by allowing the system to return to a state of maximum locality after load perturbations. This is the "hygiene" aspect of the mechanisms. The paper will also present the results of other workloads that show the cost/benefit of the patches when disabled and when enabled.