X-Git-Url: http://ftp.safe.ca/?a=blobdiff_plain;f=Documentation%2Fmemory-hotplug.txt;h=bbc8a6a3692169a53aeb9ac1ab271018a53531d0;hb=44a1f2085e8fe07b3aecdab7c391ca057d75da0f;hp=5fbcc22c98e944c1ec628bca656ec1eae73cec5e;hpb=6867c9310d5dab6897638a89c7e31addfcb22043;p=safe%2Fjmp%2Flinux-2.6 diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt index 5fbcc22..bbc8a6a 100644 --- a/Documentation/memory-hotplug.txt +++ b/Documentation/memory-hotplug.txt @@ -2,7 +2,8 @@ Memory Hotplug ============== -Last Updated: Jul 28 2007 +Created: Jul 28 2007 +Add description of notifier of memory hotplug Oct 11 2007 This document is about memory hotplug including how-to-use and current status. Because Memory Hotplug is still under development, contents of this text will @@ -24,7 +25,8 @@ be changed often. 6.1 Memory offline and ZONE_MOVABLE 6.2. How to offline memory 7. Physical memory remove -8. Future Work List +8. Memory hotplug event notifier +9. Future Work List Note(1): x86_64's has special implementation for memory hotplug. This text does not describe it. @@ -71,13 +73,13 @@ this phase is triggered automatically. ACPI can notify this event. If not, (see Section 4.). Logical Memory Hotplug phase is to change memory state into -avaiable/unavailable for users. Amount of memory from user's view is +available/unavailable for users. Amount of memory from user's view is changed by this phase. The kernel makes all memory in it as free pages when a memory range is available. In this document, this phase is described as online/offline. -Logical Memory Hotplug phase is triggred by write of sysfs file by system +Logical Memory Hotplug phase is triggered by write of sysfs file by system administrator. For the hot-add case, it must be executed after Physical Hotplug phase by hand. (However, if you writes udev's hotplug scripts for memory hotplug, these @@ -122,7 +124,7 @@ config options. This option can be kernel module too. -------------------------------- -3 sysfs files for memory hotplug +4 sysfs files for memory hotplug -------------------------------- All sections have their device information under /sys/devices/system/memory as @@ -136,11 +138,12 @@ For example, assume 1GiB section size. A device for a memory starting at (0x100000000 / 1Gib = 4) This device covers address range [0x100000000 ... 0x140000000) -Under each section, you can see 3 files. +Under each section, you can see 4 files. /sys/devices/system/memory/memoryXXX/phys_index /sys/devices/system/memory/memoryXXX/phys_device /sys/devices/system/memory/memoryXXX/state +/sys/devices/system/memory/memoryXXX/removable 'phys_index' : read-only and contains section id, same as XXX. 'state' : read-write @@ -148,10 +151,20 @@ Under each section, you can see 3 files. at write: user can specify "online", "offline" command 'phys_device': read-only: designed to show the name of physical memory device. This is not well implemented now. +'removable' : read-only: contains an integer value indicating + whether the memory section is removable or not + removable. A value of 1 indicates that the memory + section is removable and a value of 0 indicates that + it is not removable. NOTE: These directories/files appear after physical memory hotplug phase. +If CONFIG_NUMA is enabled the +/sys/devices/system/memory/memoryXXX memory section +directories can also be accessed via symbolic links located in +the /sys/devices/system/node/node* directories. For example: +/sys/devices/system/node/node0/memory9 -> ../../memory/memory9 -------------------------------- 4. Physical memory hot-add phase @@ -307,13 +320,62 @@ Need more implementation yet.... - Notification completion of remove works by OS to firmware. - Guard from remove if not yet. +-------------------------------- +8. Memory hotplug event notifier +-------------------------------- +Memory hotplug has event notifer. There are 6 types of notification. + +MEMORY_GOING_ONLINE + Generated before new memory becomes available in order to be able to + prepare subsystems to handle memory. The page allocator is still unable + to allocate from the new memory. + +MEMORY_CANCEL_ONLINE + Generated if MEMORY_GOING_ONLINE fails. + +MEMORY_ONLINE + Generated when memory has successfully brought online. The callback may + allocate pages from the new memory. + +MEMORY_GOING_OFFLINE + Generated to begin the process of offlining memory. Allocations are no + longer possible from the memory but some of the memory to be offlined + is still in use. The callback can be used to free memory known to a + subsystem from the indicated memory section. + +MEMORY_CANCEL_OFFLINE + Generated if MEMORY_GOING_OFFLINE fails. Memory is available again from + the section that we attempted to offline. + +MEMORY_OFFLINE + Generated after offlining memory is complete. + +A callback routine can be registered by + hotplug_memory_notifier(callback_func, priority) + +The second argument of callback function (action) is event types of above. +The third argument is passed by pointer of struct memory_notify. + +struct memory_notify { + unsigned long start_pfn; + unsigned long nr_pages; + int status_change_nid; +} + +start_pfn is start_pfn of online/offline memory. +nr_pages is # of pages of online/offline memory. +status_change_nid is set node id when N_HIGH_MEMORY of nodemask is (will be) +set/clear. It means a new(memoryless) node gets new memory by online and a +node loses all memory. If this is -1, then nodemask status is not changed. +If status_changed_nid >= 0, callback should create/discard structures for the +node if necessary. + -------------- -8. Future Work +9. Future Work -------------- - allowing memory hot-add to ZONE_MOVABLE. maybe we need some switch like sysctl or new control file. - showing memory section and physical device relationship. - - showing memory section and node relationship (maybe good for NUMA) - showing memory section is under ZONE_MOVABLE or not - test and make it better memory offlining. - support HugeTLB page migration and offlining.