X-Git-Url: http://ftp.safe.ca/?a=blobdiff_plain;f=Documentation%2Fkprobes.txt;h=48b3de90eb1eb03ceb1929e4228ae3a57debc526;hb=59a252ff8c0f2fa32c896f69d56ae33e641ce7ad;hp=0541fe1de70482e74fa2f10c409130b1ac21ef4b;hpb=d27a4dddd96f4ee898f8d1d597d38f8f4079bbb0;p=safe%2Fjmp%2Flinux-2.6 diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt index 0541fe1..48b3de9 100644 --- a/Documentation/kprobes.txt +++ b/Documentation/kprobes.txt @@ -14,6 +14,7 @@ CONTENTS 8. Kprobes Example 9. Jprobes Example 10. Kretprobes Example +Appendix A: The kprobes debugfs interface 1. Concepts: Kprobes, Jprobes, Return Probes @@ -36,6 +37,11 @@ registration function such as register_kprobe() specifies where the probe is to be inserted and what handler is to be called when the probe is hit. +There are also register_/unregister_*probes() functions for batch +registration/unregistration of a group of *probes. These functions +can speed up unregistration process when you have to unregister +a lot of probes at once. + The next three subsections explain how the different types of probes work. They explain certain things that you'll need to know in order to make the best use of Kprobes -- e.g., the @@ -91,11 +97,12 @@ handler has run. Up to MAX_STACK_SIZE bytes are copied -- e.g., 64 bytes on i386. Note that the probed function's args may be passed on the stack -or in registers (e.g., for x86_64 or for an i386 fastcall function). -The jprobe will work in either case, so long as the handler's -prototype matches that of the probed function. +or in registers. The jprobe will work in either case, so long as the +handler's prototype matches that of the probed function. + +1.3 Return Probes -1.3 How Does a Return Probe Work? +1.3.1 How Does a Return Probe Work? When you call register_kretprobe(), Kprobes establishes a kprobe at the entry to the function. When the probed function is called and this @@ -106,9 +113,9 @@ At boot time, Kprobes registers a kprobe at the trampoline. When the probed function executes its return instruction, control passes to the trampoline and that probe is hit. Kprobes' trampoline -handler calls the user-specified handler associated with the kretprobe, -then sets the saved instruction pointer to the saved return address, -and that's where execution resumes upon return from the trap. +handler calls the user-specified return handler associated with the +kretprobe, then sets the saved instruction pointer to the saved return +address, and that's where execution resumes upon return from the trap. While the probed function is executing, its return address is stored in an object of type kretprobe_instance. Before calling @@ -130,27 +137,56 @@ zero when the return probe is registered, and is incremented every time the probed function is entered but there is no kretprobe_instance object available for establishing the return probe. +1.3.2 Kretprobe entry-handler + +Kretprobes also provides an optional user-specified handler which runs +on function entry. This handler is specified by setting the entry_handler +field of the kretprobe struct. Whenever the kprobe placed by kretprobe at the +function entry is hit, the user-defined entry_handler, if any, is invoked. +If the entry_handler returns 0 (success) then a corresponding return handler +is guaranteed to be called upon function return. If the entry_handler +returns a non-zero error then Kprobes leaves the return address as is, and +the kretprobe has no further effect for that particular function instance. + +Multiple entry and return handler invocations are matched using the unique +kretprobe_instance object associated with them. Additionally, a user +may also specify per return-instance private data to be part of each +kretprobe_instance object. This is especially useful when sharing private +data between corresponding user entry and return handlers. The size of each +private data object can be specified at kretprobe registration time by +setting the data_size field of the kretprobe struct. This data can be +accessed through the data field of each kretprobe_instance object. + +In case probed function is entered but there is no kretprobe_instance +object available, then in addition to incrementing the nmissed count, +the user entry_handler invocation is also skipped. + 2. Architectures Supported Kprobes, jprobes, and return probes are implemented on the following architectures: - i386 -- x86_64 (AMD-64, E64MT) +- x86_64 (AMD-64, EM64T) - ppc64 -- ia64 (Support for probes on certain instruction types is still in progress.) +- ia64 (Does not support probes on instruction slot1.) - sparc64 (Return probes not yet implemented.) +- arm +- ppc 3. Configuring Kprobes When configuring the kernel using make menuconfig/xconfig/oldconfig, -ensure that CONFIG_KPROBES is set to "y". Under "Kernel hacking", -look for "Kprobes". You may have to enable "Kernel debugging" -(CONFIG_DEBUG_KERNEL) before you can enable Kprobes. +ensure that CONFIG_KPROBES is set to "y". Under "Instrumentation +Support", look for "Kprobes". -You may also want to ensure that CONFIG_KALLSYMS and perhaps even -CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name() -is a handy, version-independent way to find a function's address. +So that you can load and unload Kprobes-based instrumentation modules, +make sure "Loadable module support" (CONFIG_MODULES) and "Module +unloading" (CONFIG_MODULE_UNLOAD) are set to "y". + +Also make sure that CONFIG_KALLSYMS and perhaps even CONFIG_KALLSYMS_ALL +are set to "y", since kallsyms_lookup_name() is used by the in-kernel +kprobe address resolution code. If you need to insert a probe in the middle of a function, you may find it useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO), @@ -160,9 +196,11 @@ code mapping. 4. API Reference The Kprobes API includes a "register" function and an "unregister" -function for each type of probe. Here are terse, mini-man-page -specifications for these functions and the associated probe handlers -that you'll write. See the latter half of this document for examples. +function for each type of probe. The API also includes "register_*probes" +and "unregister_*probes" functions for (un)registering arrays of probes. +Here are terse, mini-man-page specifications for these functions and +the associated probe handlers that you'll write. See the files in the +samples/kprobes/ sub-directory for examples. 4.1 register_kprobe @@ -176,6 +214,27 @@ occurs during execution of kp->pre_handler or kp->post_handler, or during single-stepping of the probed instruction, Kprobes calls kp->fault_handler. Any or all handlers can be NULL. +NOTE: +1. With the introduction of the "symbol_name" field to struct kprobe, +the probepoint address resolution will now be taken care of by the kernel. +The following will now work: + + kp.symbol_name = "symbol_name"; + +(64-bit powerpc intricacies such as function descriptors are handled +transparently) + +2. Use the "offset" field of struct kprobe if the offset into the symbol +to install a probepoint is known. This field is used to calculate the +probepoint. + +3. Specify either the kprobe "symbol_name" OR the "addr". If both are +specified, kprobe registration will fail with -EINVAL. + +4. With CISC architectures (such as i386 and x86_64), the kprobes code +does not validate if the kprobe.addr is at an instruction boundary. +Use "offset" with caution. + register_kprobe() returns 0 on success, or a negative errno otherwise. User's pre-handler (kp->pre_handler): @@ -218,9 +277,9 @@ Kprobes runs the handler whose address is jp->entry. The handler should have the same arg list and return type as the probed function; and just before it returns, it must call jprobe_return(). (The handler never actually returns, since jprobe_return() returns -control to Kprobes.) If the probed function is declared asmlinkage, -fastcall, or anything else that affects how args are passed, the -handler's declaration must match. +control to Kprobes.) If the probed function is declared asmlinkage +or anything else that affects how args are passed, the handler's +declaration must match. register_jprobe() returns 0 on success, or a negative errno otherwise. @@ -248,6 +307,13 @@ of interest: - ret_addr: the return address - rp: points to the corresponding kretprobe object - task: points to the corresponding task struct +- data: points to per return-instance private data; see "Kretprobe + entry-handler" for details. + +The regs_return_value(regs) macro provides a simple abstraction to +extract the return value from the appropriate register as defined by +the architecture's ABI. + The handler's return value is currently ignored. 4.4 unregister_*probe @@ -260,20 +326,57 @@ void unregister_kretprobe(struct kretprobe *rp); Removes the specified probe. The unregister function can be called at any time after the probe has been registered. +NOTE: +If the functions find an incorrect probe (ex. an unregistered probe), +they clear the addr field of the probe. + +4.5 register_*probes + +#include +int register_kprobes(struct kprobe **kps, int num); +int register_kretprobes(struct kretprobe **rps, int num); +int register_jprobes(struct jprobe **jps, int num); + +Registers each of the num probes in the specified array. If any +error occurs during registration, all probes in the array, up to +the bad probe, are safely unregistered before the register_*probes +function returns. +- kps/rps/jps: an array of pointers to *probe data structures +- num: the number of the array entries. + +NOTE: +You have to allocate(or define) an array of pointers and set all +of the array entries before using these functions. + +4.6 unregister_*probes + +#include +void unregister_kprobes(struct kprobe **kps, int num); +void unregister_kretprobes(struct kretprobe **rps, int num); +void unregister_jprobes(struct jprobe **jps, int num); + +Removes each of the num probes in the specified array at once. + +NOTE: +If the functions find some incorrect probes (ex. unregistered +probes) in the specified array, they clear the addr field of those +incorrect probes. However, other probes in the array are +unregistered correctly. + 5. Kprobes Features and Limitations -As of Linux v2.6.12, Kprobes allows multiple probes at the same -address. Currently, however, there cannot be multiple jprobes on -the same function at the same time. +Kprobes allows multiple probes at the same address. Currently, +however, there cannot be multiple jprobes on the same function at +the same time. In general, you can install a probe anywhere in the kernel. In particular, you can probe interrupt handlers. Known exceptions are discussed in this section. -For obvious reasons, it's a bad idea to install a probe in -the code that implements Kprobes (mostly kernel/kprobes.c and -arch/*/kernel/kprobes.c). A patch in the v2.6.13 timeframe instructs -Kprobes to reject such requests. +The register_*probe functions will return -EINVAL if you attempt +to install a probe in the code that implements Kprobes (mostly +kernel/kprobes.c and arch/*/kernel/kprobes.c, but also functions such +as do_page_fault and notifier_call_chain). If you install a probe in an inline-able function, Kprobes makes no attempt to chase down all inline instances of the function and @@ -290,18 +393,14 @@ from the accidental ones. Don't drink and probe. Kprobes makes no attempt to prevent probe handlers from stepping on each other -- e.g., probing printk() and then calling printk() from a -probe handler. As of Linux v2.6.12, if a probe handler hits a probe, -that second probe's handlers won't be run in that instance. - -In Linux v2.6.12 and previous versions, Kprobes' data structures are -protected by a single lock that is held during probe registration and -unregistration and while handlers are run. Thus, no two handlers -can run simultaneously. To improve scalability on SMP systems, -this restriction will probably be removed soon, in which case -multiple handlers (or multiple instances of the same handler) may -run concurrently on different CPUs. Code your handlers accordingly. - -Kprobes does not use semaphores or allocate memory except during +probe handler. If a probe handler hits a probe, that second probe's +handlers won't be run in that instance, and the kprobe.nmissed member +of the second probe will be incremented. + +As of Linux v2.6.15-rc1, multiple handlers (or multiple instances of +the same handler) may run concurrently on different CPUs. + +Kprobes does not use mutexes or allocate memory except during registration and unregistration. Probe handlers are run with preemption disabled. Depending on the @@ -316,11 +415,21 @@ address instead of the real return address for kretprobed functions. (As far as we can tell, __builtin_return_address() is used only for instrumentation and error reporting.) -If the number of times a function is called does not match the -number of times it returns, registering a return probe on that -function may produce undesirable results. We have the do_exit() -and do_execve() cases covered. do_fork() is not an issue. We're -unaware of other specific cases where this could be a problem. +If the number of times a function is called does not match the number +of times it returns, registering a return probe on that function may +produce undesirable results. In such a case, a line: +kretprobe BUG!: Processing kretprobe d000000000041aa8 @ c00000000004f48c +gets printed. With this information, one will be able to correlate the +exact instance of the kretprobe that caused the problem. We have the +do_exit() case covered. do_execve() and do_fork() are not an issue. +We're unaware of other specific cases where this could be a problem. + +If, upon entry to or exit from a function, the CPU is running on +a stack other than that of the current task, registering a return +probe on that function may produce undesirable results. For this +reason, Kprobes doesn't support return probes (or kprobes or jprobes) +on the x86_64 version of __switch_to(); the registration functions +return -EINVAL. 6. Probe Overhead @@ -347,242 +456,54 @@ k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99 7. TODO -a. SystemTap (http://sourceware.org/systemtap): Work in progress -to provide a simplified programming interface for probe-based -instrumentation. -b. Improved SMP scalability: Currently, work is in progress to handle -multiple kprobes in parallel. -c. Kernel return probes for sparc64. -d. Support for other architectures. -e. User-space probes. +a. SystemTap (http://sourceware.org/systemtap): Provides a simplified +programming interface for probe-based instrumentation. Try it out. +b. Kernel return probes for sparc64. +c. Support for other architectures. +d. User-space probes. +e. Watchpoint probes (which fire on data references). 8. Kprobes Example -Here's a sample kernel module showing the use of kprobes to dump a -stack trace and selected i386 registers when do_fork() is called. ------ cut here ----- -/*kprobe_example.c*/ -#include -#include -#include -#include -#include - -/*For each probe you need to allocate a kprobe structure*/ -static struct kprobe kp; - -/*kprobe pre_handler: called just before the probed instruction is executed*/ -int handler_pre(struct kprobe *p, struct pt_regs *regs) -{ - printk("pre_handler: p->addr=0x%p, eip=%lx, eflags=0x%lx\n", - p->addr, regs->eip, regs->eflags); - dump_stack(); - return 0; -} - -/*kprobe post_handler: called after the probed instruction is executed*/ -void handler_post(struct kprobe *p, struct pt_regs *regs, unsigned long flags) -{ - printk("post_handler: p->addr=0x%p, eflags=0x%lx\n", - p->addr, regs->eflags); -} - -/* fault_handler: this is called if an exception is generated for any - * instruction within the pre- or post-handler, or when Kprobes - * single-steps the probed instruction. - */ -int handler_fault(struct kprobe *p, struct pt_regs *regs, int trapnr) -{ - printk("fault_handler: p->addr=0x%p, trap #%dn", - p->addr, trapnr); - /* Return 0 because we don't handle the fault. */ - return 0; -} - -int init_module(void) -{ - int ret; - kp.pre_handler = handler_pre; - kp.post_handler = handler_post; - kp.fault_handler = handler_fault; - kp.addr = (kprobe_opcode_t*) kallsyms_lookup_name("do_fork"); - /* register the kprobe now */ - if (!kp.addr) { - printk("Couldn't find %s to plant kprobe\n", "do_fork"); - return -1; - } - if ((ret = register_kprobe(&kp) < 0)) { - printk("register_kprobe failed, returned %d\n", ret); - return -1; - } - printk("kprobe registered\n"); - return 0; -} - -void cleanup_module(void) -{ - unregister_kprobe(&kp); - printk("kprobe unregistered\n"); -} - -MODULE_LICENSE("GPL"); ------ cut here ----- - -You can build the kernel module, kprobe-example.ko, using the following -Makefile: ------ cut here ----- -obj-m := kprobe-example.o -KDIR := /lib/modules/$(shell uname -r)/build -PWD := $(shell pwd) -default: - $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules -clean: - rm -f *.mod.c *.ko *.o ------ cut here ----- - -$ make -$ su - -... -# insmod kprobe-example.ko - -You will see the trace data in /var/log/messages and on the console -whenever do_fork() is invoked to create a new process. +See samples/kprobes/kprobe_example.c 9. Jprobes Example -Here's a sample kernel module showing the use of jprobes to dump -the arguments of do_fork(). ------ cut here ----- -/*jprobe-example.c */ -#include -#include -#include -#include -#include -#include - -/* - * Jumper probe for do_fork. - * Mirror principle enables access to arguments of the probed routine - * from the probe handler. - */ - -/* Proxy routine having the same arguments as actual do_fork() routine */ -long jdo_fork(unsigned long clone_flags, unsigned long stack_start, - struct pt_regs *regs, unsigned long stack_size, - int __user * parent_tidptr, int __user * child_tidptr) -{ - printk("jprobe: clone_flags=0x%lx, stack_size=0x%lx, regs=0x%p\n", - clone_flags, stack_size, regs); - /* Always end with a call to jprobe_return(). */ - jprobe_return(); - /*NOTREACHED*/ - return 0; -} - -static struct jprobe my_jprobe = { - .entry = (kprobe_opcode_t *) jdo_fork -}; - -int init_module(void) -{ - int ret; - my_jprobe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("do_fork"); - if (!my_jprobe.kp.addr) { - printk("Couldn't find %s to plant jprobe\n", "do_fork"); - return -1; - } - - if ((ret = register_jprobe(&my_jprobe)) <0) { - printk("register_jprobe failed, returned %d\n", ret); - return -1; - } - printk("Planted jprobe at %p, handler addr %p\n", - my_jprobe.kp.addr, my_jprobe.entry); - return 0; -} - -void cleanup_module(void) -{ - unregister_jprobe(&my_jprobe); - printk("jprobe unregistered\n"); -} - -MODULE_LICENSE("GPL"); ------ cut here ----- - -Build and insert the kernel module as shown in the above kprobe -example. You will see the trace data in /var/log/messages and on -the console whenever do_fork() is invoked to create a new process. -(Some messages may be suppressed if syslogd is configured to -eliminate duplicate messages.) +See samples/kprobes/jprobe_example.c 10. Kretprobes Example -Here's a sample kernel module showing the use of return probes to -report failed calls to sys_open(). ------ cut here ----- -/*kretprobe-example.c*/ -#include -#include -#include -#include - -static const char *probed_func = "sys_open"; - -/* Return-probe handler: If the probed function fails, log the return value. */ -static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs) -{ - // Substitute the appropriate register name for your architecture -- - // e.g., regs->rax for x86_64, regs->gpr[3] for ppc64. - int retval = (int) regs->eax; - if (retval < 0) { - printk("%s returns %d\n", probed_func, retval); - } - return 0; -} - -static struct kretprobe my_kretprobe = { - .handler = ret_handler, - /* Probe up to 20 instances concurrently. */ - .maxactive = 20 -}; - -int init_module(void) -{ - int ret; - my_kretprobe.kp.addr = - (kprobe_opcode_t *) kallsyms_lookup_name(probed_func); - if (!my_kretprobe.kp.addr) { - printk("Couldn't find %s to plant return probe\n", probed_func); - return -1; - } - if ((ret = register_kretprobe(&my_kretprobe)) < 0) { - printk("register_kretprobe failed, returned %d\n", ret); - return -1; - } - printk("Planted return probe at %p\n", my_kretprobe.kp.addr); - return 0; -} - -void cleanup_module(void) -{ - unregister_kretprobe(&my_kretprobe); - printk("kretprobe unregistered\n"); - /* nmissed > 0 suggests that maxactive was set too low. */ - printk("Missed probing %d instances of %s\n", - my_kretprobe.nmissed, probed_func); -} - -MODULE_LICENSE("GPL"); ------ cut here ----- - -Build and insert the kernel module as shown in the above kprobe -example. You will see the trace data in /var/log/messages and on the -console whenever sys_open() returns a negative value. (Some messages -may be suppressed if syslogd is configured to eliminate duplicate -messages.) +See samples/kprobes/kretprobe_example.c For additional information on Kprobes, refer to the following URLs: http://www-106.ibm.com/developerworks/library/l-kprobes.html?ca=dgr-lnxw42Kprobe http://www.redhat.com/magazine/005mar05/features/kprobes/ +http://www-users.cs.umn.edu/~boutcher/kprobes/ +http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115) + + +Appendix A: The kprobes debugfs interface + +With recent kernels (> 2.6.20) the list of registered kprobes is visible +under the /debug/kprobes/ directory (assuming debugfs is mounted at /debug). + +/debug/kprobes/list: Lists all registered probes on the system + +c015d71a k vfs_read+0x0 +c011a316 j do_fork+0x0 +c03dedc5 r tcp_v4_rcv+0x0 + +The first column provides the kernel address where the probe is inserted. +The second column identifies the type of probe (k - kprobe, r - kretprobe +and j - jprobe), while the third column specifies the symbol+offset of +the probe. If the probed function belongs to a module, the module name +is also specified. Following columns show probe status. If the probe is on +a virtual address that is no longer valid (module init sections, module +virtual addresses that correspond to modules that've been unloaded), +such probes are marked with [GONE]. + +/debug/kprobes/enabled: Turn kprobes ON/OFF + +Provides a knob to globally turn registered kprobes ON or OFF. By default, +all kprobes are enabled. By echoing "0" to this file, all registered probes +will be disarmed, till such time a "1" is echoed to this file.