safe/jmp/linux-2.6
16 years agoedac: use the shorter LIST_HEAD for brevity
Robert P. J. Day [Tue, 29 Apr 2008 08:03:17 +0000 (01:03 -0700)]
edac: use the shorter LIST_HEAD for brevity

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoedac: add e752x parameter for sysbus_parity selection
Peter Tyser [Tue, 29 Apr 2008 08:03:15 +0000 (01:03 -0700)]
edac: add e752x parameter for sysbus_parity selection

Add a module parameter "sysbus_parity" to allow forcing system bus parity
error checking on or off.  Also add support to automatically disable system
bus parity errors for processors which do not support it.

If the sysbus_parity parameter is specified, sysbus parity detection will be
forced on or off.  If it is not specified, the driver will attempt to look at
the CPU identifier string and determine if the CPU supports system bus parity.
 A blacklist was used instead of a whitelist so that system bus parity would
be enabled by default and to minimize the chances of breaking things for those
people already using the driver which for some reason have a processor that
does not have a valid CPU identifier string.

[akpm@linux-foundation.org: coding-style fixes]
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Peter Tyser <ptyser@xes-inc.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoedac: new support for Intel 3100 chipset
Andrei Konovalov [Tue, 29 Apr 2008 08:03:13 +0000 (01:03 -0700)]
edac: new support for Intel 3100 chipset

Add Intel 3100 chipset support to e752x EDAC driver.

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrei Konovalov <akonovalov@ru.mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoidr: create idr_layer_cache at boot time
Akinobu Mita [Tue, 29 Apr 2008 08:03:13 +0000 (01:03 -0700)]
idr: create idr_layer_cache at boot time

Avoid a possible kmem_cache_create() failure by creating idr_layer_cache
unconditionary at boot time rather than creating it on-demand when idr_init()
is called the first time.

This change also enables us to eliminate the check every time idr_init() is
called.

[akpm@linux-foundation.org: rename init_id_cache() to idr_init_cache()]
[akpm@linux-foundation.org: fix alpha build]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoRemove duplicated unlikely() in IS_ERR()
Hirofumi Nakagawa [Tue, 29 Apr 2008 08:03:09 +0000 (01:03 -0700)]
Remove duplicated unlikely() in IS_ERR()

Some drivers have duplicated unlikely() macros.  IS_ERR() already has
unlikely() in itself.

This patch cleans up such pointless code.

Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jeff@garzik.org>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: add async notification support to /dev/random
Jeff Dike [Tue, 29 Apr 2008 08:03:08 +0000 (01:03 -0700)]
random: add async notification support to /dev/random

Add async notification support to /dev/random.

A little test case is below.  Without this patch, you get:

$ ./async-random
Drained the pool
Found more randomness

With it, you get:

$ ./async-random
Drained the pool
SIGIO
Found more randomness

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <errno.h>
#include <fcntl.h>

static void handler(int sig)
{
        printf("SIGIO\n");
}

int main(int argc, char **argv)
{
        int fd, n, err, flags;

        if(signal(SIGIO, handler) < 0){
                perror("setting SIGIO handler");
                exit(1);
        }

        fd = open("/dev/random", O_RDONLY);
        if(fd < 0){
                perror("open");
                exit(1);
        }

        flags = fcntl(fd, F_GETFL);
        if (flags < 0){
                perror("getting flags");
                exit(1);
        }

        flags |= O_NONBLOCK;
        if (fcntl(fd, F_SETFL, flags) < 0){
                perror("setting flags");
                exit(1);
        }

        while((err = read(fd, &n, sizeof(n))) > 0) ;

        if(err == 0){
                printf("random returned 0\n");
                exit(1);
        }
        else if(errno != EAGAIN){
                perror("read");
                exit(1);
        }

        flags |= O_ASYNC;
        if (fcntl(fd, F_SETFL, flags) < 0){
                perror("setting flags");
                exit(1);
        }

        if (fcntl(fd, F_SETOWN, getpid()) < 0) {
                perror("Setting SIGIO");
                exit(1);
        }

        printf("Drained the pool\n");
        read(fd, &n, sizeof(n));
        printf("Found more randomness\n");

        return(0);
}

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: simplify and rename credit_entropy_store
Matt Mackall [Tue, 29 Apr 2008 08:03:07 +0000 (01:03 -0700)]
random: simplify and rename credit_entropy_store

- emphasize bits in the name
- make zero bits lock-free
- simplify logic

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: make mixing interface byte-oriented
Matt Mackall [Tue, 29 Apr 2008 08:03:05 +0000 (01:03 -0700)]
random: make mixing interface byte-oriented

Switch add_entropy_words to a byte-oriented interface, eliminating numerous
casts and byte/word size rounding issues.  This also reduces the overall
bit/byte/word confusion in this code.

We now mix a byte at a time into the word-based pool.  This takes four times
as many iterations, but should be negligible compared to hashing overhead.
This also increases our pool churn, which adds some depth against some
theoretical failure modes.

The function name is changed to emphasize pool mixing and deemphasize entropy
(the samples mixed in may not contain any).  extract is added to the core
function to make it clear that it extracts from the pool.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: simplify add_ptr logic
Matt Mackall [Tue, 29 Apr 2008 08:03:04 +0000 (01:03 -0700)]
random: simplify add_ptr logic

The add_ptr variable wasn't used in a sensible way, use only i instead.
i got reused later for a different purpose, use j instead.

While we're here, put tap0 first in the tap list and add a comment.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: remove some prefetch logic
Matt Mackall [Tue, 29 Apr 2008 08:03:03 +0000 (01:03 -0700)]
random: remove some prefetch logic

The urandom output pool (ie the fast path) fits in one cacheline, so
this is pretty unnecessary. Further, the output path has already
fetched the entire pool to hash it before calling in here.

(This was the only user of prefetch_range in the kernel, and it passed
in words rather than bytes!)

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: eliminate redundant new_rotate variable
Matt Mackall [Tue, 29 Apr 2008 08:03:02 +0000 (01:03 -0700)]
random: eliminate redundant new_rotate variable

- eliminate new_rotate
- move input_rotate masking
- simplify input_rotate update
- move input_rotate update to end of inner loop for readability

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: remove cacheline alignment for locks
Matt Mackall [Tue, 29 Apr 2008 08:03:01 +0000 (01:03 -0700)]
random: remove cacheline alignment for locks

Earlier changes greatly reduce the number of times we grab the lock
per output byte, so we shouldn't need this particular hack any more.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: make backtracking attacks harder
Matt Mackall [Tue, 29 Apr 2008 08:03:00 +0000 (01:03 -0700)]
random: make backtracking attacks harder

At each extraction, we change (poolbits / 16) + 32 bits in the pool,
or 96 bits in the case of the secondary pools. Thus, a brute-force
backtracking attack on the pool state is less difficult than breaking
the hash. In certain cases, this difficulty may be is reduced to 2^64
iterations.

Instead, hash the entire pool in one go, then feedback the whole hash
(160 bits) in one go. This will make backtracking at least as hard as
inverting the hash.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: improve variable naming, clear extract buffer
Matt Mackall [Tue, 29 Apr 2008 08:02:59 +0000 (01:02 -0700)]
random: improve variable naming, clear extract buffer

- split the SHA variables apart into hash and workspace
- rename data to extract
- wipe extract and workspace after hashing

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: reuse rand_initialize
Matt Mackall [Tue, 29 Apr 2008 08:02:58 +0000 (01:02 -0700)]
random: reuse rand_initialize

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: use unlocked_ioctl
Matt Mackall [Tue, 29 Apr 2008 08:02:58 +0000 (01:02 -0700)]
random: use unlocked_ioctl

No locking actually needed.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: consolidate wakeup logic
Matt Mackall [Tue, 29 Apr 2008 08:02:56 +0000 (01:02 -0700)]
random: consolidate wakeup logic

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agorandom: clean up checkpatch complaints
Matt Mackall [Tue, 29 Apr 2008 08:02:55 +0000 (01:02 -0700)]
random: clean up checkpatch complaints

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoremove aoedev_isbusy()
Adrian Bunk [Tue, 29 Apr 2008 08:02:54 +0000 (01:02 -0700)]
remove aoedev_isbusy()

Remove the no longer used aoedev_isbusy().

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agogeneralize asm-generic/ioctl.h to allow overriding values
Robert P. J. Day [Tue, 29 Apr 2008 08:02:54 +0000 (01:02 -0700)]
generalize asm-generic/ioctl.h to allow overriding values

In the spirit of a number of other asm-generic header files,
generalize asm-generic/ioctl.h to allow arch-specific ioctl.h headers
to simply override _IOC_SIZEBITS and/or _IOC_DIRBITS before including
this header file, allowing a number of ioctl.h header files to be
shortened considerably.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agonbd: delete superfluous test for __GNUC__
Robert P. J. Day [Tue, 29 Apr 2008 08:02:52 +0000 (01:02 -0700)]
nbd: delete superfluous test for __GNUC__

Since <linux/compiler.h> already tests for __GNUC__, there's no point in nbd.h
repeating that test.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoNBD: add partition support
Laurent Vivier [Tue, 29 Apr 2008 08:02:51 +0000 (01:02 -0700)]
NBD: add partition support

Permit the use of partitions with network block devices (NBD).

A new parameter is introduced to define how many partition we want to be able
to manage per network block device.  This parameter is "max_part".

For instance, to manage 63 partitions / loop device, we will do:

   [on the server side]
# nbd-server 1234 /dev/sdb
   [on the client side]
# modprobe nbd max_part=63
# ls -l /dev/nbd*
brw-rw---- 1 root disk 43,   0 2008-03-25 11:14 /dev/nbd0
brw-rw---- 1 root disk 43,  64 2008-03-25 11:11 /dev/nbd1
brw-rw---- 1 root disk 43, 640 2008-03-25 11:11 /dev/nbd10
brw-rw---- 1 root disk 43, 704 2008-03-25 11:11 /dev/nbd11
brw-rw---- 1 root disk 43, 768 2008-03-25 11:11 /dev/nbd12
brw-rw---- 1 root disk 43, 832 2008-03-25 11:11 /dev/nbd13
brw-rw---- 1 root disk 43, 896 2008-03-25 11:11 /dev/nbd14
brw-rw---- 1 root disk 43, 960 2008-03-25 11:11 /dev/nbd15
brw-rw---- 1 root disk 43, 128 2008-03-25 11:11 /dev/nbd2
brw-rw---- 1 root disk 43, 192 2008-03-25 11:11 /dev/nbd3
brw-rw---- 1 root disk 43, 256 2008-03-25 11:11 /dev/nbd4
brw-rw---- 1 root disk 43, 320 2008-03-25 11:11 /dev/nbd5
brw-rw---- 1 root disk 43, 384 2008-03-25 11:11 /dev/nbd6
brw-rw---- 1 root disk 43, 448 2008-03-25 11:11 /dev/nbd7
brw-rw---- 1 root disk 43, 512 2008-03-25 11:11 /dev/nbd8
brw-rw---- 1 root disk 43, 576 2008-03-25 11:11 /dev/nbd9
# nbd-client localhost 1234 /dev/nbd0
Negotiation: ..size = 80418240KB
bs=1024, sz=80418240

-------NOTE, RFC: partition table is not automatically read.
The driver sets bdev->bd_invalidated to 1 to force the read of the partition
table of the device, but this is done only on an open of the device.
So we have to do a "touch /dev/nbdX" or something like that.
It can't be done from the nbd-client or nbd driver because at this
level we can't ask to read the partition table and to serve the request
at the same time (-> deadlock)

If someone has a better idea, I'm open to any suggestion.
-------NOTE, RFC

# fdisk -l /dev/nbd0

Disk /dev/nbd0: 82.3 GB, 82348277760 bytes
255 heads, 63 sectors/track, 10011 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

     Device Boot      Start         End      Blocks   Id  System
/dev/nbd0p1   *           1        9965    80043831   83  Linux
/dev/nbd0p2            9966       10011      369495    5  Extended
/dev/nbd0p5            9966       10011      369463+  82  Linux swap / Solaris

# ls -l /dev/nbd0*
brw-rw---- 1 root disk 43,   0 2008-03-25 11:16 /dev/nbd0
brw-rw---- 1 root disk 43,   1 2008-03-25 11:16 /dev/nbd0p1
brw-rw---- 1 root disk 43,   2 2008-03-25 11:16 /dev/nbd0p2
brw-rw---- 1 root disk 43,   5 2008-03-25 11:16 /dev/nbd0p5
# mount /dev/nbd0p1 /mnt
# ls /mnt
bin    dev   initrd      lost+found  opt   sbin     sys  var
boot   etc   initrd.img  media       proc  selinux  tmp  vmlinuz
cdrom  home  lib         mnt         root  srv      usr
# umount /mnt
# nbd-client -d /dev/nbd0
# ls -l /dev/nbd0*
brw-rw---- 1 root disk 43, 0 2008-03-25 11:16 /dev/nbd0
-------NOTE
On "nbd-client -d", we can do an iocl(BLKRRPART) to update partition table:
as the size of the device is 0, we don't have to serve the partition manager
request (-> no deadlock).
-------NOTE

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoNBD: allow nbd to be used locally
Laurent Vivier [Tue, 29 Apr 2008 08:02:46 +0000 (01:02 -0700)]
NBD: allow nbd to be used locally

This patch allows Network Block Device to be mounted locally (nbd-client to
nbd-server over 127.0.0.1).

It creates a kthread to avoid the deadlock described in NBD tools
documentation.  So, if nbd-client hangs waiting for pages, the kblockd thread
can continue its work and free pages.

I have tested the patch to verify that it avoids the hang that always occurs
when writing to a localhost nbd connection.  I have also tested to verify that
no performance degradation results from the additional thread and queue.

Patch originally from Laurent Vivier.

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoedd: add default mode CONFIG_EDD_OFF=n, override with edd={on,off}
Tim Gardner [Tue, 29 Apr 2008 08:02:45 +0000 (01:02 -0700)]
edd: add default mode CONFIG_EDD_OFF=n, override with edd={on,off}

Add a kernel parameter option to 'edd' to enable/disable BIOS Enhanced Disk
Drive Services.  CONFIG_EDD_OFF disables EDD while still compiling EDD into
the kernel.  Default behavior can be forced using 'edd=on' or 'edd=off' as
a kernel parameter.

[akpm@linux-foundation.org: fix kernel-parameters.txt]
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosysctl: add the ->permissions callback on the ctl_table_root
Pavel Emelyanov [Tue, 29 Apr 2008 08:02:44 +0000 (01:02 -0700)]
sysctl: add the ->permissions callback on the ctl_table_root

When reading from/writing to some table, a root, which this table came from,
may affect this table's permissions, depending on who is working with the
table.

The core hunk is at the bottom of this patch.  All the rest is just pushing
the ctl_table_root argument up to the sysctl_perm() function.

This will be mostly (only?) used in the net sysctls.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Denis V. Lunev <den@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosysctl: clean from unneeded extern and forward declarations
Pavel Emelyanov [Tue, 29 Apr 2008 08:02:41 +0000 (01:02 -0700)]
sysctl: clean from unneeded extern and forward declarations

The do_sysctl_strategy isn't used outside kernel/sysctl.c, so this can be
static and without a prototype in header.

Besides, move this one and parse_table() above their callers and drop the
forward declarations of the latter call.

One more "besides" - fix two checkpatch warnings: space before a ( and an
extra space at the end of a line.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Denis V. Lunev <den@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosysctl: merge equal proc_sys_read and proc_sys_write
Pavel Emelyanov [Tue, 29 Apr 2008 08:02:40 +0000 (01:02 -0700)]
sysctl: merge equal proc_sys_read and proc_sys_write

Many (most of) sysctls do not have a per-container sense.  E.g.
kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget and so on
and so forth.  Besides, tuning then from inside a container is not even
secure.  On the other hand, hiding them completely from the container's tasks
sometimes causes user-space to stop working.

When developing net sysctl, the common practice was to duplicate a table and
drop the write bits in table->mode, but this approach was not very elegant,
lead to excessive memory consumption and was not suitable in general.

Here's the alternative solution.  To facilitate the per-container sysctls
ctl_table_root-s were introduced.  Each root contains a list of
ctl_table_header-s that are visible to different namespaces.  The idea of this
set is to add the permissions() callback on the ctl_table_root to allow ctl
root limit permissions to the same ctl_table-s.

The main user of this functionality is the net-namespaces code, but later this
will (should) be used by more and more namespaces, containers and control
groups.

Actually, this idea's core is in a single hunk in the third patch.  First two
patches are cleanups for sysctl code, while the third one mostly extends the
arguments set of some sysctl functions.

This patch:

These ->read and ->write callbacks act in a very similar way, so merge these
paths to reduce the number of places to patch later and shrink the .text size
(a bit).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: "David S. Miller" <davem@davemloft.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Denis V. Lunev <den@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoinclude/linux/sysctl.h: remove empty #else
Adrian Bunk [Tue, 29 Apr 2008 08:02:38 +0000 (01:02 -0700)]
include/linux/sysctl.h: remove empty #else

Remove an empty #else.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosysctl: allow embedded targets to disable sysctl_check.c
Holger Schurig [Tue, 29 Apr 2008 08:02:36 +0000 (01:02 -0700)]
sysctl: allow embedded targets to disable sysctl_check.c

Disable sysctl_check.c for embedded targets. This saves about about 11 kB
in .text and another 11 kB in .data on a PXA255 embedded platform.

Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agodrivers: use non-racy method for proc entries creation (2)
Denis V. Lunev [Tue, 29 Apr 2008 08:02:35 +0000 (01:02 -0700)]
drivers: use non-racy method for proc entries creation (2)

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Neil Brown <neilb@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agodrivers: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:34 +0000 (01:02 -0700)]
drivers: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoparisc: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:32 +0000 (01:02 -0700)]
parisc: use non-racy method for proc entries creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokernel: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:31 +0000 (01:02 -0700)]
kernel: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoisdn: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:30 +0000 (01:02 -0700)]
isdn: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Add correct ->owner to proc_fops to fix reading/module unloading race.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Karsten Keil <kkeil@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agonetdev: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:29 +0000 (01:02 -0700)]
netdev: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoacpi: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:27 +0000 (01:02 -0700)]
acpi: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Add correct ->owner to proc_fops to fix reading/module unloading race.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agopowerpc: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:26 +0000 (01:02 -0700)]
powerpc: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Add correct ->owner to proc_fops to fix reading/module unloading race.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoparisc: use non-racy method for /proc/pcxl_dma creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:25 +0000 (01:02 -0700)]
parisc: use non-racy method for /proc/pcxl_dma creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoia64: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:25 +0000 (01:02 -0700)]
ia64: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agocris: use non-racy method for /proc/system_profile creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:23 +0000 (01:02 -0700)]
cris: use non-racy method for /proc/system_profile creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoavr32: proc: use non-racy method for /proc/tlb creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:22 +0000 (01:02 -0700)]
avr32: proc: use non-racy method for /proc/tlb creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoarm: use non-racy method for /proc/davinci_clocks creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:21 +0000 (01:02 -0700)]
arm: use non-racy method for /proc/davinci_clocks creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agos390: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:20 +0000 (01:02 -0700)]
s390: use non-racy method for proc entries creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agousb: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:19 +0000 (01:02 -0700)]
usb: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoscsi: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:17 +0000 (01:02 -0700)]
scsi: use non-racy method for proc entries creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Add correct ->owner to proc_fops to fix reading/module unloading race.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosamples: use non-racy method for /proc/marker-example creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:16 +0000 (01:02 -0700)]
samples: use non-racy method for /proc/marker-example creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agozorro: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:16 +0000 (01:02 -0700)]
zorro: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Add correct ->owner to proc_fops to fix reading/module unloading race.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosound: use non-racy method for /proc/driver/snd-page-alloc creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:13 +0000 (01:02 -0700)]
sound: use non-racy method for /proc/driver/snd-page-alloc creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agomm: use non-racy method for /proc/swaps creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:13 +0000 (01:02 -0700)]
mm: use non-racy method for /proc/swaps creation

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agosysvipc: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:12 +0000 (01:02 -0700)]
sysvipc: use non-racy method for proc entries creation

Use proc_create_data() to make sure that ->proc_fops and ->data be setup
before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agojbd2: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:11 +0000 (01:02 -0700)]
jbd2: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoreiserfs: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:09 +0000 (01:02 -0700)]
reiserfs: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

/proc entry owner is also added.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoext4: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:08 +0000 (01:02 -0700)]
ext4: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoafs: use non-racy method for proc entries creation
Denis V. Lunev [Tue, 29 Apr 2008 08:02:07 +0000 (01:02 -0700)]
afs: use non-racy method for proc entries creation

Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agonfs: use proc_create to setup de->proc_fops
Denis V. Lunev [Tue, 29 Apr 2008 08:02:07 +0000 (01:02 -0700)]
nfs: use proc_create to setup de->proc_fops

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agonfsd: use proc_create to setup de->proc_fops
Denis V. Lunev [Tue, 29 Apr 2008 08:02:04 +0000 (01:02 -0700)]
nfsd: use proc_create to setup de->proc_fops

Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to
main tree.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: introduce proc_create_data to setup de->data
Denis V. Lunev [Tue, 29 Apr 2008 08:02:00 +0000 (01:02 -0700)]
proc: introduce proc_create_data to setup de->data

This set of patches fixes an proc ->open'less usage due to ->proc_fops flip in
the most part of the kernel code.  The original OOPS is described in the
commit 2d3a4e3666325a9709cc8ea2e88151394e8f20fc:

    Typical PDE creation code looks like:

     pde = create_proc_entry("foo", 0, NULL);
     if (pde)
     pde->proc_fops = &foo_proc_fops;

    Notice that PDE is first created, only then ->proc_fops is set up to
    final value. This is a problem because right after creation
    a) PDE is fully visible in /proc , and
    b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's
       possible to ->read without ->open (see one class of oopses below).

    The fix is new API called proc_create() which makes sure ->proc_fops are
    set up before gluing PDE to main tree. Typical new code looks like:

     pde = proc_create("foo", 0, NULL, &foo_proc_fops);
     if (!pde)
     return -ENOMEM;

    Fix most networking users for a start.

    In the long run, create_proc_entry() for regular files will go.

In addition to this, proc_create_data is introduced to fix reading from
proc without PDE->data. The race is basically the same as above.

create_proc_entries is replaced in the entire kernel code as new method
is also simply better.

This patch:

The problem is the same as for de->proc_fops.  Right now PDE becomes visible
without data set.  So, the entry could be looked up without data.  This, in
most cases, will simply OOPS.

proc_create_data call is created to address this issue.  proc_create now
becomes a wrapper around it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Chris Mason <chris.mason@oracle.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Pierre Peiffer <peifferp@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: convert /proc/tty/ldiscs to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:58 +0000 (01:01 -0700)]
proc: convert /proc/tty/ldiscs to seq_file interface

Note: THIS_MODULE and header addition aren't technically needed because
      this code is not modular, but let's keep it anyway because people
      can copy this code into modular code.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: remove ->get_info infrastructure
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:58 +0000 (01:01 -0700)]
proc: remove ->get_info infrastructure

Now that last dozen or so users of ->get_info were removed, ditch it too.
Everyone sane shouldd have switched to seq_file interface long ago.

P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc
      is long-standing crap, BTW, thus
      a) put ->read_proc/->write_proc/read_proc_entry() users on death row,
      b) new such users should be rejected,
      c) everyone is encouraged to convert his favourite ->read_proc user or
         I'll do it, lazy bastards.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: switch /proc/scsi/device_info to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:55 +0000 (01:01 -0700)]
proc: switch /proc/scsi/device_info to seq_file interface

Note 1: 0644 should be used, but root bypasses permissions, so writing
to /proc/scsi/device_info still works.
Note 2: looks like scsi_dev_info_list is unprotected
Note 3: probably make proc whine about "unwriteable but with ->write hook"
entries. Probably.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: switch /proc/ip2mem to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:55 +0000 (01:01 -0700)]
proc: switch /proc/ip2mem to seq_file interface

/******************************************/
/* Remove useless comment, while I am it. */
/******************************************/

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: convert /proc/bus/nubus to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:54 +0000 (01:01 -0700)]
proc: convert /proc/bus/nubus to seq_file interface

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: switch /proc/irda/irnet to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:52 +0000 (01:01 -0700)]
proc: switch /proc/irda/irnet to seq_file interface

Probably interface misuse, because of the way iterating over hashbin is done.
However! Printing of socket number ("IrNET socket %d - ", i++") made conversion
to proper ->start/->next difficult enough to do blindly without hardware.
Said that, please apply.

Remove useless comment while I am it.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: switch /proc/excite/unit_id to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:50 +0000 (01:01 -0700)]
proc: switch /proc/excite/unit_id to seq_file interface

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Koeller <thomas.koeller@baslerweb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: switch /proc/bus/ecard/devices to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:49 +0000 (01:01 -0700)]
proc: switch /proc/bus/ecard/devices to seq_file interface

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Yani Ioannou <yani.ioannou@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: remove /proc/mac_iop
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:47 +0000 (01:01 -0700)]
proc: remove /proc/mac_iop

Entry creation was commented for a long time and right now it stands on
the way of ->get_info removal, so unless nobody objects...

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Simon Arlott <simon@fire.lp0.eu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Joern Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: switch /proc/apm to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:46 +0000 (01:01 -0700)]
proc: switch /proc/apm to seq_file interface

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: switch /proc/bus/zorro/devices to seq_file interface
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:45 +0000 (01:01 -0700)]
proc: switch /proc/bus/zorro/devices to seq_file interface

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Josef Sipek <jsipek@fsl.cs.sunysb.edu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: remove proc_root from drivers
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:44 +0000 (01:01 -0700)]
proc: remove proc_root from drivers

Remove proc_root export.  Creation and removal works well if parent PDE is
supplied as NULL -- it worked always that way.

So, one useless export removed and consistency added, some drivers created
PDEs with &proc_root as parent but removed them as NULL and so on.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: remove proc_root_driver
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:44 +0000 (01:01 -0700)]
proc: remove proc_root_driver

Use creation by full path: "driver/foo".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: remove proc_root_fs
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:42 +0000 (01:01 -0700)]
proc: remove proc_root_fs

Use creation by full path instead: "fs/foo".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: remove proc_bus
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:41 +0000 (01:01 -0700)]
proc: remove proc_bus

Remove proc_bus export and variable itself. Using pathnames works fine
and is slightly more understandable and greppable.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: drop several "PDE valid/invalid" checks
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:41 +0000 (01:01 -0700)]
proc: drop several "PDE valid/invalid" checks

proc-misc code is noticeably full of "if (de)" checks when PDE passed is
always valid.  Remove them.

Addition of such check in proc_lookup_de() is for failed lookup case.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: less special case in xlate code
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:40 +0000 (01:01 -0700)]
proc: less special case in xlate code

If valid "parent" is passed to proc_create/remove_proc_entry(), then name of
PDE should consist of only one path component, otherwise creation or or
removal will fail.  However, if NULL is passed as parent then create/remove
accept full path as a argument.  This is arbitrary restriction -- all
infrastructure is in place.

So, patch allows the following to succeed:

create_proc_entry("foo/bar", 0, pde_baz);
remove_proc_entry("baz/foo/bar", &proc_root);

Also makes the following to behave identically:

create_proc_entry("foo/bar", 0, NULL);
create_proc_entry("foo/bar", 0, &proc_root);

Discrepancy noticed by Den Lunev (IIRC).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: simplify locking in remove_proc_entry()
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:39 +0000 (01:01 -0700)]
proc: simplify locking in remove_proc_entry()

proc_subdir_lock protects only modifying and walking through PDE lists, so
after we've found PDE to remove and actually removed it from lists, there is
no need to hold proc_subdir_lock for the rest of operation.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoprocfs: mem permission cleanup
Roland McGrath [Tue, 29 Apr 2008 08:01:38 +0000 (01:01 -0700)]
procfs: mem permission cleanup

This cleans up the permission checks done for /proc/PID/mem i/o calls.  It
puts all the logic in a new function, check_mem_permission().

The old code repeated the (!MAY_PTRACE(task) || !ptrace_may_attach(task))
magical expression multiple times.  The new function does all that work in one
place, with clear comments.

The old code called security_ptrace() twice on successful checks, once in
MAY_PTRACE() and once in __ptrace_may_attach().  Now it's only called once,
and only if all other checks have succeeded.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: switch to proc_create()
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:37 +0000 (01:01 -0700)]
proc: switch to proc_create()

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoprocfs task exe symlink
Matt Helsley [Tue, 29 Apr 2008 08:01:36 +0000 (01:01 -0700)]
procfs task exe symlink

The kernel implements readlink of /proc/pid/exe by getting the file from
the first executable VMA.  Then the path to the file is reconstructed and
reported as the result.

Because of the VMA walk the code is slightly different on nommu systems.
This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
walking the VMAs to find the first executable file-backed VMA we store a
reference to the exec'd file in the mm_struct.

That reference would prevent the filesystem holding the executable file
from being unmounted even after unmapping the VMAs.  So we track the number
of VM_EXECUTABLE VMAs and drop the new reference when the last one is
unmapped.  This avoids pinning the mounted filesystem.

[akpm@linux-foundation.org: improve comments]
[yamamoto@valinux.co.jp: fix dup_mmap]
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: David Howells <dhowells@redhat.com>
Cc:"Eric W. Biederman" <ebiederm@xmission.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoproc: print more information when removing non-empty directories
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:35 +0000 (01:01 -0700)]
proc: print more information when removing non-empty directories

This usually saves one recompile to insert similar printk like below. :)

Sample nastygram:

remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar'
------------[ cut here ]------------
WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200()
Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor
Pid: 3034, comm: rmmod Tainted: G   M     2.6.25-rc1 #5

Call Trace:
 [<ffffffff80231974>] warn_on_slowpath+0x64/0x90
 [<ffffffff80232a6e>] printk+0x4e/0x60
 [<ffffffff802d6c8a>] remove_proc_entry+0x18a/0x200
 [<ffffffff8045cd88>] mutex_lock_nested+0x1c8/0x2d0
 [<ffffffff8025f0f0>] __try_stop_module+0x0/0x40
 [<ffffffff8025effd>] sys_delete_module+0x14d/0x200
 [<ffffffff8045df3d>] lockdep_sys_exit_thunk+0x35/0x67
 [<ffffffff8031c307>] __up_read+0x27/0xa0
 [<ffffffff8045decc>] trace_hardirqs_on_thunk+0x35/0x3a
 [<ffffffff8020b6ab>] system_call_after_swapgs+0x7b/0x80

---[ end trace 10ef850597e89c54 ]---

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: make key_serial() a function if CONFIG_KEYS=y
David Howells [Tue, 29 Apr 2008 08:01:34 +0000 (01:01 -0700)]
keys: make key_serial() a function if CONFIG_KEYS=y

Make key_serial() an inline function rather than a macro if CONFIG_KEYS=y.
This prevents double evaluation of the key pointer and also provides better
type checking.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: explicitly include required slab.h header file.
Robert P. J. Day [Tue, 29 Apr 2008 08:01:32 +0000 (01:01 -0700)]
keys: explicitly include required slab.h header file.

Since these two source files invoke kmalloc(), they should explicitly
include <linux/slab.h>.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: make the keyring quotas controllable through /proc/sys
David Howells [Tue, 29 Apr 2008 08:01:32 +0000 (01:01 -0700)]
keys: make the keyring quotas controllable through /proc/sys

Make the keyring quotas controllable through /proc/sys files:

 (*) /proc/sys/kernel/keys/root_maxkeys
     /proc/sys/kernel/keys/root_maxbytes

     Maximum number of keys that root may have and the maximum total number of
     bytes of data that root may have stored in those keys.

 (*) /proc/sys/kernel/keys/maxkeys
     /proc/sys/kernel/keys/maxbytes

     Maximum number of keys that each non-root user may have and the maximum
     total number of bytes of data that each of those users may have stored in
     their keys.

Also increase the quotas as a number of people have been complaining that it's
not big enough.  I'm not sure that it's big enough now either, but on the
other hand, it can now be set in /etc/sysctl.conf.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <kwc@citi.umich.edu>
Cc: <arunsr@cse.iitk.ac.in>
Cc: <dwalsh@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: don't generate user and user session keyrings unless they're accessed
David Howells [Tue, 29 Apr 2008 08:01:31 +0000 (01:01 -0700)]
keys: don't generate user and user session keyrings unless they're accessed

Don't generate the per-UID user and user session keyrings unless they're
explicitly accessed.  This solves a problem during a login process whereby
set*uid() is called before the SELinux PAM module, resulting in the per-UID
keyrings having the wrong security labels.

This also cures the problem of multiple per-UID keyrings sometimes appearing
due to PAM modules (including pam_keyinit) setuiding and causing user_structs
to come into and go out of existence whilst the session keyring pins the user
keyring.  This is achieved by first searching for extant per-UID keyrings
before inventing new ones.

The serial bound argument is also dropped from find_keyring_by_name() as it's
not currently made use of (setting it to 0 disables the feature).

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <kwc@citi.umich.edu>
Cc: <arunsr@cse.iitk.ac.in>
Cc: <dwalsh@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: allow clients to set key perms in key_create_or_update()
Arun Raghavan [Tue, 29 Apr 2008 08:01:28 +0000 (01:01 -0700)]
keys: allow clients to set key perms in key_create_or_update()

The key_create_or_update() function provided by the keyring code has a default
set of permissions that are always applied to the key when created.  This
might not be desirable to all clients.

Here's a patch that adds a "perm" parameter to the function to address this,
which can be set to KEY_PERM_UNDEF to revert to the current behaviour.

Signed-off-by: Arun Raghavan <arunsr@cse.iitk.ac.in>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: switch to proc_create()
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:27 +0000 (01:01 -0700)]
keys: switch to proc_create()

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: add keyctl function to get a security label
David Howells [Tue, 29 Apr 2008 08:01:26 +0000 (01:01 -0700)]
keys: add keyctl function to get a security label

Add a keyctl() function to get the security label of a key.

The following is added to Documentation/keys.txt:

 (*) Get the LSM security context attached to a key.

long keyctl(KEYCTL_GET_SECURITY, key_serial_t key, char *buffer,
    size_t buflen)

     This function returns a string that represents the LSM security context
     attached to a key in the buffer provided.

     Unless there's an error, it always returns the amount of data it could
     produce, even if that's too big for the buffer, but it won't copy more
     than requested to userspace. If the buffer pointer is NULL then no copy
     will take place.

     A NUL character is included at the end of the string if the buffer is
     sufficiently big.  This is included in the returned count.  If no LSM is
     in force then an empty string will be returned.

     A process must have view permission on the key for this function to be
     successful.

[akpm@linux-foundation.org: declare keyctl_get_security()]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Cc: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: allow the callout data to be passed as a blob rather than a string
David Howells [Tue, 29 Apr 2008 08:01:24 +0000 (01:01 -0700)]
keys: allow the callout data to be passed as a blob rather than a string

Allow the callout data to be passed as a blob rather than a string for
internal kernel services that call any request_key_*() interface other than
request_key().  request_key() itself still takes a NUL-terminated string.

The functions that change are:

request_key_with_auxdata()
request_key_async()
request_key_async_with_auxdata()

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: check starting keyring as part of search
Kevin Coffman [Tue, 29 Apr 2008 08:01:22 +0000 (01:01 -0700)]
keys: check starting keyring as part of search

Check the starting keyring as part of the search to (a) see if that is what
we're searching for, and (b) to check it is still valid for searching.

The scenario: User in process A does things that cause things to be created in
its process session keyring.  The user then does an su to another user and
starts a new process, B.  The two processes now share the same process session
keyring.

Process B does an NFS access which results in an upcall to gssd.  When gssd
attempts to instantiate the context key (to be linked into the process session
keyring), it is denied access even though it has an authorization key.

The order of calls is:

   keyctl_instantiate_key()
      lookup_user_key()     (the default: case)
         search_process_keyrings(current)
    search_process_keyrings(rka->context)   (recursive call)
       keyring_search_aux()

keyring_search_aux() verifies the keys and keyrings underneath the top-level
keyring it is given, but that top-level keyring is neither fully validated nor
checked to see if it is the thing being searched for.

This patch changes keyring_search_aux() to:
1) do more validation on the top keyring it is given and
2) check whether that top-level keyring is the thing being searched for

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Kevin Coffman <kwc@citi.umich.edu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agokeys: increase the payload size when instantiating a key
David Howells [Tue, 29 Apr 2008 08:01:19 +0000 (01:01 -0700)]
keys: increase the payload size when instantiating a key

Increase the size of a payload that can be used to instantiate a key in
add_key() and keyctl_instantiate_key().  This permits huge CIFS SPNEGO blobs
to be passed around.  The limit is raised to 1MB.  If kmalloc() can't allocate
a buffer of sufficient size, vmalloc() will be tried instead.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Kevin Coffman <kwc@citi.umich.edu>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoelf: fix shadowed variables in fs/binfmt_elf.c
WANG Cong [Tue, 29 Apr 2008 08:01:18 +0000 (01:01 -0700)]
elf: fix shadowed variables in fs/binfmt_elf.c

Fix these sparse warings:
fs/binfmt_elf.c:1749:29: warning: symbol 'tmp' shadows an earlier one
fs/binfmt_elf.c:1734:28: originally declared here
fs/binfmt_elf.c:2009:26: warning: symbol 'vma' shadows an earlier one
fs/binfmt_elf.c:1892:24: originally declared here

[akpm@linux-foundation.org: chose better variable name]
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoBINFMT: fill_elf_header cleanup - use straight memset first
Cyrill Gorcunov [Tue, 29 Apr 2008 08:01:18 +0000 (01:01 -0700)]
BINFMT: fill_elf_header cleanup - use straight memset first

This patch does simplify fill_elf_header function by setting
to zero the whole elf header first. So we fillup the fields
we really need only.

before:
   text    data     bss     dec     hex filename
  11735      80       0   11815    2e27 fs/binfmt_elf.o

after:
   text    data     bss     dec     hex filename
  11710      80       0   11790    2e0e fs/binfmt_elf.o

viola, 25 bytes of text is freed

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoELF: Use EI_NIDENT instead of numeric value
Cyrill Gorcunov [Tue, 29 Apr 2008 08:01:17 +0000 (01:01 -0700)]
ELF: Use EI_NIDENT instead of numeric value

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoipmi: fix return from atca_oem_poweroff_hook
Adrian Bunk [Tue, 29 Apr 2008 08:01:17 +0000 (01:01 -0700)]
ipmi: fix return from atca_oem_poweroff_hook

A void returning function returned the return value of another void
returning function...

Spotted by sparse.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoipmi: make alloc_recv_msg static
Adrian Bunk [Tue, 29 Apr 2008 08:01:14 +0000 (01:01 -0700)]
ipmi: make alloc_recv_msg static

Make the needlessly global ipmi_alloc_recv_msg() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoipmi: make comment match actual preprocessor check
Robert P. J. Day [Tue, 29 Apr 2008 08:01:14 +0000 (01:01 -0700)]
ipmi: make comment match actual preprocessor check

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoipmi: remove ->write_proc code
Alexey Dobriyan [Tue, 29 Apr 2008 08:01:13 +0000 (01:01 -0700)]
ipmi: remove ->write_proc code

IPMI code theoretically allows ->write_proc users, but nobody uses this thus
far.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoipmi: remove unused target and action in Makefile
Denis Cheng [Tue, 29 Apr 2008 08:01:13 +0000 (01:01 -0700)]
ipmi: remove unused target and action in Makefile

Kbuild system handles this automatically.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoIPMI: Style fixes in the misc code
Corey Minyard [Tue, 29 Apr 2008 08:01:12 +0000 (01:01 -0700)]
IPMI: Style fixes in the misc code

Lots of style fixes for the miscellaneous IPMI files.  No functional
changes.  Basically fixes everything reported by checkpatch and fixes the
comment style.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoIPMI: Style fixes in the system interface code
Corey Minyard [Tue, 29 Apr 2008 08:01:10 +0000 (01:01 -0700)]
IPMI: Style fixes in the system interface code

Lots of style fixes for the IPMI system interface driver.  No functional
changes.  Basically fixes everything reported by checkpatch and fixes the
comment style.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Rocky Craig <rocky.craig@hp.com>
Cc: Hannes Schulz <schulz@schwaar.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoipmi: style fixes in the base code
Corey Minyard [Tue, 29 Apr 2008 08:01:09 +0000 (01:01 -0700)]
ipmi: style fixes in the base code

Lots of style fixes for the base IPMI driver.  No functional changes.
Basically fixes everything reported by checkpatch and fixes the comment
style.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>