safe/jmp/linux-2.6
15 years agoSUNRPC: Ensure we close the socket on EPIPE errors too...
Trond Myklebust [Wed, 11 Mar 2009 19:29:24 +0000 (15:29 -0400)]
SUNRPC: Ensure we close the socket on EPIPE errors too...

As long as one task is holding the socket lock, then calls to
xprt_force_disconnect(xprt) will not succeed in shutting down the socket.
In particular, this would mean that a server initiated shutdown will not
succeed until the lock is relinquished.
In order to avoid the deadlock, we should ensure that xs_tcp_send_request()
closes the socket on EPIPE errors too.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: xs_tcp_connect_worker{4,6}: merge common code
Trond Myklebust [Wed, 11 Mar 2009 18:38:04 +0000 (14:38 -0400)]
SUNRPC: xs_tcp_connect_worker{4,6}: merge common code

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Add a sysctl to control the duration of the socket linger timeout
Trond Myklebust [Wed, 11 Mar 2009 18:38:03 +0000 (14:38 -0400)]
SUNRPC: Add a sysctl to control the duration of the socket linger timeout

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC sockets
Trond Myklebust [Wed, 11 Mar 2009 18:38:03 +0000 (14:38 -0400)]
SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC sockets

This fixes a regression against FreeBSD servers as reported by Tomas
Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers
don't ever react to the client closing the socket, and so commit
e06799f958bf7f9f8fae15f0c6f519953fb0257c (SUNRPC: Use shutdown() instead of
close() when disconnecting a TCP socket) causes the setup to hang forever
whenever the client attempts to close and then reconnect.

We break the deadlock by adding a 'linger2' style timeout to the socket,
after which, the client will abort the connection using a TCP 'RST'.

The default timeout is set to 15 seconds. A subsequent patch will put it
under user control by means of a systctl.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Ensure that xs_nospace return values are propagated
Trond Myklebust [Wed, 11 Mar 2009 18:38:01 +0000 (14:38 -0400)]
SUNRPC: Ensure that xs_nospace return values are propagated

If xs_nospace() finds that the socket has disconnected, it attempts to
return ENOTCONN, however that value is then squashed by the callers.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Delay, then retry on connection errors.
Trond Myklebust [Wed, 11 Mar 2009 18:38:01 +0000 (14:38 -0400)]
SUNRPC: Delay, then retry on connection errors.

Enforce the comment in xs_tcp_connect_worker4/xs_tcp_connect_worker6 that
we should delay, then retry on certain connection errors.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Return EAGAIN instead of ENOTCONN when waking up xprt->pending
Trond Myklebust [Wed, 11 Mar 2009 18:38:00 +0000 (14:38 -0400)]
SUNRPC: Return EAGAIN instead of ENOTCONN when waking up xprt->pending

While we should definitely return socket errors to the task that is
currently trying to send data, there is no need to propagate the same error
to all the other tasks on xprt->pending. Doing so actually slows down
recovery, since it causes more than one tasks to attempt socket recovery.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Handle socket errors correctly
Trond Myklebust [Wed, 11 Mar 2009 18:38:00 +0000 (14:38 -0400)]
SUNRPC: Handle socket errors correctly

Ensure that we pick up and handle socket errors as they occur.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Handle ECONNREFUSED correctly in xprt_transmit()
Trond Myklebust [Wed, 11 Mar 2009 18:37:59 +0000 (14:37 -0400)]
SUNRPC: Handle ECONNREFUSED correctly in xprt_transmit()

If we get an ECONNREFUSED error, we currently go to sleep on the
'xprt->sending' wait queue. The problem is that no timeout is set there,
and there is nothing else that will wake the task up later.

We should deal with ECONNREFUSED in call_status, given that is where we
also deal with -EHOSTDOWN, and friends.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Don't disconnect if a connection is still in progress.
Trond Myklebust [Wed, 11 Mar 2009 18:37:58 +0000 (14:37 -0400)]
SUNRPC: Don't disconnect if a connection is still in progress.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Ensure we set XPRT_CLOSING only after we've sent a tcp FIN...
Trond Myklebust [Wed, 11 Mar 2009 18:37:58 +0000 (14:37 -0400)]
SUNRPC: Ensure we set XPRT_CLOSING only after we've sent a tcp FIN...

...so that we can distinguish between when we need to shutdown and when we
don't. Also remove the call to xs_tcp_shutdown() from xs_tcp_connect(),
since xprt_connect() makes the same test.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Avoid an unnecessary task reschedule on ENOTCONN
Trond Myklebust [Wed, 11 Mar 2009 18:37:57 +0000 (14:37 -0400)]
SUNRPC: Avoid an unnecessary task reschedule on ENOTCONN

If the socket is unconnected, and xprt_transmit() returns ENOTCONN, we
currently give up the lock on the transport channel. Doing so means that
the lock automatically gets assigned to the next task in the xprt->sending
queue, and so that task needs to be woken up to do the actual connect.

The following patch aims to avoid that unnecessary task switch.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: load the rpc/rdma transport module automatically
Tom Talpey [Wed, 11 Mar 2009 18:37:56 +0000 (14:37 -0400)]
NFS: load the rpc/rdma transport module automatically

When mounting an NFS/RDMA server with the "-o proto=rdma" or
"-o rdma" options, attempt to dynamically load the necessary
"xprtrdma" client transport module. Doing so improves usability,
while avoiding a static module dependency and any unnecesary
resources.

Signed-off-by: Tom Talpey <tmtalpey@gmail.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: dynamically load RPC transport modules on-demand
Tom Talpey [Wed, 11 Mar 2009 18:37:56 +0000 (14:37 -0400)]
SUNRPC: dynamically load RPC transport modules on-demand

Provide an api to attempt to load any necessary kernel RPC
client transport module automatically. By convention, the
desired module name is "xprt"+"transport name". For example,
when NFS mounting with "-o proto=rdma", attempt to load the
"xprtrdma" module.

Signed-off-by: Tom Talpey <tmtalpey@gmail.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoXPRTRDMA: correct an rpc/rdma inline send marshaling error
Tom Talpey [Wed, 11 Mar 2009 18:37:55 +0000 (14:37 -0400)]
XPRTRDMA: correct an rpc/rdma inline send marshaling error

Certain client rpc's which contain both lengthy page-contained
metadata and a non-empty xdr_tail buffer require careful handling
to avoid overlapped memory copying. Rearranging of existing rpcrdma
marshaling code avoids it; this fixes an NFSv4 symlink creation error
detected with connectathon basic/test8 to multiple servers.

Signed-off-by: Tom Talpey <tmtalpey@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSVCRDMA: remove faulty assertions in rpc/rdma chunk validation.
Tom Talpey [Wed, 11 Mar 2009 18:37:55 +0000 (14:37 -0400)]
SVCRDMA: remove faulty assertions in rpc/rdma chunk validation.

Certain client-provided RPCRDMA chunk alignments result in an
additional scatter/gather entry, which triggered nfs/rdma server
assertions incorrectly. OpenSolaris nfs/rdma client connectathon
testing was blocked by these in the special/locking section.

Signed-off-by: Tom Talpey <tmtalpey@gmail.com>
Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: Kill the "defined but not used" compile error on nommu machines
Trond Myklebust [Wed, 11 Mar 2009 18:37:54 +0000 (14:37 -0400)]
NFS: Kill the "defined but not used" compile error on nommu machines

Bryan Wu reports that when compiling NFS on nommu machines he gets a
"defined but not used" error on nfs_file_mmap().

The easiest fix is simply to get rid of the special casing in NFS, and
just always call generic_file_mmap() to set up the file.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: Throttle page dirtying while we're flushing to disk
Trond Myklebust [Wed, 11 Mar 2009 18:10:30 +0000 (14:10 -0400)]
NFS: Throttle page dirtying while we're flushing to disk

The following patch is a combination of a patch by myself and Peter
Staubach.

Trond: If we allow other processes to dirty pages while a process is doing
a consistency sync to disk, we can end up never making progress.

Peter: Attached is a patch which addresses a continuing problem with
the NFS client generating out of order WRITE requests.  While
this is compliant with all of the current protocol
specifications, there are servers in the market which can not
handle out of order WRITE requests very well.  Also, this may
lead to sub-optimal block allocations in the underlying file
system on the server.  This may cause the read throughputs to
be reduced when reading the file from the server.

Peter: There has been a lot of work recently done to address out of
order issues on a systemic level.  However, the NFS client is
still susceptible to the problem.  Out of order WRITE
requests can occur when pdflush is in the middle of writing
out pages while the process dirtying the pages calls
generic_file_buffered_write which calls
generic_perform_write which calls
balance_dirty_pages_rate_limited which ends up calling
writeback_inodes which ends up calling back into the NFS
client to writes out dirty pages for the same file that
pdflush happens to be working with.

Signed-off-by: Peter Staubach <staubach@redhat.com>
[modification by Trond to merge the two similar patches]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: cleanup - remove struct nfs_inode->ncommit
Trond Myklebust [Wed, 11 Mar 2009 18:10:29 +0000 (14:10 -0400)]
NFS: cleanup - remove struct nfs_inode->ncommit

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFSv4: Simplify some cache consistency post-op GETATTRs
Trond Myklebust [Wed, 11 Mar 2009 18:10:28 +0000 (14:10 -0400)]
NFSv4: Simplify some cache consistency post-op GETATTRs

Certain asynchronous operations such as write() do not expect
(or care) that other metadata such as the file owner, mode, acls, ...
change. All they want to do is update and/or check the change attribute,
ctime, and mtime.
By skipping the file owner and group update, we also avoid having to do a
potential idmapper upcall for these asynchronous RPC calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFSv4: A referral is assumed to always point to a directory.
Trond Myklebust [Wed, 11 Mar 2009 18:10:28 +0000 (14:10 -0400)]
NFSv4: A referral is assumed to always point to a directory.

Fix a bug whereby we would fail to create a mount point for a referral.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFSv4: Make decode_getfattr() set fattr->valid to reflect what was decoded
Trond Myklebust [Wed, 11 Mar 2009 18:10:27 +0000 (14:10 -0400)]
NFSv4: Make decode_getfattr() set fattr->valid to reflect what was decoded

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFSv4: Clean up decode_getfattr()
Trond Myklebust [Wed, 11 Mar 2009 18:10:26 +0000 (14:10 -0400)]
NFSv4: Clean up decode_getfattr()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: Fix the type of struct nfs_fattr->mode
Trond Myklebust [Wed, 11 Mar 2009 18:10:26 +0000 (14:10 -0400)]
NFS: Fix the type of struct nfs_fattr->mode

There is no point in using anything other than umode_t, since we copy the
content pretty much directly into inode->i_mode.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: Shrink the struct nfs_fattr
Trond Myklebust [Wed, 11 Mar 2009 18:10:25 +0000 (14:10 -0400)]
NFS: Shrink the struct nfs_fattr

We don't need the bitmap[] field anymore, since the 'valid' field tells us
all we need to know about which attributes were filled in...
Also move the pre-op attributes in order to improve the structure packing.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFSv4: Support NFSv4 optional attributes in the struct nfs_fattr
Trond Myklebust [Wed, 11 Mar 2009 18:10:24 +0000 (14:10 -0400)]
NFSv4: Support NFSv4 optional attributes in the struct nfs_fattr

Currently, filling struct nfs_fattr is more or less an all or nothing
operation, since NFSv2 and NFSv3 have only mandatory attributes.
In NFSv4, some attributes are optional, and so we may simply not be able to
fill in those fields. Furthermore, NFSv4 allows you to specify which
attributes you are interested in retrieving, thus permitting you to
optimise away retrieval of attributes that you know will no change...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFSv4: Ignore errors on the post-op attributes in SETATTR calls
Trond Myklebust [Wed, 11 Mar 2009 18:10:23 +0000 (14:10 -0400)]
NFSv4: Ignore errors on the post-op attributes in SETATTR calls

There is no need to fail or retry a SETATTR call just because the post-op
GETATTR failed.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: flush cached directory information slightly more readily.
NeilBrown [Wed, 11 Mar 2009 18:10:23 +0000 (14:10 -0400)]
NFS: flush cached directory information slightly more readily.

If cached directory contents becomes incorrect, there is no way to
flush the contents.  This contrasts with files where file locking is
the recommended way to ensure cache consistency between multiple
applications (a read-lock always flushes the cache).

Also while changes to files often change the size of the file (thus
triggering a cache flush), changes to directories often do not change
the apparent size (as the size is often rounded to a block size).

So it is particularly important with directories to avoid the
possibility of an incorrect cache wherever possible.

When the link count on a directory changes it implies a change in the
number of child directories, and so a change in the contents of this
directory.  So use that as a trigger to flush cached contents.

When the ctime changes but the mtime does not, there are two possible
reasons.
 1/ The owner/mode information has been changed.
 2/ utimes has been used to set the mtime backwards.

In the first case, a data-cache flush is not required.
In the second case it is.

So on the basis that correctness trumps performance, flush the
directory contents cache in this case also.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: Minor __nfs_revalidate_inode cleanup
Suresh Jayaraman [Wed, 11 Mar 2009 18:10:22 +0000 (14:10 -0400)]
NFS: Minor __nfs_revalidate_inode cleanup

Remove redundant NFS_STALE() check, a leftover due to the commit
691beb13cdc88358334ef0ba867c080a247a760f

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Avoid spurious wake-up during UDP connect processing
Chuck Lever [Wed, 11 Mar 2009 18:10:21 +0000 (14:10 -0400)]
SUNRPC: Avoid spurious wake-up during UDP connect processing

To clear out old state, the UDP connect workers unconditionally invoke
xs_close() before proceeding with a new connect.  Nowadays this causes
a spurious wake-up of the task waiting for the connect to complete.

This is a little racey, but usually harmless.  The waiting task
immediately retries the connect via a call_bind/call_connect sequence,
which usually finds the transport already in the connected state
because the connect worker has finished in the background.

To avoid a spurious wake-up, factor the xs_close() logic that resets
the underlying socket into a helper, and have the UDP connect workers
call that helper instead of xs_close().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: xprt_connect() don't abort the task if the transport isn't bound
Trond Myklebust [Wed, 11 Mar 2009 18:09:39 +0000 (14:09 -0400)]
SUNRPC: xprt_connect() don't abort the task if the transport isn't bound

If the transport isn't bound, then we should just return ENOTCONN, letting
call_connect_status() and/or call_status() deal with retrying. Currently,
we appear to abort all pending tasks with an EIO error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Fix an Oops due to socket not set up yet...
Trond Myklebust [Wed, 11 Mar 2009 18:06:41 +0000 (14:06 -0400)]
SUNRPC: Fix an Oops due to socket not set up yet...

We can Oops in both xs_udp_send_request() and xs_tcp_send_request() if the
call to xs_sendpages() returns an error due to the socket not yet being
set up.
Deal with that situation by returning a new error: ENOTSOCK, so that we
know to avoid dereferencing transport->sock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoBug 11061, NFS mounts dropped
Ian Dall [Wed, 11 Mar 2009 00:33:22 +0000 (20:33 -0400)]
Bug 11061, NFS mounts dropped

Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=11061

sockaddr structures can't be reliably compared using memcmp() because
there are padding bytes in the structure which can't be guaranteed to
be the same even when the sockaddr structures refer to the same
socket. Instead compare all the relevant fields. In the case of IPv6
sin6_flowinfo is not compared because it only affects QoS and
sin6_scope_id is only compared if the address is "link local" because
"link local" addresses need only be unique to a specific link.

Signed-off-by: Ian Dall <ian@beware.dropbear.id.au>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: Handle -ESTALE error in access()
Suresh Jayaraman [Wed, 11 Mar 2009 00:33:21 +0000 (20:33 -0400)]
NFS: Handle -ESTALE error in access()

Hi Trond,

I have been looking at a bugreport where trying to open applications on KDE
on a NFS mounted home fails temporarily. There have been multiple reports on
different kernel versions pointing to this common issue:
http://bugzilla.kernel.org/show_bug.cgi?id=12557
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/269954
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=508866.html

This issue can be reproducible consistently by doing this on a NFS mounted
home (KDE):
1. Open 2 xterm sessions
2. From one of the xterm session, do "ssh -X <remote host>"
3. "stat ~/.Xauthority" on the remote SSH session
4. Close the two xterm sessions
5. On the server do a "stat ~/.Xauthority"
6. Now on the client, try to open xterm
This will fail.

Even if the filehandle had become stale, the NFS client should invalidate
the cache/inode and should repeat LOOKUP. Looking at the packet capture when
the failure occurs shows that there were two subsequent ACCESS() calls with
the same filehandle and both fails with -ESTALE error.

I have tested the fix below. Now the client issue a LOOKUP after the
ACCESS() call fails with -ESTALE. If all this makes sense to you, can you
consider this for inclusion?

Thanks,

If the server returns an -ESTALE error due to stale filehandle in response to
an ACCESS() call, we need to invalidate the cache and inode so that LOOKUP()
can be retried. Without this change, the nfs client retries ACCESS() with the
same filehandle, fails again and could lead to temporary failure of
applications running on nfs mounted home.

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNLM: Fix GRANT callback address comparison when IPv6 is enabled
Chuck Lever [Wed, 11 Mar 2009 00:33:20 +0000 (20:33 -0400)]
NLM: Fix GRANT callback address comparison when IPv6 is enabled

The NFS mount command may pass an AF_INET server address to lockd.  If
lockd happens to be using a PF_INET6 listener, the nlm_cmp_addr() in
nlmclnt_grant() will fail to match requests from that host because they
will all have a mapped IPv4 AF_INET6 address.

Adopt the same solution used in nfs_sockaddr_match_ipaddr() for NFSv4
callbacks: if either address is AF_INET, map it to an AF_INET6 address
before doing the comparison.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNLM: Shrink the IPv4-only version of nlm_cmp_addr()
Chuck Lever [Wed, 11 Mar 2009 00:33:19 +0000 (20:33 -0400)]
NLM: Shrink the IPv4-only version of nlm_cmp_addr()

Clean up/micro-optimatization:  Make the AF_INET-only version of
nlm_cmp_addr() smaller.  This matches the style of
nlm_privileged_requester(), and makes the AF_INET-only version of
nlm_cmp_addr() nearly the same size as it was before IPv6 support.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFSv3: Fix posix ACL code
Trond Myklebust [Wed, 11 Mar 2009 00:33:18 +0000 (20:33 -0400)]
NFSv3: Fix posix ACL code

Fix a memory leak due to allocation in the XDR layer. In cases where the
RPC call needs to be retransmitted, we end up allocating new pages without
clearing the old ones. Fix this by moving the allocation into
nfs3_proc_setacls().

Also fix an issue discovered by Kevin Rudd, whereby the amount of memory
reserved for the acls in the xdr_buf->head was miscalculated, and causing
corruption.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoNFS: Fix misparsing of nfsv4 fs_locations attribute (take 2)
Trond Myklebust [Wed, 11 Mar 2009 00:33:17 +0000 (20:33 -0400)]
NFS: Fix misparsing of nfsv4 fs_locations attribute (take 2)

The changeset ea31a4437c59219bf3ea946d58984b01a45a289c (nfs: Fix
misparsing of nfsv4 fs_locations attribute) causes the mountpath that is
calculated at the beginning of try_location() to be clobbered when we
later strncpy a non-nul terminated hostname using an incorrect buffer
length.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoSUNRPC: Tighten up the task locking rules in __rpc_execute()
Trond Myklebust [Wed, 11 Mar 2009 00:33:16 +0000 (20:33 -0400)]
SUNRPC: Tighten up the task locking rules in __rpc_execute()

We should probably not be testing any flags after we've cleared the
RPC_TASK_RUNNING flag, since rpc_make_runnable() is then free to assign the
rpc_task to another workqueue, which may then destroy it.

We can fix any races with rpc_make_runnable() by ensuring that we only
clear the RPC_TASK_RUNNING flag while holding the rpc_wait_queue->lock that
the task is supposed to be sleeping on (and then checking whether or not
the task really is sleeping).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
15 years agoi810: fix kernel crash fix when struct fb_var_screeninfo is supplied
Samuel CUELLA [Tue, 10 Mar 2009 19:56:00 +0000 (12:56 -0700)]
i810: fix kernel crash fix when struct fb_var_screeninfo is supplied

Prevent the kernel from being crashed by a divide-by-zero operation when
supplied an incorrectly filled 'struct fb_var_screeninfo' from userland.

Previously i810_main.c:1005 (i810_check_params) was using the global
'yres' symbol previously defined at i810_main.c:145 as a module parameter
value holder (i810_main.c:2174).  If i810fb is compiled-in or if this
param doesn't get a default value, this direct usage leads to a
divide-by-zero at i810_main.c:1005 (i810_check_params).  The patch simply
replace the 'yres' global, perhaps undefined symbol usage by a given
parameter structure lookup.

This problem occurs with directfb, mplayer -vo fbdev, SDL library.
It was also reported ( but non solved ) at:

http://mail.directfb.org/pipermail/directfb-dev/2008-March/004050.html

Signed-off-by: Samuel CUELLA <samuel.cuella@supinfo.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agom68knommu: m528x build fix
Steven King [Tue, 10 Mar 2009 19:55:58 +0000 (12:55 -0700)]
m68knommu: m528x build fix

There isn't any mcfqspi.h in the tree, and without it everything inside the
#ifdef CONFIG_SPI is uncompilable.

Signed-off-by: Steven King <sfking@fdwdc.com>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agom68knommu: m5206e build fix
Steven King [Tue, 10 Mar 2009 19:55:57 +0000 (12:55 -0700)]
m68knommu: m5206e build fix

Signed-off-by: Steven King <sfking@fdwdc.com>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agorcu: documentation 1Q09 update
Paul E. McKenney [Tue, 10 Mar 2009 19:55:57 +0000 (12:55 -0700)]
rcu: documentation 1Q09 update

Update the RCU documentation to call out the need for callers of
primitives like call_rcu() and synchronize_rcu() to prevent subsequent RCU
readers from hazard.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agokernel/user.c: fix a memory leak when freeing up non-init usernamespaces users
Dhaval Giani [Tue, 10 Mar 2009 19:55:56 +0000 (12:55 -0700)]
kernel/user.c: fix a memory leak when freeing up non-init usernamespaces users

We were returning early in the sysfs directory cleanup function if the
user belonged to a non init usernamespace.  Due to this a lot of the
cleanup was not done and we were left with a leak.  Fix the leak.

Reported-by: Serge Hallyn <serue@linux.vnet.ibm.com>
Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomtd: physmap: fix NULL pointer dereference in error path
Atsushi Nemoto [Tue, 10 Mar 2009 19:55:55 +0000 (12:55 -0700)]
mtd: physmap: fix NULL pointer dereference in error path

commit e480814f138cd5d78a8efe397756ba6b6518fdb6 ("[MTD] [MAPS] physmap:
fix wrong free and del_mtd_{partition,device}") introduces a NULL pointer
dereference in physmap_flash_remove when called from the error path in
physmap_flash_probe (if map_probe failed).

Call del_mtd_{partition,device} only if info->cmtd was not NULL.

Reported-by: pHilipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agointel-agp: fix a panic with 1M of shared memory, no GTT entries
Lubomir Rintel [Tue, 10 Mar 2009 19:55:54 +0000 (12:55 -0700)]
intel-agp: fix a panic with 1M of shared memory, no GTT entries

When GTT size is equal to amount of video memory, the amount of GTT
entries is computed lower than zero, which is invalid and leads to
off-by-one error in intel_i915_configure()

Originally posted here:
http://bugzilla.kernel.org/show_bug.cgi?id=12539
http://bugzilla.redhat.com/show_bug.cgi?id=445592

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: Lubomir Rintel <lkundrak@v3.sk>
Cc: Dave Airlie <airlied@linux.ie>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomtd_dataflash: fix probing of AT45DB321C chips.
Will Newton [Tue, 10 Mar 2009 19:55:53 +0000 (12:55 -0700)]
mtd_dataflash: fix probing of AT45DB321C chips.

Commit 771999b65f79264acde4b855e5d35696eca5e80c ("[MTD] DataFlash: bugfix,
binary page sizes now handled") broke support for probing AT45DB321C flash
chips.  These chips do not support the "page size" status bit, so if we
match the JEDEC id return early.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Will Newton <will.newton@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoidr: make idr_remove_all() do removal -before- free_layer()
Paul E. McKenney [Tue, 10 Mar 2009 19:55:52 +0000 (12:55 -0700)]
idr: make idr_remove_all() do removal -before- free_layer()

Fix a problem in the IDR system, where an idr_remove_all() hands a data
element to call_rcu() (via free_layer()) before making that data element
inaccessible to new readers.  This is very bad, and results in readers
still having a reference to this data element at the end of the grace
period.

Tests on large machines that concurrently map and unmap user-space memory
within the same multithreaded process result in crashes within about five
minutes.  Applying this patch increases the kernel's longevity to the
three-to-eight-hour range.

There appear to be other similar problems in idr_get_empty_slot() and
sub_remove(), but I fixed the easy one in idr_remove_all() first.  It is
therefore no surprise that failures still occur.

Located-by: Milton Miller II <miltonm@austin.ibm.com>
Tested-by: Milton Miller II <miltonm@austin.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agodevpts: remove graffiti
Alexey Dobriyan [Tue, 10 Mar 2009 19:55:51 +0000 (12:55 -0700)]
devpts: remove graffiti

Very annoying when working with containters.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agox86/agp: tighten check to update amd nb aperture
Yinghai Lu [Tue, 10 Mar 2009 19:55:50 +0000 (12:55 -0700)]
x86/agp: tighten check to update amd nb aperture

Impact: fix bug to make agp work with dri

Jeffrey reported that dri does work with 64bit, but doesn't work with
32bit it turns out NB aperture is 32M, aperture on agp is 128M

64bit is using 64M for vaidation for 64 iommu/gart 32bit is only using
32M..., and will not update the nb aperture.

So try to compare nb apterture and agp apterture before leaving not
touch nb aperture.

Reported-by: Jeffrey Trull <jetrull@sbcglobal.net>
Tested-by: Jeffrey Trull <jetrull@sbcglobal.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Dave Airlie <airlied@linux.ie>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoxtensa: fix compilation somewhat
Alexey Dobriyan [Tue, 10 Mar 2009 19:55:49 +0000 (12:55 -0700)]
xtensa: fix compilation somewhat

* ->put_char changes
 * HIGHMEM is bogus it seems, there is no kmap_atomic() et al
 * some includes

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Chris Zankel <zankel@tensilica.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolm85: add VRM10 support for adt7468 chip
Darrick J. Wong [Tue, 10 Mar 2009 19:55:48 +0000 (12:55 -0700)]
lm85: add VRM10 support for adt7468 chip

The adt7468 chip supports VRM10 sensors just like the adt7463; add a
missing check for it.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolm85: fix the version check that broke adt7468 probing
Darrick J. Wong [Tue, 10 Mar 2009 19:55:47 +0000 (12:55 -0700)]
lm85: fix the version check that broke adt7468 probing

The verstep check in the lm85 driver fails because the upper nibble of
the version register is 0x7, not 0x6, on the adt7468 chip.  Probing of
all adt7468s was broken by 69fc1feba2d5856ff74dedb6ae9d8c490210825c
("hwmon: (lm85) Rework the device detection"), and this patch fixes
that.  Also add in a missing i2c_device_id that accidentally got dropped
from the original patch.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomenu: fix embedded menu snafu
Randy Dunlap [Tue, 10 Mar 2009 19:55:46 +0000 (12:55 -0700)]
menu: fix embedded menu snafu

The COMPAT_BRK kconfig symbol does not depend on EMBEDDED, but it is in
the midst of the EMBEDDED menu symbols, so it mucks up the EMBEDDED menu.
Fix by moving it to just after all of the EMBEDDED menu symbols.  Also,
ANON_INODES has a similar problem, so move it to just above the EMBEDDED
menu items since it is used in the EMBEDDED menu.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agomm: get_nid_for_pfn() returns int
Roel Kluin [Tue, 10 Mar 2009 19:55:45 +0000 (12:55 -0700)]
mm: get_nid_for_pfn() returns int

get_nid_for_pfn() returns int

Presumably the (nid < 0) case has never happened.

We do know that it is happening on one system while creating a symlink for
a memory section so it should also happen on the same system if
unregister_mem_sect_under_nodes() were called to remove the same symlink.

The test was actually added in response to a problem with an earlier
version reported by Yasunori Goto where one or more of the leading pages
of a memory section on the 2nd node of one of his systems was
uninitialized because I believe they coincided with a memory hole.

That earlier version did not ignore uninitialized pages and determined
the nid by considering only the 1st page of each memory section.  This
caused the symlink to the 1st memory section on the 2nd node to be
incorrectly created in /sys/devices/system/node/node0 instead of
/sys/devices/system/node/node1.  The problem was fixed by adding the
test to skip over uninitialized pages.

I suspect we have not seen any reports of the non-removal
of a symlink due to the incorrect declaration of the nid
variable in unregister_mem_sect_under_nodes() because
  - systems where a memory section could have an uninitialized
    range of leading pages are probably rare.
  - memory remove is probably not done very frequently on the
    systems that are capable of demonstrating the problem.
  - lingering symlink(s) that should have been removed may
    have simply gone unnoticed.

[garyhade@us.ibm.com: wrote changelog]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 10 Mar 2009 19:03:30 +0000 (12:03 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86 mmiotrace: fix remove_kmmio_fault_pages()

15 years agoMerge branch 'sh/for-2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal...
Linus Torvalds [Tue, 10 Mar 2009 16:31:19 +0000 (09:31 -0700)]
Merge branch 'sh/for-2.6.29' of git://git./linux/kernel/git/lethal/sh-2.6

* 'sh/for-2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  video: deferred io cleanup fix for sh_mobile_lcdcfb
  sh: Add media/soc_camera.h to board setup of Renesas AP325RXA

15 years agovideo: deferred io cleanup fix for sh_mobile_lcdcfb
Magnus Damm [Tue, 10 Mar 2009 06:08:49 +0000 (06:08 +0000)]
video: deferred io cleanup fix for sh_mobile_lcdcfb

Fix deferred io cleanup patch in the sh_mobile_lcdcfb driver.

If probe() fails early the sh_mobile_lcdc_stop() function will
be called to clean up deferred io. This patch modifies the
code to only call fb_deferred_io_cleanup() after deferred io
has been initialized.

With this patch applied we no longer hit BUG_ON() inside
fb_deferred_io_cleanup(). Triggers on a Migo-R with the
SYS QVGA panel board unmounted.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
15 years agosh: Add media/soc_camera.h to board setup of Renesas AP325RXA
Nobuhiro Iwamatsu [Fri, 6 Mar 2009 02:51:14 +0000 (02:51 +0000)]
sh: Add media/soc_camera.h to board setup of Renesas AP325RXA

Other compilation errors were revised by commit of
"sh: ap325rxa: Revert ov772x support"
(08c2f5b4d76f83213e379b12df504269d21c9e7c) but other compilation
errors are given.
We revert this commit and need to add new header(media/soc_camera.h).
This change revises new compilation error.

Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
15 years agoMerge branch 'for-linus' of git://neil.brown.name/md
Linus Torvalds [Tue, 10 Mar 2009 03:50:11 +0000 (20:50 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md

* 'for-linus' of git://neil.brown.name/md:
  md: fix deadlock when stopping arrays

15 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
Linus Torvalds [Mon, 9 Mar 2009 20:23:59 +0000 (13:23 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/davej/cpufreq

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.
  Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."

15 years agocopy_process: fix CLONE_PARENT && parent_exec_id interaction
Oleg Nesterov [Mon, 2 Mar 2009 21:58:45 +0000 (22:58 +0100)]
copy_process: fix CLONE_PARENT && parent_exec_id interaction

CLONE_PARENT can fool the ->self_exec_id/parent_exec_id logic. If we
re-use the old parent, we must also re-use ->parent_exec_id to make
sure exit_notify() sees the right ->xxx_exec_id's when the CLONE_PARENT'ed
task exits.

Also, move down the "p->parent_exec_id = p->self_exec_id" thing, to place
two different cases together.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years ago[CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.
Dave Jones [Mon, 9 Mar 2009 19:14:37 +0000 (15:14 -0400)]
[CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
15 years agoRevert "[CPUFREQ] Disable sysfs ui for p4-clockmod."
Dave Jones [Mon, 9 Mar 2009 19:07:33 +0000 (15:07 -0400)]
Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."

This reverts commit e088e4c9cdb618675874becb91b2fd581ee707e6.

Removing the sysfs interface for p4-clockmod was flagged as a
regression in bug 12826.

Course of action:
 - Find out the remaining causes of overheating, and fix them
   if possible. ACPI should be doing the right thing automatically.
   If it isn't, we need to fix that.
 - mark p4-clockmod ui as deprecated
 - try again with the removal in six months.

It's not really feasible to printk about the deprecation, because
it needs to happen at all the sysfs entry points, which means adding
a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.

Signed-off-by: Dave Jones <davej@redhat.com>
15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Mon, 9 Mar 2009 16:15:40 +0000 (09:15 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
  p54: fix race condition in memory management
  cfg80211: test before subtraction on unsigned
  iwlwifi: fix error flow in iwl*_pci_probe
  rt2x00 : more devices to rt73usb.c
  rt2x00 : more devices to rt2500usb.c
  bonding: Fix device passed into ->ndo_neigh_setup().
  vlan: Fix vlan-in-vlan crashes.
  net: Fix missing dev->neigh_setup in register_netdevice().
  tmspci: fix request_irq race
  pkt_sched: act_police: Fix a rate estimator test.
  tg3: Fix 5906 link problems
  SCTP: change sctp_ctl_sock_init() to try IPv4 if IPv6 fails
  IPv6: add "disable" module parameter support to ipv6.ko
  sungem: another error printed one too early
  aoe: error printed 1 too early
  net pcmcia: worklimit reaches -1
  net: more timeouts that reach -1
  net: fix tokenring license
  dm9601: new vendor/product IDs
  netlink: invert error code in netlink_set_err()
  ...

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Linus Torvalds [Mon, 9 Mar 2009 16:14:17 +0000 (09:14 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: fix for CONFIG_SPARSE_IRQ=y
  lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
Linus Torvalds [Mon, 9 Mar 2009 16:13:16 +0000 (09:13 -0700)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: fix spinlock assertions on UP systems

15 years agoBtrfs: fix spinlock assertions on UP systems
Chris Mason [Mon, 9 Mar 2009 15:45:38 +0000 (11:45 -0400)]
Btrfs: fix spinlock assertions on UP systems

btrfs_tree_locked was being used to make sure a given extent_buffer was
properly locked in a few places.  But, it wasn't correct for UP compiled
kernels.

This switches it to using assert_spin_locked instead, and renames it to
btrfs_assert_tree_locked to better reflect how it was really being used.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
15 years agoFix fixpoint divide exception in acct_update_integrals
Heiko Carstens [Mon, 9 Mar 2009 12:31:59 +0000 (13:31 +0100)]
Fix fixpoint divide exception in acct_update_integrals

Frans Pop reported the crash below when running an s390 kernel under Hercules:

  Kernel BUG at 000738b4  verbose debug info unavailable!
  fixpoint divide exception: 0009  #1! SMP
  Modules linked in: nfs lockd nfs_acl sunrpc ctcm fsm tape_34xx
     cu3088 tape ccwgroup tape_class ext3 jbd mbcache dm_mirror dm_log dm_snapshot
     dm_mod dasd_eckd_mod dasd_mod
  CPU: 0 Not tainted 2.6.27.19 #13
  Process awk (pid: 2069, task: 0f9ed9b8, ksp: 0f4f7d18)
  Krnl PSW : 070c1000 800738b4 (acct_update_integrals+0x4c/0x118)
             R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
  Krnl GPRS: 00000000 000007d0 7fffffff fffff830
             00000000 ffffffff 00000002 0f9ed9b8
             00000000 00008ca0 00000000 0f9ed9b8
             0f9edda4 8007386e 0f4f7ec8 0f4f7e98
  Krnl Code: 800738aaa71807d0         lhi     %r1,2000
             800738ae8c200001         srdl    %r2,1
             800738b2: 1d21             dr      %r2,%r1
            >800738b45810d10e         l       %r1,270(%r13)
             800738b8: 1823             lr      %r2,%r3
             800738ba4130f060         la      %r3,96(%r15)
             800738be: 0de1             basr    %r14,%r1
             800738c05800f060         l       %r0,96(%r15)
  Call Trace:
  ( <000000000004fdea>! blocking_notifier_call_chain+0x1e/0x2c)
    <0000000000038502>! do_exit+0x106/0x7c0
    <0000000000038c36>! do_group_exit+0x7a/0xb4
    <0000000000038c8e>! SyS_exit_group+0x1e/0x30
    <0000000000021c28>! sysc_do_restart+0x12/0x16
    <0000000077e7e924>! 0x77e7e924

Reason for this is that cpu time accounting usually only happens from
interrupt context, but acct_update_integrals gets also called from
process context with interrupts enabled.

So in acct_update_integrals we may end up with the following scenario:

Between reading tsk->stime/tsk->utime and tsk->acct_timexpd an interrupt
happens which updates accouting values.  This causes acct_timexpd to be
greater than the former stime + utime.  The subsequent calculation of

dtime = cputime_sub(time, tsk->acct_timexpd);

will be negative and the division performed by

cputime_to_jiffies(dtime)

will generate an exception since the result won't fit into a 32 bit
register.

In order to fix this just always disable interrupts while accessing any
of the accounting values.

Reported by: Frans Pop <elendil@planet.nl>
Tested by: Frans Pop <elendil@planet.nl>
Cc: stable@kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agolguest: fix for CONFIG_SPARSE_IRQ=y
Rusty Russell [Mon, 9 Mar 2009 16:06:28 +0000 (10:06 -0600)]
lguest: fix for CONFIG_SPARSE_IRQ=y

Impact: remove lots of lguest boot WARN_ON() when CONFIG_SPARSE_IRQ=y

We now need to call irq_to_desc_alloc_cpu() before
set_irq_chip_and_handler_name(), but we can't do that from init_IRQ (no
kmalloc available).

So do it as we use interrupts instead.  Also means we only alloc for
irqs we use, which was the intent of CONFIG_SPARSE_IRQ anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
15 years agolguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'
Rusty Russell [Mon, 9 Mar 2009 16:06:22 +0000 (10:06 -0600)]
lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'

Impact: fix lguest boot crash on modern Intel machines

The code in early_init_intel does:

if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
u64 misc_enable;

rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);

And that rdmsr faults (not allowed from non-0 PL).  We can get around
this by mugging the family ID part of the cpuid.  5 seems like a good
number.

Of course, this is a hack (how very lguest!).  We could just indicate
that we don't support MSRs, or implement lguest_rdmst.

Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Patrick McHardy <kaber@trash.net>
15 years agox86 mmiotrace: fix remove_kmmio_fault_pages()
Stuart Bennett [Sun, 8 Mar 2009 18:21:35 +0000 (20:21 +0200)]
x86 mmiotrace: fix remove_kmmio_fault_pages()

Impact: fix race+crash in mmiotrace

The list manipulation in remove_kmmio_fault_pages() was broken. If more
than one consecutive kmmio_fault_page was re-added during the grace
period between unregister_kmmio_probe() and remove_kmmio_fault_pages(),
the list manipulation failed to remove pages from the release list.

After a second grace period the pages get into rcu_free_kmmio_fault_pages()
and raise a BUG_ON() kernel crash.

The list manipulation is fixed to properly remove pages from the release
list.

This bug has been present from the very beginning of mmiotrace in the
mainline kernel. It was introduced in 0fd0e3da ("x86: mmiotrace full
patch, preview 1");

An urgent fix for Linus. Tested by Stuart (on 32-bit) and Pekka
(on amd and intel 64-bit systems, nouveau and nvidia proprietary).

Signed-off-by: Stuart Bennett <stuart@freedesktop.org>
Signed-off-by: Pekka Paalanen <pq@iki.fi>
LKML-Reference: <20090308202135.34933feb@daedalus.pq.iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
Linus Torvalds [Sun, 8 Mar 2009 17:37:57 +0000 (10:37 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/drzeus/mmc

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
  mmc: fix data timeout for SEND_EXT_CSD

15 years agoMerge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 8 Mar 2009 17:30:18 +0000 (10:30 -0700)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: increment quiescent state counter in ksoftirqd()

15 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 8 Mar 2009 17:27:13 +0000 (10:27 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs()
  x86, bts: remove bad warning
  x86: add Dell XPS710 reboot quirk
  x86, math-emu: fix init_fpu for task != current
  x86: EFI: Back efi_ioremap with init_memory_mapping instead of FIX_MAP
  x86: fix DMI on EFI

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
Linus Torvalds [Sun, 8 Mar 2009 17:25:13 +0000 (10:25 -0700)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  [WATCHDOG] orion5x_wdt.c: 'ORION5X_TCLK' undeclared
  [WATCHDOG] gef_wdt.c: fsl_get_sys_freq() failure not noticed
  [WATCHDOG] ks8695_wdt.c: 'CLOCK_TICK_RATE' undeclared
  [WATCHDOG] rc32434_wdt: fix sections
  [WATCHDOG] rc32434_wdt: fix watchdog driver

15 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Linus Torvalds [Sun, 8 Mar 2009 17:24:57 +0000 (10:24 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix ext4_free_inode() vs. ext4_claim_inode() race

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney...
Linus Torvalds [Sun, 8 Mar 2009 17:24:39 +0000 (10:24 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/cooloney/blackfin-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (28 commits)
  Blackfin arch: SPI_MMC is now mainlined MMC_SPI
  Blackfin arch: disable legacy /proc/scsi/ support by default
  Blackfin arch: remove duplicated ANOMALY_05000448 ifdef check
  Blackfin arch: add stubs for anomalies 447 and 448
  Blackfin arch: cleanup bfin_sport.h header and export it to userspace
  Blackfin arch: fix bug - gdb signull case make trunk kernel panic frequently
  Blackfin arch: remove spurious dash when dcache is off
  Blackfin arch: mark init_pda as __init as only __init funcs all it
  Blackfin arch: fix bug - On bf548-ezkit, ethernet fails to work after wakeup from "mem"
  Blackfin arch: Random read/write errors are a bad thing
  Blackfin arch: update default kernel config, select KSZ8893M driver for BF518
  Blackfin arch: Fix bug - KGDB single step into the middle of a 4 bytes instruction on bf561 after soft bp is hit
  Blackfin arch: Fix bug - make ksz8893m driver available when bfin_mac is enabled
  Blackfin arch: make sure people do not set the kernel load address too high
  Blackfin arch: fix bug - The SPORT_HYS bit is not set for BF561 0.5
  Blackfin arch: update anomaly sheets to match latest public info
  Blackfin arch: Fix BUG - kernel fails to build in pm.c when allow wakeup fromi standby by GPIO
  Blackfin arch: PM_BFIN_WAKE_GP: update help
  Blackfin arch: fix bug - kgdb fails to continue after setting breakpoint on bf561-ezkit kernel with smp patch
  Blackfin arch: Enable Write Back Cache on all Blackfin Boards
  ...

15 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
Linus Torvalds [Sun, 8 Mar 2009 17:23:05 +0000 (10:23 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/djbw/async_tx

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  dmatest: fix use after free in dmatest_exit
  ipu_idmac: fix spinlock type
  iop-adma, mv_xor: fix mem leak on self-test setup failure
  fsldma: fix off by one in dma_halt
  I/OAT: fail self-test if callback test reaches timeout
  I/OAT: update driver version and copyright dates
  I/OAT: list usage cleanup
  I/OAT: set tcp_dma_copybreak to 256k for I/OAT ver.3
  I/OAT: cancel watchdog before dma remove
  I/OAT: fail initialization on zero channels detection
  I/OAT: do not set DCACTRL_CMPL_WRITE_ENABLE for I/OAT ver.3
  I/OAT: add verification for proper APICID_TAG_MAP setting by BIOS
  dmaengine: update kerneldoc

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
Linus Torvalds [Sun, 8 Mar 2009 17:22:22 +0000 (10:22 -0700)]
Merge git://git./linux/kernel/git/bart/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ata: add CFA specific identify data words
  remove stale comment from <linux/hdreg.h>
  AT91: initialize Compact Flash on AT91SAM9263 cpu
  ide: add at91_ide driver
  ide: allow to wrap interrupt handler
  ide-iops: fix odd-length ATAPI PIO transfers
  ide: NULL noise: drivers/ide/ide-*.c
  ide: expiry() returns int, negative expiry() return values won't be noticed

15 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzi...
Linus Torvalds [Sun, 8 Mar 2009 17:22:01 +0000 (10:22 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: Don't trust current capacity values in identify words 57-58
  libata: make sure port is thawed when skipping resets
  sata_nv: fix module parameter description
  ahci: Add the Device IDs for MCP89 and remove IDs of MCP7B to/from ahci.c
  libata: don't use on-stack sense buffer
  libata: align ap->sector_buf
  libata: fix dma_unmap_sg misuse
  libata: change drive ready wait after hard reset to 5s

15 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus
Linus Torvalds [Sun, 8 Mar 2009 17:21:31 +0000 (10:21 -0700)]
Merge git://git./linux/kernel/git/pkl/squashfs-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
  Squashfs: frag_size should be signed, as it can hold an error result
  Squashfs: fix documentation typo, Cramfs filesystem limit is 256 MiB
  Squashfs: Fix oops when reading fsfuzzer corrupted filesystems

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Sun, 8 Mar 2009 17:21:10 +0000 (10:21 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  smack: fixes for unlabeled host support

15 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sun, 8 Mar 2009 17:14:19 +0000 (10:14 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: serio - fix protocol number for TouchIT213

15 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
Linus Torvalds [Sun, 8 Mar 2009 17:13:28 +0000 (10:13 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] fix PCI DMA flag propagation on SN (Altix) with PICs

15 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
Linus Torvalds [Sun, 8 Mar 2009 17:08:57 +0000 (10:08 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix missing bio back/front segment size setting in blk_recount_segments()
  loop: don't increment p->offset with (size_t) -EINVAL
  cciss: remove 30 second initial timeout on controller reset
  Fix kernel NULL pointer dereference in xen-blkfront

15 years agoMerge branch 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
Linus Torvalds [Sun, 8 Mar 2009 17:03:31 +0000 (10:03 -0700)]
Merge branch 'fix/hda' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: hda - Fix headphone-detect regression with multiple HP jacks
  ALSA: hda - Fix typos in slave controls in patch_sigmatel.c

15 years agoMIPS: compat: Implement is_compat_task.
Ralf Baechle [Thu, 5 Mar 2009 10:45:48 +0000 (11:45 +0100)]
MIPS: compat: Implement is_compat_task.

This is a build fix required after "x86-64: seccomp: fix 32/64 syscall
hole" (commit 5b1017404aea6d2e552e991b3fd814d839e9cd67).  MIPS doesn't
have the issue that was fixed for x86-64 by that patch.

This also doesn't solve the N32 issue which is that N32 seccomp processes
will be treated as non-compat processes thus only have access to N64
syscalls.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
15 years agommc: fix data timeout for SEND_EXT_CSD
Adrian Hunter [Tue, 10 Feb 2009 14:32:33 +0000 (16:32 +0200)]
mmc: fix data timeout for SEND_EXT_CSD

Commit 0d3e0460f307e84904968aad6cff97bd688583d8
"MMC: CSD and CID timeout values" inadvertently broke
the timeout for the MMC command SEND_EXT_CSD.

This patch puts it back again.

Depending on the characteristics of the controller,
this bug may prevent the use of MMC cards.

Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
15 years agoInput: serio - fix protocol number for TouchIT213
Dmitry Torokhov [Sat, 7 Mar 2009 21:39:22 +0000 (13:39 -0800)]
Input: serio - fix protocol number for TouchIT213

Protocol 0x37 has been reserved for iNexio devices and Sahara
was supposed to get 0x38.

Reported-by: Claudio Nieder <private@claudio.ch>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
15 years agop54: fix race condition in memory management
Christian Lamparter [Thu, 5 Mar 2009 23:53:59 +0000 (00:53 +0100)]
p54: fix race condition in memory management

This patch fixes a number of race conditions in the driver.
Up until now, "entry" pointer was initialized before acquiring the right lock.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agocfg80211: test before subtraction on unsigned
Roel Kluin [Tue, 3 Mar 2009 21:55:21 +0000 (22:55 +0100)]
cfg80211: test before subtraction on unsigned

freq_diff is unsigned, so test before subtraction

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years ago[IA64] fix PCI DMA flag propagation on SN (Altix) with PICs
Jeremy Higdon [Wed, 4 Mar 2009 20:09:46 +0000 (12:09 -0800)]
[IA64] fix PCI DMA flag propagation on SN (Altix) with PICs

We recently discovered a problem with passing of DMA attributes on SN
systems with the older PIC chips.

[akpm@linux-foundation.org: coding-style fixes]

Signed-off-by: Jeremy Higdon <jeremy@sgi.com>
Cc: <habeck@sgi.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
15 years agox86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs()
Markus Metzger [Thu, 5 Mar 2009 07:57:21 +0000 (08:57 +0100)]
x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs()

ds_write_config() can write the BTS as well as the PEBS part of
the DS config. ds_request_pebs() passes the wrong qualifier, which
results in the wrong configuration to be written.

Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305085721.A22550@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agox86, bts: remove bad warning
Markus Metzger [Thu, 5 Mar 2009 07:49:54 +0000 (08:49 +0100)]
x86, bts: remove bad warning

In case a ptraced task is reaped (while the tracer is still attached),
ds_exit_thread() is called before ptrace_exit(). The latter will
release the bts_tracer and remove the thread's ds_ctx.
The former will WARN() if the context is not NULL.

Oleg Nesterov submitted patches that move ptrace_exit() before
exit_thread() and thus reverse the order of the above calls.

Remove the bad warning. I will add it again when Oleg's changes are in.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305084954.A22000@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
15 years agoALSA: hda - Fix headphone-detect regression with multiple HP jacks
Takashi Iwai [Fri, 6 Mar 2009 08:43:58 +0000 (09:43 +0100)]
ALSA: hda - Fix headphone-detect regression with multiple HP jacks

The recent changes over the DAC detection mechanism in patch_sigmatel.c
breaks the HP detection on the machines with multiple HP jacks.
It's basically because of the workaround to support the multi-channel
output.  Since the HP detection is more important feature, disable
the HP-swap workaroud temporarily.

Reference: Novell bnc#482052
https://bugzilla.novell.com/show_bug.cgi?id=482052

Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoALSA: hda - Fix typos in slave controls in patch_sigmatel.c
Takashi Iwai [Fri, 6 Mar 2009 08:42:07 +0000 (09:42 +0100)]
ALSA: hda - Fix typos in slave controls in patch_sigmatel.c

"Headphone Playback ..." appears twice in slave_vols[] and slave_sws[].
They should be "Headphone Playback2 ..."

Signed-off-by: Takashi Iwai <tiwai@suse.de>
15 years agoblock: fix missing bio back/front segment size setting in blk_recount_segments()
Jens Axboe [Fri, 6 Mar 2009 07:55:24 +0000 (08:55 +0100)]
block: fix missing bio back/front segment size setting in blk_recount_segments()

Commit 1e42807918d17e8c93bf14fbb74be84b141334c1 introduced a bug where we
don't get front/back segment sizes in the bio in blk_recount_segments().
Fix this by tracking the back bio as well as the front bio in
__blk_recalc_rq_segments(), this also cleans up the interface by getting
rid of the segment size pointer passing.

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
15 years ago[WATCHDOG] orion5x_wdt.c: 'ORION5X_TCLK' undeclared
Wim Van Sebroeck [Wed, 25 Feb 2009 12:33:01 +0000 (13:33 +0100)]
[WATCHDOG] orion5x_wdt.c: 'ORION5X_TCLK' undeclared

orion5x_wdt no longer compiled after the changes in commit
ebe35aff883496c07248df82c8576c3b6e84bbbe ("Orion: prepare for
runtime-determined timer tick rate").

The tick rate define (ORION5X_TCLK) was removed in favor of a runtime
detection. The quick fix is to add the define in the watchdog driver.
The fix is not correct for all supported orion5x platforms, but since
the supported platforms right now are 133 Mhz and 166 Mhz, it won't
be _that_ far off. ;-) A fix that uses the runtime-determined timer
tick rate will be applied later.

Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Kristof Provost <kristof@sigsegv.be>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Nicolas Pitre <nico@cam.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
15 years agoiwlwifi: fix error flow in iwl*_pci_probe
Reinette Chatre [Tue, 3 Mar 2009 19:37:04 +0000 (11:37 -0800)]
iwlwifi: fix error flow in iwl*_pci_probe

Both the agn and 3945 drivers has some problems with dealing with
errors in their probe functions. Ensure that a goto will undo only
things that was done before the goto was called.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>