Home Home > GIT Browse
AgeCommit message (Collapse)Author
2008-04-18Linux Wright
2008-04-18locks: fix possible infinite loop in fcntl(F_SETLKW) over nfsJ. Bruce Fields
upstream commit: 19e729a928172103e101ffd0829fd13e68c13f78 Miklos Szeredi found the bug: "Basically what happens is that on the server nlm_fopen() calls nfsd_open() which returns -EACCES, to which nlm_fopen() returns NLM_LCK_DENIED. "On the client this will turn into a -EAGAIN (nlm_stat_to_errno()), which in will cause fcntl_setlk() to retry forever." So, for example, opening a file on an nfs filesystem, changing permissions to forbid further access, then trying to lock the file, could result in an infinite loop. And Trond Myklebust identified the culprit, from Marc Eshel and I: 7723ec9777d9832849b76475b1a21a2872a40d20 "locks: factor out generic/filesystem switch from setlock code" That commit claimed to just be reshuffling code, but actually introduced a behavioral change by calling the lock method repeatedly as long as it returned -EAGAIN. We assumed this would be safe, since we assumed a lock of type SETLKW would only return with either success or an error other than -EAGAIN. However, nfs does can in fact return -EAGAIN in this situation, and independently of whether that behavior is correct or not, we don't actually need this change, and it seems far safer not to depend on such assumptions about the filesystem's ->lock method. Therefore, revert the problematic part of the original commit. This leaves vfs_lock_file() and its other callers unchanged, while returning fcntl_setlk and fcntl_setlk64 to their former behavior. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Tested-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18file capabilities: remove cap_task_kill()Serge Hallyn
upstream commit: aedb60a67c10a0861af179725d060765262ba0fb The original justification for cap_task_kill() was as follows: check_kill_permission() does appropriate uid equivalence checks. However with file capabilities it becomes possible for an unprivileged user to execute a file with file capabilities resulting in a more privileged task with the same uid. However now that cap_task_kill() always returns 0 (permission granted) when p->uid==current->uid, the whole hook is worthless, and only likely to create more subtle problems in the corner cases where it might still be called but return -EPERM. Those cases are basically when uids are different but euid/suid is equivalent as per the check in check_kill_permission(). One example of a still-broken application is 'at' for non-root users. This patch removes cap_task_kill(). Signed-off-by: Serge Hallyn <serge@hallyn.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Earlier-version-tested-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [chrisw@sous-sol.org: backport to] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18macb: Call phy_disconnect on removingAtsushi Nemoto
upstream commit: 84b7901f8d5a17536ef2df7fd628ab865df8fe3a Call phy_disconnect() on remove routine. Otherwise the phy timer causes a kernel crash when unloading. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18fbdev: fix /proc/fb oops after module removalAlexey Dobriyan
upstream commit: c43f89c2084f46e3ec59ddcbc52ecf4b1e9b015a /proc/fb is not removed during rmmod. Steps to reproduce: modprobe fb rmmod fb ls /proc BUG: unable to handle kernel paging request at ffffffffa0094370 IP: [<ffffffff802b92a1>] proc_get_inode+0x101/0x130 PGD 203067 PUD 207063 PMD 17e758067 PTE 0 Oops: 0000 [1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:05:02.0/resource CPU 1 Modules linked in: nf_conntrack_irc xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables vfat fat usbhid ehci_hcd uhci_hcd usbcore sr_mod cdrom [last unloaded: fb] Pid: 21205, comm: ls Not tainted 2.6.25-rc8-mm2 #14 RIP: 0010:[<ffffffff802b92a1>] [<ffffffff802b92a1>] proc_get_inode+0x101/0x130 RSP: 0018:ffff81017c4bfc78 EFLAGS: 00010246 RAX: 0000000000008000 RBX: ffff8101787f5470 RCX: 0000000048011ccc RDX: ffffffffa0094320 RSI: ffff810006ad43b0 RDI: ffff81017fc2cc00 RBP: ffff81017e450300 R08: 0000000000000002 R09: ffff81017c5d1000 R10: 0000000000000000 R11: 0000000000000246 R12: ffff81016b903a28 R13: ffff81017f822020 R14: ffff81017c4bfd58 R15: ffff81017f822020 FS: 00007f08e71696f0(0000) GS:ffff81017fc06480(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffffa0094370 CR3: 000000017e54a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ls (pid: 21205, threadinfo ffff81017c4be000, task ffff81017de48770) Stack: ffff81017c5d1000 00000000ffffffea ffff81017e450300 ffffffff802bdd1e ffff81017f802258 ffff81017c4bfe48 ffff81016b903a28 ffff81017f822020 ffff81017c4bfd48 ffffffff802b9ba0 ffff81016b903a28 ffff81017f802258 Call Trace: [<ffffffff802bdd1e>] ? proc_lookup_de+0x8e/0x100 [<ffffffff802b9ba0>] ? proc_root_lookup+0x20/0x60 [<ffffffff802882a7>] ? do_lookup+0x1b7/0x210 [<ffffffff8028883d>] ? __link_path_walk+0x53d/0x7f0 [<ffffffff80295eb8>] ? mntput_no_expire+0x28/0x130 [<ffffffff80288b4a>] ? path_walk+0x5a/0xc0 [<ffffffff80288dd3>] ? do_path_lookup+0x83/0x1c0 [<ffffffff80287785>] ? getname+0xe5/0x210 [<ffffffff80289adb>] ? __user_walk_fd+0x4b/0x80 [<ffffffff8028236c>] ? vfs_lstat_fd+0x2c/0x70 [<ffffffff8028bf1e>] ? filldir+0xae/0xf0 [<ffffffff802b92e9>] ? de_put+0x9/0x50 [<ffffffff8029633d>] ? mnt_want_write+0x2d/0x80 [<ffffffff8029339f>] ? touch_atime+0x1f/0x170 [<ffffffff802b9b1d>] ? proc_root_readdir+0x7d/0xa0 [<ffffffff802825e7>] ? sys_newlstat+0x27/0x50 [<ffffffff8028bffb>] ? vfs_readdir+0x9b/0xd0 [<ffffffff8028c0fe>] ? sys_getdents+0xce/0xe0 [<ffffffff8020b39b>] ? system_call_after_swapgs+0x7b/0x80 Code: b7 83 b2 00 00 00 25 00 f0 00 00 3d 00 80 00 00 74 19 48 89 93 f0 00 00 00 48 89 df e8 39 9a fd ff 48 89 d8 48 83 c4 08 5b 5d c3 <48> 83 7a 50 00 48 c7 c0 60 16 45 80 48 c7 c2 40 17 45 80 48 0f RIP [<ffffffff802b92a1>] proc_get_inode+0x101/0x130 RSP <ffff81017c4bfc78> CR2: ffffffffa0094370 ---[ end trace c71hiarjan8ab739 ]--- Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> "Antonino A. Daplas" <adaplas@pol.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18acpi: bus: check once more for an empty list after locking itChuck Ebbert
upstream commit: f0a37e008750ead1751b7d5e89d220a260a46147 List could have become empty after the unlocked check that was made earlier, so check again inside the lock. Should fix https://bugzilla.redhat.com/show_bug.cgi?id=427765 Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Cc: <stable@kernel.org> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18PARISC fix signal trampoline cache flushingKyle McMartin
upstream commit: cf39cc3b56bc4a562db6242d3069f65034ec7549 The signal trampolines were accidently flushing the kernel I$ instead of the users. Fix that up, and also add a missing user D$ flush while we're at it. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18PARISC pdc_console: fix bizarre panic on bootKyle McMartin
upstream commit ef1afd4d79f0479960ff36bb5fe6ec6eba1ebff2 commit 721fdf34167580ff98263c74cead8871d76936e6 Author: Kyle McMartin <kyle@shortfin.cabal.ca> Date: Thu Dec 6 09:32:15 2007 -0800 [PARISC] print more than one character at a time for pdc console introduced a subtle bug by accidentally removing the "static" from iodc_dbuf. This resulted in, what appeared to be, a trap without *current set to a task. Probably the result of a trap in real mode while calling firmware. Also do other misc clean ups. Since the only input from firmware is non blocking, share iodc_dbuf between input and output, and spinlock the only callers. [jejb: fixed up rejections against the stable tree] Signed-off-by: Kyle McMartin <kyle@parisc-linux.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18PARISC futex: special case cmpxchg NULL in kernel spaceKyle McMartin
upstream commit: c20a84c91048c76c1379011c96b1a5cee5c7d9a0 commit f9e77acd4060fefbb60a351cdb8d30fca27fe194 Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun Feb 24 02:10:05 2008 +0000 futex: runtime enable pi and robust functionality which was backported to stable based on mainline Commit a0c1e9073ef7428a14309cba010633a6cd6719ea added code to futex.c to detect whether futex_atomic_cmpxchg_inatomic was implemented at run time: + curval = cmpxchg_futex_value_locked(NULL, 0, 0); + if (curval == -EFAULT) + futex_cmpxchg_enabled = 1; This is bogus on parisc, since page zero in kernel virtual space is the gateway page for syscall entry, and should not be read from the kernel. (That, and we really don't like the kernel faulting on its own address space...) Signed-off-by: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18pnpacpi: reduce printk severity for "pnpacpi: exceeded the max number of ..."Len Brown
upstream commit 33fd7afd66ffdc6addf1b085fe6403b6af532f8e We have been printing these messages at KERN_ERR since 2.6.24, per http://bugzilla.kernel.org/show_bug.cgi?id=9535 But KERN_ERR pops up on a console booted with "quiet" and causes users to get alarmed and file bugs about the message itself: https://bugzilla.redhat.com/show_bug.cgi?id=436589 So reduce the severity of these messages to KERN_WARNING, which is not printed by "quiet". This message will still be seen without "quiet", but a lot of messages are printed in that mode and it will be less likely to cause undue alarm. We could go all the way to KERN_DEBUG, but this is a real warning after all, so it seems prudent not to require "debug" to see it. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18POWERPC: Fix build of modular drivers/macintosh/apm_emu.cGuido Guenther
upstream commit: 620a245978d007279bc5c7c64e15f5f63af9af98 Currently, if drivers/macintosh/apm_emu is a module and the config doesn't have CONFIG_SUSPEND we get: ERROR: "pmu_batteries" [drivers/macintosh/apm_emu.ko] undefined! ERROR: "pmu_battery_count" [drivers/macintosh/apm_emu.ko] undefined! ERROR: "pmu_power_flags" [drivers/macintosh/apm_emu.ko] undefined! on PPC32. The variables aren't wrapped in '#if defined(CONFIG_SUSPEND)' so we probably shouldn't wrap the exports either. This removes the CONFIG_SUSPEND part of the export, which fixes compilation on ppc32. Signed-off-by: Guido Guenther <agx@sigxcpu.org> Signed-off-by: Paul Mackerras <paulus@samba.org> mpagano@gentoo.org notes: The details can be found at http://bugs.gentoo.org/show_bug.cgi?id=217629. Cc: Mike Pagano <mpagano@gentoo.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18md: close a livelock window in handle_parity_checks5Dan Williams
upstream commit: bd2ab67030e9116f1e4aae1289220255412b37fd If a failure is detected after a parity check operation has been initiated, but before it completes handle_parity_checks5 will never quiesce operations on the stripe. Explicitly handle this case by "canceling" the parity check, i.e. clear the STRIPE_OP_CHECK flags and queue the stripe on the handle list again to refresh any non-uptodate blocks. Kernel versions >= 2.6.23 are susceptible. Cc: <stable@kernel.org> Cc: NeilBrown <neilb@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18signalfd: fix for incorrect SI_QUEUE user data reportingDavide Libenzi
upstream commit: 0859ab59a8a48d2a96b9d2b7100889bcb6bb5818 Michael Kerrisk found out that signalfd was not reporting back user data pushed using sigqueue: http://groups.google.com/group/linux.kernel/msg/9397cab8551e3123 The following patch makes signalfd report back the ssi_ptr and ssi_int members of the signalfd_siginfo structure. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Acked-by: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18plip: replace spin_lock_irq with spin_lock_irqsave in irq contextMikulas Patocka
upstream commit: cabce28ec0a0ae3d0ddfa4461f0e8be94ade9e46 Plip uses spin_lock_irq/spin_unlock_irq in its IRQ handler (called from parport IRQ handler), the latter enables interrupts without parport subsystem IRQ handler expecting it. The bug can be seen if you compile kernel with lock dependency checking and use plip --- it produces a warning. This patch changes it to spin_lock_irqsave/spin_lock_irqrestore, so that it doesn't enable interrupts when already disabled. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18acpi: fix "buggy BIOS check" when CPUs are hot removedAlok Kataria
upstream commit: ba62b077871a5255e271f4fdae57167651839277 Fixes a BUG in ACPI hotplugging. processor_device_array[pr->id] needs to be set to NULL when removing a CPU. Else the "buggy BIOS check" in acpi_processor_start mistakenly fires when a CPU is removed from the system and then later re-added. Signed-off-by: Alok N Kataria <akataria@vmware.com> Signed-off-by: Dan Arai <arai@vmware.com> Cc: Len Brown <lenb@kernel.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18HFS+: fix unlink of linksRoman Zippel
upstream commit: 76b0c26af2736b7e5b87e6ed7ab63901483d5736 Some time ago while attempting to handle invalid link counts, I botched the unlink of links itself, so this patch fixes this now correctly, so that only the link count of nodes that don't point to links is ignored. Thanks to Vlado Plaga <rechner@vlado-do.de> to notify me of this problem. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18DVB: tda10086: make the 22kHz tone for DISEQC a config optionHartmut Hackmann
(backported from commit ea75baf4b0f117564bd50827a49c4b14d61d24e9) Some cards need the diseqc signal modulated, while some just need the envelope to control the LNB supply. This fixes Bug 9887 Signed-off-by: Hartmut Hackmann <hartmut.hackmann@t-online.de> Acked-by: Oliver Endriss <o.endriss@gmx.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Hermann Pitton <hermann-pitton@arcor.de> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18SPARC64: Fix FPU saving in 64-bit signal handling.David S. Miller
Upstream commit: 7c3cce978e4f933ac13758ec5d2554fc8d0927d2 The calculation of the FPU reg save area pointer was wrong. Based upon an OOPS report from Tom Callaway. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18bluetooth: hci_core: defer hci_unregister_sysfs()Dave Young
upstream commit: 147e2d59833e994cc99341806a88b9e59be41391 Alon Bar-Lev reports: Feb 16 23:41:33 alon1 usb 3-1: configuration #1 chosen from 1 choice Feb 16 23:41:33 alon1 BUG: unable to handle kernel NULL pointer dereference at virtual address 00000008 Feb 16 23:41:33 alon1 printing eip: c01b2db6 *pde = 00000000 Feb 16 23:41:33 alon1 Oops: 0000 [#1] PREEMPT Feb 16 23:41:33 alon1 Modules linked in: ppp_deflate zlib_deflate zlib_inflate bsd_comp ppp_async rfcomm l2cap hci_usb vmnet(P) vmmon(P) tun radeon drm autofs4 ipv6 aes_generic crypto_algapi ieee80211_crypt_ccmp nf_nat_irc nf_nat_ftp nf_conntrack_irc nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat ipt_REJECT xt_tcpudp ipt_LOG xt_limit xt_state nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables snd_pcm_oss snd_mixer_oss snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device bluetooth ppp_generic slhc ioatdma dca cfq_iosched cpufreq_powersave cpufreq_ondemand cpufreq_conservative acpi_cpufreq freq_table uinput fan af_packet nls_cp1255 nls_iso8859_1 nls_utf8 nls_base pcmcia snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm nsc_ircc snd_timer ipw2200 thinkpad_acpi irda snd ehci_hcd yenta_socket uhci_hcd psmouse ieee80211 soundcore intel_agp hwmon rsrc_nonstatic pcspkr e1000 crc_ccitt snd_page_alloc i2c_i801 ieee80211_crypt pcmcia_core agpgart thermal bat! tery nvram rtc sr_mod ac sg firmware_class button processor cdrom unix usbcore evdev ext3 jbd ext2 mbcache loop ata_piix libata sd_mod scsi_mod Feb 16 23:41:33 alon1 Feb 16 23:41:33 alon1 Pid: 4, comm: events/0 Tainted: P (2.6.24-gentoo-r2 #1) Feb 16 23:41:33 alon1 EIP: 0060:[<c01b2db6>] EFLAGS: 00010282 CPU: 0 Feb 16 23:41:33 alon1 EIP is at sysfs_get_dentry+0x26/0x80 Feb 16 23:41:33 alon1 EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: f48a2210 Feb 16 23:41:33 alon1 ESI: f72eb900 EDI: f4803ae0 EBP: f4803ae0 ESP: f7c49efc Feb 16 23:41:33 alon1 hcid[7004]: HCI dev 0 registered Feb 16 23:41:33 alon1 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Feb 16 23:41:33 alon1 Process events/0 (pid: 4, ti=f7c48000 task=f7c3efc0 task.ti=f7c48000) Feb 16 23:41:33 alon1 Stack: f7cb6140 f4822668 f7e71e10 c01b304d ffffffff ffffffff fffffffe c030ba9c Feb 16 23:41:33 alon1 f7cb6140 f4822668 f6da6720 f7cb6140 f4822668 f6da6720 c030ba8e c01ce20b Feb 16 23:41:33 alon1 f6e9dd00 c030ba8e f6da6720 f6e9dd00 f6e9dd00 00000000 f4822600 00000000 Feb 16 23:41:33 alon1 Call Trace: Feb 16 23:41:33 alon1 [<c01b304d>] sysfs_move_dir+0x3d/0x1f0 Feb 16 23:41:33 alon1 [<c01ce20b>] kobject_move+0x9b/0x120 Feb 16 23:41:33 alon1 [<c0241711>] device_move+0x51/0x110 Feb 16 23:41:33 alon1 [<f9aaed80>] del_conn+0x0/0x70 [bluetooth] Feb 16 23:41:33 alon1 [<f9aaed99>] del_conn+0x19/0x70 [bluetooth] Feb 16 23:41:33 alon1 [<c012c1a1>] run_workqueue+0x81/0x140 Feb 16 23:41:33 alon1 [<c02c0c88>] schedule+0x168/0x2e0 Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50 Feb 16 23:41:33 alon1 [<c012c9cb>] worker_thread+0x9b/0xf0 Feb 16 23:41:33 alon1 [<c012fc70>] autoremove_wake_function+0x0/0x50 Feb 16 23:41:33 alon1 [<c012c930>] worker_thread+0x0/0xf0 Feb 16 23:41:33 alon1 [<c012f962>] kthread+0x42/0x70 Feb 16 23:41:33 alon1 [<c012f920>] kthread+0x0/0x70 Feb 16 23:41:33 alon1 [<c0104c2f>] kernel_thread_helper+0x7/0x18 Feb 16 23:41:33 alon1 ======================= Feb 16 23:41:33 alon1 Code: 26 00 00 00 00 57 89 c7 a1 50 1b 3a c0 56 53 8b 70 38 85 f6 74 08 8b 0e 85 c9 74 58 ff 06 8b 56 50 39 fa 74 47 89 fb eb 02 89 c3 <8b> 43 08 39 c2 75 f7 8b 46 08 83 c0 68 e8 98 e7 10 00 8b 43 10 Feb 16 23:41:33 alon1 EIP: [<c01b2db6>] sysfs_get_dentry+0x26/0x80 SS:ESP 0068:f7c49efc Feb 16 23:41:33 alon1 ---[ end trace aae864e9592acc1d ]--- Defer hci_unregister_sysfs because hci device could be destructed while hci conn devices still there. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Tested-by: Stefan Seyfried <seife@suse.de> Acked-by: Alon Bar-Lev <alon.barlev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> dsd@gentoo.org notes: This patch fixes http://bugs.gentoo.org/211179 Cc: Daniel Drake <dsd@gentoo.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18sis190: read the mac address from the eeprom firstFrancois Romieu
upstream commit: 563e0ae06ff18f0b280f11cf706ba0172255ce52 Reading a serie of zero from the cmos sram area do not work well with is_valid_ether_addr(). Let's read the mac address from the eeprom first as it seems more reliable. Fix for http://bugzilla.kernel.org/show_bug.cgi?id=9831 Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> dsd@gentoo.org notes: This patch fixes http://bugs.gentoo.org/207706 Cc: Daniel Drake <dsd@gentoo.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18libata: assume no device is attached if both IDENTIFYs are abortedTejun Heo
upstream commit: 1ffc151fcddf524d0c76709d7e7a2af0255acb6b This is to fix bugzilla #10254. QSI cdrom attached to pata_sis as secondary master appears as phantom device for the slave. Interestingly, instead of not setting DRQ after IDENTIFY which triggers NODEV_HINT, it aborts both IDENTIFY and IDENTIFY PACKET which makes EH retry. Modify EH such that it assumes no device is attached if both flavors of IDENTIFY are aborted by the device. There really isn't much point in retrying when the device actively aborts the commands. While at it, convert NODEV detection message to ata_dev_printk() to help debugging obscure detection problems. This problem was reported by Jan Bücken. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Jan Bücken <jb.faq@gmx.de> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> dsd@gentoo.org notes: This patch fixes http://bugs.gentoo.org/211369 Cc: Daniel Drake <dsd@gentoo.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18SPARC64: flush_ptrace_access() needs preemption disable.David S. Miller
Upstream commit: f6a843d939ade435e060d580f5c56d958464f8a5 Based upon a report by Mariusz Kozlowski. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18SPARC64: Fix __get_cpu_var in preemption-enabled area.David S. Miller
Upstream commit: 69072f6e8e4bd4799d2a54e4ff8771d0657512c1 Reported by Mariusz Kozlowski. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18SPARC64: Fix atomic backoff limit.David S. Miller
Upstream commit: 4cfea5a7dfcc2766251e50ca30271a782d5004ad 4096 will not fit into the immediate field of a compare instruction, in fact it will end up being -4096 causing the check to fail every time and thus disabling backoff. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18VLAN: Don't copy ALLMULTI/PROMISC flags from underlying devicePatrick McHardy
Upstream commit: 0ed21b321a13421e2dfeaa70a6c324e05e3e91e6 Changing these flags requires to use dev_set_allmulti/dev_set_promiscuity or dev_change_flags. Setting it directly causes two unwanted effects: - the next dev_change_flags call will notice a difference between dev->gflags and the actual flags, enable promisc/allmulti mode and incorrectly update dev->gflags - this keeps the underlying device in promisc/allmulti mode until the VLAN device is deleted [ Ported back to 2.6.24 VLAN code. -DaveM ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18TCP: Let skbs grow over a page on fast peersHerbert Xu
Upstream commit: 69d1506731168d6845a76a303b2c45f7c05f3f2c While testing the virtio-net driver on KVM with TSO I noticed that TSO performance with a 1500 MTU is significantly worse compared to the performance of non-TSO with a 16436 MTU. The packet dump shows that most of the packets sent are smaller than a page. Looking at the code this actually is quite obvious as it always stop extending the packet if it's the first packet yet to be sent and if it's larger than the MSS. Since each extension is bound by the page size, this means that (given a 1500 MTU) we're very unlikely to construct packets greater than a page, provided that the receiver and the path is fast enough so that packets can always be sent immediately. The fix is also quite obvious. The push calls inside the loop is just an optimisation so that we don't end up doing all the sending at the end of the loop. Therefore there is no specific reason why it has to do so at MSS boundaries. For TSO, the most natural extension of this optimisation is to do the pushing once the skb exceeds the TSO size goal. This is what the patch does and testing with KVM shows that the TSO performance with a 1500 MTU easily surpasses that of a 16436 MTU and indeed the packet sizes sent are generally larger than 16436. I don't see any obvious downsides for slower peers or connections, but it would be prudent to test this extensively to ensure that those cases don't regress. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18TCP: Fix shrinking windows with window scalingPatrick McHardy
Upstream commit: 607bfbf2d55dd1cfe5368b41c2a81a8c9ccf4723 When selecting a new window, tcp_select_window() tries not to shrink the offered window by using the maximum of the remaining offered window size and the newly calculated window size. The newly calculated window size is always a multiple of the window scaling factor, the remaining window size however might not be since it depends on rcv_wup/rcv_nxt. This means we're effectively shrinking the window when scaling it down. The dump below shows the problem (scaling factor 2^7): - Window size of 557 (71296) is advertised, up to 3111907257: IP > . ack 3111835961 win 557 <...> - New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes below the last end: IP > . 3113575668:3113577116(1448) ack 3111841425 win 514 <...> The number 40 results from downscaling the remaining window: 3111907257 - 3111841425 = 65832 65832 / 2^7 = 514 65832 % 2^7 = 40 If the sender uses up the entire window before it is shrunk, this can have chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq() will notice that the window has been shrunk since tcp_wnd_end() is before tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number. This will fail the receivers checks in tcp_sequence() however since it is before it's tp->rcv_wup, making it respond with a dupack. If both sides are in this condition, this leads to a constant flood of ACKs until the connection times out. Make sure the window is never shrunk by aligning the remaining window to the window scaling factor. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18NET: Fix multicast device ioctl checksPatrick McHardy
Upstream commit: 61ee6bd487b9cc160e533034eb338f2085dc7922 SIOCADDMULTI/SIOCDELMULTI check whether the driver has a set_multicast_list method to determine whether it supports multicast. Drivers implementing secondary unicast support use set_rx_mode however. Check for both dev->set_multicast_mode and dev->set_rx_mode to determine multicast capabilities. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18SCTP: Fix local_addr deletions during list traversals.Chidambar 'ilLogict' Zinnoury
Upstream commit: 22626216c46f2ec86287e75ea86dd9ac3df54265 Since the lists are circular, we need to explicitely tag the address to be deleted since we might end up freeing the list head instead. This fixes some interesting SCTP crashes. Signed-off-by: Chidambar 'ilLogict' Zinnoury <illogict@online.fr> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18sch_htb: fix "too many events" situationMartin Devera
Upstream commit: 8f3ea33a5078a09eba12bfe57424507809367756 HTB is event driven algorithm and part of its work is to apply scheduled events at proper times. It tried to defend itself from livelock by processing only limited number of events per dequeue. Because of faster computers some users already hit this hardcoded limit. This patch limits processing up to 2 jiffies (why not 1 jiffie ? because it might stop prematurely when only fraction of jiffie remains). Signed-off-by: Martin Devera <devik@cdi.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18NET: Add preemption point in qdisc_runHerbert Xu
Upstream commit: 2ba2506ca7ca62c56edaa334b0fe61eb5eab6ab0 The qdisc_run loop is currently unbounded and runs entirely in a softirq. This is bad as it may create an unbounded softirq run. This patch fixes this by calling need_resched and breaking out if necessary. It also adds a break out if the jiffies value changes since that would indicate we've been transmitting for too long which starves other softirqs. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18PPPOL2TP: Fix SMP issues in skb reorder queue handlingJames Chapman
Upstream commit: e653181dd6b3ad38ce14904351b03a5388f4b0f7 When walking a session's packet reorder queue, use skb_queue_walk_safe() since the list could be modified inside the loop. Rearrange the unlinking skbs from the reorder queue such that it is done while the queue lock is held in pppol2tp_recv_dequeue() when walking the skb list. A version of this patch was suggested by Jarek Poplawski. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18PPPOL2TP: Make locking calls softirq-safeJames Chapman
Upstream commit: cf3752e2d203bbbfc88d29e362e6938cef4339b3 Fix locking issues in the pppol2tp driver which can cause a kernel crash on SMP boxes. There were two problems:- 1. The driver was violating read_lock() and write_lock() scheduling rules because it wasn't using softirq-safe locks in softirq contexts. So we now consistently use the _bh variants of the lock functions. 2. The driver was calling sk_dst_get() in pppol2tp_xmit() which was taking sk_dst_lock in softirq context. We now call __sk_dst_get(). Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18netpoll: zap_completion_queue: adjust skb->users counterJarek Poplawski
Upstream commit: 8a455b087c9629b3ae3b521b4f1ed16672f978cc zap_completion_queue() retrieves skbs from completion_queue where they have zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero yet, so it's increased now. Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18LLC: Restrict LLC sockets to rootPatrick McHardy
Upstream commit: 3480c63bdf008e9289aab94418f43b9592978fff LLC currently allows users to inject raw frames, including IP packets encapsulated in SNAP. While Linux doesn't handle IP over SNAP, other systems do. Restrict LLC sockets to root similar to packet sockets. [ Modified Patrick's patch to use CAP_NEW_RAW --DaveM ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18INET: inet_frag_evictor() must run with BH disabledDavid S. Miller
Part of upstream commit: e8e16b706e8406f1ab3bccab16932ebc513896d8 Based upon a lockdep trace from Dave Jones. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18SUNGEM: Fix NAPI assertion failure.David S. Miller
Upstream commit: da990a2402aeaee84837f29054c4628eb02f7493 As reported by Johannes Berg: I started getting this warning with recent kernels: [ 773.908927] ------------[ cut here ]------------ [ 773.908954] Badness at net/core/dev.c:2204 ... If we loop more than once in gem_poll(), we'll use more than the real budget in our gem_rx() calls, thus eventually trigger the caller's assertions in net_rx_action(). Subtract "work_done" from "budget" for the second arg to gem_rx() to fix the bug. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18NET: include <linux/types.h> into linux/ethtool.h for __u* typedefKirill A. Shutemov
Upstream commit: e621e69137b24fdbbe7ad28214e8d81e614c25b7 Signed-off-by: Kirill A. Shutemov <k.shutemov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18AX25 ax25_out: check skb for NULL in ax25_kick()Jarek Poplawski
Upstream commit: f47b7257c7368698eabff6fd7b340071932af640 According to some OOPS reports ax25_kick tries to clone NULL skbs sometimes. It looks like a race with ax25_clear_queues(). Probably there is no need to add more than a simple check for this yet. Another report suggested there are probably also cases where ax25 ->paclen == 0 can happen in ax25_output(); this wasn't confirmed during testing but let's leave this debugging check for some time. Reported-and-tested-by: Jann Traschewski <jann@gmx.de> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18ipmi: change device node ordering to reflect probe orderCarol Hebert
upstream commit: abd24df828f1a72971db29d1b74fefae104ea9e2 In 2.6.14 a patch was merged which switching the order of the ipmi device naming from in-order-of-discovery over to reverse-order-of-discovery. So on systems with multiple BMC interfaces, the ipmi device names are being created in reverse order relative to how they are discovered on the system (e.g. on an IBM x3950 multinode server with N nodes, the device name for the BMC in the first node is /dev/ipmiN-1 and the device name for the BMC in the last node is /dev/ipmi0, etc.). The problem is caused by the list handling routines chosen in dmi_scan.c. Using list_add() causes the multiple ipmi devices to be added to the device list using a stack-paradigm and so the ipmi driver subsequently pulls them off during initialization in LIFO order. This patch changes the dmi_save_ipmi_device() list handling paradigm to a queue, thereby allowing the ipmi driver to build the ipmi device names in the order in which they are found on the system. Signed-off-by: Carol Hebert <cah@us.ibm.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18mtd: fix broken state in CFI driver caused by FL_SHUTDOWNAlexey Korolev
upstream commit: fb6d080c6f75dfd7e23d5a3575334785aa8738eb THe CFI driver in 2.6.24 kernel is broken. Not so intensive read/write operations cause incomplete writes which lead to kernel panics in JFFS2. We investigated the issue - it is caused by bug in FL_SHUTDOWN parsing code. Sometimes chip returns -EIO as if it is in FL_SHUTDOWN state when it should wait in FL_PONT (error in order of conditions). The following patch fixes the bug in state parsing code of CFI. Also I've added comments to notify developers if they want to add new case in future. Signed-off-by: Alexey Korolev <akorolev@infradead.org> Reviewed-by: Joern Engel <joern@logfs.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18CRYPTO xcbc: Fix crash when ipsec uses xcbc-mac with big data chunkJoy Latten
upstream commit: 1edcf2e1ee2babb011cfca80ad9d202e9c491669 The kernel crashes when ipsec passes a udp packet of about 14XX bytes of data to aes-xcbc-mac. It seems the first xxxx bytes of the data are in first sg entry, and remaining xx bytes are in next sg entry. But we don't check next sg entry to see if we need to go look the page up. I noticed in hmac.c, we do a scatterwalk_sg_next(), to do this check and possible lookup, thus xcbc.c needs to use this routine too. A 15-hour run of an ipsec stress test sending streams of tcp and udp packets of various sizes, using this patch and aes-xcbc-mac completed successfully, so hopefully this fixes the problem. Signed-off-by: Joy Latten <latten@austin.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> [chrisw@sous-sol.org: backport to] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18USB: serial: ti_usb_3410_5052: Correct TUSB3410 endpoint requirements.Robert Spanton
upstream commit: 1bfd6693cd66f1e79abce62d3e8c3647e1f59a55 The changes introduced in commit 063a2da8f01806906f7d7b1a1424b9afddebc443 changed the semantics of the num_interrupt_in, num_interrupt_out, num_bulk_in and num_bulk_out entries of the usb_serial_driver struct to be the number of endpoints the device has when probed. This patch changes the ti_1port_device usb_serial_driver struct to reflect this change. The single port devices only have 1 bulk_out endpoint in their initial configuration, and so this patch changes the number of other types to NUM_DONT_CARE. The same change probably needs doing to the ti_2port_device struct, but I don't have a two port device at hand. Signed-off-by: Robert Spanton <rspanton@zepler.net> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18USB: serial: fix regression in Visor/Palm OS module for kernels >= 2.6.24Brad Sawatzky
upstream commit: d04863e9e65767feff7807c8f693ac2719dd1944 Fixes a bug/inconsistency revealed by the additional sanity checking in commit 063a2da8f01806906f7d7b1a1424b9afddebc443 introduced in the original 2.6.24 branch. The Handspring Visor / PalmOS 4 device structure defines .num_bulk_out=2 but the usb-serial probe returns num_bulk_out=3, triggering the check in the above commit and forcing a bail out when the device (a Garmin iQue in my case) attempts to connect. The patch bumps the expected number of endpoints to 3. FWIW, this patch will probably solve the following kernel bug report for Treo users (identical symptoms, different model PalmOS units): <http://bugzilla.kernel.org/show_bug.cgi?id=10118> Signed-off-by: Brad Sawatzky <brad+kernel@swatter.net> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18USB: Allow initialization of broken keyspan serial adapters.Clark Rawlins
upstream commit: 822470537d0fc1dee38a2a9c8b8c398bfbb332bb Fixes the keyspan driver after the addition of additional checking of driver requirements introduced in usb-serial.c commit 063a2da8f01806906f7d7b1a1424b9afddebc443. The initialization of the keyspan usb_serial_driver structs were not initializing the num_interrupt_out field and the additional checking was rejecting the end point so the driver wouldn't finish initializing. This commit initializes the fields to NUM_DONT_CARE. It works for the keyspan USA-49WG and doesn't break the USA-19HS which are the two keyspan devices I have to test with. Signed-off-by: Clark Rawlins <clark.rawlins@escient.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18vmcoreinfo: add the symbol "phys_base"Ken'ichi Ohmichi
upstream commit: 629c8b4cdb354518308663aff2f719e02f69ffbe Fix the problem that makedumpfile sometimes fails on x86_64 machine. This patch adds the symbol "phys_base" to a vmcoreinfo data. The vmcoreinfo data has the minimum debugging information only for dump filtering. makedumpfile (dump filtering command) gets it to distinguish unnecessary pages, and makedumpfile creates a small dumpfile. On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and CONFIG_RELOCATABLE=y, makedumpfile fails like the following: # makedumpfile -d31 /proc/vmcore dumpfile The kernel version is not supported. The created dumpfile may be incomplete. _exclude_free_page: Can't get next online node. makedumpfile Failed. # The cause is the lack of the symbol "phys_base" in a vmcoreinfo data. If the symbol "phys_base" does not exist, makedumpfile considers an x86_64 kernel as non relocatable. As the result, makedumpfile misunderstands the physical address where the kernel is loaded, and it cannot translate a kernel virtual address to physical address correctly. To fix this problem, this patch adds the symbol "phys_base" to a vmcoreinfo data. Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@kernel.org> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18hwmon: (w83781d) Fix I/O resource conflict with PNPJean Delvare
upstream commit: 2961cb22ef02850d90e7a12c28a14d74e327df8d Only request I/O ports 0x295-0x296 instead of the full I/O address range. This solves a conflict with PNP resources on a few motherboards. Also request the I/O ports in two parts (4 low ports, 4 high ports) during device detection, otherwise the PNP resource makes the request (and thus the detection) fail. This fixes lm-sensors ticket #2306: http://www.lm-sensors.org/ticket/2306 Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18pci: revert SMBus unhide on HP Compaq nx6110Jean Delvare
upstream commit: a99acc832de1104afaba02d7c2576fd9b9fd6422 This reverts commit 3c0a654e390d00fef9d8faed758f5e1e8078adb5 and fixes kernel bug #10245: http://bugzilla.kernel.org/show_bug.cgi?id=10245 The HP Compaq nc6120 has the same PCI sub-device ID as the nx6110, and the SMBus is used by ACPI for thermal management on the nc6120, so Linux should not attach a native driver to it. This means that this quirk is unsafe and has to be removed. I also added a comment to help developers realize that adding new IDs to this SMBus unhiding quirk table should be done only with great care, and in particular only after checking that ACPI is not making use of the SMBus. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Tomasz Koprowski <tomek@koprowski.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18vfs: fix data leak in nobh_write_end()Dmitri Monakhov
upstream commit: 5b41e74ad1b0bf7bc51765ae74e5dc564afc3e48 Current nobh_write_end() implementation ignore partial writes(copied < len) case if page was fully mapped and simply mark page as Uptodate, which is totally wrong because area [pos+copied, pos+len) wasn't updated explicitly in previous write_begin call. It simply contains garbage from pagecache and result in data leakage. #TEST_CASE_BEGIN: ~~~~~~~~~~~~~~~~ In fact issue triggered by classical testcase open("/mnt/test", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3 ftruncate(3, 409600) = 0 writev(3, [{"a", 1}, {NULL, 4095}], 2) = 1 ##TESTCASE_SOURCE: ~~~~~~~~~~~~~~~~~ #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <sys/uio.h> #include <sys/mman.h> #include <errno.h> int main(int argc, char **argv) { int fd, ret; void* p; struct iovec iov[2]; fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666); ftruncate(fd, 409600); iov[0].iov_base="a"; iov[0].iov_len=1; iov[1].iov_base=NULL; iov[1].iov_len=4096; ret = writev(fd, iov, sizeof(iov)/sizeof(struct iovec)); printf("writev = %d, err = %d\n", ret, errno); return 0; } ##TESTCASE RESULT: ~~~~~~~~~~~~~~~~~~ [root@ts63 ~]# mount | grep mnt2 /dev/mapper/test on /mnt2 type ext2 (rw,nobh) [root@ts63 ~]# /tmp/writev /mnt2/test writev = 1, err = 0 [root@ts63 ~]# hexdump -C /mnt2/test 00000000 61 65 62 6f 6f 74 00 00 f0 b9 b4 59 3a 00 00 00 |aeboot.....Y:...| 00000010 20 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 | .......!.......| 00000020 df df df df df df df df df df df df df df df df |................| 00000030 3a 00 00 00 2a 00 00 00 21 00 00 00 00 00 00 00 |:...*...!.......| 00000040 60 c0 8c 00 00 00 00 00 40 4a 8d 00 00 00 00 00 |`.......@J......| 00000050 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 |........A.......| 00000060 74 69 6d 65 20 64 64 20 69 66 3d 2f 64 65 76 2f |time dd if=/dev/| 00000070 6c 6f 6f 70 30 20 20 6f 66 3d 2f 64 65 76 2f 6e |loop0 of=/dev/n| skip.. 00000f50 00 00 00 00 00 00 00 00 31 00 00 00 00 00 00 00 |........1.......| 00000f60 6d 6b 66 73 2e 65 78 74 33 20 2f 64 65 76 2f 76 |mkfs.ext3 /dev/v| 00000f70 7a 76 67 2f 74 65 73 74 20 2d 62 34 30 39 36 00 |zvg/test -b4096.| 00000f80 a0 fe 8c 00 00 00 00 00 21 00 00 00 00 00 00 00 |........!.......| 00000f90 23 31 32 30 35 39 35 30 34 30 34 00 3a 00 00 00 |#1205950404.:...| 00000fa0 20 00 8d 00 00 00 00 00 21 00 00 00 00 00 00 00 | .......!.......| 00000fb0 d0 cf 8c 00 00 00 00 00 10 d0 8c 00 00 00 00 00 |................| 00000fc0 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 |........A.......| 00000fd0 6d 6f 75 6e 74 20 2f 64 65 76 2f 76 7a 76 67 2f |mount /dev/vzvg/| 00000fe0 74 65 73 74 20 20 2f 76 7a 20 2d 6f 20 64 61 74 |test /vz -o dat| 00000ff0 61 3d 77 72 69 74 65 62 61 63 6b 00 00 00 00 00 |a=writeback.....| 00001000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| As you can see file's page contains garbage from pagecache instead of zeros. #TEST_CASE_END Attached patch: - Add sanity check BUG_ON in order to prevent incorrect usage by caller, This is function invariant because page can has buffers and in no zero *fadata pointer at the same time. - Always attach buffers to page is it is partial write case. - Always switch back to generic_write_end if page has buffers. This is reasonable because if page already has buffer then generic_write_begin was called previously. Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org> Reviewed-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2008-04-18alloc_percpu() fails to allocate percpu dataEric Dumazet
upstream commit: be852795e1c8d3829ddf3cb1ce806113611fa555 Some oprofile results obtained while using tbench on a 2x2 cpu machine were very surprising. For example, loopback_xmit() function was using high number of cpu cycles to perform the statistic updates, supposed to be real cheap since they use percpu data pcpu_lstats = netdev_priv(dev); lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id()); lb_stats->packets++; /* HERE : serious contention */ lb_stats->bytes += skb->len; struct pcpu_lstats is a small structure containing two longs. It appears that on my 32bits platform, alloc_percpu(8) allocates a single cache line, instead of giving to each cpu a separate cache line. Using the following patch gave me impressive boost in various benchmarks ( 6 % in tbench) (all percpu_counters hit this bug too) Long term fix (ie >= 2.6.26) would be to let each CPU allocate their own block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or merging the SGI stuff of course... Note : SLUB vs SLAB is important here to *show* the improvement, since they dont have the same minimum allocation sizes (8 bytes vs 32 bytes). This could very well explain regressions some guys reported when they switched to SLUB. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>