Optimization/kernel by anvol · Pull Request #2 · boa19861105/android_442_KitKat_kernel_htc_dlxpul

anvol · 2014-09-10T17:31:02Z

Various kernel patches & hacks from mainstream

Signed-off-by: flar2 <asegaert@gmail.com>

the kernel's memcpy and memmove is very inefficient. But the glibc version is quite fast, in some cases it is 10 times faster than the kernel version. So I introduce some memory copy macros and functions of the glibc to improve the kernel version's performance. The strategy of the memory functions is: 1. Copy bytes until the destination pointer is aligned. 2. Copy words in unrolled loops. If the source and destination are not aligned in the same way, use word memory operations, but shift and merge two read words before writing. 3. Copy the few remaining bytes. Signed-off-by: Miao Xie <miaox*******> Conflicts: lib/Makefile Signed-off-by: flar2 <asegaert@gmail.com>

the performance of memcpy and memmove of the general version is very inefficient, this patch improved them. Signed-off-by: Miao Xie <miaox*******> Conflicts: lib/string.c Signed-off-by: flar2 <asegaert@gmail.com>

Optimize the current version of the shift-and-subtract (hardware) algorithm, described by John von Newmann[1] and Guy L Steele. Iterating 1,000,000 times, perf shows for the current version: Performance counter stats for './sqrt-curr' (10 runs): 27.170996 task-clock # 0.979 CPUs utilized ( +- 3.19% ) 3 context-switches # 0.103 K/sec ( +- 4.76% ) 0 cpu-migrations # 0.004 K/sec ( +-100.00% ) 104 page-faults # 0.004 M/sec ( +- 0.16% ) 64,921,199 cycles # 2.389 GHz ( +- 0.03% ) 28,967,789 stalled-cycles-frontend # 44.62% frontend cycles idle ( +- 0.18% ) <not supported> stalled-cycles-backend 104,502,623 instructions # 1.61 insns per cycle # 0.28 stalled cycles per insn ( +- 0.00% ) 34,088,368 branches # 1254.587 M/sec ( +- 0.00% ) 4,901 branch-misses # 0.01% of all branches ( +- 1.32% ) 0.027763015 seconds time elapsed ( +- 3.22% ) And for the new version: Performance counter stats for './sqrt-new' (10 runs): 0.496869 task-clock # 0.519 CPUs utilized ( +- 2.38% ) 0 context-switches # 0.000 K/sec 0 cpu-migrations # 0.403 K/sec ( +-100.00% ) 104 page-faults # 0.209 M/sec ( +- 0.15% ) 590,760 cycles # 1.189 GHz ( +- 2.35% ) 395,053 stalled-cycles-frontend # 66.87% frontend cycles idle ( +- 3.67% ) <not supported> stalled-cycles-backend 398,963 instructions # 0.68 insns per cycle # 0.99 stalled cycles per insn ( +- 0.39% ) 70,228 branches # 141.341 M/sec ( +- 0.36% ) 3,364 branch-misses # 4.79% of all branches ( +- 5.45% ) 0.000957440 seconds time elapsed ( +- 2.42% ) Furthermore, this saves space in instruction text: text data bss dec hex filename 111 0 0 111 6f lib/int_sqrt-baseline.o 89 0 0 89 59 lib/int_sqrt.o [1] http://en.wikipedia.org/wiki/First_Draft_of_a_Report_on_the_EDVAC Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Reviewed-by: Jonathan Gonzalez <jgonzlez@linets.cl> Tested-by: Jonathan Gonzalez <jgonzlez@linets.cl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: flar2 <asegaert@gmail.com>

Android goes through suspend/resume very often (every few seconds when on a busy wifi network with the screen off), and a significant portion of the energy used to go in and out of suspend is spent in the freezer. If a task has called freezer_do_not_count(), don't bother waking it up. If it happens to wake up later it will call freezer_count() and immediately enter the refrigerator. Combined with patches to convert freezable helpers to use freezer_do_not_count() and convert common sites where idle userspace tasks are blocked to use the freezable helpers, this reduces the time and energy required to suspend and resume. Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: flar2 <asegaert@gmail.com>

Freezing tasks will wake up almost every userspace task from where it is blocking and force it to run until it hits a call to try_to_sleep(), generally on the exit path from the syscall it is blocking in. On resume each task will run again, usually restarting the syscall and running until it hits the same blocking call as it was originally blocked in. Convert the existing wait_event_freezable* wrappers to use freezer_do_not_count(). Combined with a previous patch, these tasks will not run during suspend or resume unless they wake up for another reason, in which case they will run until they hit the try_to_freeze() in freezer_count(), and then continue processing the wakeup after tasks are thawed. This results in a small change in behavior, previously a race between freezing and a normal wakeup would be won by the wakeup, now the task will freeze and then handle the wakeup after thawing. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: flar2 <asegaert@gmail.com>

Date Wed, 1 May 2013 18:35:02 -0700 Avoid waking up every thread sleeping in a binder call during suspend and resume by calling a freezable blocking call. Previous patches modified the freezer to avoid sending wakeups to threads that are blocked in freezable blocking calls. This call was selected to be converted to a freezable call because it doesn't hold any locks or release any resources when interrupted that might be needed by another freezing task or a kernel driver during suspend, and is a common site where idle userspace tasks are blocked. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: flar2 <asegaert@gmail.com>

Date Wed, 1 May 2013 18:35:03 -0700 Avoid waking up every thread sleeping in an epoll_wait call during suspend and resume by calling a freezable blocking call. Previous patches modified the freezer to avoid sending wakeups to threads that are blocked in freezable blocking calls. This call was selected to be converted to a freezable call because it doesn't hold any locks or release any resources when interrupted that might be needed by another freezing task or a kernel driver during suspend, and is a common site where idle userspace tasks are blocked. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: flar2 <asegaert@gmail.com>

Date Wed, 1 May 2013 18:35:04 -0700 Avoid waking up every thread sleeping in a select call during suspend and resume by calling a freezable blocking call. Previous patches modified the freezer to avoid sending wakeups to threads that are blocked in freezable blocking calls. This call was selected to be converted to a freezable call because it doesn't hold any locks or release any resources when interrupted that might be needed by another freezing task or a kernel driver during suspend, and is a common site where idle userspace tasks are blocked. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: flar2 <asegaert@gmail.com>

Date Wed, 1 May 2013 18:35:05 -0700 Avoid waking up every thread sleeping in a futex_wait call during suspend and resume by calling a freezable blocking call. Previous patches modified the freezer to avoid sending wakeups to threads that are blocked in freezable blocking calls. This call was selected to be converted to a freezable call because it doesn't hold any locks or release any resources when interrupted that might be needed by another freezing task or a kernel driver during suspend, and is a common site where idle userspace tasks are blocked. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: flar2 <asegaert@gmail.com>

Date Wed, 1 May 2013 18:35:06 -0700 Avoid waking up every thread sleeping in a nanosleep call during suspend and resume by calling a freezable blocking call. Previous patches modified the freezer to avoid sending wakeups to threads that are blocked in freezable blocking calls. This call was selected to be converted to a freezable call because it doesn't hold any locks or release any resources when interrupted that might be needed by another freezing task or a kernel driver during suspend, and is a common site where idle userspace tasks are blocked. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: flar2 <asegaert@gmail.com>

Date Wed, 1 May 2013 18:35:07 -0700 Avoid waking up every thread sleeping in a sigtimedwait call during suspend and resume by calling a freezable blocking call. Previous patches modified the freezer to avoid sending wakeups to threads that are blocked in freezable blocking calls. This call was selected to be converted to a freezable call because it doesn't hold any locks or release any resources when interrupted that might be needed by another freezing task or a kernel driver during suspend, and is a common site where idle userspace tasks are blocked. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: flar2 <asegaert@gmail.com>

Date Wed, 1 May 2013 18:35:08 -0700 Avoid waking up every thread sleeping in read call on an AF_UNIX socket during suspend and resume by calling a freezable blocking call. Previous patches modified the freezer to avoid sending wakeups to threads that are blocked in freezable blocking calls. This call was selected to be converted to a freezable call because it doesn't hold any locks or release any resources when interrupted that might be needed by another freezing task or a kernel driver during suspend, and is a common site where idle userspace tasks are blocked. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: flar2 <asegaert@gmail.com>

If we take the 2nd retry path in ext4_expand_extra_isize_ea, we potentionally return from the function without having freed these allocations. If we don't do the return, we over-write the previous allocation pointers, so we leak either way. Spotted with Coverity. [ Fixed by tytso to set is and bs to NULL after freeing these pointers, in case in the retry loop we later end up triggering an error causing a jump to cleanup, at which point we could have a double free bug. -- Ted ] Change-Id: I49b8ca41a6c6d44b563eb23306870258a3affd3b Signed-off-by: Dave Jones <davej@fedoraproject.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Cc: stable@vger.kernel.org Git-commit: 6e4ea8e33b2057b85d75175dd89b93f5e26de3bc Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Signed-off-by: Osvaldo Banuelos <osvaldob@codeaurora.org> Signed-off-by: flar2 <asegaert@gmail.com>

One testbox of mine (Intel Nehalem, 16-way) uses MWAIT for its idle routine, which apparently can break out of its idle loop rather frequently, with high frequency. In that case NO_HZ_FULL=y kernels show high ksoftirqd overhead and constant context switching, because tick_nohz_stop_sched_tick() will, if delta_jiffies == 0, mis-identify this as a timer event - activating the TIMER_SOFTIRQ, which wakes up ksoftirqd. Fix this by treating delta_jiffies == 0 the same way we treat other short wakeups, delta_jiffies == 1.

Signed-off-by: Lens-F <tnoah12@gmail.com> Signed-off-by: flar2 <asegaert@gmail.com>

Signed-off-by: flar2 <asegaert@gmail.com>

Dave Jones trinity syscall fuzzer exposed an issue in the deadlock detection code of rtmutex: http://lkml.kernel.org/r/20140429151655.GA14277@...hat.com That underlying issue has been fixed with a patch to the rtmutex code, but the futex code must not call into rtmutex in that case because - it can detect that issue early - it avoids a different and more complex fixup for backing out If the user space variable got manipulated to 0x80000000 which means no lock holder, but the waiters bit set and an active pi_state in the kernel is found we can figure out the recursive locking issue by looking at the pi_state owner. If that is the current task, then we can safely return -EDEADLK. The check should have been added in commit 59fa62451 (futex: Handle futex_pi OWNER_DIED take over correctly) already, but I did not see the above issue caused by user space manipulation back then. Signed-off-by: Thomas Gleixner <tglx@...utronix.de> Cc: Dave Jones <davej@...hat.com> Cc: Linus Torvalds <torvalds@...ux-foundation.org> Cc: Peter Zijlstra <peterz@...radead.org> Cc: Darren Hart <darren@...art.com> Cc: Davidlohr Bueso <davidlohr@...com> Cc: Steven Rostedt <rostedt@...dmis.org> Cc: Clark Williams <williams@...hat.com> Cc: Paul McKenney <paulmck@...ux.vnet.ibm.com> Cc: Lai Jiangshan <laijs@...fujitsu.com> Cc: Roland McGrath <roland@...k.frob.com> Cc: Carlos ODonell <carlos@...hat.com> Cc: Jakub Jelinek <jakub@...hat.com> Cc: Michael Kerrisk <mtk.manpages@...il.com> Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de> Link: http://lkml.kernel.org/r/20140512201701.097349971@...utronix.de Signed-off-by: Thomas Gleixner <tglx@...utronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org> Change-Id: I245f77622443435cf36bbd157b98798c235acdb0 Signed-off-by: flar2 <asegaert@gmail.com>

…addr2 in futex_requeue(..., requeue_pi=1) If uaddr == uaddr2, then we have broken the rule of only requeueing from a non-pi futex to a pi futex with this call. If we attempt this, then dangling pointers may be left for rt_waiter resulting in an exploitable condition. This change brings futex_requeue() into line with futex_wait_requeue_pi() which performs the same check as per commit 6f7b0a2a5 (futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi()) [ tglx: Compare the resulting keys as well, as uaddrs might be different depending on the mapping ] Fixes CVE-2014-3153. Reported-by: Pinkie Pie Signed-off-by: Will Drewry <wad@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Change-Id: Iea1dfe1d877d63a374db2623dca6cea4e7a544b9 Signed-off-by: flar2 <asegaert@gmail.com>

We need to protect the atomic acquisition in the kernel against rogue user space which sets the user space futex to 0, so the kernel side acquisition succeeds while there is existing state in the kernel associated to the real owner. Verify whether the futex has waiters associated with kernel state. If it has, return -EINVAL. The state is corrupted already, so no point in cleaning it up. Subsequent calls will fail as well. Not our problem. [ tglx: Use futex_top_waiter() and explain why we do not need to try restoring the already corrupted user space state. ] Signed-off-by: Darren Hart <dvhart@linux.intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Will Drewry <wad@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Change-Id: I6c19ae4a5e85fbc9c560723e672d5136c4786ab6 Signed-off-by: flar2 <asegaert@gmail.com>

If the owner died bit is set at futex_unlock_pi, we currently do not cleanup the user space futex. So the owner TID of the current owner (the unlocker) persists. That's observable inconsistant state, especially when the ownership of the pi state got transferred. Clean it up unconditionally. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Will Drewry <wad@chromium.org> Cc: Darren Hart <dvhart@linux.intel.com> Cc: stable@vger.kernel.org Change-Id: Ic84473a62bdf3e5702f46e04b266afa38e9ec40e Signed-off-by: flar2 <asegaert@gmail.com>

The current implementation of lookup_pi_state has ambigous handling of the TID value 0 in the user space futex. We can get into the kernel even if the TID value is 0, because either there is a stale waiters bit or the owner died bit is set or we are called from the requeue_pi path or from user space just for fun. The current code avoids an explicit sanity check for pid = 0 in case that kernel internal state (waiters) are found for the user space address. This can lead to state leakage and worse under some circumstances. Handle the cases explicit: Waiter | pi_state | pi->owner | uTID | uODIED | ? [1] NULL | --- | --- | 0 | 0/1 | Valid [2] NULL | --- | --- | >0 | 0/1 | Valid [3] Found | NULL | -- | Any | 0/1 | Invalid [4] Found | Found | NULL | 0 | 1 | Valid [5] Found | Found | NULL | >0 | 1 | Invalid [6] Found | Found | task | 0 | 1 | Valid [7] Found | Found | NULL | Any | 0 | Invalid [8] Found | Found | task | ==taskTID | 0/1 | Valid [9] Found | Found | task | 0 | 0 | Invalid [10] Found | Found | task | !=taskTID | 0/1 | Invalid [1] Indicates that the kernel can acquire the futex atomically. We came came here due to a stale FUTEX_WAITERS/FUTEX_OWNER_DIED bit. [2] Valid, if TID does not belong to a kernel thread. If no matching thread is found then it indicates that the owner TID has died. [3] Invalid. The waiter is queued on a non PI futex [4] Valid state after exit_robust_list(), which sets the user space value to FUTEX_WAITERS | FUTEX_OWNER_DIED. [5] The user space value got manipulated between exit_robust_list() and exit_pi_state_list() [6] Valid state after exit_pi_state_list() which sets the new owner in the pi_state but cannot access the user space value. [7] pi_state->owner can only be NULL when the OWNER_DIED bit is set. [8] Owner and user space value match [9] There is no transient state which sets the user space TID to 0 except exit_robust_list(), but this is indicated by the FUTEX_OWNER_DIED bit. See [4] [10] There is no transient state which leaves owner and user space TID out of sync. Backport to 3.13 conflicts: kernel/futex.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Johansen <john.johansen@canonical.com> Cc: Kees Cook <keescook@chromium.org> Cc: Will Drewry <wad@chromium.org> Cc: Darren Hart <dvhart@linux.intel.com> Cc: stable@vger.kernel.org Change-Id: I967e3bb07b239941ea51af6b09d61d7b08138f40 Signed-off-by: flar2 <asegaert@gmail.com>

anvol and others added 30 commits July 10, 2014 11:27

Input: Send events one packet at a time

2871e23

Add /dev/frandom support

d8a1a21

Signed-off-by: flar2 <asegaert@gmail.com>

lib/string: use glibc version

0ccff0c

the performance of memcpy and memmove of the general version is very inefficient, this patch improved them. Signed-off-by: Miao Xie <miaox*******> Conflicts: lib/string.c Signed-off-by: flar2 <asegaert@gmail.com>

freezer: shorten freezer sleep time using exponential backoff

0b310f9

freezer: add new freezable helpers using freezer_do_not_count

f2d6027

freezer: add some missing functions

244781d

binfmt_elf.c: use get_random_int() to fix entropy depleting

80901af

AIO: Don't plug the I/O queue in do_io_submit()

2b4298a

AIO: merge bug fixed

0fa8779

Makefile: Linaro GCC 4.9

9c8b3ec

Linaro compile bugfixes

748e7cd

fix GCC 4.9 compilation errors

7f8bd0b

Linaro 4.9 compilation errors fix

6be1007

GPU: add optimization flags to GPU drivers

44c565c

Signed-off-by: Lens-F <tnoah12@gmail.com> Signed-off-by: flar2 <asegaert@gmail.com>

ARM CPU Topology and cpu_power driver

4aeb7ab

arm optimized crypto algorithms

36f06e2

Signed-off-by: flar2 <asegaert@gmail.com>

fix for app-mounted directories (thanks @mkasick)

123722e

flar2 and others added 13 commits July 13, 2014 18:03

fix pocket detection

535b07a

Signed-off-by: flar2 <asegaert@gmail.com>

allow moc-crypto modules to load

c64c36c

softirq: reduce latencies

9919ca3

sync: don't block the flusher thread waiting on IO

414c34c

reduce wlan_htc wakelock a little

1bde45b

Signed-off-by: flar2 <asegaert@gmail.com>

reduce wlan_ctrl_wake and wlan_rx_wake wakelocks

c186217

Signed-off-by: flar2 <asegaert@gmail.com>

Makefile adjustments

52dfaff

Signed-off-by: flar2 <asegaert@gmail.com>

futex: Prevent attaching to kernel threads

b4d695c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization/kernel#2

Optimization/kernel#2
anvol wants to merge 43 commits into
boa19861105:masterfrom
anvol:optimization/kernel

anvol commented Sep 10, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

anvol commented Sep 10, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants