Add ARM NEON intrinsic optimizations for SincResampler.
On an exynos board these yielded an ~2.3x speedup:
Benchmarking 50000000 iterations:
Convolve_C took 5682.71ms.
Convolve_NEON(unaligned) took 2451.18ms; which is 2.32x faster than Convolve_C.
Convolve_NEON (aligned) took 2397.01ms; which is 2.37x faster than Convolve_C and 1.02x faster than Convolve_NEON (unaligned).
BUG=none
TEST=try bot, fischman.
Review URL: https://codereview.chromium.org/10960023
git-svn-id: svn://svn.chromium.org/chrome/trunk/src@158870 0039d316-1c4b-4281-b951-d872f2087c98
4 files changed