Bug#863672: performance critical libyuv built with Os

Julian Taylor jtaylor.debian at googlemail.com
Mon May 29 21:14:38 UTC 2017


Package: firefox
Version:  53.0.is.52.0.2-1
Severity: normal


libyuv which is a performance critical library for firefix is built with
-Os which is horrible for performance for it.
In particular row_common.cc which contains the generic parts of the
color transformation code:

See:
https://buildd.debian.org/status/fetch.php?pkg=firefox&arch=amd64&ver=53.0.is.52.0.2-1&stamp=1492644908&raw=0

/usr/bin/g++ -std=gnu++11 -o row_common.o -c  ...   -fPIC
-DMOZILLA_CLIENT -include
/<<PKGBUILDDIR>>/build-browser/mozilla-config.h -MD -MP -MF
.deps/row_common.o.pp -Wdate-time -D_FORTIFY_SOURCE=2 -Wall
-Wc++11-compat -Wempty-body -Wignored-qualifiers -Woverloaded-virtual
-Wpointer-arith -Wsign-compare -Wtype-limits -Wunreachable-code
-Wwrite-strings -Wno-invalid-offsetof -Wc++14-compat
-Wno-error=maybe-uninitialized -Wno-error=deprecated-declarations
-Wno-error=array-bounds -fno-lifetime-dse -fstack-protector-strong
-Wformat -Werror=format-security -fno-schedule-insns2 -fno-lifetime-dse
-fno-delete-null-pointer-checks -fno-exceptions -fno-strict-aliasing
-fno-rtti -ffunction-sections -fdata-sections -fno-exceptions
-fno-math-errno -pthread -pipe  -g -freorder-blocks -Os
-fomit-frame-pointer
/<<PKGBUILDDIR>>/media/libyuv/source/row_common.cc


The problematic part is the YuvPixel function which is called in loops
and in turn calls tiny clamp functions.
Os disables inlining so this causes massive overhead.
This is the top cpu profile on sites which e.g. display videos.
  17.25%  libxul.so                   [.] YuvPixel        ▒
   6.58%  libxul.so                   [.] Clamp           ▒
   6.46%  libxul.so                   [.] clamp255

The problem is not as bad as it looks as this generic code is only
executed on machines that do not have SSSE3, AVX2 or NEON (see
convert_argb.cc)
But there are still plenty useful cpus that do not have these
instruction sets and are crippled by the compiler flags used.

Is it possible to compile this library with O3 to allow the compiler to
vectorize it with the best available generic instruction set (e.g. SSE2
on x64).

cheers,
Julian Taylor

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 845 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-mozilla-maintainers/attachments/20170529/7c495dad/attachment.sig>


More information about the pkg-mozilla-maintainers mailing list