[Bug 212812] www/chromium: tabs "hang" 10% of the time

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Nov 2 15:13:38 UTC 2017


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212812

--- Comment #33 from Vince Mulhollon <vince.mulhollon at springcitysolutions.com> ---
Problem seems to be getting worse over the last few months and last few
upgrades, to the point I've had to switch to Firefox recently now that tab
rendering success rate is well below 10%.

With no change in rate of tab rendering failure I have tried the following from
the bug report:

Disable V8 caching in chrome://flags (which also takes 10-20 attempts to
render)

Wipe .config/chromium (multiple times... first time I did that I wiped over a
gig of junk)

Disable hardware acceleration in chrome://settings "Use hardware Accleration
when available"

Multiple machines: I have three desktops using NFS home / LDAP etc so its
trivial to try chromium across multiple machines, all with identical problems.

I had various older proprietary nvidia driver software on my multiple machines,
I've upgraded the machines multiple times since the problem began per
nvidia-smi as of today I'm running the latest "Driver Version: 384.59" and
there is no change in tab hanging.

Because I have multiple different machines on the desk I've experienced the
same failure on six nvidia cards of various ages and models: two older (circa
2014) geforce gtx 730 connected via analog vga 1280x1024, one ancient (circa
2011) geforce gtx 560ti DVI connection 1600x1200, and three new (circa 2017)
geforce gtx 1050ti displayport connection 2560x1440 144 hz.  I have no hardware
problems running CPU/GPU intensive workloads for hours without any crashes such
as firefox or minecraft.  The bug I'm experiencing is solely related to
chromium tab opening failing 90%+ of the time; never a graphics (or other
subsystem) hang or kernel crash.  I can and do run onshape.com for hours in
firefox, which is an online html5 professional CAD program which would seem to
torture test 3d graphics rendering and hardware acceleration in a html5
browser.  I have an onshape CAD drawing open in another tab in firefox while
I'm typing this; flawless operation, at least in firefox.  Whatever is wrong,
it probably doesn't relate to graphics hardware or hardware acceleration or
graphics drivers given that chromium hangs the same way and same rate of
failure regardless of whatever hardware I throw at it; admittedly I don't have
access to any non-nvidia graphics hardware.

I've tried a couple combinations of the above ; old card with disabled hardware
and disabled caching but latest driver, wipe the config and no caching but
enable hardware, sorry but I didn't record all my mixtures of experiments.  No
change in failure rate under any condition.

One of the machines has front mounted drive trays; stick a drive tray boot
drive with a SSD with devuan (basically, a debian distro) linux installed and
everything works with every nvidia hardware card.  Ditto win10 and win7.  I
don't have drive trays on the other two freebsd machines.

I find it fascinating that the failure rate is not 100% or 0% but is roughly
90% and for a given version of chromium regardless of hardware, the rate of
failure in the long run is constant regardless of what I try, a sparse and
graphically bare intranet of a couple K of pure html and CSS and no JS fails
equally often as some social networking site with 5 megs of spyware and ad
links and JS.  I opened and closed a gmail tab on another machine on my desk in
chromium 13 times before it worked, whereas the intranet took 8 tries, but I've
seen those numbers reverse before.  The standard deviation is very high,
sometimes (although VERY rarely) tabs render in as little as three attempts. 
Its been a long time since a chromium tab renders on the first try; maybe a
year now.

All three machines are running SSDs.  Two have 1/4 tb raid1 mirror arrays with
zfs; the third drive tray machine has a 120 GB SSD for freebsd.  Two machines
have 16 GB of RAM one has 8 GB.  I'm playing with one machine as I'm typing
this, trying to open a gmail window and top reports 2925M free with 0 swap use;
whatever's wrong its not because its starved for memory or cpu load is too
high.  All three motherboards are older AMD64.

I work in a secured "armed guard" type of environment, so once I got a tab to
gmail to work, I was able to leave the tab open and running on a physically
secured machine; once a tab initially renders, if it doesn't hang when opened
of course, performance is excellent and there are no weird crashes even after
days of continuous use of the same tab.  No slowdowns, no bugs, no weird
rendering, no kernel crashes, no weird syslog lines, no memory leaks (none that
hit within a week or two anyway)  Either it fails to begin to render or it
works and that one tab will continue to work for at least a week of continuous
use.

I have access to an extremely large vmware cluster so this morning I spun up a
freebsd host on the dev vlan and installed chromium 61.0,3163.100_1 ... I can
connect to the vmware image using rdesktop and its slow but tabs work 100% of
the time.  That is interesting because its the same OS with the same
ansible-enforced packages configuration and installation but if it runs on bare
hardware the chromium render hangs, but if I run it on virtualized hardware it
works perfectly (well, slowly as you'd expect...)  Neither ansible nor the OS
"know or care" that one machine is bare metal hardware and the other machine is
a virtual image, from an ansible configuration standpoint the only difference
between the three bare metal installs on my desk and the virtual image on the
cluster is ip addrs and hostname.  Obviously the hardware is different; the
cluster is intel hardware and the cluster image devices are virtual not bare
metal and silicon.

dmesg on the identical ansible-configured virtualized image reports "vga0:
<Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0". 
/var/log/Xorg.0.log reports its using the vmware x11 driver "[    11.742] (--)
vmware(0): VMware SVGA regs at (0x1070, 0x1071)"  I believe there is a way to
pass thru GPU access to a vmware image and I'm obviously not doing that.

The fact that chromium renders tabs fine when an identical image is virtualized
and run without the nvidia driver, connected via a rdesktop session, would seem
to point to the nvidia driver GPU acceleration as the problem, although the
nvidia driver operates perfectly with all other software on the machine
including extremely graphically intensive software and the only symptom of any
sort ever is creating tabs fails to render almost all the time only on chromium
seems to point to a problem with how chromium talks to nvidia driver WRT some
kind of hardware acceleration that cannot be disabled from options in the
browser config.  Its also odd that the driver fails only 90% of the time and
the failure rate appears unrelated to system workload or complexity/size of the
page to be rendered or physical nvidia card hardware model.

I wonder if I opened 100000 tabs, using some kind of GUI automation software
that might not even exist, if the failure rate converged to 15/16th of the
time.  If when opening a tab, some random stack address or something has to
randomly line up precisely on a perfect 16 byte address boundary or it locks up
the thread.  I wouldn't even know where to begin to look, but I do have a gut
level feeling the failure rate, if it could be measured, is currently exactly
15/16th of the time, and somewhere a 128 bit long "something" is only being
stored correctly 1/16th of the time.

If anyone has any ideas or suggestions for experiments, please advise.  I'm
outta ideas, and firefox works great, and the only reason I care anymore is my
chromebook uses chrome so it would be nice to sync my desktop and my chromebook
bookmarks, whatevs...

Have a pleasant day and thanks for your efforts thus far and in the future and
good luck with this tricky bug!

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-chromium mailing list