VMware – Possible Clues to Failure?

OPINION/SPECULATION:

After more tests, I am still encountering the odd behaviour of VMware (14.1.2 with vmmon patch) and Kernel 4.18-rc1, but did notice one possible clue..

Looking at the # systemctl status vmware.service output, it includes the following:

Jun 18 01:02:03 rgtest vmware[5663]: Blocking file system[FAILED]

I think this should not be there, as VMBlock has been obsolete for a long time now, and the functionality has been taken over by kernel code, including FUSE (File system in User Space)..     And… it turns out that there have been some changes to FUSE in 4.18-rc1:   http://lkml.iu.edu/hypermail/linux/kernel/1806.0/01405.html

So.. it may be that changes to FUSE have caused VMware – including userland functions – to be ‘confused’?

More investigation needed, of course..

Robert Gadsdon.   June 18, 2018.

VMware – Odd Behaviour with Kernel 4.18-rc1, and a Workaround?

VMware 14.1.2 worked OK with Kernel 4.17, on my Fedora 28 test system, but after updating to 4.18-rc1 the following occurred:

Modules (vmmon/vmnet) compiled OK, but – after further testing – appeared to have a runtime problem..    I applied the vmmon patches from Michal Kubeček, at https://github.com/mkubecek/vmware-host-modules/commit/3f2a6c720f68 and this also compiled cleanly.

When I ran # vmware-modconfig …. I got the following ‘error’:

# vmware-modconfig --console --install-all
..........................
Received option outside of allowed bounds. Option was -1
........................
Must use a valid mode. Use one of:
............................

So, I tried a manual compile/install of vmmon and vmnet, into /lib/modules/4.18.0-rc1/misc (after creating the ~/misc sub-directory), and then #depmod -a and # modprobe vmmon  # modprobe vmnet..

But then:

# service vmware start
Starting vmware (via systemctl): Job for vmware.service failed because the control process exited with error code.
See "systemctl status vmware.service" and "journalctl -xe" for details.
                                               [FAILED]

and:

# systemctl status vmware.service
● vmware.service - SYSV: This service starts and stops VMware services
Loaded: loaded (/etc/rc.d/init.d/vmware; generated)
Active: failed (Result: exit-code) since Mon 2018-06-18 01:02:04 PDT; 3min 7s ago
Docs: man:systemd-sysv-generator(8)
Process: 5663 ExecStart=/etc/rc.d/init.d/vmware start (code=exited, status=1/FAILURE)
CGroup: /system.slice/vmware.service
├─3879 /usr/bin/vmnet-bridge -s 6 -d /var/run/vmnet-bridge-0.pid -n 0
├─3933 /usr/bin/vmnet-netifup -s 6 -d /var/run/vmnet-netifup-vmnet1.pid /dev/vmnet1 vmnet1
├─3953 /usr/bin/vmnet-dhcpd -s 6 -cf /etc/vmware/vmnet1/dhcpd/dhcpd.conf -lf /etc/vmware/vmnet1/dhcpd/dhcpd.leases -pf /var/>
├─3964 /usr/bin/vmnet-natd -s 6 -m /etc/vmware/vmnet8/nat.mac -c /etc/vmware/vmnet8/nat/nat.conf
├─3969 /usr/bin/vmnet-netifup -s 6 -d /var/run/vmnet-netifup-vmnet8.pid /dev/vmnet8 vmnet8
├─3985 /usr/bin/vmnet-dhcpd -s 6 -cf /etc/vmware/vmnet8/dhcpd/dhcpd.conf -lf /etc/vmware/vmnet8/dhcpd/dhcpd.leases -pf /var/>
└─4025 /usr/sbin/vmware-authdlauncher

Jun 18 01:02:03 rgtest vmware[5663]: Blocking file system[FAILED]
Jun 18 01:02:04 rgtest vmware[5663]: Virtual ethernet[ OK ]
Jun 18 01:02:04 rgtest vmware[5663]: VMware Authentication Daemon[ OK ]
Jun 18 01:02:04 rgtest systemd[1]: vmware.service: Control process exited, code=exited status=1
Jun 18 01:02:04 rgtest systemd[1]: vmware.service: Failed with result 'exit-code'.
Jun 18 01:02:04 rgtest systemd[1]: Failed to start SYSV: This service starts and stops VMware services.

The userland # vmware command did nothing, and just returned to a command prompt..

After more testing, I found that using the # /usr/lib/vmware/bin/vmware start command actually worked, and resulted in the normal VMware graphical window appearing, and guest o/s (WinXP, and Fedora) started OK, and guest networking seemed to work correctly..

So..  It would appear that the various error messages are somewhat confused, and VMware does actually work, although not in the normal way..

Robert Gadsdon.   June 18, 2018

Kernel – 4.18-rc1 released – Breaks NVIDIA, Fix Available.. Not OK with VMware

Kernel 4.18-rc1 has been released – earlier than expected, but Linus is in Japan, where it was already Sunday..

Brief details are here:  http://lkml.iu.edu/hypermail/linux/kernel/1806.2/00125.html

Tested with VMware 14.1.2, and vmmon/vmnet compile OK, but NVIDIA fails:

 ...............................
CC [M] /home/rgadsdon/kernel/NVIDIA-Linux-x86_64-390.67/kernel/nvidia-drm/nvidia-drm-drv.o
In file included from /usr/src/linux-4.18-rc1/include/drm/drmP.h:82,
from /home/rgadsdon/kernel/NVIDIA-Linux-x86_64-390.67/kernel/nvidia-drm/nvidia-drm-priv.h:30,
from /home/rgadsdon/kernel/NVIDIA-Linux-x86_64-390.67/kernel/nvidia-drm/nvidia-drm-drv.c:25:
/home/rgadsdon/kernel/NVIDIA-Linux-x86_64-390.67/kernel/nvidia-drm/nvidia-drm-drv.c:637:23: error: ‘DRM_CONTROL_ALLOW’ undeclared here (not in a function); did you mean ‘DRM_RENDER_ALLOW’?
DRM_CONTROL_ALLOW|DRM_UNLOCKED),
^~~~~~~~~~~~~~~~~
/usr/src/linux-4.18-rc1/include/drm/drm_ioctl.h:162:12: note: in definition of macro ‘DRM_IOCTL_DEF_DRV’
.flags = _flags, \
^~~~~~
make[3]: *** [/usr/src/linux-4.18-rc1/scripts/Makefile.build:318: /home/rgadsdon/kernel/NVIDIA-Linux-x86_64-390.67/kernel/nvidia-drm/nvidia-drm-drv.o] Error 1
........................

Thanks to HERB. there is a fix for this, at http://mom.hlmjr.com/2018/06/11/nvidia-drivers-390-67-vs-kernel-4-17/

This patch is for 390.67 specifically, and would need modifying for other versions..     I have tested the patch, and it applies cleanly, and 390.67 compiles OK with 4.18-rc1..

UPDATE:  After further testing…  VMware compiles OK, but runtime fails, and even vmware-modconfig does not work:

# vmware-modconfig --console --install-all
[AppLoader] GLib does not have GSettings support.   <--this is an existing warning (non-fatal) 
Received option outside of allowed bounds. Option was -1
Must use a valid mode. Use one of: ......

See comment below, from Michal Kubeček, with more info..    More testing is needed, and the results will be in a new article..

Robert Gadsdon.   June 17, 2018.

Kernel – 4.17 Released – OK with Latest VMware and NVIDIA..

Kernel 4.17 is out, and details of changes from -rc7 are here: http://lkml.iu.edu/hypermail/linux/kernel/1806.0/01332.html

No real surprises, so far, and it is OK with VMware 14.1.2, and NVIDIA 396.24.

Apparently the major version bump to Linux 5.0 might occur at around the Kernel 4.20 mark….

Robert Gadsdon   June 3, 2018.

Fedora – F28 Annoyances, and Workarounds..

Running Fedora 28 on the test system, I have encountered the following issues with various applications, and found substitute rpm workarounds..  Usually, these can be installed by downloading them ,and then using # rpm -Uvh......... --force..

Grub-customizer still crashes, but the F27 version works ( grub-customizer-5.0.6-6.fc27.x86_64 )

Wine crashes with some windows apps, and the workaround is – again – to download/install the F27 versions of all wine rpms – I am currently using wine-3.9-1.fc27.

HandBrake (from rpmfusion) fails with pixellated images in some cases, and ‘hang’s in others, but the sourceforge F28 versions work correctly ( currently handbrake-cli-1.1.0-2.gitb463d33.fc28.x86_64 and handbrake-gui-1.1.0-2.gitb463d33.fc28.x86_64 )      I had tried installing the F27 version, but this involved several ‘required library version’ mismatches and conflicts…

As I mentioned in a previous article, even (re)creating from source does not work, including compiling from the original source tree..   It would appear that there are still issues with GCC 8, included with this Fedora release…

Hopefully all this will soon be fixed…

Robert Gadsdon.  June 1, 2018.

VMware – 14.1.2 Released – Fixes Annoying Bug..

VMware 14.1.2 has been released, and details are here:  https://docs.vmware.com/en/VMware-Workstation-Pro/14/rn/workstation-1412-release-notes.html

This version fixes the annoying ‘window resizing’ bug, which caused the guest to suddenly flip from fullscreen to windowed, and then attempted to resize the window.

As with 14.1.1, this version is OK with the latest Linux kernels (tested with 14.6.10 and 14.7-rc6).

Robert Gadsdon.   May 22, 2018.

Kernel – 4.17-rc6 Out – Includes GCC8 Compile Fixes..

Kernel 4.17-rc6 has been released, and details of changes since -rc5 are here:  http://lkml.iu.edu/hypermail/linux/kernel/1805.2/03965.html

This version is still OK with the latest VMware (14.1.1) and NVIDIA (396.24) drivers, but also includes fixes for repeated GCC8 / objtool warnings at compile time:

……………….
Josh Poimboeuf (5):
……….
objtool: Support GCC 8’s cold subfunctions
objtool: Support GCC 8 switch tables
…………….

I have confirmed these with GCC 8.1.1 (Fedora 28)..

Robert Gadsdon.  May 20, 2018.

Kernel – GCC8 / ‘Objtool Warnings’ Patches..

There are now patches available to deal with the host of ‘objtool‘ warnings when compiling the kernel with GCC8:

...................................
drivers/video/fbdev/core/fbmem.o: warning: objtool: fb_set_var()+0x209: sibling call from callable instruction with modified stack frame
drivers/video/fbdev/core/fbmem.o: warning: objtool: do_remove_conflicting_framebuffers()+0xa6: sibling call from callable instruction with modified stack frame
drivers/video/fbdev/core/fbmem.o: warning: objtool: register_framebuffer()+0x14b: sibling call from callable instruction with modified stack frame
................... etc....

More details in this thread:  http://lkml.iu.edu/hypermail/linux/kernel/1805.1/02193.html

I have applied the 3 patches, and with Kernel 4.17 (4.17-rc5) the compile is now – relatively – free from ‘warnings’, apart from a few after executing # make xconfig..    The patches also apply cleanly to 4.16 (4.16.8) and do also remove the objtool-related warnings, but still leave a number of ‘syscall‘ and other warnings, which do not occur with 4.17..:

..................
./include/linux/compat.h:52:18: warning: ‘compat_sys_x86_clone’ alias between functions of incompatible types ‘long int(long unsigned int, long unsigned int, int *, long unsigned int, int *)’ and ‘long int(long int, long int, long int, long int, long int)’ [-Wattribute-alias]
 asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\
 ^~~~~~~~~~
./include/linux/compat.h:47:2: note: in expansion of macro ‘COMPAT_SYSCALL_DEFINEx’
 COMPAT_SYSCALL_DEFINEx(5, _##name, __VA_ARGS__)
 ^~~~~~~~~~~~~~~~~~~~~~
arch/x86/ia32/sys_ia32.c:240:1: note: in expansion of macro ‘COMPAT_SYSCALL_DEFINE5’
 COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, clone_flags,
 ^~~~~~~~~~~~~~~~~~~~~~
./include/linux/compat.h:56:18: note: aliased declaration here
 asmlinkage long compat_SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__))\
 ^~~~~~~~~~
./include/linux/compat.h:47:2: note: in expansion of macro ‘COMPAT_SYSCALL_DEFINEx’
 COMPAT_SYSCALL_DEFINEx(5, _##name, __VA_ARGS__)
 ^~~~~~~~~~~~~~~~~~~~~~
arch/x86/ia32/sys_ia32.c:240:1: note: in expansion of macro ‘COMPAT_SYSCALL_DEFINE5’
 COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, clone_flags,
..............  (repeated) ...............

So – at least with 4.17 Kernels, the GCC8 warnings have been dealt with..   These patches are still work-in-progress, but hopefully will make it to mainline soon..

Robert Gadsdon.  May 14, 2018.

GCC – Update to 8.1.1 – Still Gives Kernel Compile Warnings..

Fedora 28 has GCC 8.1.1 available (currently in ‘updates-testing‘ repo), and I tested this with Kernel 4.17-rc4, to see if there was any improvement in the horde of warnings generated with 8.0.1.

In short, there is no real difference:

....................................
 CC security/keys/request_key.o
arch/x86/kvm/lapic.o: warning: objtool: kvm_lapic_reg_read()+0x126: sibling call from callable instruction with modified stack frame
arch/x86/kvm/lapic.o: warning: objtool: limit_periodic_timer_frequency.part.16()+0x3b: sibling call from callable instruction with modified stack frame
arch/x86/kvm/lapic.o: warning: objtool: kvm_irq_delivery_to_apic_fast()+0x373: sibling call from callable instruction with modified stack frame
arch/x86/kvm/lapic.o: warning: objtool: kvm_intr_is_single_vcpu_fast()+0x286: sibling call from callable instruction with modified stack frame
arch/x86/kvm/lapic.o: warning: objtool: kvm_lapic_set_base()+0xf0: sibling call from callable instruction with modified stack frame
arch/x86/kvm/lapic.o: warning: objtool: kvm_create_lapic()+0x39: sibling call from callable instruction with modified stack frame
arch/x86/kvm/lapic.o: warning: objtool: kvm_lapic_set_base.cold.24()+0x16: sibling call from callable instruction with modified stack frame
..................... etc, etc....................

There were patches proposed for the kernel, for GCC 8, but these would not appear to have made it into mainstream, yet..    There may be some debate over whether the fix is with GCC, or with kernel code…   Older versions of the kernel (tested with 4.9.98) compile more ‘cleanly’..

Robert Gadsdon.    May 7th, 2018.

Fedora – F28 GCC 8 Now OK with Kernel 4.16.7, and fix for Grub-Customizer..

Tried Kernel 4.16.7 on Fedora 28 (GCC 8.0.1), and this now compiles successfully, although still with a lot of ‘warnings’..

I use grub-customizer with each new kernel, and the F28 version of this fails:

# grub-customizer
Error creating proxy: The connection is closed (g-io-error-quark, 18)
 *** initializing (w/o specified bootloader type)…
 * reading partition info…
/usr/include/c++/8/bits/basic_string.h:1048: std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::reference std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::operator[](std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::size_type) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>; std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::reference = char&; std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]: Assertion '__pos <= size()' failed.
Aborted (core dumped)

Workaround, is to install the F27 version:

rpm -Uvh grub-customizer-5.0.6-6.fc27.x86_64.rpm --force

– and this works OK, again..

Robert Gadsdon.   May 2, 2018.