VMware – Possible Clues to Failure?
OPINION/SPECULATION:
After more tests, I am still encountering the odd behaviour of VMware (14.1.2 with vmmon patch) and Kernel 4.18-rc1, but did notice one possible clue..
Looking at the # systemctl status vmware.service
output, it includes the following:
Jun 18 01:02:03 rgtest vmware[5663]: Blocking file system[FAILED]
I think this should not be there, as VMBlock has been obsolete for a long time now, and the functionality has been taken over by kernel code, including FUSE (File system in User Space).. And… it turns out that there have been some changes to FUSE in 4.18-rc1: http://lkml.iu.edu/hypermail/linux/kernel/1806.0/01405.html
So.. it may be that changes to FUSE have caused VMware – including userland functions – to be ‘confused’?
More investigation needed, of course..
Robert Gadsdon. June 18, 2018.
Good catch but I’m not sure it’s related to FUSE changes. Starting the “blocking filesystem” leads to executing /usr/lib/vmware/bin/vmware-vmblock-fuse with arguments “-o subtype=vmware-vmblock,default_permissions,allow_other /var/run/vmblock-fuse”. But vmware-vmblock-fuse is actually using a wrapper named appLoader which first loads some dynamic libraries and then executes itself again with the same parameters (not sure what is the purpose of this exercise as execve() drops everything, maybe it sets some environment variables). Except the last parameter, “/var/run/vmblock-fuse” is missing for some reason, leading to error message “fuse: missing mountpoint parameter”.
However, this “blocking filesystem” is AFAIK only used for directory sharing which is something I don’t use. I rather suspect that other strange problem are also symptoms of the same issue: appLoader losing its last argument somehow.
This looks promissing: with 4.18-rc1, reading /proc/self/cmdline doesn’t get trailing null byte which is there with older kernels.
This (kernel) patch should fix the issue with /proc/$pid/cmdline:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1715503.html
I’m sorry, this should be the correct version:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1715506.html
Thanks! I’ve applied the V2 version of the patch, and VMware behaves correctly, now..
Hopefully this patch will pass through the LKML ‘review process’ without too much hassle..
RG.