summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-18pci/shpc: Move function to generic header fileYuval Shaia
This function should be declared in generic header file so we can utilize it. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18virtio: improve virtio devices initialization timeGal Hammer
The loading time of a VM is quite significant when its virtio devices use a large amount of virt-queues (e.g. a virtio-serial device with max_ports=511). Most of the time is spend in the creation of all the required event notifiers (ioeventfd and memory regions). This patch pack all the changes to the memory regions in a single memory transaction. Reported-by: Sitong Liu <siliu@redhat.com> Reported-by: Xiaoling Gao <xiagao@redhat.com> Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18virtio: postpone the execution of event_notifier_cleanup functionGal Hammer
Use the EventNotifier's cleanup callback function to execute the event_notifier_cleanup function after kvm unregistered the eventfd. This change supports running the virtio_bus_set_host_notifier function inside a memory region transaction. Otherwise, a closed fd is sent to kvm, which results in a failure. Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18qemu: add a cleanup callback function to EventNotifierGal Hammer
Adding a cleanup callback function to the EventNotifier struct which allows users to execute event_notifier_cleanup in a different context. Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18contrib/vhost-user-blk: introduce a vhost-user-blk sample applicationChangpeng Liu
This commit introduces a vhost-user-blk backend device, it uses UNIX domain socket to communicate with QEMU. The vhost-user-blk sample application should be used with QEMU vhost-user-blk-pci device. To use it, complie with: make vhost-user-blk and start like this: vhost-user-blk -b /dev/sdb -s /path/vhost.socket Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18contrib/libvhost-user: enable virtio config space messagesChangpeng Liu
Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages in libvhost-user library, users can implement their own I/O target based on the library. This enable the virtio config space delivered between QEMU host device and the I/O target. Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18vhost-user-blk: introduce a new vhost-user-blk host deviceChangpeng Liu
This commit introduces a new vhost-user device for block, it uses a chardev to connect with the backend, same with Qemu virito-blk device, Guest OS still uses the virtio-blk frontend driver. To use it, start QEMU with command line like this: qemu-system-x86_64 \ -chardev socket,id=char0,path=/path/vhost.socket \ -device vhost-user-blk-pci,chardev=char0,num-queues=2, \ bootindex=2... \ Users can use different parameters for `num-queues` and `bootindex`. Different with exist Qemu virtio-blk host device, it makes more easy for users to implement their own I/O processing logic, such as all user space I/O stack against hardware block device. It uses the new vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config information from backend process. Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18vhost-user: add new vhost user messages to support virtio config spaceChangpeng Liu
Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be used for live migration of vhost user devices, also vhost user devices can benefit from the messages to get/set virtio config space from/to the I/O target. For the purpose to support virtio config space change, VHOST_USER_SLAVE_CONFIG_CHANGE_MSG message is added as the event notifier in case virtio config space change in the slave I/O target. Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18MAINTAINERS: Add myself as maintainer to X86 machinesMarcel Apfelbaum
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2018-01-17' into ↵Peter Maydell
staging pull-nbd-2018-01-17 - Vladimir Sementsov-Ogievskiy/Eric Blake: 0/6 NBD server refactoring # gpg: Signature made Thu 18 Jan 2018 02:21:55 GMT # gpg: using RSA key 0xA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2018-01-17: nbd/server: structurize option reply sending nbd/server: Add helper functions for parsing option payload nbd/server: Add va_list form of nbd_negotiate_send_rep_err() nbd/server: Better error for NBD_OPT_EXPORT_NAME failure nbd/server: refactor negotiation functions parameters nbd/server: Hoist nbd_reject_length() earlier Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-01-18Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into ↵Peter Maydell
staging x86 queue, 2018-01-17 Highlight: new CPU models that expose CPU features that guests can use to mitigate CVE-2017-5715 (Spectre variant #2). # gpg: Signature made Thu 18 Jan 2018 02:00:03 GMT # gpg: using RSA key 0x2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/x86-pull-request: i386: Add EPYC-IBPB CPU model i386: Add new -IBRS versions of Intel CPU models i386: Add FEAT_8000_0008_EBX CPUID feature word i386: Add spec-ctrl CPUID bit i386: Add support for SPEC_CTRL MSR i386: Change X86CPUDefinition::model_id to const char* target/i386: add clflushopt to "Skylake-Server" cpu model pc: add 2.12 machine types Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-01-18Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.12-20180117' ↵Peter Maydell
into staging ppc patch queue 2017-01-17 Another pull request for ppc related patches. The most interesting thing here is the new capabilities framework for the pseries machine type. This gives us better handling of several existing incompatibilities between TCG, PR and HV KVM, as well as new ones that arise with POWER9. Further, it will allow reasonable handling of the advertisement of features necessary to mitigate the recent CVEs (Spectre and Meltdown). In addition there's: * Improvide handling of different "vsmt" modes * Significant enhancements to the "pnv" machine type * Assorted other bugfixes # gpg: Signature made Wed 17 Jan 2018 02:21:50 GMT # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.12-20180117: (22 commits) target-ppc: Fix booke206 tlbwe TLB instruction target/ppc: add support for POWER9 HILE ppc/pnv: change initrd address ppc/pnv: fix XSCOM core addressing on POWER9 ppc/pnv: introduce pnv*_is_power9() helpers ppc/pnv: change core mask for POWER9 ppc/pnv: use POWER9 DD2 processor tests/boot-serial-test: fix powernv support ppc/pnv: Update skiboot firmware image spapr: Adjust default VSMT value for better migration compatibility spapr: Allow some cases where we can't set VSMT mode in the kernel target/ppc: Clarify compat mode max_threads value ppc: Change Power9 compat table to support at most 8 threads/core spapr: Remove unnecessary 'options' field from sPAPRCapabilityInfo hw/ppc/spapr_caps: Rework spapr_caps to use uint8 internal representation spapr: Handle Decimal Floating Point (DFP) as an optional capability spapr: Handle VMX/VSX presence as an spapr capability flag target/ppc: Clean up probing of VMX, VSX and DFP availability on KVM spapr: Validate capabilities on migration spapr: Treat Hardware Transactional Memory (HTM) as an optional capability ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-01-18cocoa.m: Fix scroll wheel supportJohn Arbuckle
When using a mouse's scroll wheel in a guest with the cocoa front-end, the mouse pointer moves up and down instead of scrolling the window. This patch fixes this problem. Signed-off-by: John Arbuckle <programmingkidx@gmail.com> Message-id: 20180108180707.7976-1-programmingkidx@gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-01-17nbd/server: structurize option reply sendingVladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20171122101958.17065-6-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2018-01-17nbd/server: Add helper functions for parsing option payloadEric Blake
Rather than making every callsite perform length sanity checks and error reporting, add the helper functions nbd_opt_read() and nbd_opt_drop() that use the length stored in the client struct; also add an assertion that optlen is 0 before any option (ie. any previous option was fully handled), complementing the assertion added in an earlier patch that optlen is 0 after all negotiation completes. Note that the call in nbd_negotiate_handle_export_name() does not use the new helper (in part because the server cannot reply to NBD_OPT_EXPORT_NAME - it either succeeds or the connection drops). Based on patches by Vladimir Sementsov-Ogievskiy. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180110230825.18321-6-eblake@redhat.com>
2018-01-17nbd/server: Add va_list form of nbd_negotiate_send_rep_err()Eric Blake
This will be useful for the next patch. Based on a patch by Vladimir Sementsov-Ogievskiy Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180110230825.18321-5-eblake@redhat.com>
2018-01-17nbd/server: Better error for NBD_OPT_EXPORT_NAME failureEric Blake
When a client abruptly disconnects before we've finished reading the name sent with NBD_OPT_EXPORT_NAME, we are better off logging the failure as EIO (we can't communicate with the client), rather than EINVAL (the client sent bogus data). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180110230825.18321-4-eblake@redhat.com>
2018-01-17nbd/server: refactor negotiation functions parametersVladimir Sementsov-Ogievskiy
Instead of passing currently negotiating option and its length to many of negotiation functions let's just store them on NBDClient struct to be state-variables of negotiation phase. This unifies semantics of negotiation functions and allows tracking changes of remaining option length in future patches. Asssert that optlen is back to 0 after negotiation (including old-style connections which don't negotiate), although we need more patches before we can assert optlen is 0 between options during negotiation. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20171122101958.17065-2-vsementsov@virtuozzo.com> [eblake: rebase, commit message tweak, assert !optlen after negotiation completes] Signed-off-by: Eric Blake <eblake@redhat.com>
2018-01-17nbd/server: Hoist nbd_reject_length() earlierEric Blake
No semantic change, but will make it easier for an upcoming patch to refactor code without having to add forward declarations. Fix a poor comment while at it. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20180110230825.18321-2-eblake@redhat.com>
2018-01-17i386: Add EPYC-IBPB CPU modelEduardo Habkost
EPYC-IBPB is a copy of the EPYC CPU model with just CPUID_8000_0008_EBX_IBPB added. Cc: Jiri Denemark <jdenemar@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-7-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17i386: Add new -IBRS versions of Intel CPU modelsEduardo Habkost
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel microcode updated and can be used by OSes to mitigate CVE-2017-5715. Unfortunately we can't change the existing CPU models without breaking existing setups, so users need to explicitly update their VM configuration to use the new *-IBRS CPU model if they want to expose IBRS to guests. The new CPU models are simple copies of the existing CPU models, with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated. Cc: Jiri Denemark <jdenemar@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-6-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17i386: Add FEAT_8000_0008_EBX CPUID feature wordEduardo Habkost
Add the new feature word and the "ibpb" feature flag. Based on a patch by Paolo Bonzini. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-5-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17i386: Add spec-ctrl CPUID bitEduardo Habkost
Add the feature name and a CPUID_7_0_EDX_SPEC_CTRL macro. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-4-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17i386: Add support for SPEC_CTRL MSRPaolo Bonzini
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-3-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17i386: Change X86CPUDefinition::model_id to const char*Eduardo Habkost
It is valid to have a 48-character model ID on CPUID, however the definition of X86CPUDefinition::model_id is char[48], which can make the compiler drop the null terminator from the string. If a CPU model happens to have 48 bytes on model_id, "-cpu help" will print garbage and the object_property_set_str() call at x86_cpu_load_def() will read data outside the model_id array. We could increase the array size to 49, but this would mean the compiler would not issue a warning if a 49-char string is used by mistake for model_id. To make things simpler, simply change model_id to be const char*, and validate the string length using an assert() on x86_register_cpudef_type(). Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20180109154519.25634-2-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17target/i386: add clflushopt to "Skylake-Server" cpu modelHaozhong Zhang
CPUID_7_0_EBX_CLFLUSHOPT is missed in current "Skylake-Server" cpu model. Add it to "Skylake-Server" cpu model on pc-i440fx-2.12 and pc-q35-2.12. Keep it disabled in "Skylake-Server" cpu model on older machine types. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20171219033730.12748-3-haozhong.zhang@intel.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17pc: add 2.12 machine typesHaozhong Zhang
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Message-Id: <20171219033730.12748-2-haozhong.zhang@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-01-17target-ppc: Fix booke206 tlbwe TLB instructionLuc MICHEL
When overwritting a valid TLB entry with a new one, the previous page were not flushed in QEMU TLB, leading to incoherent mapping. This commit fixes this. Signed-off-by: Luc MICHEL <luc.michel@git.antfield.fr> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17target/ppc: add support for POWER9 HILECédric Le Goater
Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: change initrd addressCédric Le Goater
When skiboot starts, it first clears the CPU structs for all possible CPUs on a system : for (i = 0; i <= cpu_max_pir; i++) memset(&cpu_stacks[i].cpu, 0, sizeof(struct cpu_thread)); On POWER9, cpu_max_pir is quite big, 0x7fff, and the skiboot cpu_stacks array overlaps with the memory region in which QEMU maps the initramfs file. Move it upwards in memory to keep it safe. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: fix XSCOM core addressing on POWER9Cédric Le Goater
The XSCOM base address of the core chiplet was wrongly calculated. Use the OPAL macros to fix that and do a couple of renames. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: introduce pnv*_is_power9() helpersCédric Le Goater
These are useful when instantiating device models which are shared between the POWER8 and the POWER9 processor families. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: change core mask for POWER9Cédric Le Goater
When addressed by XSCOM, the first core has the 0x20 chiplet ID but the CPU PIR can start at 0x0. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: use POWER9 DD2 processorCédric Le Goater
commit 1ed9c8af501f ("target/ppc: Add POWER9 DD2.0 model information") deprecated the POWER9 model v1.0. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17tests/boot-serial-test: fix powernv supportCédric Le Goater
Recent commit introduced the firmware image skiboot 5.9 which has a different first line ouput. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17ppc/pnv: Update skiboot firmware imageCédric Le Goater
This is skiboot 5.9 (commit e0ee24c2). It brings improved POWER9 support among many other things. Built from submodule. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17spapr: Adjust default VSMT value for better migration compatibilityDavid Gibson
fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the "vsmt" parameter for the pseries machine type, which controls the spacing of the vcpu ids of thread 0 for each virtual core. This was done to bring some consistency and stability to how that was done, while still allowing backwards compatibility for migration and otherwise. The default value we used for vsmt was set to the max of the host's advertised default number of threads and the number of vthreads per vcore in the guest. This was done to continue running without extra parameters on older KVM versions which don't allow the VSMT value to be changed. Unfortunately, even that smaller than before leakage of host configuration into guest visible configuration still breaks things. Specifically a guest with 4 (or less) vthread/vcore will get a different vsmt value when running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host. That means the vcpu ids don't line up so you can't migrate between them, though you should be able to. Long term we really want to make vsmt == smp_threads for sufficiently new machine types. However, that means that qemu will then require a sufficiently recent KVM (one which supports changing VSMT) - that's still not widely enough deployed to be really comfortable to do. In the meantime we need some default that will work as often as possible. This patch changes that default to 8 in all circumstances. This does change guest visible behaviour (including for existing machine versions) for many cases - just not the most common/important case. Following is case by case justification for why this is still the least worst option. Note that any of the old behaviours can still be duplicated after this patch, it's just that it requires manual intervention by setting the vsmt property on the command line. KVM HV on POWER8 host: This is the overwhelmingly common case in production setups, and is unchanged by design. POWER8 hosts will advertise a default VSMT mode of 8, and > 8 vthreads/vcore isn't permitted KVM HV on POWER7 host: Will break, but POWER7s allowing KVM were never released to the public. KVM HV on POWER9 host: Not yet released to the public, breaking this now will reduce other breakage later. KVM HV on PowerPC 970: Will theoretically break it, but it was barely supported to begin with and already required various user visible hacks to work. Also so old that I just don't care. TCG: This is the nastiest one; it means migration of TCG guests (without manual vsmt setting) will break. Since TCG is rarely used in production I think this is worth it for the other benefits. It does also remove one more barrier to TCG<->KVM migration which could be interesting for debugging applications. KVM PR: As with TCG, this will break migration of existing configurations, without adding extra manual vsmt options. As with TCG, it is rare in production so I think the benefits outweigh breakages. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Allow some cases where we can't set VSMT mode in the kernelDavid Gibson
At present if we require a vsmt mode that's not equal to the kernel's default, and the kernel doesn't let us change it (e.g. because it's an old kernel without support) then we always fail. But in fact we can cope with the kernel having a different vsmt as long as a) it's >= the actual number of vthreads/vcore (so that guest threads that are supposed to be on the same core act like it) b) it's a submultiple of the requested vsmt mode (so that guest threads spaced by the vsmt value will act like they're on different cores) Allowing this case gives us a bit more freedom to adjust the vsmt behaviour without breaking existing cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17target/ppc: Clarify compat mode max_threads valueDavid Gibson
We recently had some discussions that were sidetracked for a while, because nearly everyone misapprehended the purpose of the 'max_threads' field in the compatiblity modes table. It's all about guest expectations, not host expectations or support (that's handled elsewhere). In an attempt to avoid a repeat of that confusion, rename the field to 'max_vthreads' and add an explanatory comment. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
2018-01-17ppc: Change Power9 compat table to support at most 8 threads/coreJose Ricardo Ziviani
Increases the max smt mode to 8 for Power9. That's because KVM supports smt emulation in this platform so QEMU should allow users to use it as well. Today if we try to pass -smp ...,threads=8, QEMU will silently truncate it to smt4 mode and may cause a crash if we try to perform a cpu hotplug. Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com> [dwg: Added an explanatory comment] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17spapr: Remove unnecessary 'options' field from sPAPRCapabilityInfoDavid Gibson
The options field here is intended to list the available values for the capability. It's not used yet, because the existing capabilities are boolean. We're going to add capabilities that aren't, but in that case the info on the possible values can be folded into the .description field. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17hw/ppc/spapr_caps: Rework spapr_caps to use uint8 internal representationSuraj Jitindar Singh
Currently spapr_caps are tied to boolean values (on or off). This patch reworks the caps so that they can have any uint8 value. This allows more capabilities with various values to be represented in the same way internally. Capabilities are numbered in ascending order. The internal representation of capability values is an array of uint8s in the sPAPRMachineState, indexed by capability number. Capabilities can have their own name, description, options, getter and setter functions, type and allow functions. They also each have their own section in the migration stream. Capabilities are only migrated if they were explictly set on the command line, with the assumption that otherwise the default will match. On migration we ensure that the capability value on the destination is greater than or equal to the capability value from the source. So long at this remains the case then the migration is considered compatible and allowed to continue. This patch implements generic getter and setter functions for boolean capabilities. It also converts the existings cap-htm, cap-vsx and cap-dfp capabilities to this new format. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-17spapr: Handle Decimal Floating Point (DFP) as an optional capabilityDavid Gibson
Decimal Floating Point has been available on POWER7 and later (server) cpus. However, it can be disabled on the hypervisor, meaning that it's not available to guests. We currently handle this by conditionally advertising DFP support in the device tree depending on whether the guest CPU model supports it - which can also depend on what's allowed in the host for -cpu host. That can lead to confusion on migration, since host properties are silently affecting guest visible properties. This patch handles it by treating it as an optional capability for the pseries machine type. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Handle VMX/VSX presence as an spapr capability flagDavid Gibson
We currently have some conditionals in the spapr device tree code to decide whether or not to advertise the availability of the VMX (aka Altivec) and VSX vector extensions to the guest, based on whether the guest cpu has those features. This can lead to confusion and subtle failures on migration, since it makes a guest visible change based only on host capabilities. We now have a better mechanism for this, in spapr capabilities flags, which explicitly depend on user options rather than host capabilities. Rework the advertisement of VSX and VMX based on a new VSX capability. We no longer bother with a conditional for VMX support, because every CPU that's ever been supported by the pseries machine type supports VMX. NOTE: Some userspace distributions (e.g. RHEL7.4) already rely on availability of VSX in libc, so using cap-vsx=off may lead to a fatal SIGILL in init. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17target/ppc: Clean up probing of VMX, VSX and DFP availability on KVMDavid Gibson
When constructing the "host" cpu class we modify whether the VMX and VSX vector extensions and DFP (Decimal Floating Point) are available based on whether KVM can support those instructions. This can depend on policy in the host kernel as well as on the actual host cpu capabilities. However, the way we probe for this is not very nice: we explicitly check the host's device tree. That works in practice, but it's not really correct, since the device tree is a property of the host kernel's platform which we don't really know about. We get away with it because the only modern POWER platforms happen to encode VMX, VSX and DFP availability in the device tree in the same way. Arguably we should have an explicit KVM capability for this, but we haven't needed one so far. Barring specific KVM policies which don't yet exist, each of these instruction classes will be available in the guest if and only if they're available in the qemu userspace process. We can determine that from the ELF AUX vector we're supplied with. Once reworked like this, there are no more callers for kvmppc_get_vmx() and kvmppc_get_dfp() so remove them. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Validate capabilities on migrationDavid Gibson
Now that the "pseries" machine type implements optional capabilities (well, one so far) there's the possibility of having different capabilities available at either end of a migration. Although arguably a user error, it would be nice to catch this situation and fail as gracefully as we can. This adds code to migrate the capabilities flags. These aren't pulled directly into the destination's configuration since what the user has specified on the destination command line should take precedence. However, they are checked against the destination capabilities. If the source was using a capability which is absent on the destination, we fail the migration, since that could easily cause a guest crash or other bad behaviour. If the source lacked a capability which is present on the destination we warn, but allow the migration to proceed. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Treat Hardware Transactional Memory (HTM) as an optional capabilityDavid Gibson
This adds an spapr capability bit for Hardware Transactional Memory. It is enabled by default for pseries-2.11 and earlier machine types. with POWER8 or later CPUs (as it must be, since earlier qemu versions would implicitly allow it). However it is disabled by default for the latest pseries-2.12 machine type. This means that with the latest machine type, HTM will not be available, regardless of CPU, unless it is explicitly enabled on the command line. That change is made on the basis that: * This way running with -M pseries,accel=tcg will start with whatever cpu and will provide the same guest visible model as with accel=kvm. - More specifically, this means existing make check tests don't have to be modified to use cap-htm=off in order to run with TCG * We hope to add a new "HTM without suspend" feature in the not too distant future which could work on both POWER8 and POWER9 cpus, and could be enabled by default. * Best guesses suggest that future POWER cpus may well only support the HTM-without-suspend model, not the (frankly, horribly overcomplicated) POWER8 style HTM with suspend. * Anecdotal evidence suggests problems with HTM being enabled when it wasn't wanted are more common than being missing when it was. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17spapr: Capabilities infrastructureDavid Gibson
Because PAPR is a paravirtual environment access to certain CPU (or other) facilities can be blocked by the hypervisor. PAPR provides ways to advertise in the device tree whether or not those features are available to the guest. In some places we automatically determine whether to make a feature available based on whether our host can support it, in most cases this is based on limitations in the available KVM implementation. Although we correctly advertise this to the guest, it means that host factors might make changes to the guest visible environment which is bad: as well as generaly reducing reproducibility, it means that a migration between different host environments can easily go bad. We've mostly gotten away with it because the environments considered mature enough to be well supported (basically, KVM on POWER8) have had consistent feature availability. But, it's still not right and some limitations on POWER9 is going to make it more of an issue in future. This introduces an infrastructure for defining "sPAPR capabilities". These are set by default based on the machine version, masked by the capabilities of the chosen cpu, but can be overriden with machine properties. The intention is at reset time we verify that the requested capabilities can be supported on the host (considering TCG, KVM and/or host cpu limitations). If not we simply fail, rather than silently modifying the advertised featureset to the guest. This does mean that certain configurations that "worked" may now fail, but such configurations were already more subtly broken. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>
2018-01-17target/ppc: Yet another fix for KVM-HV HPTE accessorsAlexey Kardashevskiy
As stated in the 1ad9f0a464fe commit log, the returned entries are not a whole PTEG. It was not a problem before 1ad9f0a464fe as it would read a single record assuming it contains a whole PTEG but now the code tries reading the entire PTEG and "if ((n - i) < invalid)" produces negative values which then are converted to size_t for memset() and that throws seg fault. This fixes the math. While here, fix the last @i increment as well. Fixes: 1ad9f0a464fe "target/ppc: Fix KVM-HV HPTE accessors" Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-01-16Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20180116' into stagingPeter Maydell
Queued TCG patches # gpg: Signature made Tue 16 Jan 2018 16:24:50 GMT # gpg: using RSA key 0x64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20180116: tcg/ppc: Allow a 32-bit offset to the constant pool tcg/ppc: Support tlb offsets larger than 64k tcg/arm: Support tlb offsets larger than 64k tcg/arm: Fix double-word comparisons Signed-off-by: Peter Maydell <peter.maydell@linaro.org>