summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-12-18blockdev: Set 'format' indicates non-empty driveKevin Wolf
Creating an empty drive while specifying 'format' doesn't make sense. The specified format driver would simply be ignored. Make a set 'format' option an indication that a non-empty drive should be created. This makes 'format' consistent with 'driver' and allows using it with a block driver that doesn't need any other options (like null-co/null-aio). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18block: Introduce bs->explicit_optionsKevin Wolf
bs->options doesn't only contain options that the user explicitly requested, but also option that were derived from flags, the filename or inherited from the parent node. For reopen, it is important to know the difference because reopening the parent can change inherited values in child nodes, but it shouldn't change any options that were explicitly specified for the child. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18block: Split out parse_json_protocol()Kevin Wolf
The next patch distinguishes options that were explicitly set and options that were derived. bdrv_fill_option() added options of both types: Options given by json: syntax should be counted as explicit, but the rest is derived. In preparation for the distinction, move json: parse to a separate function. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18block: Add infrastructure for option inheritanceKevin Wolf
Options are not actually inherited from the parent node yet, but this commit lays the grounds for doing so. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18block: reopen: Document option precedence and refactor accordinglyKevin Wolf
The interesting part of reopening an image is from which sources the effective options should be taken, i.e. which options take precedence over which other options. This patch documents the precedence that will be implemented in the following patches. It also refactors bdrv_reopen_queue(), so that the top-level reopened node is handled the same way as children are. Option/flag inheritance from the parent becomes just one item in the list and is done at the beginning of the function, similar to how the other items are/will be handled. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18block: Allow specifying child options in reopenKevin Wolf
If the child was defined in the same context (-drive argument or blockdev-add QMP command) as its parent, a reopen of the parent should work the same and allow changing options of the child. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-12-18block: Keep "driver" in bs->optionsKevin Wolf
Instead of passing a separate drv argument to bdrv_open_common(), just make sure that a "driver" option is set in the QDict. This also means that a "driver" entry is consistently present in bs->options now. This is another step towards keeping all options in the QDict (which is the represenation of the blockdev-add QMP command). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18block: Pass driver-specific options to .bdrv_refresh_filename()Kevin Wolf
In order to decide whether a blkdebug: filename can be produced or a json: one is necessary, blkdebug checked whether bs->options had more options than just "config", "x-image" or "image" (the latter including nested options). That doesn't work well when generic block layer options are present. This patch passes an option QDict to the driver that contains only driver-specific options, i.e. the options for the general block layer as well as child nodes are already filtered out. Works much better this way. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-12-18block: Exclude nested options only for children in append_open_options()Kevin Wolf
Some drivers have nested options (e.g. blkdebug rule arrays), which don't belong to a child node and shouldn't be removed. Don't remove all options with "." in their name, but check for the complete prefixes of actually existing child nodes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18block: Consider all block layer options in append_open_optionsKevin Wolf
The code already special-cased "node-name", which is currently the only option passed in the QDict that isn't driver-specific. Generalise the code to take all general block layer options into consideration. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-12-18block: Allow references for backing filesKevin Wolf
For bs->file, using references to existing BDSes has been possible for a while already. This patch enables the same for bs->backing. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18mirror: Error out when a BDS would get two BBsKevin Wolf
bdrv_replace_in_backing_chain() asserts that not both old and new BlockDdriverState have a BlockBackend attached to them because both would have to end up pointing to the new BDS and we don't support more than one BB per BDS yet. Before we can safely allow references to existing nodes as backing files, we need to make sure that even if a backing file has a BB on it, this doesn't crash qemu. There are probably also some cases with the 'replaces' option set where drive-mirror could fail this assertion today. They are fixed with this error check as well. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18block: Fix reopen with semantically overlapping optionsKevin Wolf
This fixes bdrv_reopen() calls like the following one: qemu-io -c 'open -o overlap-check.template=all /tmp/test.qcow2' \ -c 'reopen -o overlap-check=none' The approach taken so far would result in an options QDict that has both "overlap-check.template=all" and "overlap-check=none", which obviously conflicts. In this case, the old option should be overridden by the newly specified option. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-12-18qcow2: Add .bdrv_join_options callbackKevin Wolf
qcow2 accepts a few driver-specific options that overlap semantically (e.g. "overlap-check" is an alias of "overlap-check.template", and any missing cache size option is derived from the given ones). When bdrv_reopen() merges the set of updated options with left out options that should be kept at their old value, we need to consider this and filter out any duplicates (which would generally cause errors because new and old value would contradict each other). This patch adds a .bdrv_join_options callback to BlockDriver and implements it for qcow2. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-12-18iotests: 124: don't reopen qcow2John Snow
Don't create two interfaces to the same drive in the recently moved failure test. Signed-off-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18iotests: 124: move incremental failure testJohn Snow
Code motion only, in preparation for adjusting the setUp procedure for this test. Signed-off-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18iotests: 124: Split into two test classesJohn Snow
Split it into an abstract test class and an implementation class. The split is primarily to facilitate more flexible setUp variations for other kinds of tests without having to rewrite or shuffle around all of these helpers. See the following two patches for more of the "why." Signed-off-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18Merge remote-tracking branch ↵Peter Maydell
'remotes/berrange/tags/pull-io-channel-base-2015-12-18-1' into staging Merge I/O channels base classes # gpg: Signature made Fri 18 Dec 2015 12:18:38 GMT using RSA key ID 15104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" * remotes/berrange/tags/pull-io-channel-base-2015-12-18-1: io: add QIOChannelBuffer class io: add QIOChannelCommand class io: add QIOChannelWebsock class io: add QIOChannelTLS class io: add QIOChannelFile class io: add QIOChannelSocket class io: add QIOTask class for async operations io: add helper module for creating watches on FDs io: add abstract QIOChannel classes Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-12-18io: add QIOChannelBuffer classDaniel P. Berrange
Add a QIOChannel subclass that is capable of performing I/O to/from a memory buffer. This implementation does not attempt to support concurrent readers & writers. It is designed for serialized access where by a single thread at a time may write data, seek and then read data back out. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18io: add QIOChannelCommand classDaniel P. Berrange
Add a QIOChannel subclass that is capable of performing I/O to/from a separate process, via a pair of pipes. The command can be used for unidirectional or bi-directional I/O. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18io: add QIOChannelWebsock classDaniel P. Berrange
Add a QIOChannel subclass that can run the websocket protocol over the top of another QIOChannel instance. This initial implementation is only capable of acting as a websockets server. There is no support for acting as a websockets client yet. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18io: add QIOChannelTLS classDaniel P. Berrange
Add a QIOChannel subclass that can run the TLS protocol over the top of another QIOChannel instance. The object provides a simplified API to perform the handshake when starting the TLS session. The layering of TLS over the underlying channel does not have to be setup immediately. It is possible to take an existing QIOChannel that has done some handshake and then swap in the QIOChannelTLS layer. This allows for use with protocols which start TLS right away, and those which start plain text and then negotiate TLS. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18io: add QIOChannelFile classDaniel P. Berrange
Add a QIOChannel subclass that is capable of operating on things that are files, such as plain files, pipes, character/block devices, but notably not sockets. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18io: add QIOChannelSocket classDaniel P. Berrange
Implement a QIOChannel subclass that supports sockets I/O. The implementation is able to manage a single socket file descriptor, whether a TCP/UNIX listener, TCP/UNIX connection, or a UDP datagram. It provides APIs which can listen and connect either asynchronously or synchronously. Since there is no asynchronous DNS lookup API available, it uses the QIOTask helper for spawning a background thread to ensure non-blocking operation. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18io: add QIOTask class for async operationsDaniel P. Berrange
A number of I/O operations need to be performed asynchronously to avoid blocking the main loop. The caller of such APIs need to provide a callback to be invoked on completion/error and need access to the error, if any. The small QIOTask provides a simple framework for dealing with such probes. The API docs inline provide an outline of how this is to be used. Some functions don't have the ability to run asynchronously (eg getaddrinfo always blocks), so to facilitate their use, the task class provides a mechanism to run a blocking function in a thread, while triggering the completion callback in the main event loop thread. This easily allows any synchronous function to be made asynchronous, albeit at the cost of spawning a thread. In this series, the QIOTask class will be used for things like the TLS handshake, the websockets handshake and TCP connect() progress. The concept of QIOTask is inspired by the GAsyncResult interface / GTask class in the GIO libraries. The min version requirements on glib don't allow those to be used from QEMU, so QIOTask provides a facsimilie which can be easily switched to GTask in the future if the min version is increased. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18io: add helper module for creating watches on FDsDaniel P. Berrange
A number of the channel implementations will require the ability to create watches on file descriptors / sockets. To avoid duplicating this code in each channel, provide a helper API for dealing with file descriptor watches. There are two watch implementations provided. The first is useful for bi-directional file descriptors such as sockets, regular files, character devices, etc. The second works with a pair of unidirectional file descriptors such as pipes. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18io: add abstract QIOChannel classesDaniel P. Berrange
Start the new generic I/O channel framework by defining a QIOChannel abstract base class. This is designed to feel similar to GLib's GIOChannel, but with the addition of support for using iovecs, qemu error reporting, file descriptor passing, coroutine integration and use of the QOM framework for easier sub-classing. The intention is that anywhere in QEMU that almost anywhere that deals with sockets will use this new I/O infrastructure, so that it becomes trivial to then layer in support for TLS encryption. This will at least include the VNC server, char device backend and migration code. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-17Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell
* KVM: synic support, split irqchip support * memory: cleanups, optimizations, ioeventfd emulation * SCSI: small fixes, vmw_pvscsi compatibility improvements * qemu_log cleanups * Coverity model improvements # gpg: Signature made Thu 17 Dec 2015 16:35:21 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream: (45 commits) coverity: Model g_memdup() coverity: Model g_poll() scsi: always call notifier on async cancellation scsi: use scsi_req_cancel_async when purging requests target-i386: kvm: clear unusable segments' flags in migration rcu: optimize rcu_read_lock memory: try to inline constant-length reads memory: inline a few small accessors memory: extract first iteration of address_space_read and address_space_write memory: split address_space_read and address_space_write memory: avoid unnecessary object_ref/unref memory: reorder MemoryRegion fields exec: make qemu_ram_ptr_length more similar to qemu_get_ram_ptr exec: always call qemu_get_ram_ptr within rcu_read_lock linux-user: convert DEBUG_SIGNAL logging to tracepoints linux-user: avoid "naked" qemu_log user: introduce "-d page" xtensa: avoid "naked" qemu_log tricore: avoid "naked" qemu_log ppc: cleanup logging ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-12-17coverity: Model g_memdup()Markus Armbruster
We model all the non-deprecated memory allocation functions from https://developer.gnome.org/glib/stable/glib-Memory-Allocation.html except for g_memdup(), g_clear_pointer(), g_steal_pointer(). We don't use the latter two. Model the former. Coverity now reports an OVERRUN vl.c:2317: alloc_strlen: Allocating insufficient memory for the terminating null of the string. Correct, but we omit the terminating null intentionally there. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1448901152-11716-1-git-send-email-armbru@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17coverity: Model g_poll()Markus Armbruster
In my testing, Coverity reported two more CHECKED_RETURN: * qemu-char.c:1248: fixed in commit c1f2448: "qemu-char: retry g_poll on EINTR". * migration/qemu-file-unix.c:75: harmless, cleaned up in commit 4e39f57 "migration: Clean up use of g_poll() in socket_writev_buffer() Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1450336833-27710-1-git-send-email-armbru@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17scsi: always call notifier on async cancellationPaolo Bonzini
This was found by code inspection. If the request is cancelled twice, the notifier is never called on the second cancellation request, and hence for example a TMF might never finish. All the calls in scsi_req_cancel_async are idempotent, so the change is safe. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1450290827-30508-2-git-send-email-pbonzini@redhat.com>
2015-12-17scsi: use scsi_req_cancel_async when purging requestsPaolo Bonzini
This avoids calls to aio_poll without having acquired the context first. Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1450290827-30508-1-git-send-email-pbonzini@redhat.com>
2015-12-17target-i386: kvm: clear unusable segments' flags in migrationMichael Chapman
This commit fixes migration of a QEMU/KVM guest from kernel >= v3.9 to kernel <= v3.7 (e.g. from RHEL 7 to RHEL 6). Without this commit a guest migrated across these kernel versions fails to resume on the target host as its segment descriptors are invalid. Two separate kernel commits combined together to result in this bug: commit f0495f9b9992f80f82b14306946444b287193390 Author: Avi Kivity <avi@redhat.com> Date: Thu Jun 7 17:06:10 2012 +0300 KVM: VMX: Relax check on unusable segment Some userspace (e.g. QEMU 1.1) munge the d and g bits of segment descriptors, causing us not to recognize them as unusable segments with emulate_invalid_guest_state=1. Relax the check by testing for segment not present (a non-present segment cannot be usable). Signed-off-by: Avi Kivity <avi@redhat.com> commit 25391454e73e3156202264eb3c473825afe4bc94 Author: Gleb Natapov <gleb@redhat.com> Date: Mon Jan 21 15:36:46 2013 +0200 KVM: VMX: don't clobber segment AR of unusable segments. Usability is returned in unusable field, so not need to clobber entire AR. Callers have to know how to deal with unusable segments already since if emulate_invalid_guest_state=true AR is not zeroed. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> The first commit changed the KVM_SET_SREGS ioctl so that it did no treat segment flags == 0 as an unusable segment, instead only looking at the "present" flag. The second commit changed KVM_GET_SREGS so that it did not clear the flags of an unusable segment. Since QEMU does not itself maintain the "unusable" flag across a migration, the end result is that unusable segments read from a kernel with these commits and loaded into a kernel without these commits are not properly recognised as being unusable. This commit updates both get_seg and set_seg so that the problem is avoided even when migrating to or migrating from a QEMU without this commit. In get_seg, we clear the segment flags if the segment is marked unusable. In set_seg, we mark the segment unusable if the segment's "present" flag is not set. Signed-off-by: Michael Chapman <mike@very.puzzling.org> Message-Id: <1449464047-17467-1-git-send-email-mike@very.puzzling.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17rcu: optimize rcu_read_lockPaolo Bonzini
rcu_read_lock cannot change rcu_gp_ongoing from true to false (the previous value of p_rcu_reader->ctr is zero), hence there is no need to check p_rcu_reader->waiting and wake up a concurrent synchronize_rcu. While at it mark the wakeup as unlikely in rcu_read_unlock. Reviewed-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1450265542-4323-1-git-send-email-pbonzini@redhat.com>
2015-12-17memory: try to inline constant-length readsPaolo Bonzini
memcpy can take a large amount of time for small reads and writes. Handle the common case of reading s/g descriptors from memory (there is no corresponding "write" case that is as common, because writes often use address_space_st* functions) by inlining the relevant parts of address_space_read into the caller. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17memory: inline a few small accessorsPaolo Bonzini
These are used in the address_space_* fast paths. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17memory: extract first iteration of address_space_read and address_space_writePaolo Bonzini
We want to inline the case where there is only one iteration, because then the compiler can also inline the memcpy. As a start, extract everything after the first address_space_translate call. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17memory: split address_space_read and address_space_writePaolo Bonzini
Rather than dispatching on is_write for every iteration, make address_space_rw call one of the two functions. The amount of duplicate logic is pretty small, and memory_access_is_direct can be tweaked so that it inlines better in the callers. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17memory: avoid unnecessary object_ref/unrefPaolo Bonzini
For the common case of DMA into non-hotplugged RAM, it is unnecessary but expensive to do object_ref/unref. Add back an owner field to MemoryRegion, so that these memory regions can skip the reference counting. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17memory: reorder MemoryRegion fieldsPaolo Bonzini
Order fields so that all fields accessed during a RAM read/write fit in the same cache line. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17exec: make qemu_ram_ptr_length more similar to qemu_get_ram_ptrPaolo Bonzini
Notably, use qemu_get_ram_block to enjoy the MRU optimization. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17exec: always call qemu_get_ram_ptr within rcu_read_lockPaolo Bonzini
Simplify the code and document the assumption. The only caller that is not within rcu_read_lock is memory_region_get_ram_ptr. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17linux-user: convert DEBUG_SIGNAL logging to tracepointsPaolo Bonzini
"Unimplemented" messages go to stderr, everything else goes to tracepoints Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17linux-user: avoid "naked" qemu_logPaolo Bonzini
Ensure that all log writes are protected by qemu_loglevel_mask or, in serious cases, go to both the log and stderr. Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17user: introduce "-d page"Paolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17xtensa: avoid "naked" qemu_logPaolo Bonzini
Cc: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17tricore: avoid "naked" qemu_logPaolo Bonzini
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17ppc: cleanup loggingPaolo Bonzini
Avoid "naked" qemu_log, bring documentation for DEBUG #defines up to date. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17s390x: avoid "naked" qemu_logPaolo Bonzini
Convert to debug-only qemu_log. Cc: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17microblaze: avoid "naked" qemu_logPaolo Bonzini
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>