qemu - QEMU is a generic and open source machine & userspace emulator and virtualizer.

Age	Commit message (Collapse)	Author
2017-05-09	sockets: Limit SocketAddressLegacy to external interfaces	Markus Armbruster
	SocketAddressLegacy is a simple union, and simple unions are awkward: they have their variant members wrapped in a "data" object on the wire, and require additional indirections in C. SocketAddress is the equivalent flat union. Convert all users of SocketAddressLegacy to SocketAddress, except for existing external interfaces. See also commit fce5d53..9445673 and 85a82e8..c5f1ae3. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1493192202-3184-7-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Minor editing accident fixed, commit message and a comment tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09	sockets: Rename SocketAddressFlat to SocketAddress	Markus Armbruster
	Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1493192202-3184-6-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2017-05-09	sockets: Rename SocketAddress to SocketAddressLegacy	Markus Armbruster
	The next commit will rename SocketAddressFlat to SocketAddress, and the commit after that will replace most uses of SocketAddressLegacy by SocketAddress, replacing most of this commit's renames right back. Note that checkpatch emits a few "line over 80 characters" warnings. The long lines are all temporary; the SocketAddressLegacy replacement will shorten them again. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1493192202-3184-5-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09	sockets: Prepare inet_parse() for flattened SocketAddress	Markus Armbruster
	I'm going to flatten SocketAddress: rename SocketAddress to SocketAddressLegacy, SocketAddressFlat to SocketAddress, eliminate SocketAddressLegacy except in external interfaces. inet_parse() returns a newly allocated InetSocketAddress. Lift the allocation from inet_parse() into its caller socket_parse() to prepare for flattening SocketAddress. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1493192202-3184-3-git-send-email-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Straightforward rebase]
2017-05-09	qobject: Use simpler QDict/QList scalar insertion macros	Eric Blake
	We now have macros in place to make it less verbose to add a scalar to QDict and QList, so use them. Patch created mechanically via: spatch --sp-file scripts/coccinelle/qobject.cocci \ --macro-file scripts/cocci-macro-file.h --dir . --in-place then touched up manually to fix a couple of '?:' back to original spacing, as well as avoiding a long line in monitor.c. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20170427215821.19397-7-eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08	qobject: Drop useless QObject casts	Eric Blake
	We have macros in place to make it less verbose to add a subtype of QObject to both QDict and QList. While we have made cleanups like this in the past (see commit fcfcd8ffc, for example), having it be automated by Coccinelle makes it easier to maintain. Patch created mechanically via: spatch --sp-file scripts/coccinelle/qobject.cocci \ --macro-file scripts/cocci-macro-file.h --dir . --in-place then I verified that no manual touchups were required. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20170427215821.19397-5-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-28	qcow2: Allow discard of final unaligned cluster	Eric Blake
	As mentioned in commit 0c1bd46, we ignored requests to discard the trailing cluster of an unaligned image. While discard is an advisory operation from the guest standpoint, (and we are therefore free to ignore any request), our qcow2 implementation exploits the fact that a discarded cluster reads back as 0. As long as we discard on cluster boundaries, we are fine; but that means we could observe non-zero data leaked at the tail of an unaligned image. Enhance iotest 66 to cover this case, and fix the implementation to honor a discard request on the final partial cluster. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 20170407013709.18440-1-eblake@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28	block: Add .bdrv_truncate() error messages	Max Reitz
	Add missing error messages for the block driver implementations of .bdrv_truncate(); drop the generic one from block.c's bdrv_truncate(). Since one of these changes touches a mis-indented block in block/file-posix.c, this patch fixes that coding style issue along the way. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170328205129.15138-5-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28	block: Add errp to BD.bdrv_truncate()	Max Reitz
	Add an Error parameter to the block drivers' bdrv_truncate() interface. If a block driver does not set this in case of an error, the generic bdrv_truncate() implementation will do so. Where it is obvious, this patch also makes some block drivers set this value. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170328205129.15138-4-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28	block: Add errp to b{lk,drv}_truncate()	Max Reitz
	For one thing, this allows us to drop the error message generation from qemu-img.c and blockdev.c and instead have it unified in bdrv_truncate(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170328205129.15138-3-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28	block/vhdx: Make vhdx_create() always set errp	Max Reitz
	This patch makes vhdx_create() always set errp in case of an error. It also adds errp parameters to vhdx_create_bat() and vhdx_create_new_region_table() so we can pass on the error object generated by blk_truncate() as of a future commit. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170328205129.15138-2-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-27	block: fix alignment calculations in bdrv_co_do_zero_pwritev	Denis V. Lunev
	tail_padding_bytes is calculated wrong. F.e. for offset = 0 bytes = 2048 align = 512 we will have tail_padding_bytes = 512 which is definitely wrong. The patch fixes that arithmetics. Fortunately this problem is harmless, we will have 1 extra allocation and free thus there is no need to put this into stable. The problem is here from the very beginning. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27	block: Do not unref bs->file on error in BD's open	Max Reitz
	The block layer takes care of removing the bs->file child if the block driver's bdrv_open()/bdrv_file_open() implementation fails. The block driver therefore does not need to do so, and indeed should not unless it sets bs->file to NULL afterwards -- because if this is not done, the bdrv_unref_child() in bdrv_open_inherit() will dereference the freed memory block at bs->file afterwards, which is not good. We can now decide whether to add a "bs->file = NULL;" after each of the offending bdrv_unref_child() invocations, or just drop them altogether. The latter is simpler, so let's do that. Cc: qemu-stable <qemu-stable@nongnu.org> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27	block: Remove NULL check in bdrv_co_flush	Fam Zheng
	Reported by Coverity. We already use bs in bdrv_inc_in_flight before checking for NULL. It is unnecessary as all callers pass non-NULL bs, so drop it. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27	Revert "block/io: Comment out permission assertions"	Max Reitz
	This reverts commit e3e0003a8f6570aba1421ef99a0b383a43371a74. This commit was necessary for the 2.9 release because we were unable to fix the underlying issue(s) in time. However, we will be for 2.10. Signed-off-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27	file-win32: Remove unnecessary include	Kevin Wolf
	Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27	file-posix: Remove unnecessary includes	Kevin Wolf
	Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27	block: Constify data passed by pointer to blk_name	Krzysztof Kozlowski
	blk_name() is not modifying data passed to it through pointer and it returns also a pointer to const so the argument can be made const for code safeness. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-24	block/rbd: Add support for reopen()	Jeff Cody
	This adds support for reopen in rbd, for changing between r/w and r/o. Note, that this is only a flag change, but we will block a change from r/o to r/w if we are using an RBD internal snapshot. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: d4e87539167ec6527d44c97b164eabcccf96e4f3.1491597120.git.jcody@redhat.com
2017-04-24	block/rbd - update variable names to more apt names	Jeff Cody
	Update 'clientname' to be 'user', which tracks better with both the QAPI and rados variable naming. Update 'name' to be 'image_name', as it indicates the rbd image. Naming it 'image' would have been ideal, but we are using that for the rados_image_t value returned by rbd_open(). Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: b7ec1fb2e1cf36f9b6911631447a5b0422590b7d.1491597120.git.jcody@redhat.com
2017-04-24	block: do not set BDS read_only if copy_on_read enabled	Jeff Cody
	A few block drivers will set the BDS read_only flag from their .bdrv_open() function. This means the bs->read_only flag could be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ flag check occurs prior to the call to bdrv->bdrv_open(). This adds an error return to bdrv_set_read_only(), and an error will be return if we try to set the BDS to read_only while copy_on_read is enabled. This patch also changes the behavior of vvfat. Before, vvfat could override the drive 'readonly' flag with its own, internal 'rw' flag. For instance, this -drive parameter would result in a writable image: "-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on" This is not correct. Now, attempting to use the above -drive parameter will result in an error (i.e., 'rw' is incompatible with 'readonly=on'). Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
2017-04-24	block: add bdrv_set_read_only() helper function	Jeff Cody
	We have a helper wrapper for checking for the BDS read_only flag, add a helper wrapper to set the read_only flag as well. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 9b18972d05f5fa2ac16c014f0af98d680553048d.1491597120.git.jcody@redhat.com
2017-04-24	block/vxhs.c: Add support for a new block device type called "vxhs"	Ashish Mittal
	Source code for the qnio library that this code loads can be downloaded from: https://github.com/VeritasHyperScale/libqnio.git Sample command line using JSON syntax: ./x86_64-softmmu/qemu-system-x86_64 -name instance-00000008 -S -vnc 0.0.0.0:0 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on 'json:{"driver":"vxhs","vdisk-id":"c3e9095a-a5ee-4dce-afeb-2a59fb387410", "server":{"host":"172.172.17.4","port":"9999"}}' Sample command line using URI syntax: qemu-img convert -f raw -O raw -n /var/lib/nova/instances/_base/0c5eacd5ebea5ed914b6a3e7b18f1ce734c386ad vxhs://192.168.0.1:9999/c6718f6b-0401-441d-a8c3-1f0064d75ee0 Sample command line using TLS credentials (run in secure mode): ./qemu-io --object tls-creds-x509,id=tls0,dir=/etc/pki/qemu/vxhs,endpoint=client -c 'read -v 66000 2.5k' 'json:{"server.host": "127.0.0.1", "server.port": "9999", "vdisk-id": "/test.raw", "driver": "vxhs", "tls-creds":"tls0"}' [Jeff: Modified trace-events with the correct string formatting] Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Message-id: 1491277689-24949-2-git-send-email-Ashish.Mittal@veritas.com
2017-04-24	nfs: Make errp the last parameter of nfs_client_open	Fam Zheng
	Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170421122710.15373-10-famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24	block: Make errp the last parameter of commit_active_start	Fam Zheng
	Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170421122710.15373-9-famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24	mirror: Make errp the last parameter of mirror_start_job	Fam Zheng
	Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170421122710.15373-8-famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24	crypto: Make errp the last parameter of functions	Fam Zheng
	Move opaque to 2nd instead of the 2nd to last, so that compilers help check with the conversion. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170421122710.15373-7-famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [Commit message typo corrected] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24	socket: Make errp the last parameter of inet_connect_saddr	Fam Zheng
	Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170421122710.15373-3-famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24	socket: Make errp the last parameter of socket_connect	Fam Zheng
	Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170421122710.15373-2-famz@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-18	block: Walk bs->children carefully in bdrv_drain_recurse	Fam Zheng
	The recursive bdrv_drain_recurse may run a block job completion BH that drops nodes. The coming changes will make that more likely and use-after-free would happen without this patch Stash the bs pointer and use bdrv_ref/bdrv_unref in addition to QLIST_FOREACH_SAFE to prevent such a case from happening. Since bdrv_unref accesses global state that is not protected by the AioContext lock, we cannot use bdrv_ref/bdrv_unref unconditionally. Fortunately the protection is not needed in IOThread because only main loop can modify a graph with the AioContext lock held. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170418143044.12187-2-famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Tested-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-11	block/io: Comment out permission assertions	Max Reitz
	In case of block migration, there may be writes to BlockBackends that do not have the write permission taken. Before this issue is fixed (which is not going to happen in 2.9), we therefore cannot assert that this is the case. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170411145050.31290-1-mreitz@redhat.com Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11	sheepdog: Fix crash in co_read_response()	Kevin Wolf
	This fixes a regression introduced in commit 9d456654. aio_co_wake() can only be used to reenter a coroutine that was already previously entered, otherwise co->ctx is uninitialised and we access garbage. Using it immediately after qemu_coroutine_create() like in co_read_response() is wrong and causes segfaults. Replace the call with aio_co_enter(), which gets an explicit AioContext parameter and works even for new coroutines. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kashyap Chamarthy <kchamart@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1491919733-21065-1-git-send-email-kwolf@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11	iscsi: Fix iscsi_create	Fam Zheng
	Since d5895fcb (iscsi: Split URL into individual options), creating qcow2 image on an iscsi LUN fails: qemu-img create -f qcow2 iscsi://$SERVER/$IQN/0 1G qemu-img: iscsi://$SERVER/$IQN/0: Could not create image: Invalid argument The problem is iscsi_open now expects that transport_name, portal and target are already parsed into structured options by iscsi_parse_filename, but it is not called in iscsi_create. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20170410075451.21329-1-famz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> [mreitz: Dropped now superfluous qdict_put(bs_options, "filename", ...)] Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11	throttle: Remove block from group on hot-unplug	Eric Blake
	When a block device that is part of a throttle group is hot-unplugged, we forgot to remove it from the throttle group. This leaves stale memory around, and causes an easily reproducible crash: $ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \ -device virtio-scsi-pci,bus=pci.0 -drive \ id=drive_image2,if=none,format=raw,file=file2,bps=512000,iops=100,group=foo \ -device scsi-hd,id=image2,drive=drive_image2 -drive \ id=drive_image3,if=none,format=raw,file=file3,bps=512000,iops=100,group=foo \ -device scsi-hd,id=image3,drive=drive_image3 {'execute':'qmp_capabilities'} {'execute':'device_del','arguments':{'id':'image3'}} {'execute':'system_reset'} Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1428810 Suggested-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 20170406190847.29347-1-eblake@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11	block: pass the right options for BlockDriver.bdrv_open()	Dong Jia Shi
	raw_open() expects the caller always passing in the right actual @options parameter. But when trying to applying snapshot on a RBD image, bdrv_snapshot_goto() calls raw_open() (by calling the bdrv_open callback on the BlockDriver) with a NULL @options, and that will result in a Segmentation fault. For the other non-raw format drivers, it also makes sense to passing in the actual options, althought they don't trigger the problem so far. Let's prepare a @options by adding the "file" key-value pair to a copy of the actual options that were given for the node (i.e. bs->options), and pass it to the callback. BlockDriver.bdrv_open() expects bs->file to be NULL and just overwrites it with the result from bdrv_open_child(). That means we should actually make sure it's NULL because otherwise the child BDS will have a reference count that is 1 too high. So we unconditionally invoke bdrv_unref_child() before calling BlockDriver.bdrv_open(), and we wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't deleted in the meantime. Suggested-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-id: 20170405091909.36357-2-bjsdjshi@linux.vnet.ibm.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11	sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE	Fam Zheng
	When called from main thread, the coroutine should run in the context of bs. Use bdrv_coroutine_enter to ensure that. Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-11	block: Fix bdrv_co_flush early return	Fam Zheng
	bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for BDRV_POLL_WHILE to work, even for the shortcut case where flush is unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix the variable declaration position. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-11	block: Use bdrv_coroutine_enter to start I/O coroutines	Fam Zheng
	BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling the main context, which relies on the yielded coroutine continuing on bs->ctx before notifying qemu_aio_context with bdrv_wakeup(). Thus, using qemu_coroutine_enter to start I/O is wrong because if the coroutine is entered from main loop, co->ctx will be qemu_aio_context, as a result of the "release, poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when both main thread and the iothread access the same BDS: main loop iothread ----------------------------------------------------------------------- blockdev_snapshot aio_context_acquire(bs->ctx) virtio_scsi_data_plane_handle_cmd bdrv_drained_begin(bs->ctx) bdrv_flush(bs) bdrv_co_flush(bs) aio_context_acquire(bs->ctx).enter ... qemu_coroutine_yield(co) BDRV_POLL_WHILE() aio_context_release(bs->ctx) aio_context_acquire(bs->ctx).return ... aio_co_wake(co) aio_poll(qemu_aio_context) ... co_schedule_bh_cb() ... qemu_coroutine_enter(co) ... /* (A) bdrv_co_flush(bs) /* (B) I/O on bs / continues... / aio_context_release(bs->ctx) aio_context_acquire(bs->ctx) Note that in above case, bdrv_drained_begin() doesn't do the "release, poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0. Fix this by using bdrv_coroutine_enter and enter coroutine in the right context. iotests 109 output is updated because the coroutine reenter flow during mirror job complete is different (now through co_queue_wakeup, instead of the unconditional qemu_coroutine_switch before), making the end job len different. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11	block: Make bdrv_parent_drained_begin/end public	Fam Zheng
	Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07	mirror: Fix aio context of mirror_top_bs	Fam Zheng
	It should be moved to the same context as source, before inserting to the graph. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07	block: Don't check permissions for copy on read	Kevin Wolf
	The assertion is currently failing. We can't require callers to have write permissions when all they are doing is a read, so comment it out. Add a FIXME comment in the code so that the check is re-enabled when copy on read is refactored into its own filter driver. Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2017-04-07	block/mirror: Fix use-after-free	Max Reitz
	If @bs does not have any parents, the only reference to @mirror_top_bs will be held by the BlockJob object after the bdrv_unref() following block_job_create(). However, if block_job_create() fails, this reference will not exist and @mirror_top_bs will have been deleted when we goto fail. The issue comes back at all later entries to the fail label: We delete the BlockJob object before rolling back our changes to the node graph. This means that we will delete @mirror_top_bs in the process. All in all, whenever @bs does not have any parents and we go down the fail path we will dereference @mirror_top_bs after it has been deleted. Fix this by invoking bdrv_unref() only when block_job_create() was successful and by bdrv_ref()'ing @mirror_top_bs in the fail path before deleting the BlockJob object. Finally, bdrv_unref() it at the end of the fail path after we actually no longer need it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07	commit: Set commit_top_bs->total_sectors	Kevin Wolf
	Like in the mirror filter driver, we also need to set the image size for the commit filter driver. This is less likely to be a problem in practice than for the mirror because we're not at the active layer here, but attaching new parents to a node in the middle of the chain is possible, so the size needs to be correct anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-07	commit: Set commit_top_bs->aio_context	Kevin Wolf
	The filter driver that is inserted by the commit job needs to use the same AioContext as its parent and child nodes. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-07	block: Ignore guest dev permissions during incoming migration	Kevin Wolf
	Usually guest devices don't like other writers to the same image, so they use blk_set_perm() to prevent this from happening. In the migration phase before the VM is actually running, though, they don't have a problem with writes to the image. On the other hand, storage migration needs to be able to write to the image in this phase, so the restrictive blk_set_perm() call of qdev devices breaks it. This patch flags all BlockBackends with a qdev device as blk->disable_perm during incoming migration, which means that the requested permissions are stored in the BlockBackend, but not actually applied to its root node yet. Once migration has finished and the VM should be resumed, the permissions are applied. If they cannot be applied (e.g. because the NBD server used for block migration hasn't been shut down), resuming the VM fails. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
2017-04-04	Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging	Peter Maydell
	* MemoryRegionCache revert * glib optimization workaround * fix "info lapic" segfault on isapc * fix QIOChannel memory leak # gpg: Signature made Mon 03 Apr 2017 18:17:00 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: main-loop: Acquire main_context lock around os_host_main_loop_wait. exec: revert MemoryRegionCache nbd: fix memory leak on socket_connect failed ipmi: Fix macro issues target-i386: fix "info lapic" segfault on isapc iscsi: drop unused IscsiAIOCB.qiov field Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03	block/parallels: Avoid overflows	Max Reitz
	Change the types of variables in allocate_clusters() to int64_t so we do not have to worry about potential overflows. Add an assertion that our accesses to s->bat[] do not result in a buffer overflow and that the implicit conversion performed when invoking bat_entry_off() does not result in an integer overflow. Coverity-id: 1307776 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170331170512.10381-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03	qcow2: Discard unaligned tail when wiping image	Eric Blake
	There is a subtle difference between the fast (qcow2v3 with no extra data) and slow path (qcow2v2 format [aka 0.10], or when a snapshot is present) of qcow2_make_empty(). The slow path fails to discard the final (partial) cluster of an unaligned image. The problem stems from the fact that qcow2_discard_clusters() was silently ignoring sub-cluster head and tail on unaligned requests. A quick audit of all callers shows that qcow2_snapshot_create() has always passed a cluster-aligned request since the call was added in commit 1ebf561; qcow2_co_pdiscard() has passed a cluster-aligned request since commit ecdbead taught the block layer about preferred discard alignment; and qcow2_make_empty() was fixed to pass an aligned start (but not necessarily end) in commit a3e1505. Asserting that the start is always aligned also points out that we now have a dead check: rounding the end offset down can never result in a value less than the aligned start offset (the check was rendered dead with commit ecdbead). Meanwhile, we do not want to round the end cluster down in the one case of the end offset matching the (unaligned) file size - that final partial cluster should still be discarded. With those fixes in place, the fast and slow paths are back in sync at discarding an entire image; the next patch will update qemu-iotests to ensure we don't regress. Note that bdrv_co_pdiscard ignores ALL partial cluster requests, including the partial cluster at the end of an image; it can be argued that the partial cluster at the end should be special-cased so that a guest issuing discard requests at proper alignments everywhere else can likewise empty the entire image. But that optimization is left for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 20170331185356.2479-3-eblake@redhat.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03	sheepdog: Fix blockdev-add	Markus Armbruster
	Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit d282f34 "sheepdog: Support blockdev-add" have different ideas on how the QemuOpts parameters for the server address are named. Fix that. While there, rename BlockdevOptionsSheepdog member addr to server, for consistency with BlockdevOptionsSsh, BlockdevOptionsGluster, BlockdevOptionsNbd. Commit 831acdc's example becomes --drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly instead of --drive driver=sheepdog,host=fido,vdi=dolly Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Kashyap Chamarthy <kchamart@redhat.com> Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03	nbd: Tidy up blockdev-add interface	Markus Armbruster
	SocketAddress is a simple union, and simple unions are awkward: they have their variant members wrapped in a "data" object on the wire, and require additional indirections in C. I intend to limit its use to existing external interfaces, and convert all internal interfaces to SocketAddressFlat. BlockdevOptionsNbd is an external interface using SocketAddress. We already use SocketAddressFlat elsewhere in blockdev-add. Replace it by SocketAddressFlat while we can (it's new in 2.9) for simplicity and consistency. For example, { "execute": "blockdev-add", "arguments": { "node-name": "foo", "driver": "nbd", "server": { "type": "inet", "data": { "host": "localhost", "port": "12345" } } } } becomes { "execute": "blockdev-add", "arguments": { "node-name": "foo", "driver": "nbd", "server": { "type": "inet", "host": "localhost", "port": "12345" } } } Since the internal interfaces still take SocketAddress, this requires conversion function socket_address_crumple(). It'll go away when I update the interfaces. Unfortunately, SocketAddress is also visible in -drive since 2.8: -drive if=none,driver=nbd,server.type=inet,server.data.host=127.0.0.1,server.data.port=12345 Nobody should be using it, as it's fairly new and has never been documented, so adding still more compatibility gunk to keep it working isn't worth the trouble. You now have to use -drive if=none,driver=nbd,server.type=inet,server.host=127.0.0.1,server.port=12345 Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 1490895797-29094-9-git-send-email-armbru@redhat.com [mreitz: Change iotest 147 accordingly] Because of this interface change, iotest 147 has to be adapted. Unfortunately, we cannot just flatten all of the addresses because nbd-server-start still takes a plain SocketAddress. Therefore, we need both and this is most easily achieved by writing the SocketAddress into the code and flattening it where necessary. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20170330221243.17333-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>