summaryrefslogtreecommitdiff
path: root/block/Makefile.objs
AgeCommit message (Collapse)Author
2015-07-08block: convert quorum blockdrv to use crypto APIsDaniel P. Berrange
Get rid of direct use of gnutls APIs in quorum blockdrv in favour of using the crypto APIs. This avoids the need to do conditional compilation of the quorum driver. It can simply report an error at file open file instead if the required hash algorithm isn't supported by QEMU. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1435770638-25715-8-git-send-email-berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-12throttle: Add throttle group infrastructureAlberto Garcia
Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 2fdb4de17210b733a13eb472c33cd08b45f8fd21.1433779731.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-04-28block: move I/O request processing to block/io.cStefan Hajnoczi
The block.c file has grown to over 6000 lines. It is time to split this file so there are fewer conflicts and the code is easier to maintain. Extract I/O request processing code: * Read * Write * Zero writes and making the image empty * Flush * Discard * ioctl * Tracked requests and queuing * Throttling and copy-on-read * Block status and allocated functions * Refreshing block limits * Reading/writing vmstate * qemu_blockalign() and friends The patch simply moves code from block.c into block/io.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28block/dmg: make it modularMichael Tokarev
dmg can optionally utilize libbz2, make it modular Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06block/dmg: support bzip2 block entry typesPeter Wu
This patch adds support for bzip2-compressed block entries as introduced with OS X 10.4 (source: https://en.wikipedia.org/wiki/Apple_Disk_Image). It was tested against a 5.2G "OS X Yosemite" installation image which stores the BLXX block in the XML property list (instead of resource forks) and has over 5k chunks. New configure entries are added (--enable-bzip2 / --disable-bzip2) to control inclusion of bzip2 functionality (which requires linking against libbz2). The help message suggests that this option is needed for DMG files, but the tests are generic enough that other parts of QEMU can use bzip2 if needed. The identifiers are based on http://newosxbook.com/DMG.html. The decompression routines are based on the zlib case, but as there is no way to reset the decompression state (unlike zlib), memory is allocated and deallocated for every decompression. This should not be problematic as the decompression takes most of the time and as blocks are typically about/over 1 MiB in size, only one allocation is done every 2000 sectors. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1420566495-13284-12-git-send-email-peter@lekensteyn.nl Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06block: add event when disk usage exceeds thresholdFrancesco Romani
Managing applications, like oVirt (http://www.ovirt.org), make extensive use of thin-provisioned disk images. To let the guest run smoothly and be not unnecessarily paused, oVirt sets a disk usage threshold (so called 'high water mark') based on the occupation of the device, and automatically extends the image once the threshold is reached or exceeded. In order to detect the crossing of the threshold, oVirt has no choice but aggressively polling the QEMU monitor using the query-blockstats command. This lead to unnecessary system load, and is made even worse under scale: deployments with hundreds of VMs are no longer rare. To fix this, this patch adds: * A new monitor command `block-set-write-threshold', to set a mark for a given block device. * A new event `BLOCK_WRITE_THRESHOLD', to report if a block device usage exceeds the threshold. * A new `write_threshold' field into the `BlockDeviceInfo' structure, to report the configured threshold. This will allow the managing application to use smarter and more efficient monitoring, greatly reducing the need of polling. [Updated qemu-iotests 067 output to add the new 'write_threshold' property. --Stefan] [Changed g_assert_false() to !g_assert() to fix the build on older glib versions. --Kevin] Signed-off-by: Francesco Romani <fromani@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1421068273-692-1-git-send-email-fromani@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-11-03qemu-img: Implement commit like QMPMax Reitz
qemu-img should use QMP commands whenever possible in order to ensure feature completeness of both online and offline image operations. As qemu-img itself has no access to QMP (since this would basically require just everything being linked into qemu-img), imitate QMP's implementation of block-commit by using commit_active_start() and then waiting for the block job to finish. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1414159063-25977-9-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-20block: New BlockBackendMarkus Armbruster
A block device consists of a frontend device model and a backend. A block backend has a tree of block drivers doing the actual work. The tree is managed by the block layer. We currently use a single abstraction BlockDriverState both for tree nodes and the backend as a whole. Drawbacks: * Its API includes both stuff that makes sense only at the block backend level (root of the tree) and stuff that's only for use within the block layer. This makes the API bigger and more complex than necessary. Moreover, it's not obvious which interfaces are meant for device models, and which really aren't. * Since device models keep a reference to their backend, the backend object can't just be destroyed. But for media change, we need to replace the tree. Our solution is to make the BlockDriverState generic, with actual driver state in a separate object, pointed to by member opaque. That lets us replace the tree by deinitializing and reinitializing its root. This special need of the root makes the data structure awkward everywhere in the tree. The general plan is to separate the APIs into "block backend", for use by device models, monitor and whatever other code dealing with block backends, and "block driver", for use by the block layer and whatever other code (if any) dealing with trees and tree nodes. Code dealing with block backends, device models in particular, should become completely oblivious of BlockDriverState. This should let us clean up both APIs, and the tree data structures. This commit is a first step. It creates a minimal "block backend" API: type BlockBackend and functions to create, destroy and find them. BlockBackend objects are created and destroyed exactly when root BlockDriverState objects are created and destroyed. "Root" in the sense of "in bdrv_states". They're not yet used for anything; that'll come shortly. A root BlockDriverState is created with bdrv_new_root(), so where to create a BlockBackend is obvious. Where these roots get destroyed isn't always as obvious. It is obvious in qemu-img.c, qemu-io.c and qemu-nbd.c, and in error paths of blockdev_init(), blk_connect(). That leaves destruction of objects successfully created by blockdev_init() and blk_connect(). blockdev_init() is used only by drive_new() and qmp_blockdev_add(). Objects created by the latter are currently indestructible (see commit 48f364d "blockdev: Refuse to drive_del something added with blockdev-add" and commit 2d246f0 "blockdev: Introduce DriveInfo.enable_auto_del"). Objects created by the former get destroyed by drive_del(). Objects created by blk_connect() get destroyed by blk_disconnect(). BlockBackend is reference-counted. Its reference count never exceeds one so far, but that's going to change. In drive_del(), the BB's reference count is surely one now. The BDS's reference count is greater than one when something else is holding a reference, such as a block job. In this case, the BB is destroyed right away, but the BDS lives on until all extra references get dropped. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-22block: delete cow block driverStefan Hajnoczi
This patch removes support for the cow file format. Normally we do not break backwards compatibility but in this case there is no impact and it is the most logical option. Extraordinary claims require extraordinary evidence so I will show why removing the cow block driver is the right thing to do. The cow file format is the disk image format for Usermode Linux, a way of running a Linux system in userspace. The performance of UML was never great and it was hacky, but it enjoyed some popularity before hardware virtualization support became mainstream. QEMU's block/cow.c is supposed to read this image file format. Unfortunately the file format was underspecified: 1. Earlier Linux versions used the MAXPATHLEN constant for the backing filename field. The value of MAXPATHLEN can change, so Linux switched to a 4096 literal but QEMU has a 1024 literal. 2. Padding was not used on the header struct (both in the Linux kernel and in QEMU) so the struct layout varied across architectures. In particular, i386 and x86_64 were different due to int64_t alignment differences. Linux now uses __attribute__((packed)), QEMU does not. Therefore: 1. QEMU cow images do not conform to the Linux cow image file format. 2. cow images cannot be shared between different host architectures. This means QEMU cow images are useless and QEMU has not had bug reports from users actually hitting these issues. Let's get rid of this thing, it serves no purpose and no one will be affected. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 1410877464-20481-1-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22block: Introduce "null" driversFam Zheng
This is an analogue to Linux null_blk. It can be used for testing or benchmarking block device emulation and general block layer functionalities such as coroutines and throttling, where disk IO is not necessary or wanted. Use null-aio:// for AIO version, and null-co:// for coroutine version. [Resolved conflict with Fam's async bdrv_aio_cancel() series: 1. Drop .bdrv_aio_cancel() since it is now done by block.c 2. Rename qemu_aio_release() to qemu_aio_unref() --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1410415798-20673-2-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-10block: Extract the block accounting codeBenoît Canet
The plan is to add new accounting metrics (latency, invalid requests, failed requests, queue depth) and block.c is overpopulated so it will be better to work in a separate module. Moreover the long term plan is to have statistics in each of the BDS of the graph for metrology purpose; this means that the device model statistics must move from the topmost BDS to the device model. So we need to decouple the statistic code from BlockDriverState. This is another argument for the extraction of the code in a separate module. CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Max Reitz <mreitz@redhat.com> CC: Eric Blake <eblake@redhat.com> CC: Benoit Canet <benoit@irqsave.net> CC: Fam Zheng <famz@redhat.com> CC: Peter Crosthwaite <peter.crosthwaite@xilinx.com> CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Benoît Canet <benoit.canet@nodalink.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-29aio-win32: add support for socketsPaolo Bonzini
Uses the same select/WSAEventSelect scheme as main-loop.c. WSAEventSelect() is edge-triggered, so it cannot be used directly, but it is still used as a way to exit from a blocking g_poll(). Before g_poll() is called, we poll sockets with a non-blocking select() to achieve the level-triggered semantics we require: if a socket is ready, the g_poll() is made non-blocking too. Based on a patch from Or Goshen. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15block: Support Archipelago as a QEMU block backendChrysostomos Nanakos
VM Image on Archipelago volume is specified like this: file.driver=archipelago,file.volume=<volumename>[,file.mport=<mapperd_port>[, file.vport=<vlmcd_port>][,file.segment=<segment_name>]] 'archipelago' is the protocol. 'mport' is the port number on which mapperd is listening. This is optional and if not specified, QEMU will make Archipelago to use the default port. 'vport' is the port number on which vlmcd is listening. This is optional and if not specified, QEMU will make Archipelago to use the default port. 'segment' is the name of the shared memory segment Archipelago stack is using. This is optional and if not specified, QEMU will make Archipelago to use the default value, 'archipelago'. Examples: file.driver=archipelago,file.volume=my_vm_volume file.driver=archipelago,file.volume=my_vm_volume,file.mport=123 file.driver=archipelago,file.volume=my_vm_volume,file.mport=123, file.vport=1234 file.driver=archipelago,file.volume=my_vm_volume,file.mport=123, file.vport=1234,file.segment=my_segment Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-25Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block patches # gpg: Signature made Fri 21 Feb 2014 21:42:24 GMT using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (54 commits) iotests: Mixed quorum child device specifications quorum: Simplify quorum_open() quorum: Add unit test. quorum: Add quorum_open() and quorum_close(). quorum: Implement recursive .bdrv_recurse_is_first_non_filter in quorum. quorum: Add quorum_co_flush(). quorum: Add quorum_invalidate_cache(). quorum: Add quorum_getlength(). quorum: Add quorum mechanism. quorum: Add quorum_aio_readv. blkverify: Extract qemu_iovec_clone() and qemu_iovec_compare() from blkverify. quorum: Add quorum_aio_writev and its dependencies. quorum: Create BDRVQuorumState and BlkDriver and do init. quorum: Create quorum.c, add QuorumChildRequest and QuorumAIOCB. check-qdict: Test termination of qdict_array_split() check-qdict: Adjust test for qdict_array_split() qdict: Extract non-QDicts in qdict_array_split() qemu-config: Sections must consist of keys qemu-iotests: Check qemu-img command line parsing qemu-img: Allow -o help with incomplete argument list ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-21quorum: Add quorum mechanism.Benoît Canet
This patchset enables the core of the quorum mechanism. The num_children reads are compared to get the majority version and if this version exists more than threshold times the guest won't see the error at all. If a block is corrupted or if an error occurs during an IO or if the quorum cannot be established QMP events are used to report to the management. Use gnutls's SHA-256 to compare versions. --enable-quorum must be used to enable the feature. Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21quorum: Create quorum.c, add QuorumChildRequest and QuorumAIOCB.Benoît Canet
Quorum is a block filter mirroring writes to num_children children. For reads quorum reads each children and does a vote. If more than vote_threshold versions are identical the quorum is reached and this winning version is returned to the guest. So quorum prevents bit corruption. For high availability purpose minority errors are reported via QMP but the guest does not see them. This patch creates the driver C source file and introduces the structures that will be used in asynchronous reads and writes. Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-20block: use per-object cflags and libsFam Zheng
No longer adds flags and libs for them to global variables, instead create config-host.mak variables like FOO_CFLAGS and FOO_LIBS, which is used as per object cflags and libs. This removes unwanted dependencies from libcacard. Signed-off-by: Fam Zheng <famz@redhat.com> [Split from Fam's patch to enable modules. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-09block: add native support for NFSPeter Lieven
This patch adds native support for accessing images on NFS shares without the requirement to actually mount the entire NFS share on the host. NFS Images can simply be specified by an url of the form: nfs://<host>/<export>/<filename>[?param=value[&param2=value2[&...]]] For example: qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2 You need LibNFS from Ronnie Sahlberg available at: git://github.com/sahlberg/libnfs.git for this to work. During configure it is automatically probed for libnfs and support is enabled on-the-fly. You can forbid or enforce libnfs support with --disable-libnfs or --enable-libnfs respectively. Due to NFS restrictions you might need to execute your binaries as root, allow them to open priviledged ports (<1024) or specify insecure option on the NFS server. For additional information on ROOT vs. non-ROOT operation and URL format + parameters see: https://raw.github.com/sahlberg/libnfs/master/README Supported by qemu are the uid, gid and tcp-syncnt URL parameters. LibNFS currently support NFS version 3 only. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-12-16Split nbd block client codeMarc-André Lureau
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-11-07block: vhdx - log parsing, replay, and flush supportJeff Cody
This adds support for VHDX v0 logs, as specified in Microsoft's VHDX Specification Format v1.00: https://www.microsoft.com/en-us/download/details.aspx?id=34750 The following support is added: * Log parsing, and validation - validate that an existing log is correct. * Log search - search through an existing log, to find any valid sequence of entries. * Log replay and flush - replay an existing log, and flush/clear the log when complete. The VHDX log is a circular buffer, with elements (sectors) of 4KB. A log entry is a variably-length number of sectors, that is comprised of a header and 'descriptors', that describe each sector. A log may contain multiple entries, know as a log sequence. In a log sequence, each log entry immediately follows the previous entry, with an incrementing sequence number. There can only ever be one active and valid sequence in the log. Each log entry must match the file log GUID in order to be valid (along with other criteria). Once we have flushed all valid log entries, we marked the file log GUID to be zero, which indicates a buffer with no valid entries. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07block: vhdx - break endian translation functions outJeff Cody
This moves the endian translation functions out from the vhdx.c source, into a separate source file. In addition to the previously defined endian functions, new endian translation functions for log support are added as well. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07block: vhdx - add header update capability.Jeff Cody
This adds the ability to update the headers in a VHDX image, including generating a new MS-compatible GUID. As VHDX depends on uuid.h, VHDX is now a configurable build option. If VHDX support is enabled, that will also enable uuid as well. The default is to have VHDX enabled. To enable/disable VHDX: --enable-vhdx, --disable-vhdx Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-30switch raw block driver from "raw.o" to "raw_bsd.o"Laszlo Ersek
"Incoming" function prototypes and "outgoing" function calls must match reality. Implemented using the "struct BlockDriver" definition in "include/block/block_int.h", and gcc errors & warnings. v1->v2: On 08/20/13 09:51, Kevin Wolf wrote: > Am 18.08.2013 um 16:29 hat Paolo Bonzini geschrieben: >> Il 16/08/2013 16:15, Laszlo Ersek ha scritto: >>> +static int raw_reopen_prepare(BDRVReopenState *reopen_state, >>> + BlockReopenQueue *queue, Error **errp) >>> { >>> - return bdrv_reopen_prepare(bs->file); >>> + BDRVReopenState tmp = *reopen_state; >>> + >>> + tmp.bs = tmp.bs->file; >>> + return bdrv_reopen_prepare(&tmp, queue, errp); >>> } >> >> This should just return zero, my fault. > > Which is because bdrv_reopen_queue() already queues bs->file for reopen. > The simple return 0; implementation is shared by all other format drivers > that support reopening images. Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28block: add basic backup support to block driverDietmar Maurer
backup_start() creates a block job that copies a point-in-time snapshot of a block device to a target block device. We call backup_do_cow() for each write during backup. That function reads the original data from the block device before it gets overwritten. The data is then written to the target device. Currently backup cluster size is hardcoded to 65536 bytes. [I made a number of changes to Dietmar's original patch and folded them in to make code review easy. Here is the full list: * Drop BackupDumpFunc interface in favor of a target block device * Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes() * Use 0 delay instead of 1us, like other block jobs * Unify creation/start functions into backup_start() * Simplify cleanup, free bitmap in backup_run() instead of cb * function * Use HBitmap to avoid duplicating bitmap code * Use bdrv_getlength() instead of accessing ->total_sectors * directly * Delete the backup.h header file, it is no longer necessary * Move ./backup.c to block/backup.c * Remove #ifdefed out code * Coding style and whitespace cleanups * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks * Keep our own in-flight CowRequest list instead of using block.c tracked requests. This means a little code duplication but is much simpler than trying to share the tracked requests list and use the backup block size. * Add on_source_error and on_target_error error handling. * Use trace events instead of DPRINTF() -- stefanha] Signed-off-by: Dietmar Maurer <dietmar@proxmox.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-04block: move qmp and info dump related code to block/qapi.cWenchao Xia
This patch is a pure code move patch, except following modification: 1 get_human_readable_size() is changed to static function. 2 dump_human_image_info() is renamed to bdrv_image_info_dump(). 3 in qmp_query_block() and qmp_query_blockstats, use bdrv_next(bs) instead of direct traverse of global array 'bdrv_states'. 4 collect_snapshots() and collect_image_info() are renamed, unused parameter *fmt in collect_image_info() is removed. 5 code style fix. To avoid conflict and tip better, macro in header file is BLOCK_QAPI_H instead of QAPI_H. Now block.h and snapshot.h are at the same level in include path, block_int.h and qapi.h will both include them. Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-04block: move snapshot code in block.c to block/snapshot.cWenchao Xia
All snapshot related code, except bdrv_snapshot_dump() and bdrv_is_snapshot(), is moved to block/snapshot.c. bdrv_snapshot_dump() will be moved to another file later. bdrv_is_snapshot() is not related with internal snapshot. It also fixes small code style errors reported by check script. Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-05-03block: initial VHDX driver support framework - supports open and probeJeff Cody
This is the initial block driver framework for VHDX image support (i.e. Hyper-V image file formats), that supports opening VHDX files, and parsing the headers. This commit does not yet enable: - reading - writing - updating the header - differencing files (images with parents) - log replay / dirty logs (only clean images) This is based on Microsoft's VHDX specification: "VHDX Format Specification v0.95", published 4/12/2012 https://www.microsoft.com/en-us/download/details.aspx?id=29681 Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15block: Add support for Secure Shell (ssh) block device.Richard W.M. Jones
qemu-system-x86_64 -drive file=ssh://hostname/some/image QEMU will ssh into 'hostname' and open '/some/image' which is made available as a standard block device. You can specify a username (ssh://user@host/...) and/or a port number (ssh://host:port/...). You can also use an alternate syntax using properties (file.user, file.host, file.port, file.path). Current limitations: - Authentication must be done without passwords or passphrases, using ssh-agent. Other authentication methods are not supported. - Uses a single connection, instead of concurrent AIO with multiple SSH connections. This is implemented using libssh2 on the client side. The server just requires a regular ssh daemon with sftp-server support. Most ssh daemons on Unix/Linux systems will work out of the box. Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-19build: move rules from Makefile to */Makefile.objsPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-31Merge remote-tracking branch 'origin/master' into threadpoolPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-31raw-win32: implement native asynchronous I/OPaolo Bonzini
With the new support for EventNotifiers in the AIO event loop, we can hook a completion port to every opened file and use asynchronous I/O on them. Wine's support is extremely inefficient, also because it really does the I/O synchronously on regular files. (!) But it works, and it is good to keep the Win32 and POSIX ports as similar as possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-31raw-posix: move linux-aio.c to block/Paolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-30aio: add Win32 implementationPaolo Bonzini
The Win32 implementation will only accept EventNotifiers, thus a few drivers are disabled under Windows. EventNotifiers are a good match for the GSource implementation, too, because the Win32 port of glib allows to place their HANDLEs in a GPollFD. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-24mirror: introduce mirror jobPaolo Bonzini
This patch adds the implementation of a new job that mirrors a disk to a new image while letting the guest continue using the old image. The target is treated as a "black box" and data is copied from the source to the target in the background. This can be used for several purposes, including storage migration, continuous replication, and observation of the guest I/O in an external program. It is also a first step in replacing the inefficient block migration code that is part of QEMU. The job is possibly never-ending, but it is logically structured into two phases: 1) copy all data as fast as possible until the target first gets in sync with the source; 2) keep target in sync and ensure that reopening to the target gets a correct (full) copy of the source data. The second phase is indicated by the progress in "info block-jobs" reporting the current offset to be equal to the length of the file. When the job is cancelled in the second phase, QEMU will run the job until the source is clean and quiescent, then it will report successful completion of the job. In other words, the BLOCK_JOB_CANCELLED event means that the target may _not_ be consistent with a past state of the source; the BLOCK_JOB_COMPLETED event means that the target is consistent with a past state of the source. (Note that it could already happen that management lost the race against QEMU and got a completion event instead of cancellation). It is not yet possible to complete the job and switch over to the target disk. The next patches will fix this and add many refinements to the basic idea introduced here. These include improved error management, some tunable knobs and performance optimizations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28block: move job APIs to separate filesPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28block: add live block commit functionalityJeff Cody
This adds the live commit coroutine. This iteration focuses on the commit only below the active layer, and not the active layer itself. The behaviour is similar to block streaming; the sectors are walked through, and anything that exists above 'base' is committed back down into base. At the end, intermediate images are deleted, and the chain stitched together. Images are restored to their original open flags upon completion. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28block: Support GlusterFS as a QEMU block backend.Bharata B Rao
This patch adds gluster as the new block backend in QEMU. This gives QEMU the ability to boot VM images from gluster volumes. Its already possible to boot from VM images on gluster volumes using FUSE mount, but this patchset provides the ability to boot VM images from gluster volumes by by-passing the FUSE layer in gluster. This is made possible by using libgfapi routines to perform IO on gluster volumes directly. VM Image on gluster volume is specified like this: file=gluster[+transport]://[server[:port]]/volname/image[?socket=...] 'gluster' is the protocol. 'transport' specifies the transport type used to connect to gluster management daemon (glusterd). Valid transport types are tcp, unix and rdma. If a transport type isn't specified, then tcp type is assumed. 'server' specifies the server where the volume file specification for the given volume resides. This can be either hostname, ipv4 address or ipv6 address. ipv6 address needs to be within square brackets [ ]. If transport type is 'unix', then 'server' field should not be specifed. The 'socket' field needs to be populated with the path to unix domain socket. 'port' is the port number on which glusterd is listening. This is optional and if not specified, QEMU will send 0 which will make gluster to use the default port. If the transport type is unix, then 'port' should not be specified. 'volname' is the name of the gluster volume which contains the VM image. 'image' is the path to the actual VM image that resides on gluster volume. Examples: file=gluster://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4/testvol/a.img file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket file=gluster+rdma://1.2.3.4:24007/testvol/a.img Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-07build: move block/ objects to nested Makefile.objsPaolo Bonzini
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>