summaryrefslogtreecommitdiff
path: root/migration/migration.c
AgeCommit message (Collapse)Author
2016-05-26migration: add support for encrypting data with TLSDaniel P. Berrange
This extends the migration_set_incoming_channel and migration_set_outgoing_channel methods so that they will automatically wrap the QIOChannel in a QIOChannelTLS instance if TLS credentials are configured in the migration parameters. This allows TLS to work for tcp, unix, fd and exec migration protocols. It does not (currently) work for RDMA since it does not use these APIs, but it is unlikely that TLS would be desired with RDMA anyway since it would degrade the performance to that seen with TCP defeating the purpose of using RDMA. On the target host, QEMU would be launched with a set of TLS credentials for a server endpoint $ qemu-system-x86_64 -monitor stdio -incoming defer \ -object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=server,id=tls0 \ ...other args... To enable incoming TLS migration 2 monitor commands are then used (qemu) migrate_set_str_parameter tls-creds tls0 (qemu) migrate_incoming tcp:myhostname:9000 On the source host, QEMU is launched in a similar manner but using client endpoint credentials $ qemu-system-x86_64 -monitor stdio \ -object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=client,id=tls0 \ ...other args... To enable outgoing TLS migration 2 monitor commands are then used (qemu) migrate_set_str_parameter tls-creds tls0 (qemu) migrate tcp:otherhostname:9000 Thanks to earlier improvements to error reporting, TLS errors can be seen 'info migrate' when doing a detached migration. For example: (qemu) info migrate capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off Migration status: failed total time: 0 milliseconds error description: TLS handshake failed: The TLS connection was non-properly terminated. Or (qemu) info migrate capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off Migration status: failed total time: 0 milliseconds error description: Certificate does not match the hostname localhost Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1461751518-12128-27-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: define 'tls-creds' and 'tls-hostname' migration parametersDaniel P. Berrange
Define two new migration parameters to be used with TLS encryption. The 'tls-creds' parameter provides the ID of an instance of the 'tls-creds' object type, or rather a subclass such as 'tls-creds-x509'. Providing these credentials will enable use of TLS on the migration data stream. If using x509 certificates, together with a migration URI that does not include a hostname, the 'tls-hostname' parameter provides the hostname to use when verifying the server's x509 certificate. This allows TLS to be used in combination with fd: and exec: protocols where a TCP connection is established by a 3rd party outside of QEMU. NB, this requires changing the migrate_set_parameter method in the HMP to accept a 's' (string) value instead of 'i' (integer). This is backwards compatible, because the parsing of strings allows the quotes to be optional, thus any integer is also a valid string. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1461751518-12128-26-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: don't use an array for storing migrate parametersDaniel P. Berrange
The MigrateState struct uses an array for storing migration parameters. This presumes that all future parameters will be integers too, which is not going to be the case. There is no functional reason why an array is used, if anything it makes the code less clear. The QAPI schema already defines a struct - MigrationParameters - capable of storing all the individual parameters, so just use that instead of an array. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1461751518-12128-25-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: convert exec socket protocol to use QIOChannelDaniel P. Berrange
Convert the exec socket migration protocol driver to use QIOChannel and QEMUFileChannel, instead of the stdio popen APIs. It can be unconditionally built because the QIOChannelCommand class can report suitable error messages on platforms which can't fork processes. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1461751518-12128-17-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: convert fd socket protocol to use QIOChannelDaniel P. Berrange
Convert the fd socket migration protocol driver to use QIOChannel and QEMUFileChannel, instead of plain sockets APIs. It can be unconditionally built because the QIOChannel APIs it uses will take care to report suitable error messages if needed. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1461751518-12128-16-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: convert unix socket protocol to use QIOChannelDaniel P. Berrange
Convert the unix socket migration protocol driver to use QIOChannel and QEMUFileChannel, instead of plain sockets APIs. It can be unconditionally built, since the socket impl of QIOChannel will report a suitable error on platforms where UNIX sockets are unavailable. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1461751518-12128-13-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: convert post-copy to use QIOChannelBufferDaniel P. Berrange
The post-copy code does some I/O to/from an intermediate in-memory buffer rather than direct to the underlying I/O channel. Switch this code to use QIOChannelBuffer instead of QEMUSizedBuffer. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <1461751518-12128-12-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: add reporting of errors for outgoing migrationDaniel P. Berrange
Currently if an application initiates an outgoing migration, it may or may not, get an error reported back on failure. If the error occurs synchronously to the 'migrate' command execution, the client app will see the error message. This is the case for DNS lookup failures. If the error occurs asynchronously to the monitor command though, the error will be thrown away and the client left guessing about what went wrong. This is the case for failure to connect to the TCP server (eg due to wrong port, or firewall rules, or other similar errors). In the future we'll be adding more scope for errors to happen asynchronously with the TLS protocol handshake. TLS errors are hard to diagnose even when they are well reported, so discarding errors entirely will make it impossible to debug TLS connection problems. Management apps which do migration are already using 'query-migrate' / 'info migrate' to check up on progress of background migration operations and to see their end status. This is a fine place to also include the error message when things go wrong. This patch thus adds an 'error-desc' field to the MigrationInfo struct, which will be populated when the 'status' is set to 'failed': (qemu) migrate -d tcp:localhost:9001 (qemu) info migrate capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off Migration status: failed (Error connecting to socket: Connection refused) total time: 0 milliseconds In the HMP, when doing non-detached migration, it is also possible to display this error message directly to the app. (qemu) migrate tcp:localhost:9001 Error connecting to socket: Connection refused Or with QMP { "execute": "query-migrate", "arguments": {} } { "return": { "status": "failed", "error-desc": "address resolution failed for myhost:9000: No address associated with hostname" } } Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <1461751518-12128-11-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: add helpers for creating QEMUFile from a QIOChannelDaniel P. Berrange
Currently creating a QEMUFile instance from a QIOChannel is quite simple only requiring a single call to qemu_fopen_channel_input or qemu_fopen_channel_output depending on the end of migration connection. When QEMU gains TLS support, however, there will need to be a TLS negotiation done inbetween creation of the QIOChannel and creation of the final QEMUFile. Introduce some helper methods that will encapsulate this logic, isolating the migration protocol drivers from knowledge about TLS. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Acked-by: Juan Quintela <quintela@redhat.com> Message-Id: <1461751518-12128-10-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: force QEMUFile to blocking mode for outgoing migrationDaniel P. Berrange
Instead of relying on the default QEMUFile I/O blocking flag state, explicitly turn on blocking I/O for outgoing migration since it takes place in a background thread. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <1461751518-12128-8-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26migration: introduce set_blocking function in QEMUFileOpsDaniel P. Berrange
Remove the assumption that every QEMUFile implementation has a file descriptor available by introducing a new function in QEMUFileOps to change the blocking state of a QEMUFile. If not set, it will fallback to the original code using the get_fd method. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <1461751518-12128-7-git-send-email-berrange@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23migration: regain control of images when migration fails to completeGreg Kurz
We currently have an error path during migration that can cause the source QEMU to abort: migration_thread() migration_completion() runstate_is_running() ----------------> true if guest is running bdrv_inactivate_all() ----------------> inactivate images qemu_savevm_state_complete_precopy() ... qemu_fflush() socket_writev_buffer() --------> error because destination fails qemu_fflush() -------------------> set error on migration stream migration_completion() -----------------> set migrate state to FAILED migration_thread() -----------------------> break migration loop vm_start() -----------------------------> restart guest with inactive images and you get: qemu-system-ppc64: socket_writev_buffer: Got err=104 for (32768/18446744073709551615) qemu-system-ppc64: /home/greg/Work/qemu/qemu-master/block/io.c:1342:bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed. Aborted (core dumped) If we try postcopy with a similar scenario, we also get the writev error message but QEMU leaves the guest paused because entered_postcopy is true. We could possibly do the same with precopy and leave the guest paused. But since the historical default for migration errors is to restart the source, this patch adds a call to bdrv_invalidate_cache_all() instead. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Message-Id: <146357896785.6003.11983081732454362715.stgit@bahia.huguette.org> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23savevm: fail if migration blockers are presentGreg Kurz
QEMU has currently two ways to prevent migration to occur: - migration blocker when it depends on runtime state - VMStateDescription.unmigratable when migration is not supported at all This patch gathers all the logic into a single function to be called from both the savevm and the migrate paths. This fixes a bug with 9p, at least, where savevm would succeed and the following would happen in the guest after loadvm: $ ls /host ls: cannot access /host: Protocol error With this patch: (qemu) savevm foo Migration is disabled when VirtFS export path '/' is mounted in the guest using mount_tag 'host' Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <146239057139.11271.9011797645454781543.stgit@bahia.huguette.org> [Update subject according to Paolo's suggestion - Amit] Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23migration: Promote improved autoconverge commands out of experimental stateJason J. Herne
The new autoconverge throttling commands have been tested for a release now. It is time to move them out of the experimental state. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Message-Id: <1461262038-8197-1-git-send-email-jjherne@linux.vnet.ibm.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-18Fix some typos found by codespellStefan Weil
Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-03-22util: move declarations out of qemu-common.hVeronia Bahaa
Move declarations out of qemu-common.h for functions declared in utils/ files: e.g. include/qemu/path.h for utils/path.c. Move inline functions out of qemu-common.h and into new files (e.g. include/qemu/bcd.h) Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-11postcopy: Remove the x-Dr. David Alan Gilbert
Postcopy seems to have survived a cycle with only a few fixes, and Jiri has the current libvirt wired up and working ( https://www.redhat.com/archives/libvir-list/2016-March/msg00080.html ) so remove the experimental tag. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <1457690016-9070-3-git-send-email-dgilbert@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-11migration: fix warning for source_return_path_threadPeter Xu
max_len is not necessary, while it brings a warning during compilation when specify "-Wstack-usage=1000000". Replacing using sizeof(). Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1457503932-31763-1-git-send-email-peterx@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-08Postcopy: Fix sync count in info migrateDr. David Alan Gilbert
I'd missed the sync count off in the postcopy case. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-id: 1456394631-18010-1-git-send-email-dgilbert@redhat.com Message-Id: <1456394631-18010-1-git-send-email-dgilbert@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-26migration (ordinary): move bdrv_invalidate_cache_all of of coroutine contextDenis V. Lunev
There is a possibility to hit an assert in qcow2_get_specific_info that s->qcow_version is undefined. This happens when VM in starting from suspended state, i.e. it processes incoming migration, and in the same time 'info block' is called. The problem is that qcow2_invalidate_cache() closes the image and memset()s BDRVQcowState in the middle. The patch moves processing of bdrv_invalidate_cache_all out of coroutine context for standard migration to avoid that. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Fam Zheng <famz@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Juan Quintela <quintela@redhat.com> CC: Amit Shah <amit.shah@redhat.com> Message-Id: <1456304019-10507-2-git-send-email-den@openvz.org> [Amit: Fix a use-after-free bug] Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-23Postcopy+spice: Pass spice migration data earlierDr. David Alan Gilbert
Spice hooks the migration status changes to figure out when to transmit information to the new spice server; but the migration status in postcopy doesn't quite fit - the destination starts running before the end of the source migration. It's not a case of hanging off the migration status change to postcopy-active either, since that happens before we stop the guest CPU. Fix it by sending a notify just after sending the device state, and adding a flag that can be tested by the notify receiver. Symptom: spice handover doesn't work with the error: red_worker.c:11540:display_channel_wait_for_migrate_data: timeout Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-id: 1456161452-25318-1-git-send-email-dgilbert@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-11rdma: remove check on time_spent when calculating mbsWei Yang
Within the if statement, time_spent is assured to be non-zero. This patch just removes the check on time_spent when calculating mbs. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-05migration: remove useless code.Liang Li
Since 's->state' will be set in migrate_init(), there is no need to set it before calling migrate_init(). The code and the related comments can be removed. Signed-off-by: Liang Li <liang.z.li@intel.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1453875065-24326-1-git-send-email-liang.z.li@intel.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05migration: rename 'file' in MigrationState to 'to_dst_file'zhanghailiang
Rename the 'file' member of MigrationState to 'to_dst_file' to be consistent with to_src_file, from_src_file and from_dst_file. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1452829066-9764-3-git-send-email-zhang.zhanghailiang@huawei.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-01-29migration: Clean up includesPeter Maydell
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1453832250-766-2-git-send-email-peter.maydell@linaro.org
2016-01-20block: Inactivate BDS when migration completesKevin Wolf
So far, live migration with shared storage meant that the image is in a not-really-ready don't-touch-me state on the destination while the source is still actively using it, but after completing the migration, the image was fully opened on both sides. This is bad. This patch adds a block driver callback to inactivate images on the source before completing the migration. Inactivation means that it goes to a state as if it was just live migrated to the qemu instance on the source (i.e. BDRV_O_INACTIVE is set). You're then supposed to continue either on the source or on the destination, which takes ownership of the image. A typical migration looks like this now with respect to disk images: 1. Destination qemu is started, the image is opened with BDRV_O_INACTIVE. The image is fully opened on the source. 2. Migration is about to complete. The source flushes the image and inactivates it. Now both sides have the image opened with BDRV_O_INACTIVE and are expecting the other side to still modify it. 3. One side (the destination on success) continues and calls bdrv_invalidate_all() in order to take ownership of the image again. This removes BDRV_O_INACTIVE on the resuming side; the flag remains set on the other side. This ensures that the same image isn't written to by both instances (unless both are resumed, but then you get what you deserve). This is important because .bdrv_close for non-BDRV_O_INACTIVE images could write to the image file, which is definitely forbidden while another host is using the image. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com>
2016-01-13migration: Add state records for migration incomingzhanghailiang
For migration destination, we also need to know its state, we will use it in COLO. Here we add a new member 'state' for MigrationIncomingState, and also use migrate_set_state() to modify its value. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> dgilbert: Fixed early free of MigraitonIncomingState Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Message-Id: <1450266458-3178-3-git-send-email-dgilbert@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-01-13migration: Export migrate_set_state()zhanghailiang
Change the first parameter of migrate_set_state(), and export it. We will use it in a later patch to update incoming state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Updated comment as per Juan's review Message-Id: <1450266458-3178-2-git-send-email-dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-12-17qapi: Don't let implicit enum MAX member collideEric Blake
Now that we guarantee the user doesn't have any enum values beginning with a single underscore, we can use that for our own purposes. Renaming ENUM_MAX to ENUM__MAX makes it obvious that the sentinel is generated. This patch was mostly generated by applying a temporary patch: |diff --git a/scripts/qapi.py b/scripts/qapi.py |index e6d014b..b862ec9 100644 |--- a/scripts/qapi.py |+++ b/scripts/qapi.py |@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = { | max_index = c_enum_const(name, 'MAX', prefix) | ret += mcgen(''' | [%(max_index)s] = NULL, |+// %(max_index)s | }; | ''', | max_index=max_index) then running: $ cat qapi-{types,event}.c tests/test-qapi-types.c | sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list $ git grep -l _MAX | xargs sed -i -f list The only things not generated are the changes in scripts/qapi.py. Rejecting enum members named 'MAX' is now useless, and will be dropped in the next patch. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> [Rebased to current master, commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-12-03migration: do floating-point divisionPaolo Bonzini
Dividing integer expressions transferred_bytes and time_spent, and then converting the integer quotient to type double. Any remainder, or fractional part of the quotient, is ignored. Fix this. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-19Unneeded NULL checkDr. David Alan Gilbert
The check is unneccesary, we read the value at the start of the thread, use it, and never change it. The value is checked to be non-NULL before thread creation. Spotted by coverity, CID 1339211 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-19migration: Dead assignment of current_timeDr. David Alan Gilbert
I set current_time before the postcopy test but never use it; (I think this was from the original version where it was time based). Spotted by coverity, CID 1339208 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-12migration_init: Fix lock initialisation/make it explicitDr. David Alan Gilbert
Peter reported a lock error on MacOS after my a82d593b patch. migrate_get_current does one-time initialisation of a bunch of variables. migrate_init does reinitialisation even on a 2nd migrate after a cancel. The problem here was that I'd initialised the mutex in migrate_get_current, and the memset in migrate_init corrupted it. Remove the memset and replace it by explicit initialisation of fields that need initialising; this also turns out to be simpler than the old code that had to preserve some fields. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixes: a82d593b Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-12migrate-start-postcopy: Improve textDr. David Alan Gilbert
Improve the text in both the qapi-schema and hmp help to point out you need to set the postcopy-ram capability prior to issuing migrate-start-postcopy. Also fix the text of the migrate_start_postcopy error that deals with capabilities. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-12Finish non-postcopiable iterative devices before packageDr. David Alan Gilbert
Where we have iterable, but non-postcopiable devices (e.g. htab or block migration), complete them before forming the 'package' but with the CPUs stopped. This stops them filling up the package. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10End of migration for postcopyDr. David Alan Gilbert
Tweak the end of migration cleanup; we don't want to close stuff down at the end of the main stream, since the postcopy is still sending pages on the other thread. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Start up a postcopy/listener thread ready for incoming page dataDr. David Alan Gilbert
The loading of a device state (during postcopy) may access guest memory that's still on the source machine and thus might need a page fill; split off a separate thread that handles the incoming page data so that the original incoming migration code can finish off the device data. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Don't iterate on precopy-only devices during postcopyDr. David Alan Gilbert
During the postcopy phase we must not call the iterate method on precopy-only devices, since they may have done some cleanup during the _complete call at the end of the precopy phase. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Page request: Process incoming page requestDr. David Alan Gilbert
On receiving MIG_RPCOMM_REQ_PAGES look up the address and queue the page. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Page request: Add MIG_RP_MSG_REQ_PAGES reverse commandDr. David Alan Gilbert
Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy destination to request a page from the source. Two versions exist: MIG_RP_MSG_REQ_PAGES_ID that includes a RAMBlock name and start/len MIG_RP_MSG_REQ_PAGES that just has start/len for use with the same RAMBlock as a previous MIG_RP_MSG_REQ_PAGES_ID Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Postcopy: End of iterationDr. David Alan Gilbert
The end of migration in postcopy is a bit different since some of the things normally done at the end of migration have already been done on the transition to postcopy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Postcopy: Postcopy startup in migration threadDr. David Alan Gilbert
Rework the migration thread to setup and start postcopy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10migration_completion: Take current stateDr. David Alan Gilbert
Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we complete migration, and we need to know which we expect to be in to change state safely. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10MIGRATION_STATUS_POSTCOPY_ACTIVE: Add new migration stateDr. David Alan Gilbert
'MIGRATION_STATUS_POSTCOPY_ACTIVE' is entered after migrate_start_postcopy 'migration_in_postcopy' is provided for other sections to know if they're in postcopy. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10migration_completion: Take current stateDr. David Alan Gilbert
Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we complete migration, and we need to know which we expect to be in to change state safely. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10migrate_start_postcopy: Command to trigger transition to postcopyDr. David Alan Gilbert
Once postcopy is enabled (with migrate_set_capability), the migration will still start on precopy mode. To cause a transition into postcopy the: migrate_start_postcopy command must be issued. Postcopy will start sometime after this (when it's next checked in the migration loop). Issuing the command before migration has started will error, and issuing after it has finished is ignored. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Modify save_live_pending for postcopyDr. David Alan Gilbert
Modify save_live_pending to return separate postcopiable and non-postcopiable counts. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Add wrappers and handlers for sending/receiving the postcopy-ram migration ↵Dr. David Alan Gilbert
messages. The state of the postcopy process is managed via a series of messages; * Add wrappers and handlers for sending/receiving these messages * Add state variable that track the current state of postcopy Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Add migration-capability boolean for postcopy-ram.Dr. David Alan Gilbert
The 'postcopy ram' capability allows postcopy migration of RAM; note that the migration starts off in precopy mode until postcopy mode is triggered (see the migrate_start_postcopy patch later in the series). Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10Rework loadvm path for subloopsDr. David Alan Gilbert
Postcopy needs to have two migration streams loading concurrently; one from memory (with the device state) and the other from the fd with the memory transactions. Split the core of qemu_loadvm_state out so we can use it for both. Allow the inner loadvm loop to quit and cause the parent loops to exit as well. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>