summaryrefslogtreecommitdiff
path: root/util
AgeCommit message (Collapse)Author
2017-10-24osdep: introduce qemu_mprotect_rwx/noneEmilio G. Cota
Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-20oslib-posix: Fix compiler warning and some data typesStefan Weil
gcc warning: /qemu/util/oslib-posix.c:304:11: error: variable ‘addr’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] Fix also some related data types: numpages, hpagesize are used as pointer offset. Always use size_t for them and also for the derived numpages_per_thread and size_per_thread. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-id: 20171016202912.1117-1-sw@weilnetz.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-10-16sockets: Handle race condition between binds to the same portKnut Omang
If an offset of ports is specified to the inet_listen_saddr function(), and two or more processes tries to bind from these ports at the same time, occasionally more than one process may be able to bind to the same port. The condition is detected by listen() but too late to avoid a failure. This function is called by socket_listen() and used by all socket listening code in QEMU, so all cases where any form of dynamic port selection is used should be subject to this issue. Add code to close and re-establish the socket when this condition is observed, hiding the race condition from the user. Also clean up some issues with error handling to allow more accurate reporting of the cause of an error. This has been developed and tested by means of the test-listen unit test in the previous commit. Enable the test for make check now that it passes. Reviewed-by: Bhavesh Davda <bhavesh.davda@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: Knut Omang <knut.omang@oracle.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-10-16sockets: factor out create_fast_reuse_socketKnut Omang
Another refactoring step to prepare for fixing the problem exposed with the test-listen test in the previous commit Signed-off-by: Knut Omang <knut.omang@oracle.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-10-16sockets: factor out a new try_bind() functionKnut Omang
A refactoring step to prepare for the problem exposed by the test-listen test in the previous commit. Simplify and reorganize the IPv6 specific extra measures and move it out of the for loop to increase code readability. No semantic changes. Signed-off-by: Knut Omang <knut.omang@oracle.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-10-10util: move qemu_real_host_page_size/mask to osdep.hEmilio G. Cota
These only depend on the host and therefore belong in the common osdep, not in a target-dependent object. While at it, query the host during an init constructor, which guarantees the page size will be well-defined throughout the execution of the program. Suggested-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2017-10-09config: qemu_config_parse() return number of config groupsEduardo Habkost
Change qemu_config_parse() to return the number of config groups in success and -EINVAL on error. This will allow callers of qemu_config_parse() to check if something was really loaded from the config file. All existing callers of qemu_config_parse() and qemu_read_config_file() only check if the return value was negative, so the change shouldn't affect them. Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20171004025043.3788-2-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-10-06hbitmap: Rename serialization_granularity to serialization_alignEric Blake
The only client of hbitmap_serialization_granularity() is dirty-bitmap's bdrv_dirty_bitmap_serialization_align(). Keeping the two names consistent is worthwhile, and the shorter name is more representative of what the function returns (the required alignment to be used for start/count of other serialization functions, where violating the alignment causes assertion failures). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-10-03aio: fix assert when remove poll during destroyStefan Hajnoczi
After iothread is enabled internally inside QEMU with GMainContext, we may encounter this warning when destroying the iothread: (qemu-system-x86_64:19925): GLib-CRITICAL **: g_source_remove_poll: assertion '!SOURCE_DESTROYED (source)' failed The problem is that g_source_remove_poll() does not allow to remove one source from array if the source is detached from its owner context. (peterx: which IMHO does not make much sense) Fix it on QEMU side by avoid calling g_source_remove_poll() if we know the object is during destruction, and we won't leak anything after all since the array will be gone soon cleanly even with that fd. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-id: 20170928025958.1420-6-peterx@redhat.com [peterx: write the commit message] Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-09-27Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches # gpg: Signature made Tue 26 Sep 2017 14:52:32 BST # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (24 commits) block/qcow2-bitmap: fix use of uninitialized pointer qemu-iotests: add shrinking image test qcow2: add shrink image support qcow2: add qcow2_cache_discard qemu-img: add --shrink flag for resize iotests: fix 181: enable postcopy-ram capability on target qemu-iotests: Test change-backing-file command block: Fix permissions after bdrv_reopen() block: reopen: Queue children after their parents block: Base permissions on rw state after reopen block: Add reopen queue to bdrv_check_perm() block: Add reopen_queue to bdrv_child_perm() qemu-io: Drop write permissions before read-only reopen block: Clean up some bad code in the vvfat driver block/throttle-groups.c: allocate RestartData on the heap throttle: Assert that bkt->max is valid in throttle_compute_wait() iotests: Print full path of bad output if mismatch iotests: use virtio aliases for 067 iotests: use -ccw on s390x for 051 iotests: use -ccw on s390x for 040, 139, and 182 ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-26throttle: Assert that bkt->max is valid in throttle_compute_wait()Alberto Garcia
If bkt->max == 0 and bkt->burst_length > 1 then we could have a division by 0 in throttle_do_compute_wait(). That configuration is however not permitted and is already detected by throttle_is_valid(), but let's assert it in throttle_compute_wait() to make it explicit. Found by Coverity (CID: 1381016). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-26util/qemu-thread-posix.c: Replace OS ifdefs with CONFIG_HAVE_SEM_TIMEDWAITPeter Maydell
In qemu-thread-posix.c we have two implementations of the various qemu_sem_* functions, one of which uses native POSIX sem_* and the other of which emulates them with pthread conditions. This is necessary because not all our host OSes support sem_timedwait(). Instead of a hard-coded list of OSes which don't implement sem_timedwait(), which gets out of date, make configure test for the presence of the function and set a new CONFIG_HAVE_SEM_TIMEDWAIT appropriately. In particular, newer NetBSDs have sem_timedwait(), so this commit will switch them over to using it. OSX still does not have an implementation. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Kamil Rytarowski <n54@gmx.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-09-22bitmap: provide to_le/from_le helpersPeter Xu
Provide helpers to convert bitmaps to little endian format. It can be used when we want to send one bitmap via network to some other hosts. One thing to mention is that, these helpers only solve the problem of endianess, but it does not solve the problem of different word size on machines (the bitmaps managing same count of bits may contains different size when malloced). So we need to take care of the size alignment issue on the callers for now. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-09-22bitmap: introduce bitmap_count_one()Peter Xu
Count how many bits set in the bitmap. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-09-22bitmap: remove BITOP_WORD()Peter Xu
We have BIT_WORD(). It's the same. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-09-19Convert remaining single line fprintf() to warn_report()Alistair Francis
Convert any remaining uses of fprintf(stderr, "warning:"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using this command: find ./* -type f -exec sed -i 's|fprintf(.*".*warning[,:] |warn_report("|Ig' {} + The #include lines and chagnes to the test Makefile were manually updated to allow the code to compile. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Message-Id: <2c94ac3bb116cc6b8ebbcd66a254920a69665515.1503077821.git.alistair.francis@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19Convert multi-line fprintf() to warn_report()Alistair Francis
Convert all the multi-line uses of fprintf(stderr, "warning:"..."\n"... to use warn_report() instead. This helps standardise on a single method of printing warnings to the user. All of the warnings were changed using these commands: find ./* -type f -exec sed -i \ 'N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \ {} + find ./* -type f -exec sed -i \ 'N;N;N;N;N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \ {} + Indentation fixed up manually afterwards. Some of the lines were manually edited to reduce the line length to below 80 charecters. Some of the lines with newlines in the middle of the string were also manually edit to avoid checkpatch errrors. The #include lines were manually updated to allow the code to compile. Several of the warning messages can be improved after this patch, to keep this patch mechanical this has been moved into a later patch. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Yongbok Kim <yongbok.kim@imgtec.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Alexander Graf <agraf@suse.de> Cc: Jason Wang <jasowang@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <5def63849ca8f551630c6f2b45bcb1c482f765a6.1505158760.git.alistair.francis@xilinx.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19scsi: move non-emulation specific code to scsi/Paolo Bonzini
util/scsi.c includes some SCSI code that is shared by block/iscsi.c and hw/scsi, but the introduction of the persistent reservation helper will add many more instances of this. There is also include/block/scsi.h, which actually is not part of the core block layer. The persistent reservation manager will also need a home. A scsi/ directory provides one for both the aforementioned shared code and the PR manager code. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19scsi: Introduce scsi_sense_buf_to_errnoFam Zheng
This recognizes the "fixed" and "descriptor" format sense data, extracts the sense key/asc/ascq fields then converts them to an errno. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170821141008.19383-4-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19scsi: Improve scsi_sense_to_errnoFam Zheng
Tweak the errno mapping to return more accurate/appropriate values. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170821141008.19383-3-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19scsi: Refactor scsi sense interpreting codeFam Zheng
So that it can be reused outside of iscsi.c. Also update MAINTAINERS to include the new files in SCSI section. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170821141008.19383-2-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-07configure: Drop AIX host supportPeter Maydell
Nobody has mentioned AIX host support on the mailing list for years, and we have no test systems for it so it is most likely broken. We've advertised in configure for two releases now that we plan to drop support for this host OS, and have had no complaints. Drop the AIX host support code. We can also drop the now-unused AIX version of sys_cache_info(). Note that the _CALL_AIX define used in the PPC tcg backend is also used for Linux PPC64, and so that code should not be removed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1504545540-8002-1-git-send-email-peter.maydell@linaro.org
2017-09-07Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell
Block layer patches # gpg: Signature made Wed 06 Sep 2017 14:44:41 BST # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing qemu-iotests: add 184 for throttle filter driver block: add throttle block filter driver block: convert ThrottleGroup to object with QOM block: tidy ThrottleGroupMember initializations block: add aio_context field in ThrottleGroupMember block: move ThrottleGroup membership to ThrottleGroupMember block: document semantics of bdrv_co_preadv|pwritev qcow: Check failure of bdrv_getlength() and bdrv_truncate() qcow: Change signature of get_cluster_offset() block: add default implementations for bdrv_co_get_block_status() block: remove bdrv_truncate callback in blkdebug block: remove unused bdrv_media_changed block: pass bdrv_* methods to bs->file by default in block filters Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-05block: convert ThrottleGroup to object with QOMManos Pitsidianakis
ThrottleGroup is converted to an object. This will allow the future throttle block filter drive easy creation and configuration of throttle groups in QMP and cli. A new QAPI struct, ThrottleLimits, is introduced to provide a shared struct for all throttle configuration needs in QMP. ThrottleGroups can be created via CLI as -object throttle-group,id=foo,x-iops-total=100,x-.. where x-* are individual limit properties. Since we can't add non-scalar properties in -object this interface must be used instead. However, setting these properties must be disabled after initialization because certain combinations of limits are forbidden and thus configuration changes should be done in one transaction. The individual properties will go away when support for non-scalar values in CLI is implemented and thus are marked as experimental. ThrottleGroup also has a `limits` property that uses the ThrottleLimits struct. It can be used to create ThrottleGroups or set the configuration in existing groups as follows: { "execute": "object-add", "arguments": { "qom-type": "throttle-group", "id": "foo", "props" : { "limits": { "iops-total": 100 } } } } { "execute" : "qom-set", "arguments" : { "path" : "foo", "property" : "limits", "value" : { "iops-total" : 99 } } } This also means a group's configuration can be fetched with qom-get. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-09-05Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-20170905-2' ↵Peter Maydell
into staging Merge QEMU I/O 2017/09/05 v2 # gpg: Signature made Tue 05 Sep 2017 13:22:36 BST # gpg: using RSA key 0xBE86EBB415104FDF # gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" # gpg: aka "Daniel P. Berrange <berrange@redhat.com>" # Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF * remotes/berrange/tags/pull-qio-20170905-2: io: fix check for handshake completion in TLS test io: add new qio_channel_{readv, writev, read, write}_all functions io: fix typo in docs comment for qio_channel_read util: remove the obsolete non-blocking connect io: fix temp directory used by test-io-channel-tls test Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-09-05util: remove the obsolete non-blocking connectCao jin
The non-blocking connect mechanism is obsolete, and it doesn't work well in inet connection, because it will call getaddrinfo first and getaddrinfo will blocks on DNS lookups. Since commit e65c67e4 & d984464e, the non-blocking connect of migration goes through QIOChannel in a different manner(using a thread), and nobody use this old non-blocking connect anymore. Any newly written code which needs a non-blocking connect should use the QIOChannel code, so we can drop NonBlockingConnectHandler as a concept entirely. Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-09-04qapi: Generate FOO_str() macro for QAPI enum FOOMarkus Armbruster
The next commit will put it to use. May look pointless now, but we're going to change the FOO_lookup's type, and then it'll help. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1503564371-26090-13-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-08-30oslib-posix: Print errors before aborting on qemu_alloc_stack()Eduardo Habkost
If QEMU is running on a system that's out of memory and mmap() fails, QEMU aborts with no error message at all, making it hard to debug the reason for the failure. Add perror() calls that will print error information before aborting. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170829212053.6003-1-ehabkost@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29throttle: Make burst_length 64bit and add range checksAlberto Garcia
LeakyBucket.burst_length is defined as an unsigned integer but the code never checks for overflows and it only makes sure that the value is not 0. In practice this means that the user can set something like throttling.iops-total-max-length=4294967300 despite being larger than UINT_MAX and the final value after casting to unsigned int will be 4. This patch changes the data type to uint64_t. This does not increase the storage size of LeakyBucket, and allows us to assign the value directly from qemu_opt_get_number() or BlockIOThrottle and then do the checks directly in throttle_is_valid(). The value of burst_length does not have a specific upper limit, but since the bucket size is defined by max * burst_length we have to prevent overflows. Instead of going for UINT64_MAX or something similar this patch reuses THROTTLE_VALUE_MAX, which allows I/O bursts of 1 GiB/s for 10 days in a row. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 1b2e3049803f71cafb2e1fa1be4fb47147a0d398.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29throttle: Make LeakyBucket.avg and LeakyBucket.max integer typesAlberto Garcia
Both the throttling limits set with the throttling.iops-* and throttling.bps-* options and their QMP equivalents defined in the BlockIOThrottle struct are integer values. Those limits are also reported in the BlockDeviceInfo struct and they are integers there as well. Therefore there's no reason to store them internally as double and do the conversion everytime we're setting or querying them, so this patch uses uint64_t for those types. Let's also use an unsigned type because we don't allow negative values anyway. LeakyBucket.level and LeakyBucket.burst_level do however remain double because their value changes depending on the fraction of time elapsed since the previous I/O operation. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: f29b840422767b5be2c41c2dfdbbbf6c5f8fedf8.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29throttle: Remove throttle_fix_bucket() / throttle_unfix_bucket()Alberto Garcia
The throttling code can change internally the value of bkt->max if it hasn't been set by the user. The problem with this is that if we want to retrieve the original value we have to undo this change first. This is ugly and unnecessary: this patch removes the throttle_fix_bucket() and throttle_unfix_bucket() functions completely and moves the logic to throttle_compute_wait(). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Message-id: 5b0b9e1ac6eb208d709eddc7b09e7669a523bff3.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29throttle: Make throttle_is_valid() a bit less verboseAlberto Garcia
Use a pointer to the bucket instead of repeating cfg->buckets[i] all the time. This makes the code more concise and will help us expand the checks later and save a few line breaks. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 763ffc40a26b17d54cf93f5a999e4656049fcf0c.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-29throttle: Update the throttle_fix_bucket() documentationAlberto Garcia
The way the throttling algorithm works is that requests start being throttled once the bucket level exceeds the burst limit. When we get there the bucket leaks at the level set by the user (bkt->avg), and that leak rate is what prevents guest I/O from exceeding the desired limit. If we don't allow bursts (i.e. bkt->max == 0) then we can start throttling requests immediately. The problem with keeping the threshold at 0 is that it only allows one request at a time, and as soon as there's a bit of I/O from the guest every other request will be throttled and performance will suffer considerably. That can even make the guest unable to reach the throttle limit if that limit is high enough, and that happens regardless of the block scheduler used by the guest. Increasing that threshold gives flexibility to the guest, allowing it to perform short bursts of I/O before being throttled. Increasing the threshold too much does not make a difference in the long run (because it's the leak rate what defines the actual throughput) but it does allow the guest to perform longer initial bursts and exceed the throttle limit for a short while. A burst value of bkt->avg / 10 allows the guest to perform 100ms' worth of I/O at the target rate without being throttled. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 31aae6645f0d1fbf3860fb2b528b757236f0c0a7.1503580370.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-11osdep: Add runtime OFD lock detectionFam Zheng
Build time check of OFD lock is not sufficient and can cause image open errors when the runtime environment doesn't support it. Add a helper function to probe it at runtime, additionally. Also provide a qemu_has_ofd_lock() for callers to check the status. Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-08-08Revert "rcu: do not create thread in pthread_atfork callback"Paolo Bonzini
This reverts commit a59629fcc6f603e19b516dc08f75334e5c480bd0. This is not needed anymore because the IOThread mutex is not "magic" anymore (need not kick the CPU thread)and also because fork callbacks are only enabled at the very beginning of QEMU's execution. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-08-08rcu: completely disable pthread_atfork callbacks as soon as possiblePaolo Bonzini
Because of -daemonize, system mode QEMU sometimes needs to fork() and keep RCU enabled in the child. However, there is a possible deadlock with synchronize_rcu: - the CPU thread is inside a RCU critical section and wants to take the BQL in order to do MMIO - the monitor thread, which is owning the BQL, calls rcu_init_lock which tries to take the rcu_sync_lock - the call_rcu thread has taken rcu_sync_lock in synchronize_rcu, but synchronize_rcu needs the CPU thread to end the critical section before returning. This cannot happen for user-mode emulation, because it does not have a BQL. To fix it, assume that system mode QEMU only forks in preparation for exec (except when daemonizing) and disable pthread_atfork as soon as the double fork has happened. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-31docs: fix broken paths to docs/devel/tracing.txtPhilippe Mathieu-Daudé
With the move of some docs/ to docs/devel/ on ac06724a71, no references were updated. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-07-24Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-07-24' ↵Peter Maydell
into staging Error reporting patches for 2017-07-24 # gpg: Signature made Mon 24 Jul 2017 13:17:49 BST # gpg: using RSA key 0x3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-error-2017-07-24: error: Revert unwanted change of warning messages Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-24error: Revert unwanted change of warning messagesMarkus Armbruster
Commit 97f4030 changed warning messages from timestamp-if-enabled progname ":" location "warning: " message to "warning: " timestamp-if-enabled progname ":" location message This regressed qemu-iotests 051. Put "warning: " right back where it was, along with "info: ". Reported-by: Kevin Wolf <kwolf@redhat.com> Cc: Alistair Francis <alistair.francis@xilinx.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1500449614-16811-1-git-send-email-armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-07-24util: Introduce include/qemu/cpuid.hRichard Henderson
Clang 3.9 passes the CONFIG_AVX2_OPT configure test. However, the supplied <cpuid.h> does not contain the bit_AVX2 define that we use when detecting whether the routine can be enabled. Introduce a qemu-specific header that uses the compiler's definition of __cpuid et al, but supplies any missing bit_* definitions needed. This avoids introducing any extra ifdefs to util/bufferiszero.c, and allows quite a few to be removed from tcg/i386/tcg-target.inc.c. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20170719044018.18063-1-rth@twiddle.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-21util/oslib-posix.c: Avoid warning on NetBSDPeter Maydell
On NetBSD the compiler warns: util/oslib-posix.c: In function 'sigaction_invoke': util/oslib-posix.c:589:5: warning: missing braces around initializer [-Wmissing-braces] siginfo_t si = { 0 }; ^ util/oslib-posix.c:589:5: warning: (near initialization for 'si.si_pad') [-Wmissing-braces] because on this platform siginfo_t is defined as typedef union siginfo { char si_pad[128]; /* Total size; for future expansion */ struct _ksiginfo _info; } siginfo_t; Avoid this warning by initializing the struct with {} instead; this is a GCC extension but we use it all over the codebase already. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1500568341-8389-1-git-send-email-peter.maydell@linaro.org
2017-07-19util/cacheinfo: Add missing include for ppc linuxPhilippe Mathieu-Daudé
This include was forgotten when splitting cacheinfo.c out of tcg/ppc/tcg-target.inc.c (see commit b255b2c8). For a Centos7 host, the include path <signal.h> <bits/sigcontext.h> <asm/sigcontext.h> <asm/elf.h> <asm/auxvec.h> implicitly pulls in the desired AT_* defines. Not so for Debian Jessie. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20170711015524.22936-1-f4bug@amsat.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-07-18block: remove timer canceling in throttle_config()Manos Pitsidianakis
throttle_config() cancels the timers of the calling BlockBackend. This doesn't make sense because other BlockBackends in the group remain untouched. There's no need to cancel the timers in the one specific BlockBackend so let's not do that. Throttled requests will run as scheduled and future requests will follow the new configuration. This also allows a throttle group's configuration to be changed even when it has no members. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18block: add clock_type field to ThrottleGroupManos Pitsidianakis
Clock type in throttling is currently inferred by the ThrottleTimer's clock type even though it is a per-ThrottleGroup property; it doesn't make sense to have different clock types in the same group. Moving this to a field in ThrottleGroup can simplify some of the throttle functions. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-18Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into ↵Peter Maydell
staging # gpg: Signature made Mon 17 Jul 2017 16:40:18 BST # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: block: fix shadowed variable in bdrv_co_pdiscard util/aio-win32: Only select on what we are actually waiting for Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-17util/aio-win32: Only select on what we are actually waiting forAlistair Francis
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 9307b70e9876c4e9e3c4478524a32a23a3d5dd05.1499368180.git.alistair.francis@xilinx.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-07-17coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgradePaolo Bonzini
These functions are more efficient in the presence of contention. qemu_co_rwlock_downgrade also guarantees not to block, which may be useful in some algorithms too. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-3-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
2017-07-14sockets: ensure we don't accept IPv4 clients when IPv4 is disabledDaniel P. Berrange
Currently if you disable listening on IPv4 addresses, via the CLI flag ipv4=off, we still mistakenly accept IPv4 clients via the IPv6 listener socket due to IPV6_V6ONLY flag being unset. We must ensure IPV6_V6ONLY is always set if ipv4=off This fixes the following scenarios -incoming tcp::9000,ipv6=on -incoming tcp:[::]:9000,ipv6=on -chardev socket,id=cdev0,host=,port=9000,server,nowait,ipv4=off -chardev socket,id=cdev0,host=,port=9000,server,nowait,ipv6=on -chardev socket,id=cdev0,host=::,port=9000,server,nowait,ipv4=off -chardev socket,id=cdev0,host=::,port=9000,server,nowait,ipv6=on which all mistakenly accepted IPv4 clients Acked-by: Gerd Hoffmann <kraxel@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-07-14sockets: don't block IPv4 clients when listening on "::"Daniel P. Berrange
When inet_parse() parses the hostname, it is forcing the has_ipv6 && ipv6 flags if the address contains a ":". This means that if the user had set the ipv4=on flag, to try to restrict the listener to just ipv4, an error would not have been raised. eg -incoming tcp:[::]:9000,ipv4 should have raised an error because listening for IPv4 on "::" is a non-sensical combination. With this removed, we now call getaddrinfo() on "::" passing PF_INET and so getaddrinfo reports an error about the hostname being incompatible with the requested protocol: qemu-system-x86_64: -incoming tcp:[::]:9000,ipv4: address resolution failed for :::9000: Address family for hostname not supported Likewise it is explicitly setting the has_ipv4 & ipv4 flags when the address contains only digits + '.'. This has no ill-effect, but also has no benefit, so is removed. Acked-by: Gerd Hoffmann <kraxel@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-07-14sockets: ensure we can bind to both ipv4 & ipv6 separatelyDaniel P. Berrange
When binding to an IPv6 socket we currently force the IPV6_V6ONLY flag to off. This means that the IPv6 socket will accept both IPv4 & IPv6 sockets when QEMU is launched with something like -vnc :::1 While this is good for that case, it is bad for other cases. For example if an empty hostname is given, getaddrinfo resolves it to 2 addresses 0.0.0.0 and ::, in that order. We will thus bind to 0.0.0.0 first, and then fail to bind to :: on the same port. The same problem can happen if any other hostname lookup causes the IPv4 address to be reported before the IPv6 address. When we get an IPv6 bind failure, we should re-try the same port, but with IPV6_V6ONLY turned on again, to avoid clash with any IPv4 listener. This ensures that -vnc :1 will bind successfully to both 0.0.0.0 and ::, and also avoid -vnc :1,to=2 from mistakenly using a 2nd port for the :: listener. This is a regression due to commit 396f935 "ui: add ability to specify multiple VNC listen addresses". Acked-by: Gerd Hoffmann <kraxel@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>