Age | Commit message (Collapse) | Author |
|
The INDEX_op_call case has just been obsoleted; the mov and movi
cases have not been reachable for years. Attempt to document this
both in each tcg_out_op switch, and via TCG_OPF_NOT_PRESENT.
Because of the TCG_OPF_NOT_PRESENT change, this must be done for
all targets in a single commit.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Using a 16-byte aligned structure achieves best results, both for code
cleanliness and compiled code size. However, this means that we can't
use the trick of encoding the slot number into the low 2 bits.
Thankfully, we only ever use slot2, so make that explicit in the names
of the relocation functions, and drop the code for other slots.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Pull tcg 2014-04-22
# gpg: Signature made Tue 22 Apr 2014 22:00:04 BST using RSA key ID 4DD0279B
# gpg: Can't check signature: public key not found
* remotes/rth/tags/tcg-next-20140422:
tcg: Use HOST_WORDS_BIGENDIAN
tcg: Fix fallback from muls2_i64 to mulu2_i64
tcg: Use tcg_gen_mulu2_i32 in tcg_gen_muls2_i32
tcg: Relax requirement for mulu2_i32 on 32-bit hosts
tcg-s390: Remove W constraint
tcg-sparc: Use the type parameter to tcg_target_const_match
tcg-ppc64: Use the type parameter to tcg_target_const_match
tcg-aarch64: Remove w constraint
tcg: Add TCGType parameter to tcg_target_const_match
tcg: Fix out of range shift in deposit optimizations
tci: Mask shift counts to avoid undefined behavior
tcg: Mask shift quantities while folding
tcg: Use "unspecified behavior" for shifts
tcg: Fix warning (1 bit signed bitfield entry) and replace int by bool
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
|
|
Most 64-bit targets need to be able to ignore the high bits
of a TCG_TYPE_I32 value.
Suggested-by: Stuart Brady <sdb@zubnet.me.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Still inline, but updated to the new routines. Always use the LE
helpers, reusing the bswap between the fast and slot paths.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
The only differences were in the bswap insns emitted.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Saving at least two cycles per store, and cleaning up the code.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
This sequencing requires 5 stop bits instead of 6, and has room left
over to pre-load the tlb addend, and bswap data prior to being stored.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Saves one bundle for the common case of exit_tb 0.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Being able to "extend" from 64-bits (with a mov) simplifies
a few places where the conditional breaks the train of thought.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
We can and/or/xor/andcm small constants, saving one cycle.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
We can subtract from more small constants that just 0 with one insn,
and we can add the negative for most small constants.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Avoids a wasted cycle loading up small constants.
Simplify the code assuming the tcg optimizer is going to work
and don't expect the first operand of the add to be constant.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
When performing an operation with two input registers, we'd leave
the stop bit (and thus an extra cycle) that's only needed when one
or the other input is a constant.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Since the move away from the global areg0, we're no longer globally
reserving areg0. Which means our use of R7 clobbers a call-saved
register. Shift areg0 into the windowed registers. Indeed, choose
the incoming parameter register that it comes to us by.
This requires moving the register holding the return address elsewhere.
Choose R33 for tidiness.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
There was a misconception that a stop bit is required between a compare
and the branch that uses the predicate set by the compare. This lead to
the usage of an extra bundle in which to perform the compare. The extra
bundle left room for constants to be loaded for use with the compare insn.
If we pack the compare and the branch together in the same bundle, then
there's no longer any room for non-zero constants. At which point we
can eliminate half the function by not handling them.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Using only indirect calls results in 3 bundles (one to load the
descriptor address), and 4 stop bits. By looking through the
descriptor to the constants, we can perform the call with 2
bundles and only 1 stop bit.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
There's no need to go through the full opcode-to-insn function call
to generate nops. This makes the source a bit more readable.
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
This is a no-op backend data implementation, for those targets that
are not currently using the load/store optimization path.
This is prepatory to always requiring these functions in all backends.
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
The _cmmu helpers can be moved to exec-all.h. The helpers that are
used from TCG will shortly need access to tcg_target_long so move
their declarations into tcg.h.
This requires minor include adjustments to all TCG backends.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
|
|
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Note that in the general reg=reg,reg case we're restricted
to 16-bit insertions. This makes it easy to allow "any"
constant as input, as post-truncation it will fit into the
constant load insn for which we have room in the bundle.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
It is possible to slightly optimize the TLB access code, by replacing
the movi + and instructions by a deposit instruction.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Remove suboptimal register shifting in qemu_ld/st ops, introduced at the
CONFIG_TCG_PASS_AREG0 time.
As mem_idx is now loaded in register R58/R59 for the slow path, we have
to make sure to do it last, to not add additional register constraints.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Implement movcond_i32/64 on ia64 hosts. It is not possible to have
immediate compare arguments without adding a new bundle, but it is
possible to have 22-bit immediate value arguments.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Use stack instead of temp_buf array in CPUState for TCG temps.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
The TCG jmp operation doesn't really make sense in the QEMU context, it
is unused, it is not implemented by some targets, and it is wrongly
implemented by some others.
This patch simply removes it.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Blue Swirl <blauwirbel@gmail.com>
Acked-by: Stefan Weil<sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
The TCG targets no longer need individual implementations.
Since commit 6a18ae2d2947532d5c26439548afa0481c4529f9,
'flags' is no longer used in tcg_target_get_call_iarg_regs_count.
The remaining tcg_target_get_call_iarg_regs_count is trivial and only
called once. Therefore the patch eliminates it completely.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Now that CONFIG_TCG_PASS_AREG0 is enabled for all targets,
remove dead code and support for !CONFIG_TCG_PASS_AREG0 case.
Remove dyngen-exec.h and all references to it. Although included by
hw/spapr_hcall.c, it does not seem to use it.
Remove unused HELPER_CFLAGS.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
|
|
Store slow path has been broken in e141ab52d:
- the arguments are shifted before the last one (mem_index) is written.
- the shift is done for both slow and fast paths.
Fix that. Also optimize a bit by bundling the move together. This still
can be optimized, but it's better to wait for a decision to be taken on
the arguments order.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Prologue and epilogue code has been broken in cea5f9a28.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Optionally, make memory access helpers take a parameter for CPUState
instead of relying on global env.
On most targets, perform simple moves to reorder registers. On i386,
switch from regparm(3) calling convention to standard stack-based
version.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
Including tcg_out_ld, tcg_out_st, tcg_out_mov, tcg_out_movi.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
|
|
Remove the unused function tcg_out_addi() from the ia64 TCG backend;
this brings it into line with other backends.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
The second register is never used for ia64 hosts.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Delegate TCG temp_buf setup to targets, so that they can use a stack
frame later instead.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Make functions take a parameter for CPUState instead of relying
on global env. Pass CPUState pointer to TCG prologue, which moves
it to AREG0.
Thanks to Peter Maydell and Laurent Desnogues for the ARM prologue
change.
Revert the hacks to avoid AREG0 use on Sparc hosts.
Move cpu_has_work() and cpu_pc_from_tb() from exec.h to cpu.h.
Compile the file without HELPER_CFLAGS.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
Add a comment about cache coherency and retranslation, so that people
developping new targets based on existing ones are warned of the issue.
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|
|
Spotted by Richard Henderson.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
|