can_duplicate_and_interleave_p checks whether we know a way of
building a particular VLA SLP invariant. g:60034ecf25597bd515f
skipped that test for booleans, to support MASK_LEN_GATHER_LOAD
calls with a dummy all-ones mask. But there's nothing fundamentally
different about VLA masks vs VLA data vectors. If we have a VLA mask
that isn't all-ones, we need some way of loading it. This ultimately
led to the ICE in the PR.
This patch fixes it by applying can_duplicate_and_interleave_p
to masks, while also adding a special path for uniform vectors
(of all kinds) to support the MASK_LEN_GATHER_LOAD usage. This
also fixes an XFAIL in pr36648.cc for SVE.
The patch is mostly Richard's. My only changes were to skip
redundant conversions and to use gimple_build_vector_from_val
for all eligible vectors.
2023-11-27 Richard Biener <rguenther@suse.de>
Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/112661
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Defer duplicate-and-
interleave test to...
(vect_build_slp_tree_2): ...here, once we have all the operands.
Skip the test for uniform vectors.
(vect_create_constant_vectors): Detect uniform vectors. Avoid
redundant conversions in that case. Use gimple_build_vector_from_val
to build the vector.
gcc/testsuite/
* g++.dg/vect/pr36648.cc: Remove XFAIL for VLA load-lanes.
excl_hash_traits can be defined more simply by reusing existing traits.
gcc/
* attribs.cc (excl_hash_traits): Delete.
(test_attribute_exclusions): Use pair_hash and nofree_string_hash
instead.
We don't support it and it doesn't happen without vector extensions, so
just remove the unhandled case.
Fixes gcc.dg/pr78575.c failure.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_vectorize_vec_perm_const): Disallow TImode.
One builtin type slipped through the cracks of the last commits.
gcc/ChangeLog:
* config/s390/s390-builtin-types.def (BT_FN_UV8HI_UV8HI_UINT):
Add missing builtin type.
Commit 248df13b96 restricted the rotate
count to immediates. Although the documentation of vec_rli (Vector
Element Rotate Left Immediate) can be read as if it where restricted to
immediates, this is not the case. Thus, revert this commit.
In order to finally allow register operands, the rotate count must be of
type unsigned char since the expander expects it to be of mode QI. The
previously used type unsigned integer worked out for immediates since
those are of VOID mode anyway.
gcc/ChangeLog:
* config/s390/s390-builtin-types.def: Remove types.
* config/s390/s390-builtins.def (O_U64): Remove 64-bit literal support.
Don't restrict s390_vec_rli and s390_verll[bhfg] to immediates.
* config/s390/s390.cc (s390_const_operand_ok): Remove 64-bit
literal support.
We lack a match.pd pattern recognizing ptr + o ==/!= ptr + o'.
The following extends handling we have for integral types to
pointers.
PR tree-optimization/112706
* match.pd (ptr + o ==/!=/- ptr + o'): New patterns.
* gcc.dg/tree-ssa/pr112706.c: New testcase.
Currently for an unsigned 16-bit comparison between memory and an
immediate where the high bit is set, a clc is emitted. This is because
the constant is created for mode HI and therefore sign extended. This
means constraint D does not hold anymore. Since the mode already
restricts the immediate to 16 bit, it is enough to make use of
constraint n and chop of the high bits in the output template.
gcc/ChangeLog:
* config/s390/s390.md (*cmphi_ccu): For immediate operand 1 make
use of constraint n instead of D and chop of high bits in the
output template.
As reported in the PR, mipsisa64r2-sde-elf doesn't build because HEAP_TRAMPOLINES_INIT
macro isn't defined anywhere.
It is normally defined by
# Figure out if we need to enable heap trampolines by default
case ${target} in
*-*-darwin2*)
# Currently, we do this for macOS 11 and above.
tm_defines="$tm_defines HEAP_TRAMPOLINES_INIT=1"
;;
*)
tm_defines="$tm_defines HEAP_TRAMPOLINES_INIT=0"
;;
esac
in config.gcc, but mips*-sde-elf* is the only target which overwrites
tm_defines shell variable rather than just appending to it (or in one case
prepending), all other targets append something to it, including other
mips* triplets.
I believe (just from looking at config.gcc) that the difference is that
LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3 LIBC_MUSL=4 HEAP_TRAMPOLINES_INIT=0
isn't defined without the patch and is with the patch.
I think defining those first 4 shouldn't cause any harm and defining the
last one is required for it to actually build at all.
2023-11-27 Jakub Jelinek <jakub@redhat.com>
PR target/112300
* config.gcc (mips*-sde-elf*): Append to tm_defines rather than
overwriting them.
Come back to review the codes of gather/scatter, notice gather_scatter_valid_offset_mode_p looks odd.
gather_scatter_valid_offset_mode_p is supposed to block vluxei64/vsuxei64 in RV32 system.
However, it failed to do that since it is passing data_mode instead of index mode:
riscv_vector::gather_scatter_valid_offset_mode_p (<RATIO2:MODE>mode)
It should be RATIO2I instead of RATIO2.
So we have this following iterators which already can block the this situation:
(define_mode_iterator RATIO8I [
RVVM1QI
RVVM2HI
RVVM4SI
(RVVM8DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
])
We can see TARGET_64BIT to block EEW64 index mode on RV32 system.
So, gather_scatter_valid_offset_mode_p is no longer needed.
After remove it, I find due to incorrect gather_scatter_valid_offset_mode_p.
We failed to vectorize such case in RV32 in the past:
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
T (int64_t, 8)
TEST_ALL (TEST_LOOP)
https://godbolt.org/z/T3ara3fM3
Checked compiler explorer, we can see GCC failed to vectorize it but Clang can vectorize it.
So adapt the tests checking vectorization cases from 8 -> 11.
Confirm we have same behavior as Clang now.
Tested on zvl128/zvl256/zvl512/zvl1024 no regression.
Note this is not an optimization patch, it's buggy codes fix patch.
gcc/ChangeLog:
* config/riscv/autovec.md
(mask_len_gather_load<RATIO1:mode><RATIO1:mode>):
Remove gather_scatter_valid_offset_mode_p.
(mask_len_gather_load<mode><mode>): Ditto.
(mask_len_scatter_store<RATIO1:mode><RATIO1:mode>): Ditto.
(mask_len_scatter_store<mode><mode>): Ditto.
* config/riscv/predicates.md (const_1_or_8_operand): New predicate.
(vector_gs_scale_operand_64): Remove.
* config/riscv/riscv-protos.h (gather_scatter_valid_offset_mode_p): Remove.
* config/riscv/riscv-v.cc (expand_gather_scatter): Refine code.
(gather_scatter_valid_offset_mode_p): Remove.
* config/riscv/vector-iterators.md: Fix iterator bugs.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-1.c: Adapt test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-9.c: Ditto.
Along with RV32E, RV64E is ratified. Though ILP32E and LP64E ABIs are
still draft, it's worth supporting it.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_ext_version_table): Set version to ratified 2.0.
(riscv_subset_list::parse_std_ext): Allow RV64E.
* config.gcc: Parse base ISA 'rv64e' and ABI 'lp64e'.
* config/riscv/arch-canonicalize: Parse base ISA 'rv64e'.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
Define different macro per XLEN. Add handling for ABI_LP64E.
* config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
Add handling for ABI_LP64E.
* config/riscv/riscv-opts.h (enum riscv_abi_type): Add ABI_LP64E.
* config/riscv/riscv.cc (riscv_option_override): Enhance error
handling to support RV64E and LP64E.
(riscv_conditional_register_usage): Change "RV32E" in a comment
to "RV32E/RV64E".
* config/riscv/riscv.h
(UNITS_PER_FP_ARG): Add handling for ABI_LP64E.
(STACK_BOUNDARY): Ditto.
(ABI_STACK_BOUNDARY): Ditto.
(MAX_ARGS_IN_REGISTERS): Ditto.
(ABI_SPEC): Add support for "lp64e".
* config/riscv/riscv.opt: Parse -mabi=lp64e as ABI_LP64E.
* doc/invoke.texi: Add documentation of the LP64E ABI.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-1.c: Test for __riscv_64e.
* gcc.target/riscv/predef-2.c: Ditto.
* gcc.target/riscv/predef-3.c: Ditto.
* gcc.target/riscv/predef-4.c: Ditto.
* gcc.target/riscv/predef-5.c: Ditto.
* gcc.target/riscv/predef-6.c: Ditto.
* gcc.target/riscv/predef-7.c: Ditto.
* gcc.target/riscv/predef-8.c: Ditto.
* gcc.target/riscv/predef-9.c: New test for RV64E and LP64E,
based on predef-7.c.
For the following immediate load operation in gcc/testsuite/gcc.target/loongarch/imm-load1.c:
long long r = 0x0101010101010101;
Before this patch:
lu12i.w $r15,16842752>>12
ori $r15,$r15,257
lu32i.d $r15,0x1010100000000>>32
lu52i.d $r15,$r15,0x100000000000000>>52
After this patch:
lu12i.w $r15,16842752>>12
ori $r15,$r15,257
bstrins.d $r15,$r15,63,32
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(enum loongarch_load_imm_method): Add new method.
(loongarch_build_integer): Add relevant implementations for
new method.
(loongarch_move_integer): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/imm-load1.c: Change old check.
The xfail for "*-*-*" here, set in r14-4089-gd45ddc2c04e471
"tree-optimization/111294 - backwards threader PHI costing"
was somewhat too general and made this test XPASS for a
number of targets. The common factor for those targets is
that they either explicitly or by default define
LOGICAL_OP_NON_SHORT_CIRCUIT as 0 (see fold-const.cc).
Instead of changing *-*-* to a seemingly random set of
xfailed targets or inventing a new testsuite
effective-target predicate for logical-op-short-circuited
targets or the opposite, let's just force a setting that
removes the need for the xfail for all targets, by
overriding with --param=logical-op-non-short-circuit=0.
* gcc.dg/uninit-pred-9_b.c: Remove xfail for line 20. Pass
--param=logical-op-non-short-circuit=0. Comment why.
In a recent all-target test-round investigating XPASSes for
this file, I noticed this line XPASSing for MMIX. From the
commit history it's obvious it was left out from related
target-xfail tweaks, now the last target xfailing a bogus
warning for this line.
* gcc.dg/uninit-pred-9_b.c: Remove xfail for MMIX from line 23.
gcc/fortran/ChangeLog:
PR fortran/111880
* resolve.cc (resolve_common_vars): Do not call gfc_add_in_common
for symbols that are USE associated or used in a submodule.
gcc/testsuite/ChangeLog:
PR fortran/111880
* gfortran.dg/pr111880.f90: New test.
Avoid using 'network sort' (a misnomer) in sort.cc, the correct term is
'sorting networks'.
gcc/ChangeLog:
* sort.cc: Use 'sorting networks' in comments.
2023-11-26 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-glibc-datagram-client.c: Skip on hppa*-*-hpux*.
* gcc.dg/analyzer/fd-glibc-datagram-socket.c: Likewise.
2023-11-23 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
* g++.dg/modules/bad-mapper-1.C: Add hppa*-*-hpux* to dg-error
"this-will-not-work" targets.
The new test at gcc.target/i386/cf_check-6.c fails on darwin with:
Excess errors:
cc1: warning: '-fhardened' not supported for this target
gcc/testsuite/ChangeLog:
* gcc.target/i386/cf_check-6.c: Only run on Linux.
The new test at gcc.target/i386/pr112686.c fails on darwin with:
Excess errors:
cc1: error: '-fsplit-stack' currently only supported on GNU/Linux
cc1: error: '-fsplit-stack' is not supported by this compiler configuration
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr112686.c: Add a requirement for split_stack.
Re-check again RVV ISA, I find that we can't allow AVL propagation not only
for vrgather, but also slidedown instructions.
Committed.
PR target/112599
gcc/ChangeLog:
* config/riscv/riscv-avlprop.cc (avl_can_be_propagated_p): Add slidedown.
(vlmax_ta_p): Ditto.
(pass_avlprop::get_vlmax_ta_preferred_avl): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vf_avl-1.c: Adapt test.
* gcc.target/riscv/rvv/autovec/pr112599-3.c: New test.
r14-5628-g53ba8d669550d3 added noipa to f1 but `-fno-ipa-vrp` should have been used
instead. The testcase is testing about the clone of f1 so turning off
IPA VRP is the correct approach here rather than turning off of IPA on the function.
gcc/testsuite/ChangeLog:
PR testsuite/112691
* gcc.dg/vla-1.c: Add -fno-ipa-vrp.
Remove noipa from f1.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Just like the patch against gcc.target/aarch64/movk.c, the issue here
is the two functions, foo32 and foo64 needed to mark as noipa so that
IPA-VRP cannot propagate the return value.
gcc/testsuite/ChangeLog:
PR testsuite/112688
* gcc.target/aarch64/simd/vmulx.x (foo32): Mark as noipa rather
than noinline.
(foo4): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Since contracts-tmpl-spec2.C is just testing contracts, I thought it would be better
to just add `-fsigned-char` to the options rather than change the testcase to support
both cases.
Committed after testing on aarch64-linux-gnu.
gcc/testsuite/ChangeLog:
PR testsuite/108321
* g++.dg/contracts/contracts-tmpl-spec2.C: Add -fsigned-char
to options.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
The problem here is dummy_number_generator returns a constant which IPA VRP is now able
propagate that so we need to mark the funciton as noipa to stop that.
gcc/testsuite/ChangeLog:
PR testsuite/112688
* gcc.target/aarch64/movk.c: Add noipa on dummy_number_generator
and remove -fno-inline option.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
FreeBSD 6 and 7 have been end of life for years as have been GCC 4.x
releases, so no point in detailing specifics of changes around those.
gcc:
PR target/69374
* doc/install.texi (Specific) <*-*-freebsd*>: Remove older
contents referencing GCC 4.x.
The following testcase is miscompiled in GCC 14 because the
*jcc_bt<mode>_mask and *jcc_bt<SWI48:mode>_mask_1 patterns have just
one argument in (match_operator 0 "bt_comparison_operator" [...])
but as bt_comparison_operator is eq,ne, we need two.
The md readers don't warn about it, after all, some checks can
be done in the predicate rather than specified explicitly, and the
behavior is that anything is accepted as the second argument.
I went through all other i386.md match_operator uses and all others
looked right (extract_operator using 3 operands, all others 2).
I think we'll want to fix this at different spots in older releases
because I think the bug was introduced already in 2008, though most
likely just latent.
2023-11-25 Jakub Jelinek <jakub@redhat.com>
PR target/111408
* config/i386/i386.md (*jcc_bt<mode>_mask,
*jcc_bt<SWI48:mode>_mask_1): Add (const_int 0) as expected
second operand of bt_comparison_operator.
* gcc.c-torture/execute/pr111408.c: New test.
The aarch64_simd_stp<mode> pattern uses w constraint in one alternative and
r in another, but for the latter incorrectly uses <vw> iterator in %<vw>1 which
expands to %d1 for V2DF and %s1 for V2SF and V4SF (this one not relevant to
the pattern) and %w1 for others, so it ICEs if the alternative is selected
during final. Compared to this, <vwcore> macro has the same values for all
modes but uses w for V2DF and V2SF.
2023-11-24 Andrew Pinski <pinskia@gmail.com>
Jakub Jelinek <jakub@redhat.com>
PR target/109977
* config/aarch64/aarch64-simd.md (aarch64_simd_stp<mode>): Use <vwcore>
rather than %<vw> for alternative with r constraint on input operand.
* gcc.dg/pr109977.c: New test.
Currently only functions are directly checked for validity when
exporting via a using-declaration. This patch also checks exporting
non-external names of variables, types, and enumerators. This also
prevents ICEs with `export using enum` for internal-linkage enums.
While we're at it this patch also improves the error messages for these
cases to provide more context about what went wrong.
gcc/cp/ChangeLog:
* name-lookup.cc (check_can_export_using_decl): New.
(do_nonmember_using_decl): Use above to check if names can be
exported.
gcc/testsuite/ChangeLog:
* g++.dg/modules/using-10.C: New test.
* g++.dg/modules/using-enum-2.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
A typedef doesn't create a new entity, and thus should be allowed to be
exported even if it has been previously declared un-exported. See the
example in [module.interface] p6:
export module M;
struct S { int n; };
typedef S S;
export typedef S S; // OK, does not redeclare an entity
PR c++/102341
gcc/cp/ChangeLog:
* decl.cc (duplicate_decls): Allow exporting a redeclaration of
a typedef.
gcc/testsuite/ChangeLog:
* g++.dg/modules/export-1.C: Adjust test.
* g++.dg/modules/export-2_a.C: New test.
* g++.dg/modules/export-2_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Since r14-2893, the frontend parser object needs to exist when running in
preprocess-only mode, because pragma_lex() is now called in that mode and
needs to make use of it. This is handled by calling c_init_preprocess() at
startup. If -fpch-preprocess is in effect (commonly, because of
-save-temps), a PCH file may be loaded during preprocessing, in which
case the parser will be destroyed, causing the issue noted in the
PR. Resolve it by reinitializing the frontend parser after loading the PCH.
gcc/c-family/ChangeLog:
PR pch/112319
* c-ppoutput.cc (cb_read_pch): Reinitialize the frontend parser
after loading a PCH.
gcc/testsuite/ChangeLog:
PR pch/112319
* g++.dg/pch/pr112319.C: New test.
* g++.dg/pch/pr112319.Hs: New test.
* gcc.dg/pch/pr112319.c: New test.
* gcc.dg/pch/pr112319.hs: New test.
64-bit Linux target has relocation issue and can't use 14-bit offsets.
2023-11-22 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.cc (pa_emit_move_sequence): Use INT14_OK_STRICT
in a couple of places.
__relocate_a_1 is used to copy data after vector reizing. This can be done by memcpy
rather than memmove.
libstdc++-v3/ChangeLog:
PR middle-end/109849
* include/bits/stl_uninitialized.h (__relocate_a_1): Use memcpy instead
of memmove.
PR109849 shows that a loop that heavily pushes and pops from a stack
implemented by a C++ std::vec results in slow code, mainly because the
vector structure is not split by SRA and so we end up in many loads
and stores into it. This is because it is passed by reference
to (re)allocation methods and so needs to live in memory, even though
it does not escape from them and so we could SRA it if we
re-constructed it before the call and then separated it to distinct
replacements afterwards.
This patch does exactly that, first relaxing the selection of
candidates to also include those which are addressable but do not
escape and then adding code to deal with the calls. The
micro-benchmark that is also the (scan-dump) testcase in this patch
runs twice as fast with it than with current trunk. Honza measured
its effect on the libjxl benchmark and it almost closes the
performance gap between Clang and GCC while not requiring excessive
inlining and thus code growth.
The patch disallows creation of replacements for such aggregates which
are also accessed with a precision smaller than their size because I
have observed that this led to excessive zero-extending of data
leading to slow-downs of perlbench (on some CPUs). Apart from this
case I have not noticed any regressions, at least not so far.
Gimple call argument flags can tell if an argument is unused (and then
we do not need to generate any statements for it) or if it is not
written to and then we do not need to generate statements loading
replacements from the original aggregate after the call statement.
Unfortunately, we cannot symmetrically use flags that an aggregate is
not read because to avoid re-constructing the aggregate before the
call because flags don't tell which what parts of aggregates were not
written to, so we load all replacements, and so all need to have the
correct value before the call.
This version of the patch also takes care to avoid attempts to modify
abnormal edges, something which was missing in the previosu version.
gcc/ChangeLog:
2023-11-23 Martin Jambor <mjambor@suse.cz>
PR middle-end/109849
* tree-sra.cc (passed_by_ref_in_call): New.
(sra_initialize): Allocate passed_by_ref_in_call.
(sra_deinitialize): Free passed_by_ref_in_call.
(create_access): Add decl pool candidates only if they are not
already candidates.
(build_access_from_expr_1): Bail out on ADDR_EXPRs.
(build_access_from_call_arg): New function.
(asm_visit_addr): Rename to scan_visit_addr, change the
disqualification dump message.
(scan_function): Check taken addresses for all non-call statements,
including phi nodes. Process all call arguments, including the static
chain, build_access_from_call_arg.
(maybe_add_sra_candidate): Relax need_to_live_in_memory check to allow
non-escaped local variables.
(sort_and_splice_var_accesses): Disallow smaller-than-precision
replacements for aggregates passed by reference to functions.
(sra_modify_expr): Use a separate stmt iterator for adding satements
before the processed statement and after it.
(enum out_edge_check): New type.
(abnormal_edge_after_stmt_p): New function.
(sra_modify_call_arg): New function.
(sra_modify_assign): Adjust calls to sra_modify_expr.
(sra_modify_function_body): Likewise, use sra_modify_call_arg to
process call arguments, including the static chain.
gcc/testsuite/ChangeLog:
2023-11-23 Martin Jambor <mjambor@suse.cz>
PR middle-end/109849
* g++.dg/tree-ssa/pr109849.C: New test.
* g++.dg/tree-ssa/sra-eh-1.C: Likewise.
* gcc.dg/tree-ssa/pr109849.c: Likewise.
* gcc.dg/tree-ssa/sra-longjmp-1.c: Likewise.
* gfortran.dg/pr43984.f90: Added -fno-tree-sra to dg-options.
For -mcmodel=large, we have to load function address to a register.
PR target/112686
gcc/ChangeLog:
* config/i386/i386.cc (ix86_expand_split_stack_prologue): Load
function address to a register for ix86_cmodel == CM_LARGE.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr112686.c: New test.
The new warning has two purposes: First, it makes clearer to the
user that it is about OpenMP and, secondly and more importantly,
it permits to use -Wno-openmp.
The newly added -Wopenmp is enabled by default and replaces the
'0' (always warning) in several OpenMP-related warning calls.
For code shared with OpenACC, it only uses OPT_Wopenmp for
'flag_openmp | flag_openmp_simd'.
gcc/c-family/ChangeLog:
* c.opt (Wopenmp): Add, enable by default.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_num_threads,
c_parser_omp_clause_num_tasks, c_parser_omp_clause_grainsize,
c_parser_omp_clause_priority, c_parser_omp_clause_schedule,
c_parser_omp_clause_num_teams, c_parser_omp_clause_thread_limit,
c_parser_omp_clause_dist_schedule, c_parser_omp_depobj,
c_parser_omp_scan_loop_body, c_parser_omp_assumption_clauses):
Add OPT_Wopenmp to warning_at.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_dist_schedule,
cp_parser_omp_scan_loop_body, cp_parser_omp_assumption_clauses,
cp_parser_omp_depobj): Add OPT_Wopenmp to warning_at.
* semantics.cc (finish_omp_clauses): Likewise.
gcc/ChangeLog:
* doc/invoke.texi (-Wopenmp): Add.
* gimplify.cc (gimplify_omp_for): Add OPT_Wopenmp to warning_at.
* omp-expand.cc (expand_omp_ordered_sink): Likewise.
* omp-general.cc (omp_check_context_selector): Likewise.
* omp-low.cc (scan_omp_for, check_omp_nesting_restrictions,
lower_omp_ordered_clauses): Likewise.
* omp-simd-clone.cc (simd_clone_clauses_extract): Likewise.
gcc/fortran/ChangeLog:
* lang.opt (Wopenmp): Add, enabled by dafault and documented in C.
* openmp.cc (gfc_match_omp_declare_target, resolve_positive_int_expr,
resolve_nonnegative_int_expr, resolve_omp_clauses,
gfc_resolve_omp_do_blocks): Use OPT_Wopenmp with gfc_warning{,_now}.