Commit Graph

206057 Commits

Author SHA1 Message Date
Ian Lance Taylor 8b2e510ca3 libbacktrace: call GetModuleFileNameA on Windows
Patch from Björn Schäpers.

	* fileline.c: Include <windows.h> if available.
	(windows_get_executable_path): New static function.
	(fileline_initialize): Call windows_get_executable_path.
	* configure.ac: Checked for windows.h
	* configure: Regenerate.
	* config.h.in: Regenerate.
2023-11-29 14:04:36 -08:00
Patrick Palka 0b242afffd c++: fix testcase [PR112765]
PR c++/112765

gcc/testsuite/ChangeLog:

	* g++.dg/warn/Wparentheses-33.C: Compile with -Wparentheses.
2023-11-29 16:47:12 -05:00
Patrick Palka 220fe41fd4 c++: bogus -Wparentheses warning [PR112765]
We need to consistently look through implicit INDIRECT_REF when
setting/checking for -Wparentheses warning suppression.  In passing
use the recently introduced STRIP_REFERENCE_REF helper some more.

	PR c++/112765

gcc/cp/ChangeLog:

	* pt.cc (tsubst_expr) <case MODOP_EXPR>: Look through implicit
	INDIRECT_REF when propagating -Wparentheses warning suppression.
	* semantics.cc (maybe_warn_unparenthesized_assignment): Replace
	REFERENCE_REF_P handling with STRIP_REFERENCE_REF.
	(finish_parenthesized_expr): Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/Wparentheses-33.C: New test.
2023-11-29 16:42:39 -05:00
David Faust 72e212c029 bpf: change ASM_COMMENT_START to '#'
The BPF "pseudo-C" assembly dialect uses semi-colon (;) to separate
statements, not to begin line comments. The GNU assembler was recently
changed accordingly:

  https://sourceware.org/pipermail/binutils/2023-November/130867.html

This patch adapts the BPF backend in GCC accordingly, to use a hash (#)
instead of semi-colon (;) for ASM_COMMENT_START. This is supported
already in clang.

gcc/
	* config/bpf/bpf.h (ASM_COMMENT_START): Change from ';' to '#'.

gcc/testsuite/
	* gcc.target/bpf/core-builtin-enumvalue-opt.c: Change dg-final
	scans to not assume a specific comment character.
	* gcc.target/bpf/core-builtin-enumvalue.c: Likewise.
	* gcc.target/bpf/core-builtin-type-based.c: Likewise.
	* gcc.target/bpf/core-builtin-type-id.c: Likewise.
2023-11-29 11:50:00 -08:00
Jakub Jelinek 259bb7a45a rs6000: Fix up c-c++-common/builtin-classify-type-1.c failure [PR112725]
The rs6000 backend (and s390 one as well) diagnoses passing vector types
to unprototyped functions, which breaks the builtin-classify-type-1.c test.
The builtin isn't really unprototyped, it is just type-generic and accepting
vector types is just fine there, all it does is categorize the vector type.
The following patch makes sure we don't diagnose it for this builtin.

2023-11-29  Jakub Jelinek  <jakub@redhat.com>

	PR target/112725
	* config/rs6000/rs6000.cc (invalid_arg_for_unprototyped_fn): Return
	NULL for __builtin_classify_type calls with vector arguments.
2023-11-29 19:19:07 +01:00
Andrew MacLeod 634cf26c94 Check operands before invoking fold_range.
Call check_operands_p before fold_range to make sure it is a valid operation.

	PR tree-optimization/111922
	gcc/
	* ipa-cp.cc (ipa_vr_operation_and_type_effects): Check the
	operands are valid before calling fold_range.

	gcc/testsuite/
	* gcc.dg/pr111922.c: New.
2023-11-29 11:43:53 -05:00
Andrew MacLeod ea19de921b Add operand_check_p to range-ops.
Add an optional method to verify operands are compatible, and check
the operands before all range operations.

	* range-op-mixed.h (operator_equal::operand_check_p): New.
	(operator_not_equal::operand_check_p): New.
	(operator_lt::operand_check_p): New.
	(operator_le::operand_check_p): New.
	(operator_gt::operand_check_p): New.
	(operator_ge::operand_check_p): New.
	(operator_plus::operand_check_p): New.
	(operator_abs::operand_check_p): New.
	(operator_minus::operand_check_p): New.
	(operator_negate::operand_check_p): New.
	(operator_mult::operand_check_p): New.
	(operator_bitwise_not::operand_check_p): New.
	(operator_bitwise_xor::operand_check_p): New.
	(operator_bitwise_and::operand_check_p): New.
	(operator_bitwise_or::operand_check_p): New.
	(operator_min::operand_check_p): New.
	(operator_max::operand_check_p): New.
	* range-op.cc (range_op_handler::fold_range): Check operand
	parameter types.
	(range_op_handler::op1_range): Ditto.
	(range_op_handler::op2_range): Ditto.
	(range_op_handler::operand_check_p): New.
	(range_operator::operand_check_p): New.
	(operator_lshift::operand_check_p): New.
	(operator_rshift::operand_check_p): New.
	(operator_logical_and::operand_check_p): New.
	(operator_logical_or::operand_check_p): New.
	(operator_logical_not::operand_check_p): New.
	* range-op.h (range_operator::operand_check_p): New.
	(range_op_handler::operand_check_p): New.
2023-11-29 11:43:53 -05:00
Martin Jambor 302461ad9a
tree-sra: Avoid returns of references to SRA candidates
The enhancement to address PR 109849 contained an importsnt thinko,
and that any reference that is passed to a function and does not
escape, must also not happen to be aliased by the return value of the
function.  This has quickly transpired as bugs PR 112711 and PR
112721.

Just as IPA-modref does a good enough job to allow us to rely on the
escaped set of variables, it sems to be doing well also on updating
EAF_NOT_RETURNED_DIRECTLY call argument flag which happens to address
exactly the situation we need to avoid.  Of course, if a call
statement ignores any returned value, we also do not need to check the
flag.

Hopefully this does not pessimize things too much, I have verified
that the PR 109849 testcae remains quick and so should also the
benchmark it is derived from.

gcc/ChangeLog:

2023-11-27  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/112711
	PR tree-optimization/112721
	* tree-sra.cc (build_access_from_call_arg): New parameter
	CAN_BE_RETURNED, disqualify any candidate passed by reference if it is
	true.  Adjust leading comment.
	(scan_function): Pass appropriate value to CAN_BE_RETURNED of
	build_access_from_call_arg.

gcc/testsuite/ChangeLog:

2023-11-29  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/112711
	PR tree-optimization/112721
	* g++.dg/tree-ssa/pr112711.C: New test.
	* gcc.dg/tree-ssa/pr112721.c: Likewise.
2023-11-29 16:25:26 +01:00
Thomas Schwinge 4c909c6ee3 In 'libgomp.c/target-simd-clone-{1,2,3}.c', restrict 'scan-offload-ipa-dump's to 'only_for_offload_target amdgcn-amdhsa'
This gets rid of UNRESOLVEDs if nvptx offloading compilation is enabled in
addition to GCN:

     PASS: libgomp.c/target-simd-clone-1.c (test for excess errors)
     PASS: libgomp.c/target-simd-clone-1.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "Generated local clone _ZGV.*N.*_addit"
    -UNRESOLVED: libgomp.c/target-simd-clone-1.c scan-nvptx-none-offload-ipa-dump simdclone "Generated local clone _ZGV.*N.*_addit"
     PASS: libgomp.c/target-simd-clone-1.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "Generated local clone _ZGV.*M.*_addit"
    -UNRESOLVED: libgomp.c/target-simd-clone-1.c scan-nvptx-none-offload-ipa-dump simdclone "Generated local clone _ZGV.*M.*_addit"
     PASS: libgomp.c/target-simd-clone-2.c (test for excess errors)
     PASS: libgomp.c/target-simd-clone-2.c scan-amdgcn-amdhsa-offload-ipa-dump-not simdclone "Generated .* clone"
    -UNRESOLVED: libgomp.c/target-simd-clone-2.c scan-nvptx-none-offload-ipa-dump-not simdclone "Generated .* clone"
     PASS: libgomp.c/target-simd-clone-3.c (test for excess errors)
     PASS: libgomp.c/target-simd-clone-3.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "device doesn't match"
    -UNRESOLVED: libgomp.c/target-simd-clone-3.c scan-nvptx-none-offload-ipa-dump simdclone "device doesn't match"
     PASS: libgomp.c/target-simd-clone-3.c scan-amdgcn-amdhsa-offload-ipa-dump-not simdclone "Generated .* clone"
    -UNRESOLVED: libgomp.c/target-simd-clone-3.c scan-nvptx-none-offload-ipa-dump-not simdclone "Generated .* clone"

Minor fix-up for commit 309e2d95e3
'OpenMP: Generate SIMD clones for functions with "declare target"'.

	libgomp/
	* testsuite/libgomp.c/target-simd-clone-1.c: Restrict
	'scan-offload-ipa-dump's to
	'only_for_offload_target amdgcn-amdhsa'.
	* testsuite/libgomp.c/target-simd-clone-2.c: Likewise.
	* testsuite/libgomp.c/target-simd-clone-3.c: Likewise.
2023-11-29 15:10:01 +01:00
Thomas Schwinge 27c79b91f6 testsuite: Add 'only_for_offload_target' wrapper for 'scan-offload-tree-dump' etc.
This allows restricting scans to one specific offload target only.

	gcc/
	* doc/sourcebuild.texi (Final Actions): Document
	'only_for_offload_target' wrapper.
	gcc/testsuite/
	* lib/scanoffload.exp (only_for_offload_target): New 'proc'.
2023-11-29 15:10:01 +01:00
Rainer Orth 8ee480441e testsuite, i386: Only check for cfi directives if supported [PR112729]
gcc.target/i386/apx-interrupt-1.c and two more tests FAIL on Solaris/x86
with the native assembler.  Like Darwin as, it doesn't support cfi
directives.  Instead of adding more and more targets in every affected
test, this patch introduces a cfi effective-target keyword to check for
the prerequisite.

Tested on i386-pc-solaris2.11 (as and gas), x86_64-pc-linux-gnu, and
x86_64-apple-darwin23.1.0.

2023-11-24  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	PR testsuite/112729
	* lib/target-supports.exp (check_effective_target_cfi): New proc.
	* gcc.target/i386/apx-interrupt-1.c: Require cfi instead of
	skipping on *-*-darwin*.
	* gcc.target/i386/apx-push2pop2_force_drap-1.c: Likewise.
	* gcc.target/i386/apx-push2pop2-1.c: Likewise.

	gcc:
	PR testsuite/112729
	* doc/sourcebuild.texi (Effective-Target Keywords, Environment
	attributes): Document cfi.
2023-11-29 14:52:04 +01:00
Thomas Schwinge 58baac57d6 Fix 'g++.dg/cpp26/static_assert1.C' for '-fno-exceptions' configurations
This test case, added in recent commit 6ce952188a
"c++: Implement C++26 P2741R3 - user-generated static_assert messages [PR110348]",
expectedly runs into 'UNSUPPORTED: [...]: exception handling disabled', but
along the way also FAILs a few tests:

    UNSUPPORTED: g++.dg/cpp26/static_assert1.C  -std=gnu++98
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++11  (test for warnings, line 6)
    [...]
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++11  (test for warnings, line 51)
    FAIL: g++.dg/cpp26/static_assert1.C  -std=gnu++11  (test for errors, line 52)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++11  (test for warnings, line 56)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++11  (test for warnings, line 57)
    FAIL: g++.dg/cpp26/static_assert1.C  -std=gnu++11  at line 58 (test for errors, line 57)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++11  (test for warnings, line 59)
    [...]
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++11  at line 308 (test for errors, line 307)
    UNSUPPORTED: g++.dg/cpp26/static_assert1.C  -std=gnu++11: exception handling disabled
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++14  (test for warnings, line 6)
    [...]
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++14  (test for warnings, line 51)
    FAIL: g++.dg/cpp26/static_assert1.C  -std=gnu++14  (test for errors, line 52)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++14  (test for warnings, line 56)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++14  (test for warnings, line 57)
    FAIL: g++.dg/cpp26/static_assert1.C  -std=gnu++14  at line 58 (test for errors, line 57)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++14  (test for warnings, line 59)
    [...]
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++14  at line 257 (test for errors, line 256)
    FAIL: g++.dg/cpp26/static_assert1.C  -std=gnu++14  (test for errors, line 261)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++14  (test for warnings, line 262)
    [...]
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++14  at line 308 (test for errors, line 307)
    UNSUPPORTED: g++.dg/cpp26/static_assert1.C  -std=gnu++14: exception handling disabled
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++20  (test for warnings, line 6)
    [...]
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++20  (test for warnings, line 51)
    FAIL: g++.dg/cpp26/static_assert1.C  -std=gnu++20  (test for errors, line 52)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++20  (test for warnings, line 56)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++20  (test for warnings, line 57)
    FAIL: g++.dg/cpp26/static_assert1.C  -std=gnu++20  at line 58 (test for errors, line 57)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++20  (test for warnings, line 59)
    [...]
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++20  at line 257 (test for errors, line 256)
    FAIL: g++.dg/cpp26/static_assert1.C  -std=gnu++20  (test for errors, line 261)
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++20  (test for warnings, line 262)
    [...]
    PASS: g++.dg/cpp26/static_assert1.C  -std=gnu++20  at line 308 (test for errors, line 307)
    UNSUPPORTED: g++.dg/cpp26/static_assert1.C  -std=gnu++20: exception handling disabled

Use an explicit '-fexceptions' to turn this front end test case all-PASS.

	gcc/testsuite/
	* g++.dg/cpp26/static_assert1.C: Fix for '-fno-exceptions'
	configurations.
2023-11-29 14:12:56 +01:00
Thomas Schwinge 762b428815 Fix '23_containers/span/at.cc' for '-fno-exceptions' configurations
Added in recent commit 1fa85dcf65
"libstdc++: Add std::span::at for C++26 (P2821R5)", the test case already
does use '#if __cpp_exceptions', but failed to correspondingly guard the
'dg-warning' directives, resulting in:

    FAIL: 23_containers/span/at.cc  -std=gnu++26  (test for warnings, line 15)
    FAIL: 23_containers/span/at.cc  -std=gnu++26  (test for warnings, line 26)
    PASS: 23_containers/span/at.cc  -std=gnu++26 (test for excess errors)
    PASS: 23_containers/span/at.cc  -std=gnu++26 execution test

	libstdc++-v3/
	* testsuite/23_containers/span/at.cc: Fix for '-fno-exceptions'
	configurations.
2023-11-29 14:12:56 +01:00
Thomas Schwinge 11ee1fb3e3 Adjust 'g++.dg/ext/has-feature.C' for default-'-fno-exceptions', '-fno-rtti' configurations
..., where you currently get:

    FAIL: g++.dg/ext/has-feature.C  -std=gnu++98 (test for excess errors)
    [...]

Minor fix-up for recent commit 06280a906c
"c-family: Implement __has_feature and __has_extension [PR60512]".

	gcc/testsuite/
	* g++.dg/ext/has-feature.C: Adjust for default-'-fno-exceptions',
	'-fno-rtti' configurations.
2023-11-29 14:12:56 +01:00
Richard Biener b09b879e4e middle-end/110237 - wrong MEM_ATTRs for partial loads/stores
The following addresses a miscompilation by RTL scheduling related
to the representation of masked stores.  For that we have

(insn 38 35 39 3 (set (mem:V16SI (plus:DI (reg:DI 40 r12 [orig:90 _22 ] [90])
                (const:DI (plus:DI (symbol_ref:DI ("b") [flags 0x2]  <var_decl 0x7ffff6e28d80 b>)
                        (const_int -4 [0xfffffffffffffffc])))) [1 MEM <vector(16) int> [(int *)vectp_b.12_28]+0 S64 A32])
        (vec_merge:V16SI (reg:V16SI 20 xmm0 [118])
            (mem:V16SI (plus:DI (reg:DI 40 r12 [orig:90 _22 ] [90])
                    (const:DI (plus:DI (symbol_ref:DI ("b") [flags 0x2]  <var_decl 0x7ffff6e28d80 b>)
                            (const_int -4 [0xfffffffffffffffc])))) [1 MEM <vector(16) int> [(int *)vectp_b.12_28]+0 S64 A32])

and specifically the memory attributes

  [1 MEM <vector(16) int> [(int *)vectp_b.12_28]+0 S64 A32]

are problematic.  They tell us the instruction stores and reads a full
vector which it if course does not.  There isn't any good MEM_EXPR
we can use here (we lack a way to just specify a pointer and restrict
info for example), and since the MEMs have a vector mode it's
difficult in general as passes do not need to look at the memory
attributes at all.

The easiest way to avoid running into the alias analysis problem is
to scrap the MEM_EXPR when we expand the internal functions for
partial loads/stores.  That avoids the disambiguation we run into
which is realizing that we store to an object of less size as
the size of the mode we appear to store.

After the patch we see just

  [1  S64 A32]

so we preserve the alias set, the alignment and the size (the size
is redundant if the MEM insn't BLKmode).  That's still not good
in case the RTL alias oracle would implement the same
disambiguation but it fends off the gimple one.

This fixes gcc.dg/torture/pr58955-2.c when built with AVX512
and --param=vect-partial-vector-usage=1.

	PR middle-end/110237
	* internal-fn.cc (expand_partial_load_optab_fn): Clear
	MEM_EXPR and MEM_OFFSET.
	(expand_partial_store_optab_fn): Likewise.
2023-11-29 13:26:31 +01:00
Jakub Jelinek 5c95bf945c fold-const: Fix up multiple_of_p [PR112733]
We ICE on the following testcase when wi::multiple_of_p is called on
widest_int 1 and -128 with UNSIGNED.  I still need to work on the
actual wide-int.cc issue, the latest patch attached to the PR regressed
bitint-{38,39}.c, so will need to debug that, but there is a clear bug
on the fold-const.cc side as well - widest_int is a signed representation
by definition, using UNSIGNED with it certainly doesn't match what was
intended, because -128 as the second operand effectively means unsigned
131072 bit 0xfffff............ffff80 integer, not the signed char -128
that appeared in the source.

In the INTEGER_CST case a few lines above this we already use
    case INTEGER_CST:
      if (TREE_CODE (bottom) != INTEGER_CST || integer_zerop (bottom))
        return false;
      return wi::multiple_of_p (wi::to_widest (top), wi::to_widest (bottom),
                                SIGNED);
so I think using SIGNED with widest_int is best there (compared to the
other choices in the PR).

2023-11-29  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/112733
	* fold-const.cc (multiple_of_p): Pass SIGNED rather than
	UNSIGNED for wi::multiple_of_p on widest_int arguments.

	* gcc.dg/pr112733.c: New test.
2023-11-29 12:26:50 +01:00
Iain Sandoe d65eb8a6bb testsuite, x86: Handle a broken assembler
Earlier assembler support for complex fp16 on x86_64 Darwin is broken.
This adds an additional test to the existing target-supports that fails
for the broken assemblers but works for the newer, fixed, ones.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp: Test an asm line that fails on broken
	Darwin assembler versions.
2023-11-29 11:26:27 +01:00
Rainer Orth f82b6ddb72 testsuite: Adjust g++.dg/opt/devirt2.C on SPARC
Since 20231124, g++.dg/opt/devirt2.C began to FAIL on 32 and 64-bit
Solaris/SPARC:

FAIL: g++.dg/opt/devirt2.C  -std=gnu++14  scan-assembler-times (jmp|call)[^\\n]*xyzzy 4
FAIL: g++.dg/opt/devirt2.C  -std=gnu++17  scan-assembler-times (jmp|call)[^\\n]*xyzzy 4
FAIL: g++.dg/opt/devirt2.C  -std=gnu++20  scan-assembler-times (jmp|call)[^\\n]*xyzzy 4
FAIL: g++.dg/opt/devirt2.C  -std=gnu++98  scan-assembler-times (jmp|call)[^\\n]*xyzzy 4

This is no doubt due to

commit ba0869323e
Author: Maciej W. Rozycki <macro@embecosm.com>
Date:   Thu Nov 23 16:13:59 2023 +0000

    testsuite: Fix subexpressions with `scan-assembler-times'

which fixes exactly the double-counting the test relied on/worked around
on sparc.  Fixed by adjusting the count.

Tested on sparc-sun-solaris2.11.

2023-11-28  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	* g++.dg/opt/devirt2.C: Adjust scan-assembler-count on sparc for
	removal of -inline from regexp.  Update comment.
2023-11-29 10:54:22 +01:00
Juzhe-Zhong bdad036da3 RISC-V: Support highpart register overlap for vwcvt
Since Richard supports register filters recently, we are able to support highpart register
overlap for widening RVV instructions.

This patch support it for vwcvt intrinsics.

I leverage real application user codes for vwcvt:
https://github.com/riscv/riscv-v-spec/issues/929
https://godbolt.org/z/xoeGnzd8q

This is the real application codes that using LMUL = 8 with unrolling to gain optimal
performance for specific libraury.

You can see in the codegen, GCC has optimal codegen for such since we supported register
lowpart overlap for narrowing instructions (dest EEW < source EEW).

Now, we start to support highpart register overlap from this patch for widening instructions (dest EEW > source EEW).

Leverage this intrinsic codes above but for vwcvt:

https://godbolt.org/z/1TMPE5Wfr

size_t
foo (char const *buf, size_t len)
{
  size_t sum = 0;
  size_t vl = __riscv_vsetvlmax_e8m8 ();
  size_t step = vl * 4;
  const char *it = buf, *end = buf + len;
  for (; it + step <= end;)
    {
      vint8m4_t v0 = __riscv_vle8_v_i8m4 ((void *) it, vl);
      it += vl;
      vint8m4_t v1 = __riscv_vle8_v_i8m4 ((void *) it, vl);
      it += vl;
      vint8m4_t v2 = __riscv_vle8_v_i8m4 ((void *) it, vl);
      it += vl;
      vint8m4_t v3 = __riscv_vle8_v_i8m4 ((void *) it, vl);
      it += vl;

      asm volatile("nop" ::: "memory");
      vint16m8_t vw0 = __riscv_vwcvt_x_x_v_i16m8 (v0, vl);
      vint16m8_t vw1 = __riscv_vwcvt_x_x_v_i16m8 (v1, vl);
      vint16m8_t vw2 = __riscv_vwcvt_x_x_v_i16m8 (v2, vl);
      vint16m8_t vw3 = __riscv_vwcvt_x_x_v_i16m8 (v3, vl);

      asm volatile("nop" ::: "memory");
      size_t sum0 = __riscv_vmv_x_s_i16m8_i16 (vw0);
      size_t sum1 = __riscv_vmv_x_s_i16m8_i16 (vw1);
      size_t sum2 = __riscv_vmv_x_s_i16m8_i16 (vw2);
      size_t sum3 = __riscv_vmv_x_s_i16m8_i16 (vw3);

      sum += sumation (sum0, sum1, sum2, sum3);
    }
  return sum;
}

Before this patch:

...
csrr    t0,vlenb
...
        vwcvt.x.x.v     v16,v8
        vwcvt.x.x.v     v8,v28
        vs8r.v  v16,0(sp)               ---> spill
        vwcvt.x.x.v     v16,v24
        vwcvt.x.x.v     v24,v4
        nop
        vsetvli zero,zero,e16,m8,ta,ma
        vmv.x.s a2,v16
        vl8re16.v       v16,0(sp)      --->  reload
...
csrr    t0,vlenb
...

You can see heavy spill && reload inside the loop body.

After this patch:

...
	vwcvt.x.x.v	v8,v12
	vwcvt.x.x.v	v16,v20
	vwcvt.x.x.v	v24,v28
	vwcvt.x.x.v	v0,v4
...

Optimal codegen after this patch.

Tested on zvl128b no regression.

I am gonna to test zve64d/zvl256b/zvl512b/zvl1024b.

Ok for trunk if no regression on the testing above ?

Co-authored-by: kito-cheng <kito.cheng@sifive.com>
Co-authored-by: kito-cheng <kito.cheng@gmail.com>

	PR target/112431

gcc/ChangeLog:

	* config/riscv/constraints.md (TARGET_VECTOR ? V_REGS : NO_REGS): New register filters.
	* config/riscv/riscv.md (no,W21,W42,W84,W41,W81,W82): Ditto.
	(no,yes): Ditto.
	* config/riscv/vector.md: Support highpart register overlap for vwcvt.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr112431-1.c: New test.
	* gcc.target/riscv/rvv/base/pr112431-2.c: New test.
	* gcc.target/riscv/rvv/base/pr112431-3.c: New test.
2023-11-29 17:36:15 +08:00
Rainer Orth 77f713a64a testsuite: Handle double-quoted LTO section names [PR112728]
The gcc.dg/scantest-lto.c test FAILs on Solaris/SPARC with the native as:

FAIL: gcc.dg/scantest-lto.c scan-assembler-not ascii
FAIL: gcc.dg/scantest-lto.c scan-assembler-times ascii 0

It requires double-quoting the section name which scanasm.exp doesn't
allow for.

This patch fixes that.

Tested on sparc-sun-solaris2.11 (as and gas) and i386-pc-solaris2.11 (as
and gas).

2023-11-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	PR testsuite/112728
	* lib/scanasm.exp (dg-scan): Allow for double-quoted LTO section names.
	(scan-assembler-times): Likewise.
	(scan-assembler-dem-not): Likewise.
2023-11-29 10:29:50 +01:00
xuli 73a63efcda RISC-V: Add explicit braces to eliminate warning.
../.././gcc/gcc/config/riscv/riscv.cc: In function ‘void riscv_option_override()’:
../.././gcc/gcc/config/riscv/riscv.cc:8673:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wdangling-else]
   if (TARGET_RVE)
      ^

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_option_override): Eliminate warning.
2023-11-29 08:48:15 +00:00
Jose E. Marchesi 86903dd94e testsuite: move gcc.c-torture/compile/libcall-2.c to gcc.target/i386/libcall-1.c
This patch relocates a test that is really x86 specific, and changes
it to use check_effective_target_int128.

gcc/testsuite/ChangeLog

	* gcc.c-torture/compile/libcall-2.c: Remove.
	* gcc.target/i386/libcall-1.c: Moved from
	gcc.c-torture/compile/libcall-2.c and adapted to use
	effective-target for int128_t.
2023-11-29 09:31:23 +01:00
Jakub Jelinek 77f0e4a02d c++: Fix a compile time memory leak in finish_static_assert
On Tue, Nov 28, 2023 at 11:31:48AM -0500, Jason Merrill wrote:
> Jonathan pointed out elsewhere that this gets leaked if error return
> prevents us from getting to the XDELETEVEC.

As there is a single error return in which it can leak, I've just added
a XDELETEVEC (buf); statement to that path rather than introducing some
RAII solution.

2023-11-29  Jakub Jelinek  <jakub@redhat.com>

	* semantics.cc (finish_static_assert): Free buf on error return.
2023-11-29 09:19:02 +01:00
Jakub Jelinek 9582538cf0 fold-mem-offsets: Fix powerpc64le-linux profiledbootstrap [PR111601]
The introduction of the fold-mem-offsets pass breaks profiledbootstrap
on powerpc64le-linux.
From what I can see, the pass works one basic block at a time and
will punt on any non-DEBUG_INSN uses outside of the current block
(I believe because of the
          /* This use affects instructions outside of CAN_FOLD_INSNS.  */
          if (!bitmap_bit_p (&can_fold_insns, INSN_UID (use)))
            return 0;
test and can_fold_insns only set in do_analysis (when processing insns in
current bb, cleared at the end) or results of get_single_def_in_bb
(which are checked to be in the same bb).
But, while get_single_def_in_bb checks for
  if (DF_INSN_LUID (def) > DF_INSN_LUID (insn))
    return NULL;
The basic block in the PR in question has:
...
(insn 212 210 215 25 (set (mem/f:DI (reg/v/f:DI 10 10 [orig:152 last_viable ] [152]) [2 *last_viable_336+0 S8 A64])
        (reg/f:DI 9 9 [orig:155 _342 ] [155])) "pr111601.ii":50:17 683 {*movdi_internal64}
     (expr_list:REG_DEAD (reg/v/f:DI 10 10 [orig:152 last_viable ] [152])
        (nil)))
(insn 215 212 484 25 (set (reg:DI 5 5 [226])
        (const_int 0 [0])) "pr111601.ii":52:12 683 {*movdi_internal64}
     (expr_list:REG_EQUIV (const_int 0 [0])
        (nil)))
(insn 484 215 218 25 (set (reg/v/f:DI 10 10 [orig:152 last_viable ] [152])
        (reg/f:DI 9 9 [orig:155 _342 ] [155])) "pr111601.ii":52:12 683 {*movdi_internal64}
     (nil))
...
(insn 564 214 216 25 (set (reg/v/f:DI 10 10 [orig:152 last_viable ] [152])
        (plus:DI (reg/v/f:DI 10 10 [orig:152 last_viable ] [152])
            (const_int 96 [0x60]))) "pr111601.ii":52:12 66 {*adddi3}
     (nil))
(insn 216 564 219 25 (set (mem/f:DI (reg/v/f:DI 10 10 [orig:152 last_viable ] [152]) [2 _343->next+0 S8 A64])
        (reg:DI 5 5 [226])) "pr111601.ii":52:12 683 {*movdi_internal64}
     (expr_list:REG_DEAD (reg:DI 5 5 [226])
        (nil)))
...
and when asking for all uses of %r10 from def 564, it will see uses
in 216 and 212; the former is after the += 96 addition and gets changed
to load from %r10+96 with the addition being dropped, but there is
the other store which is a use across the backedge and when reached
from other edges certainly doesn't have the + 96 addition anywhere,
so the pass doesn't actually change that location.

This patch adds checks from get_single_def_in_bb to get_uses as well,
in particular check that the (regular non-debug) use only appears in the
same basic block as the definition and that it doesn't appear before it (i.e.
use across backedge).

2023-11-29  Jakub Jelinek  <jakub@redhat.com>

	PR bootstrap/111601
	* fold-mem-offsets.cc (get_uses): Ignore DEBUG_INSN uses.  Otherwise,
	punt if use is in a different basic block from INSN or appears before
	INSN in the same basic block.  Formatting fixes.
	(get_single_def_in_bb): Formatting fixes.
	(fold_offsets_1, pass_fold_mem_offsets::execute): Comment formatting
	fixes.

	* g++.dg/opt/pr111601.C: New test.
2023-11-29 09:14:03 +01:00
Xi Ruoyao 3f9eb37fb7
LoongArch: Use LSX for scalar FP rounding with explicit rounding mode
In LoongArch FP base ISA there is only the frint.{s/d} instruction which
reads the global rounding mode.  Utilize LSX for explicit rounding mode
even if the operand is scalar.  It seems wasting the CPU power, but
still much faster than calling the library function.

gcc/ChangeLog:

	* config/loongarch/simd.md (LSX_SCALAR_FRINT): New int iterator.
	(VLSX_FOR_FMODE): New mode attribute.
	(<simd_for_scalar_frint_pattern><mode>2): New expander,
	expanding to vreplvei.{w/d} + frint{rp/rz/rm/rne}.{s.d}.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/vect-frint-scalar.c: New test.
	* gcc.target/loongarch/vect-frint-scalar-no-inexact.c: New test.
2023-11-29 15:07:09 +08:00
Xi Ruoyao 3c81a587ad
LoongArch: Remove lrint_allow_inexact
No functional change, just a cleanup.

gcc/ChangeLog:

	* config/loongarch/loongarch.md (lrint_allow_inexact): Remove.
	(<lrint_pattern><ANYF:mode><ANYFI:mode>2): Check if <LRINT>
	== UNSPEC_FTINT instead of <lrint_allow_inexact>.
2023-11-29 15:07:09 +08:00
Xi Ruoyao 77f662a831
LoongArch: Use standard pattern name and RTX code for LSX/LASX rotate shift
Remove unnecessary UNSPECs and make the [x]vrotr[i] instructions useful
with GNU vectors and auto vectorization.

gcc/ChangeLog:

	* config/loongarch/lsx.md (bitimm): Move to ...
	(UNSPEC_LSX_VROTR): Remove.
	(lsx_vrotr_<lsxfmt>): Remove.
	(lsx_vrotri_<lsxfmt>): Remove.
	* config/loongarch/lasx.md (UNSPEC_LASX_XVROTR): Remove.
	(lsx_vrotr_<lsxfmt>): Remove.
	(lsx_vrotri_<lsxfmt>): Remove.
	* config/loongarch/simd.md (bitimm): ... here.  Expand it to
	cover LASX modes.
	(vrotr<mode>3): New define_insn.
	(vrotri<mode>3): New define_insn.
	* config/loongarch/loongarch-builtins.cc:
	(CODE_FOR_lsx_vrotr_b): Use standard pattern name.
	(CODE_FOR_lsx_vrotr_h): Likewise.
	(CODE_FOR_lsx_vrotr_w): Likewise.
	(CODE_FOR_lsx_vrotr_d): Likewise.
	(CODE_FOR_lasx_xvrotr_b): Likewise.
	(CODE_FOR_lasx_xvrotr_h): Likewise.
	(CODE_FOR_lasx_xvrotr_w): Likewise.
	(CODE_FOR_lasx_xvrotr_d): Likewise.
	(CODE_FOR_lsx_vrotri_b): Define to standard pattern name.
	(CODE_FOR_lsx_vrotri_h): Likewise.
	(CODE_FOR_lsx_vrotri_w): Likewise.
	(CODE_FOR_lsx_vrotri_d): Likewise.
	(CODE_FOR_lasx_xvrotri_b): Likewise.
	(CODE_FOR_lasx_xvrotri_h): Likewise.
	(CODE_FOR_lasx_xvrotri_w): Likewise.
	(CODE_FOR_lasx_xvrotri_d): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/vect-rotr.c: New test.
2023-11-29 15:07:09 +08:00
Xi Ruoyao cbbc3eeb07
LoongArch: Use standard pattern name and RTX code for LSX/LASX muh instructions
Removes unnecessary UNSPECs and make the muh instructions useful with
GNU vectors or auto vectorization.

gcc/ChangeLog:

	* config/loongarch/simd.md (muh): New code attribute mapping
	any_extend to smul_highpart or umul_highpart.
	(<su>mul<mode>3_highpart): New define_insn.
	* config/loongarch/lsx.md (UNSPEC_LSX_VMUH_S): Remove.
	(UNSPEC_LSX_VMUH_U): Remove.
	(lsx_vmuh_s_<lsxfmt>): Remove.
	(lsx_vmuh_u_<lsxfmt>): Remove.
	* config/loongarch/lasx.md (UNSPEC_LASX_XVMUH_S): Remove.
	(UNSPEC_LASX_XVMUH_U): Remove.
	(lasx_xvmuh_s_<lasxfmt>): Remove.
	(lasx_xvmuh_u_<lasxfmt>): Remove.
	* config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vmuh_b):
	Redefine to standard pattern name.
	(CODE_FOR_lsx_vmuh_h): Likewise.
	(CODE_FOR_lsx_vmuh_w): Likewise.
	(CODE_FOR_lsx_vmuh_d): Likewise.
	(CODE_FOR_lsx_vmuh_bu): Likewise.
	(CODE_FOR_lsx_vmuh_hu): Likewise.
	(CODE_FOR_lsx_vmuh_wu): Likewise.
	(CODE_FOR_lsx_vmuh_du): Likewise.
	(CODE_FOR_lasx_xvmuh_b): Likewise.
	(CODE_FOR_lasx_xvmuh_h): Likewise.
	(CODE_FOR_lasx_xvmuh_w): Likewise.
	(CODE_FOR_lasx_xvmuh_d): Likewise.
	(CODE_FOR_lasx_xvmuh_bu): Likewise.
	(CODE_FOR_lasx_xvmuh_hu): Likewise.
	(CODE_FOR_lasx_xvmuh_wu): Likewise.
	(CODE_FOR_lasx_xvmuh_du): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/vect-muh.c: New test.
2023-11-29 15:07:08 +08:00
Xi Ruoyao 530348c418
LoongArch: Fix usage of LSX and LASX frint/ftint instructions [PR112578]
The usage LSX and LASX frint/ftint instructions had some problems:

1. These instructions raises FE_INEXACT, which is not allowed with
   -fno-fp-int-builtin-inexact for most C2x section F.10.6 functions
   (the only exceptions are rint, lrint, and llrint).
2. The "frint" instruction without explicit rounding mode is used for
   roundM2, this is incorrect because roundM2 is defined "rounding
   operand 1 to the *nearest* integer, rounding away from zero in the
   event of a tie".  We actually don't have such an instruction.  Our
   frintrne instruction is roundevenM2 (unfortunately, this is not
   documented).
3. These define_insn's are written in a way not so easy to hack.

So I removed these instructions and created a "simd.md" file, then added
them and the corresponding expanders there.  The advantage of the
simd.md file is we don't need to duplicate the RTL template twice (in
lsx.md and lasx.md).

gcc/ChangeLog:

	PR target/112578
	* config/loongarch/lsx.md (UNSPEC_LSX_VFTINT_S,
	UNSPEC_LSX_VFTINTRNE, UNSPEC_LSX_VFTINTRP,
	UNSPEC_LSX_VFTINTRM, UNSPEC_LSX_VFRINTRNE_S,
	UNSPEC_LSX_VFRINTRNE_D, UNSPEC_LSX_VFRINTRZ_S,
	UNSPEC_LSX_VFRINTRZ_D, UNSPEC_LSX_VFRINTRP_S,
	UNSPEC_LSX_VFRINTRP_D, UNSPEC_LSX_VFRINTRM_S,
	UNSPEC_LSX_VFRINTRM_D): Remove.
	(ILSX, FLSX): Move into ...
	(VIMODE): Move into ...
	(FRINT_S, FRINT_D): Remove.
	(frint_pattern_s, frint_pattern_d, frint_suffix): Remove.
	(lsx_vfrint_<flsxfmt>, lsx_vftint_s_<ilsxfmt>_<flsxfmt>,
	lsx_vftintrne_w_s, lsx_vftintrne_l_d, lsx_vftintrp_w_s,
	lsx_vftintrp_l_d, lsx_vftintrm_w_s, lsx_vftintrm_l_d,
	lsx_vfrintrne_s, lsx_vfrintrne_d, lsx_vfrintrz_s,
	lsx_vfrintrz_d, lsx_vfrintrp_s, lsx_vfrintrp_d,
	lsx_vfrintrm_s, lsx_vfrintrm_d,
	<FRINT_S:frint_pattern_s>v4sf2,
	<FRINT_D:frint_pattern_d>v2df2, round<mode>2,
	fix_trunc<mode>2): Remove.
	* config/loongarch/lasx.md: Likewise.
	* config/loongarch/simd.md: New file.
	(ILSX, ILASX, FLSX, FLASX, VIMODE): ... here.
	(IVEC, FVEC): New mode iterators.
	(VIMODE): ... here.  Extend it to work for all LSX/LASX vector
	modes.
	(x, wu, simd_isa, WVEC, vimode, simdfmt, simdifmt_for_f,
	elebits): New mode attributes.
	(UNSPEC_SIMD_FRINTRP, UNSPEC_SIMD_FRINTRZ, UNSPEC_SIMD_FRINT,
	UNSPEC_SIMD_FRINTRM, UNSPEC_SIMD_FRINTRNE): New unspecs.
	(SIMD_FRINT): New int iterator.
	(simd_frint_rounding, simd_frint_pattern): New int attributes.
	(<simd_isa>_<x>vfrint<simd_frint_rounding>_<simdfmt>): New
	define_insn template for frint instructions.
	(<simd_isa>_<x>vftint<simd_frint_rounding>_<simdifmt_for_f>_<simdfmt>):
	Likewise, but for ftint instructions.
	(<simd_frint_pattern><mode>2): New define_expand with
	flag_fp_int_builtin_inexact checked.
	(l<simd_frint_pattern><mode><vimode>2): Likewise.
	(ftrunc<mode>2): New define_expand.  It does not require
	flag_fp_int_builtin_inexact.
	(fix_trunc<mode><vimode>2): New define_insn_and_split.  It does
	not require flag_fp_int_builtin_inexact.
	(include): Add lsx.md and lasx.md.
	* config/loongarch/loongarch.md (include): Include simd.md,
	instead of including lsx.md and lasx.md directly.
	* config/loongarch/loongarch-builtins.cc
	(CODE_FOR_lsx_vftint_w_s, CODE_FOR_lsx_vftint_l_d,
	CODE_FOR_lasx_xvftint_w_s, CODE_FOR_lasx_xvftint_l_d):
	Remove.

gcc/testsuite/ChangeLog:

	PR target/112578
	* gcc.target/loongarch/vect-frint.c: New test.
	* gcc.target/loongarch/vect-frint-no-inexact.c: New test.
	* gcc.target/loongarch/vect-ftint.c: New test.
	* gcc.target/loongarch/vect-ftint-no-inexact.c: New test.
2023-11-29 15:07:08 +08:00
Alexandre Oliva 862867eab7 Introduce hardbool attribute for C
This patch introduces hardened booleans in C.  The hardbool attribute,
when attached to an integral type, turns it into an enumerate type
with boolean semantics, using the named or implied constants as
representations for false and true.

Expressions of such types decay to _Bool, trapping if the value is
neither true nor false, and _Bool can convert implicitly back to them.
Other conversions go through _Bool first.


for  gcc/c-family/ChangeLog

	* c-attribs.cc (c_common_attribute_table): Add hardbool.
	(handle_hardbool_attribute): New.
	(type_valid_for_vector_size): Reject hardbool.
	* c-common.cc (convert_and_check): Skip warnings for convert
	and check for hardbool.
	(c_hardbool_type_attr_1): New.
	* c-common.h (c_hardbool_type_attr): New.

for  gcc/c/ChangeLog

	* c-typeck.cc (convert_lvalue_to_rvalue): Decay hardbools.
	* c-convert.cc (convert): Convert to hardbool through
	truthvalue.
	* c-decl.cc (check_bitfield_type_and_width): Skip enumeral
	truncation warnings for hardbool.
	(finish_struct): Propagate hardbool attribute to bitfield
	types.
	(digest_init): Convert to hardbool.

for  gcc/ChangeLog

	* doc/extend.texi (hardbool): New type attribute.
	* doc/invoke.texi (-ftrivial-auto-var-init): Document
	representation vs values.

for  gcc/testsuite/ChangeLog

	* gcc.dg/hardbool-err.c: New.
	* gcc.dg/hardbool-trap.c: New.
	* gcc.dg/torture/hardbool.c: New.
	* gcc.dg/torture/hardbool-s.c: New.
	* gcc.dg/torture/hardbool-us.c: New.
	* gcc.dg/torture/hardbool-i.c: New.
	* gcc.dg/torture/hardbool-ul.c: New.
	* gcc.dg/torture/hardbool-ll.c: New.
	* gcc.dg/torture/hardbool-5a.c: New.
	* gcc.dg/torture/hardbool-s-5a.c: New.
	* gcc.dg/torture/hardbool-us-5a.c: New.
	* gcc.dg/torture/hardbool-i-5a.c: New.
	* gcc.dg/torture/hardbool-ul-5a.c: New.
	* gcc.dg/torture/hardbool-ll-5a.c: New.
2023-11-29 04:00:45 -03:00
Alexandre Oliva 0d24289d12 call maybe_return_this in build_clone
__dt_base doesn't get its body from a maybe_return_this caller, it's
rather cloned with the full body within build_clone, and then it's
left alone, without going through finish_function_body or
build_delete_destructor_body, that call maybe_return_this.

Now, this is correct as far as the generated code is concerned, since
the cloned body of a cdtor that returns this is also a cdtor body that
returns this.  The problem is that the decl for THIS is also cloned,
and it doesn't get the warning suppression introduced by
maybe_return_this, so Wuse-after-free3.C fails with an excess warning
at the closing brace of the dtor body.

I've split out the warning suppression from maybe_return_this, and
arranged to call that bit from the relevant build_clone case.
Unfortunately, because the warning is silenced for all uses of the
THIS decl, rather than only for the ABI-mandated return stmt, this
also silences the very warning that the testcase checks for.

I'm not revamping the warning suppression approach to overcome this,
so I'm xfailing the expected warning on ARM EABI, hoping that's the
only target with cdtor_return_this, and leaving it at that.


for  gcc/cp/ChangeLog

	* decl.cc (maybe_prepare_return_this): Split out of...
	(maybe_return_this): ... this.
	* cp-tree.h (maybe_prepare_return_this): Declare.
	* class.cc (build_clone): Call it.

for  gcc/testsuite/ChangeLog

	* g++.dg/warn/Wuse-after-free3.C: xfail on arm_eabi.
2023-11-29 04:00:35 -03:00
Alexandre Oliva 71804526d3 c++: for contracts, cdtors never return this
When targetm.cxx.cdtor_return_this() holds, cdtors have a
non-VOID_TYPE_P result, but IMHO this ABI implementation detail
shouldn't leak to the abstract language conceptual framework, in which
cdtors don't have return values.  For contracts, specifically those
that establish postconditions on results, such a leakage is present,
and the present patch puts an end to it: with it, cdtors get an error
for result postconditions regardless of the ABI.  This fixes
g++.dg/contracts/contracts-ctor-dtor2.C on arm-eabi.


for  gcc/cp/ChangeLog

	* contracts.cc (check_postcondition_result): Cope with
	cdtor_return_this.
2023-11-29 04:00:28 -03:00
Alexandre Oliva 1ff6d9f742 Introduce -finline-stringops
try_store_by_multiple_pieces was added not long ago, enabling
variable-sized memset to be expanded inline when the worst-case
in-range constant length would, using conditional blocks with powers
of two to cover all possibilities of length and alignment.

This patch introduces -finline-stringops[=fn] to request expansions to
start with a loop, so as to still take advantage of known alignment
even with long lengths, but without necessarily adding store blocks
for every power of two.

This makes it possible for the supported stringops (memset, memcpy,
memmove, memset) to be expanded, even if storing a single byte per
iteration.  Surely efficient implementations can run faster, with a
pre-loop to increase alignment, but that would likely be excessive for
inline expansions.

Still, in some cases, such as in freestanding environments, users
prefer to inline such stringops, especially those that the compiler
may introduce itself, even if the expansion is not as performant as a
highly optimized C library implementation could be, to avoid
depending on a C runtime library.


for  gcc/ChangeLog

	* expr.cc (emit_block_move_hints): Take ctz of len.  Obey
	-finline-stringops.  Use oriented or sized loop.
	(emit_block_move): Take ctz of len, and pass it on.
	(emit_block_move_via_sized_loop): New.
	(emit_block_move_via_oriented_loop): New.
	(emit_block_move_via_loop): Take incr.  Move an incr-sized
	block per iteration.
	(emit_block_cmp_via_cmpmem): Take ctz of len.  Obey
	-finline-stringops.
	(emit_block_cmp_via_loop): New.
	* expr.h (emit_block_move): Add ctz of len defaulting to zero.
	(emit_block_move_hints): Likewise.
	(emit_block_cmp_hints): Likewise.
	* builtins.cc (expand_builtin_memory_copy_args): Pass ctz of
	len to emit_block_move_hints.
	(try_store_by_multiple_pieces): Support starting with a loop.
	(expand_builtin_memcmp): Pass ctz of len to
	emit_block_cmp_hints.
	(expand_builtin): Allow inline expansion of memset, memcpy,
	memmove and memcmp if requested.
	* common.opt (finline-stringops): New.
	(ilsop_fn): New enum.
	* flag-types.h (enum ilsop_fn): New.
	* doc/invoke.texi (-finline-stringops): Add.

for  gcc/testsuite/ChangeLog

	* gcc.dg/torture/inline-mem-cmp-1.c: New.
	* gcc.dg/torture/inline-mem-cpy-1.c: New.
	* gcc.dg/torture/inline-mem-cpy-cmp-1.c: New.
	* gcc.dg/torture/inline-mem-move-1.c: New.
	* gcc.dg/torture/inline-mem-set-1.c: New.
2023-11-29 04:00:24 -03:00
Pan Li 25a51e98fd RISC-V: Bugfix for ICE in block move when zve32f
The exact_div requires the exactly multiple of the divider.
Unfortunately, the condition will be broken when zve32f in
some cases. For example,

potential_ew is 8
BYTES_PER_RISCV_VECTOR * lmul1 is [4, 4]

This patch would like to ensure the precondition of exact_div
when get_vec_mode.

	PR target/112743

gcc/ChangeLog:

	* config/riscv/riscv-string.cc (expand_block_move): Add
	precondition check for exact_div.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr112743-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
2023-11-29 14:50:29 +08:00
Jose E. Marchesi 4ed0740c6e testsuite: fix gcc.c-torture/compile/libcall-2.c in -m32
This test relies on having __int128 in x86_64 targets, which is only
available in -m64.

gcc/testsuite/ChangeLog

	* gcc.c-torture/compile/libcall-2.c: Skip test in -m32.
2023-11-29 07:44:59 +01:00
Hongyu Wang 99fa0bfd63 [i386] Fix push2pop2 test fail on non-linux target [PR112729]
On linux x86-64, -fomit-frame-pointer was by default enabled so the
push2pop2 tests cfi scans are based on it. On other target with
-fno-omit-frame-pointer the cfi scan will be wrong as the frame pointer
is pushed at first. Add -fomit-frame-pointer to these tests that related
to cfi scan.

gcc/testsuite/ChangeLog:

	PR target/112729
	* gcc.target/i386/apx-interrupt-1.c: Add -fomit-frame-pointer.
	* gcc.target/i386/apx-push2pop2-1.c: Likewise.
	* gcc.target/i386/apx-push2pop2_force_drap-1.c: Likewise.
2023-11-29 08:49:03 +08:00
GCC Administrator 6c85b8a987 Daily bump. 2023-11-29 00:17:27 +00:00
Jason Merrill 305a2686c9 c++: prvalue array decay [PR94264]
My change for PR53220 made array to pointer decay for prvalue arrays
ill-formed to catch well-defined C code that produces a dangling pointer in
C++ due to the shorter lifetime of compound literals.  This wasn't really
correct, but wasn't a problem until C++17 added prvalue arrays, at which
point it started rejecting valid C++ code.

I wanted to make sure that we still diagnose the problematic code;
-Wdangling-pointer covers the array-lit.c case, but I needed to extend
-Wreturn-local-addr to handle the return case.

	PR c++/94264
	PR c++/53220

gcc/c/ChangeLog:

	* c-typeck.cc (array_to_pointer_conversion): Adjust -Wc++-compat
	diagnostic.

gcc/cp/ChangeLog:

	* call.cc (convert_like_internal): Remove obsolete comment.
	* typeck.cc (decay_conversion): Allow array prvalue.
	(maybe_warn_about_returning_address_of_local): Check
	for returning pointer to temporary.

gcc/testsuite/ChangeLog:

	* c-c++-common/array-lit.c: Adjust.
	* g++.dg/cpp1z/array-prvalue1.C: New test.
	* g++.dg/ext/complit17.C: New test.
2023-11-28 16:29:19 -05:00
Roger Sayle 3d104d93a7 ARC: Consistent use of whitespace in assembler templates.
This minor clean-up patch tweaks arc.md to use whitespace consistently
in output templates, always using a TAB between the mnemonic and its
operands, and avoiding spaces after commas betweem operands.  There
should be no functional changes with this patch, though several test
cases' scan-assembler need to be updated to use \\s+ instead of testing
for a TAB or a space explicitly.

2023-11-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/arc/arc.md: Make output template whitespace consistent.

gcc/testsuite/ChangeLog
	* gcc.target/arc/jli-1.c: Update dg-final whitespace.
	* gcc.target/arc/jli-2.c: Likewise.
	* gcc.target/arc/naked-1.c: Likewise.
	* gcc.target/arc/naked-2.c: Likewise.
	* gcc.target/arc/tmac-1.c: Likewise.
	* gcc.target/arc/tmac-2.c: Likewise.
2023-11-28 18:34:56 +00:00
Jose E. Marchesi 880ae958fa varasm.cc: refer to assemble_external_libcall only ifdef ASM_OUTPUT_EXTERNAL
This fixes boostrap in targets where ASM_OUTPUT_EXTERNAL is not
defined.

gcc/ChangeLog

	* varasm.cc (assemble_external_libcall): Refer in assert only ifdef
	ASM_OUTPUT_EXTERNAL.
2023-11-28 19:21:32 +01:00
Andrew Pinski 68ffaf8398 MATCH: Fix invalid signed boolean type usage
This fixes the incorrect assumption that was done in r14-3721-ge6bcf839894783,
that being able to doing the negative after the conversion would be a valid thing
but really it is not valid for boolean types.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

	PR tree-optimization/112738
	* match.pd (`(nop_convert)-(convert)a`): Reject
	when the outer type is boolean.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2023-11-28 09:49:47 -08:00
Jakub Jelinek b73fa20615 c++: Fix up __has_extension (cxx_init_captures)
On Mon, Nov 27, 2023 at 10:58:04AM +0000, Alex Coplan wrote:
> Many thanks both for the reviews, this is now pushed (with Jason's
> above changes implemented) as g:06280a906cb3dc80cf5e07cf3335b758848d488d.

The new test FAILs everywhere with GXX_TESTSUITE_STDS=98,11,14,17,20,2b
I'm normally using for testing.
FAIL: g++.dg/ext/has-feature.C  -std=gnu++11 (test for excess errors)
Excess errors:
/home/jakub/src/gcc/gcc/testsuite/g++.dg/ext/has-feature.C:185:2: error: #error

This is on
 #if __has_extension (cxx_init_captures) != CXX11
 #error
 #endif
Comparing the values with clang++ on godbolt and with what is actually
implemented:
void foo () { auto a = [b = 3]() { return b; }; }
both clang++ and GCC implement init captures as extension already in C++11
(and obviously not in C++98 because lambdas aren't implemented there),
unless -pedantic-errors/-Werror=pedantic, so I think we should change
the FE to match the test rather than the other way around.

Making __has_extension return __has_feature for -pedantic-errors and not
for -Werror=pedantic is just weird, but as that is what clang++ implements
and this is for compatibility with it, I can live with it (but perhaps
we should mention it in the documentation).  Note, the warnings/errors
can be changed using pragmas inside of the source, so whether one can
use an extension or not depends on where in the code it is (__extension__
to the rescue if it can be specified around it).
I wonder if the has-feature.C test shouldn't be #included in other 2 tests,
one where -pedantic-errors would be in dg-options and through some macro
tell the file that __has_extension will behave like __has_feature, and
another with -Werror=pedantic to document that the option doesn't change
it.

2023-11-28  Jakub Jelinek  <jakub@redhat.com>

	* cp-objcp-common.cc (cp_feature_table): Evaluate
	__has_extension (cxx_init_captures) to 1 even for -std=c++11.
2023-11-28 18:25:14 +01:00
Simon Wright 396db92d3a Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime
In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
assumption for __APPLE__ is that file names are case-insensitive
unless __arm__ or __arm64__ are defined, in which case file names are
declared case-sensitive.

The associated comment is
  "By default, we suppose filesystems aren't case sensitive on
  Windows and Darwin (but they are on arm-darwin)."

This means that on aarch64-apple-darwin, file names are treated as
case-sensitive, which is not the default case.

The true default position is that macOS file systems are
case-insensitive, iOS file systems are case-sensitive.

Apple provide a header file <TargetConditionals.h> which permits a
compile-time check for the compiler target (e.g. OSX vs IOS); if
TARGET_OS_IOS is defined as 1, this is a build for iOS.

2023-11-22  Simon Wright  <simon@pushface.org>

gcc/ada/

	PR ada/111909
	* adaint.c
	(__gnat_get_file_names_case_sensitive): Split out the __APPLE__
	check and remove the checks for __arm__, __arm64__. For Apple,
	file names are by default case-insensitive unless TARGET_OS_IOS is
	set.

Signed-off-by: Simon Wright <simon@pushface.org>
2023-11-28 17:47:10 +01:00
Richard Biener f45d5e30bd middle-end/112741 - ICE with gimple FE and later regimplification
The GIMPLE frontend, when bypassing gimplification, doesn't set
DECL_SEEN_IN_BIND_EXPR_P given there are no such things in GIMPLE.
But it probably should set the flag anyway to avoid later ICEs
when regimplifying.

	PR middle-end/112741
gcc/c/
	* gimple-parser.cc (c_parser_parse_gimple_body): Also
	set DECL_SEEN_IN_BIND_EXPR_Pfor locals.

gcc/testsuite/
	* gcc.dg/ubsan/pr112741.c: New testcase.
2023-11-28 16:58:34 +01:00
Richard Biener f26d68d5d1 middle-end/112732 - stray TYPE_ALIAS_SET in type variant
The following fixes a stray TYPE_ALIAS_SET in a type variant built
by build_opaque_vector_type which is diagnosed by type checking
enabled with -flto.

	PR middle-end/112732
	* tree.cc (build_opaque_vector_type): Reset TYPE_ALIAS_SET
	of the newly built type.
2023-11-28 16:58:08 +01:00
Uros Bizjak 99db2ce241 i386: Improve cmpstrnqi_1 insn pattern [PR112494]
REPZ CMPSB instruction does not update FLAGS register when %ecx register
equals zero.  Improve cmpstrnqi_1 insn pattern to set FLAGS_REG to its
previous value instead of (const_int 0) when operand 2 equals zero.

	PR target/112494

gcc/ChangeLog:

	* config/i386/i386.md (cmpstrnqi_1): Set FLAGS_REG to its previous
	value when operand 2 equals zero.
	(*cmpstrnqi_1): Ditto.
	(*cmpstrnqi_1 peephole2): Ditto.
2023-11-28 16:57:25 +01:00
Cupertino Miranda 82273cd6ed Revert "This patch enables errors when external calls are created."
Reverted commit was breaking the BPF build, because libgcc emits
libcalls to __builtin_abort.

This reverts commit faf5b14858.
2023-11-28 15:47:32 +00:00
Andrew Jenner b247e917ff Fortran: fix reallocation on assignment of polymorphic variables [PR110415]
This patch fixes two bugs related to polymorphic class assignment in the
Fortran front-end. One (described in PR110415) is an issue with the malloc
and realloc calls using the size from the old vptr rather than the new one.
The other is caused by the return value from the realloc call being ignored.
Testcases are added for these issues.

2023-11-28  Andrew Jenner  <andrew@codesourcery.com>

gcc/fortran/
	PR fortran/110415
	* trans-expr.cc (trans_class_vptr_len_assignment): Add
	from_vptrp parameter. Populate it. Don't check for DECL_P
	when deciding whether to create temporary.
	(trans_class_pointer_fcn, gfc_trans_pointer_assignment): Add
	NULL argument to trans_class_vptr_len_assignment calls.
	(trans_class_assignment): Get rhs_vptr from
	trans_class_vptr_len_assignment and use it for determining size
	for allocation/reallocation. Use return value from realloc.

gcc/testsuite/
	PR fortran/110415
	* gfortran.dg/pr110415.f90: New test.
	* gfortran.dg/asan/pr110415-2.f90: New test.
	* gfortran.dg/asan/pr110415-3.f90: New test.

Co-Authored-By: Tobias Burnus  <tobias@codesourcery.com>
2023-11-28 15:27:05 +00:00
Jose E. Marchesi f31a019d11 Emit funcall external declarations only if actually used.
There are many places in GCC where alternative local sequences are
tried in order to determine what is the cheapest or best alternative
to use in the current target.  When any of these sequences involve a
libcall, the current implementation of emit_library_call_value_1
introduce a side-effect consisting on emitting an external declaration
for the funcall (such as __divdi3) which is thus emitted even if the
sequence that does the libcall is not retained.

This is problematic in targets such as BPF, because the kernel loader
chokes on the spurious symbol __divdi3 and makes the resulting BPF
object unloadable.  Note that BPF objects are not linked before being
loaded.

This patch changes asssemble_external_libcall to defer emitting
declarations of external libcall symbols, by saving the call tree
nodes in a temporary list pending_libcall_symbols and letting
process_pending_assembly_externals to emit them only if they have been
referenced.  Solution suggested and sketched by Richard Sandiford.

Regtested in x86_64-linux-gnu.
Tested with host x86_64-linux-gnu with target bpf-unknown-none.

gcc/ChangeLog

	PR target/109253
	* varasm.cc (pending_libcall_symbols): New variable.
	(process_pending_assemble_externals): Process
	pending_libcall_symbols.
	(assemble_external_libcall): Defer emitting external libcall
	symbols to process_pending_assemble_externals.

gcc/testsuite/ChangeLog

	PR target/109253
	* gcc.target/bpf/divmod-libcall-1.c: New test.
	* gcc.target/bpf/divmod-libcall-2.c: Likewise.
	* gcc.c-torture/compile/libcall-2.c: Likewise.
2023-11-28 16:01:09 +01:00
Rainer Orth 8f8db55539 libsanitizer: Update LOCAL_PATCHES
2023-11-28  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	libsanitizer:
	* LOCAL_PATCHES: Update.
2023-11-28 15:00:31 +01:00