【任务描述】
We are doing llvm 15 upgrade, It will be helpful if we can know what is new in llvm 15 compared to our currently toolchain(llvm 12).
It doesn't need to be too detailed, just big upgrades like new features, optimations, etc.
【解决方案】
【任务来源】
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
Interesting changes are new features, optimizations, tooling enhancements, deprecations and breaking changes. Deprecations should also be taken in care: deprecation can be trasformed into a breaking change in newer version, so it better to be prepared for a seamless migration.
Original diffs can be found in release notes of LLVM / Clang project:
LLVM13, LLVM14, LLVM15, Clang13, Clang14, Clang15.
I've summarized this in a structured way. You can skip uninterested sections while reading to find what is more relevant.
However, not all the changes can be found in release notes: some weak guarantees may be broken for minor reasons that are not highlighted at all (like C++ demangling changes made for debugging simplification).
Most noticiable changes are:
! Note the major breaking changes:
2^24-1
bits to 2^23
bits.2^29
to 2^32
.extractvalue
, insertvalue
, udiv
, sdiv
, urem
, srem
, fadd
, fsub
, fmul
, fdiv
fmax
and fmin
in atomicrmw
instruction. The comparison is expected to match the behavior of llvm.maxnum.*
and llvm.minnum.*
respectively.callbr
instructions no longer use blockaddress arguments for labels. Instead, label constraints starting with !
refer directly to entries in the callbr
indirect destination list.Main efforts were made for ARM, AArch64, RISC-V and PowerPC backends.
More architectures and CPUs are supported.
AArch64 has auto-vectorization now.
Various extensions are added and stabilized in RISC-V. Also various optimizations implemented and improved for RISC-V.
Arch:
Features:
tune-cpu
function attribute to support the use of the -mtune
frontend flag. This allows certain scheduling features and optimisations to be enabled independently of the architecture. If the tune-cpu
attribute is absent it tunes according to the target-cpu
.Fixes:
Arch:
Features:
-mframe-chain=(none|aapcs|aapcs+leaf)
command-line option, which controls the generation of AAPCS-compliant Frame Records.Fixes:
Optimizations:
addw
/subw
/mulw
/slliw
instructions and removal of redundant sext.w
instructions (using the new RISCVSExtWRemoval
pass).RISCVRedundantCopyElimination
pass was added to remove unnecessary zero copies.CodeGenPrepare
pass was added.-Oz
. Additionally, the newly introduced RISCVMakeCompressible
pass will make modify instructions prior to emission at -Oz
in order to increase opportunities for the compression with the RISC-V C extension.RISCVSExtWRemoval
and RISCVMergeBaseOffset
.Extensions:
Zba
, Zbb
, Zbc
, and Zbs
bit-manipulation extensions were updated to version 1.0 and are no longer experimental.Zfh
and Zfhmin
extensions for half-precision floating point were updated to version 1.0 and are no longer experimental.Zbproposedc
extension was removed, as was the B
extension (including all bit-manipulation sub-extensions). Individual Zb*
extensions should be used instead.Zvfh
extension was added, enabling half-precision floating point in vectors.Zihintpause
(Pause Hint) extension.Zfinx
and Zdinx
(float / double in integer register) extensions.Zicbom
, Zicboz
, and Zicbop
cache management operation extensions.Zmmul
extension (a subextension of the M extension, adding multiplication instructions only)..insn
directive.Sscofpmf
, Smstateen
, and Sstc
extensions.Etc:
See change list here 12.0.0, 13.0.0, 14.0.0, 15.0.0 for details.
CLI flags changes and deprecations potentially can affect your automation scripts, take care of it. Some tools are not guaranteed to have a stable CLI API neither for input flags, neither for output format. Several tools have changed its output format to be closer to GNU pairs (e.g. llvm-objdump
<-> objdump
).
llvm-mca
: support for in-order CPUs (e.g. ARM Cortex-A55)llvm-mca
requires proper scheduling model for your target to get relevant results. [MCA docs].New features, changes in behavior (tool names are clickable)
--thin
archives-X
to specify the type of object file should be processed-X
to specify the type object file should be processed--export-symbols
to dump a list of unique symbols used for exportllvm-rc
(port of Windows's rc.exe
):
--update-section
for ELF and Mach-O--subsystem
for PE/COFF--rename-section
now renames relocation sections together with their targets (GNU like)zlib-gnu
format--set-section-flags src=... --rename-section src=tst
--add-section=.foo1=... --rename-section=.foo1=.foo2
now adds .foo1
instead of .foo2
.--symbolize-operands
now supports PowerPC-p
now dumps PE header (GNU like)-R
now supports ELF position-dependent executables-T
now prints symbol versions (GNU like)--needed-libs
, --relocs
, --syms
for XCOFF--elf-output-style=JSON
and --pretty-print
llvm-symbolizer
: now has --filter-markup
to filter Symbolizer Markup into human-readable form.
llc
: now parses code-model attribute from input file
Tool CLI API changes:
llvm-readobj --sections
-> llvm-readobj --syms
llvm-readobj --syms
-> llmv-readobj --section-details
llvm-nm
flags -M
, -U
, -W
are deprecatedMost noticable changes are improved AArch64 and memory tagging (MTE) support.
Read the post https://www.linaro.org/blog/debugging-memory-tagging-with-lldb-13/ for details and usage examples.
AArch64:
Memory tags related changes:
memory tag read
and memory tag write
commands.memory region
command will note when a region has memory tagging enabled.memory find
memory read
memory region
(see below)memory tag read
memory tag write
memory region
command and GetMemoryRegionInfo
API method now ignore non-address bits in the address parameter. This also means that on systems with non-address bits the last (usually unmapped) memory region will not extend to 0xF…F
. Instead it will end at the end of the mappable range that the virtual address size allows.memory read
command has a new option --show-tags
. Use this option to show memory tags beside the contents of tagged memory ranges.memory region
command now has a –all
option to list all memory regions (including unmapped ranges). This is the equivalent of using address 0
then repeating the command until all regions have been listed.–show-tags
option to the memory find
command. This is off by default. When enabled, if the target value is found in tagged memory, the tags for that memory will be shown inline with the memory contents.memory read
, memory write
and memory find
can now be used with addresses with non-address bits.Etc:
int [N]
to int[N]
) - LLDB pretty printer type name matching code may need to be updated to handle this.Guaranteed tail calls are now supported with statement attributes [[clang::musttail]]
in C++ and __attribute__((musttail))
in C. The attribute is applied to a return statement (not a function declaration), and an error is emitted if a tail call cannot be guaranteed, for example if the function signatures of caller and callee are not compatible. Guaranteed tail calls enable a class of algorithms that would otherwise use an arbitrary amount of stack space.
Added SPIR-V
triple and binary generation using external llvm-spirv
tool.
Completed support of OpenCL C 3.0 and C++ for OpenCL 2021 at experimental state.
Clang now supports the -fzero-call-used-regs
feature for x86. The purpose of this feature is to limit Return-Oriented Programming (ROP) exploits and information leakage. It works by zeroing out a selected class of registers before function return — e.g., all GPRs that are used within the function. There is an analogous zero_call_used_regs attribute to allow for finer control of this feature.
Clang now supports randomizing structure layout in C. This feature is a compile-time hardening technique, making it more difficult for an attacker to retrieve data from structures. Specify randomization with the randomize_layout
attribute. The corresponding no_randomize_layout
attribute can be used to turn the feature off.
A seed value is required to enable randomization, and is deterministic based on a seed value. Use the -frandomize-layout-seed=
or -frandomize-layout-seed-file=
flags.
Clang now supports the -fstrict-flex-arrays=<arg>
option to control which array bounds lead to flexible array members. The option yields more accurate __builtin_object_size
and __builtin_dynamic_object_size
results in most cases but may be overly conservative for some legacy code.
Experimental support for HLSL has been added. The implementation is incomplete and highly experimental.
See a table https://clang.llvm.org/cxx_status.html to track what desired features are implemented in Clang15.
登录 后才可以发表评论