[HLSL][Matrix] Support row-major `transpose` and `mul` by inserting matrix memory layout transformations (#186898) Fixes #184906 The SPIRV and DXIL backends assume matrices are provided in column-major order when lowering matrix transpose and matrix multiplication intrinsics. To support row-major order matrices from Clang/HLSL, we therefore need to convert row-major order matrices into column-major order matrices before applying matrix transpose and multiplication. A conversion from column-major order back to row-major order is also required for correctness after a matrix transpose or matrix multiply. For the matrix transpose case on row-major order matrices, the last two matrix memory layout transforms cancel each other out. So a row-major order matrix transpose is simply a column-major order transpose with the row and column dimensions swapped. For the matrix multiply case, this PR adds helper functions to the MatrixBuilder to convert a NxM row-/column-major order matrix into a NxM column-/row-major order matrix by applying a matrix transpose. These transformations take advantage of the fact that a row-major order matrix of NxM dimensions `rNxM` interpreted in column-major order is equivalent to its transpose in column-major order. Example: Let `r3x2 = [ 0, 1, 2, 3, 4, 5 ]`. The 3x2 row-major order matrix is visualized as ``` 0 1 2 3 4 5 ``` When `r3x2`, or `[ 0, 1, 2, 3, 4, 5 ]` is interpreted as a 2x3 column-major order matrix, it is visualized as: ``` 0 2 4 1 3 5 ``` which is equal to the transpose of `r3x2` but in column-major order. These matrix memory layout transformations are inserted before and after the matrix multiply and transpose intrinsics when lowering HLSL mul and transpose. We don't simplify the matrix multiply case because HLSL in Clang will eventually need to support the `row_major` and `column_major` keywords that allow matrices to independently be row-major or column-major regardless of the default matrix memory layout. While this method of supporting row-major order matrices is not performant, it is correct and will suffice for now until benchmarks are created and performance becomes a primary concern. Assisted-by: GitHub Copilot (powered by Claude Opus 4.6)
Welcome to the LLVM project!
This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.
The LLVM project has multiple components. The core of the project is itself called “LLVM”. This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.
C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.
Other components include: the libc++ C++ standard library, the LLD linker, and more.
Consult the Getting Started with LLVM page for information on building and running LLVM.
For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.
Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.
The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.