)]}'
{
  "log": [
    {
      "commit": "09c4a9137f3a19b053dc45724a62f4f73ec7746a",
      "tree": "a1a6bfa86072231014f8d5db46d72a4ecf4ecfae",
      "parents": [
        "217cd888533dcd84177b2f0eb22654f795d01d64"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Wed Jul 22 22:07:01 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 22 22:07:52 2026"
      },
      "message": "Use a local accumulator of int16 values within blocks in the int2 and int4 dot kernels on x86\n\nPiperOrigin-RevId: 952353528\n"
    },
    {
      "commit": "217cd888533dcd84177b2f0eb22654f795d01d64",
      "tree": "5d15df73310e04c76f02e1e7f6ea1d5b2a961159",
      "parents": [
        "b474850416e99876ecbff04fcd8bcbed33acd100"
      ],
      "author": {
        "name": "Samuel Fuller",
        "email": "samfuller@google.com",
        "time": "Wed Jul 22 21:40:28 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 22 21:41:02 2026"
      },
      "message": "Add 32768 (2^15) to the kDims array in the fully-connected subgraph benchmark.\n\nPiperOrigin-RevId: 952338902\n"
    },
    {
      "commit": "b474850416e99876ecbff04fcd8bcbed33acd100",
      "tree": "d830a58a39c55c4f3cbd067167b64805c3ae52d8",
      "parents": [
        "4673b3f5f489cf8755332b230eb7a3e2bd680283"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Wed Jul 22 21:24:59 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 22 21:26:26 2026"
      },
      "message": "Make the neonfma fp32 transcendental ops the consistent arithmetic kernels\n\nI didn\u0027t do this in the previous change, because we have no emulated fallback fma kernels. However, I think there is basically no scenario in which this will actually happen (I think fma is available on any target that wants to use the consistent arithmetic flag), and if it does cause a problem, we can just add emulated fma kernels.\n\nPiperOrigin-RevId: 952330983\n"
    },
    {
      "commit": "4673b3f5f489cf8755332b230eb7a3e2bd680283",
      "tree": "7df592f0ac6d86e5ab8ea91a11c2e1c60f4c70ed",
      "parents": [
        "950d955b5f998eafe54fd7ccd36f9dca0cbc0ab2"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Wed Jul 22 18:55:32 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 22 18:56:28 2026"
      },
      "message": "Add neonfma unary kernels\n\nBenchmarks on an S25 (sorted alphabetically, so the fma and non-fma kernels are next to each other).\n```\n-----------------------------------------------------------------------------------------------------------------------------\nBenchmark                                                                   Time             CPU   Iterations UserCounters...\n-----------------------------------------------------------------------------------------------------------------------------\nbench/approx_erf_fp32_neon/m:1/n:4096/real_time                          2107 ns         2097 ns        66063 Bytes\u003d15.5538G/s Op\u003d1.94423G/s\nbench/approx_erf_fp32_neonfma/m:1/n:4096/real_time                       1360 ns         1353 ns       101656 Bytes\u003d24.1007G/s Op\u003d3.01259G/s\nbench/approx_tanh_fp32_neon/m:1/n:4096/real_time                         1918 ns         1909 ns        73034 Bytes\u003d17.0844G/s Op\u003d2.13555G/s\nbench/approx_tanh_fp32_neonfma/m:1/n:4096/real_time                      1350 ns         1343 ns       103357 Bytes\u003d24.2698G/s Op\u003d3.03372G/s\nbench/cosine_fp32_neon/m:1/n:4096/real_time                              2602 ns         2589 ns        53858 Bytes\u003d12.5911G/s Op\u003d1.57388G/s\nbench/cosine_fp32_neonfma/m:1/n:4096/real_time                           1886 ns         1877 ns        74506 Bytes\u003d17.3698G/s Op\u003d2.17122G/s\nbench/cosine_fp64_neon/m:1/n:4096/real_time                              6626 ns         6593 ns        21152 Bytes\u003d9.89028G/s Op\u003d618.142M/s\nbench/erf_fp32_neon/m:1/n:4096/real_time                                 6332 ns         6300 ns        22042 Bytes\u003d5.17466G/s Op\u003d646.832M/s\nbench/erf_fp32_neonfma/m:1/n:4096/real_time                              4474 ns         4451 ns        31211 Bytes\u003d7.32391G/s Op\u003d915.489M/s\nbench/erf_fp64_neon/m:1/n:4096/real_time                                12856 ns        12787 ns        11051 Bytes\u003d5.09787G/s Op\u003d318.617M/s\nbench/exp_fp32_neon/m:1/n:4096/real_time                                 2426 ns         2413 ns        57453 Bytes\u003d13.5093G/s Op\u003d1.68866G/s\nbench/exp_fp32_neonfma/m:1/n:4096/real_time                              1742 ns         1733 ns        80541 Bytes\u003d18.8057G/s Op\u003d2.35072G/s\nbench/exp_fp64_neon/m:1/n:4096/real_time                                 4247 ns         4225 ns        32973 Bytes\u003d15.4304G/s Op\u003d964.398M/s\nbench/expm1_fp32_neon/m:1/n:4096/real_time                               2465 ns         2452 ns        56512 Bytes\u003d13.296G/s Op\u003d1.662G/s\nbench/expm1_fp32_neonfma/m:1/n:4096/real_time                            1804 ns         1794 ns        77702 Bytes\u003d18.168G/s Op\u003d2.271G/s\nbench/expm1_fp64_neon/m:1/n:4096/real_time                               4324 ns         4302 ns        32331 Bytes\u003d15.1551G/s Op\u003d947.191M/s\nbench/log_fp32_neon/m:1/n:4096/real_time                                 2536 ns         2523 ns        54889 Bytes\u003d12.9216G/s Op\u003d1.6152G/s\nbench/log_fp32_neonfma/m:1/n:4096/real_time                              2200 ns         2188 ns        63306 Bytes\u003d14.8975G/s Op\u003d1.86218G/s\nbench/log_fp64_neon/m:1/n:4096/real_time                                 5823 ns         5793 ns        24036 Bytes\u003d11.2539G/s Op\u003d703.368M/s\nbench/poly3_fp32_neon/m:1/n:4096/real_time                                381 ns          379 ns       370723 Bytes\u003d86.0492G/s Op\u003d10.7562G/s\nbench/poly3_fp32_neonfma/m:1/n:4096/real_time                             277 ns          276 ns       504702 Bytes\u003d118.263G/s Op\u003d14.7829G/s\nbench/poly3_fp64_neon/m:1/n:4096/real_time                                559 ns          556 ns       251421 Bytes\u003d117.34G/s Op\u003d7.33377G/s\nbench/reciprocal_square_root_fp32_neon/m:1/n:4096/real_time               714 ns          711 ns       195586 Bytes\u003d45.9254G/s Op\u003d5.74068G/s\nbench/reciprocal_square_root_fp32_neonfma/m:1/n:4096/real_time            708 ns          705 ns       196388 Bytes\u003d46.2502G/s Op\u003d5.78128G/s\nbench/reciprocal_square_root_fp64_neon/m:1/n:4096/real_time              1841 ns         1833 ns        76033 Bytes\u003d35.6025G/s Op\u003d2.22515G/s\nbench/sigmoid_fp32_neon/m:1/n:4096/real_time                             2822 ns         2810 ns        49835 Bytes\u003d11.6107G/s Op\u003d1.45133G/s\nbench/sigmoid_fp32_neonfma/m:1/n:4096/real_time                          2008 ns         1998 ns        70007 Bytes\u003d16.3171G/s Op\u003d2.03964G/s\nbench/sigmoid_fp64_neon/m:1/n:4096/real_time                             4965 ns         4940 ns        28251 Bytes\u003d13.1991G/s Op\u003d824.945M/s\nbench/sine_fp32_neon/m:1/n:4096/real_time                                2546 ns         2533 ns        55392 Bytes\u003d12.8704G/s Op\u003d1.6088G/s\nbench/sine_fp32_neonfma/m:1/n:4096/real_time                             1816 ns         1806 ns        78305 Bytes\u003d18.0487G/s Op\u003d2.25609G/s\nbench/sine_fp64_neon/m:1/n:4096/real_time                                6280 ns         6249 ns        22402 Bytes\u003d10.4362G/s Op\u003d652.265M/s\nbench/square_root_fp32_neon/m:1/n:4096/real_time                          484 ns          481 ns       288586 Bytes\u003d67.7722G/s Op\u003d8.47152G/s\nbench/square_root_fp32_neonfma/m:1/n:4096/real_time                       479 ns          476 ns       290764 Bytes\u003d68.4755G/s Op\u003d8.55944G/s\nbench/square_root_fp64_neon/m:1/n:4096/real_time                         1012 ns         1007 ns       139607 Bytes\u003d64.7688G/s Op\u003d4.04805G/s\nbench/tanh_fp32_neon/m:1/n:4096/real_time                                3363 ns         3345 ns        41664 Bytes\u003d9.74438G/s Op\u003d1.21805G/s\nbench/tanh_fp32_neonfma/m:1/n:4096/real_time                             2205 ns         2194 ns        63757 Bytes\u003d14.8625G/s Op\u003d1.85782G/s\nbench/tanh_fp64_neon/m:1/n:4096/real_time                                5819 ns         5789 ns        24112 Bytes\u003d11.2624G/s Op\u003d703.901M/s\n```\n\nPiperOrigin-RevId: 952250416\n"
    },
    {
      "commit": "950d955b5f998eafe54fd7ccd36f9dca0cbc0ab2",
      "tree": "37a4c0c7456244d48ee97d9b24222387a0edbe24",
      "parents": [
        "032ec84d99b24977edec2b63d5947bb833bb5fa8"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Wed Jul 22 12:56:14 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 22 12:57:05 2026"
      },
      "message": "Fix workspace buffer alignment in fully_connected benchmark.\n\nPiperOrigin-RevId: 952066792\n"
    },
    {
      "commit": "032ec84d99b24977edec2b63d5947bb833bb5fa8",
      "tree": "e3798300ca836fbafdd47a25883fbaffe0ffee74",
      "parents": [
        "2598adb390efb0d9077d10a45042b2c0fde1f6fc"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Tue Jul 21 21:10:27 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 21:11:24 2026"
      },
      "message": "Enable splitting of dimension 0 for sub-byte types\n\nPiperOrigin-RevId: 951690981\n"
    },
    {
      "commit": "2598adb390efb0d9077d10a45042b2c0fde1f6fc",
      "tree": "8db7d285cfeaa2fa7aee2188015ac6a6f3d5b99c",
      "parents": [
        "cb8fb83bb4d40bae88faefc4914f4afe2276af1f"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Tue Jul 21 19:33:10 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 19:35:58 2026"
      },
      "message": "Allow using kn kernels even if kn \u003d\u003d 1\n\nFor example, if reducing dimension 1 of shape 1024,1,1024, this is basically a no-op, but if we use the kn kernel, we can vectorize:\n```\nname                                                                   time/op        time/op     vs base\nint8_int32/dim0:1024/dim1:1/dim2:1024/kdims:2/process_time/real_time   1897.5µ ± 12%   170.1µ ± 4%  -91.04% (p\u003d0.002 n\u003d6)\n```\n\nPiperOrigin-RevId: 951638697\n"
    },
    {
      "commit": "cb8fb83bb4d40bae88faefc4914f4afe2276af1f",
      "tree": "48073d8fad99bd10b54068a9d1b050cbc40a53e4",
      "parents": [
        "0eea4b063fedb4dc3487fa350d960914b1244d6c"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Tue Jul 21 19:32:04 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 19:32:53 2026"
      },
      "message": "Allow unrecognized benchmark parameters\n\nPiperOrigin-RevId: 951638172\n"
    },
    {
      "commit": "0eea4b063fedb4dc3487fa350d960914b1244d6c",
      "tree": "8877a0bbb25e4c1d95659b3f134bd4e038e6a6b5",
      "parents": [
        "4c35ef82384c2a2a5c5125fe7e6fa9218f7f62dc"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Tue Jul 21 18:57:27 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 19:01:28 2026"
      },
      "message": "Add generic reduce_bench benchmark\n\nThis is mostly an AI port of dot_bench to reductions.\n\nPiperOrigin-RevId: 951619740\n"
    },
    {
      "commit": "4c35ef82384c2a2a5c5125fe7e6fa9218f7f62dc",
      "tree": "9b5859e681de358e416c9792597c76ec771ac2c9",
      "parents": [
        "8c138a4bd7ff892c8e903d773d84977a903fa324",
        "d0329965b5cc64c2d552913c1317e72d56e118c9"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 18:58:20 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 18:58:20 2026"
      },
      "message": "Merge pull request #10789 from RuwanpuragePawan:patch/nchw-convolution-integer-overflow\n\nPiperOrigin-RevId: 951619708\n"
    },
    {
      "commit": "8c138a4bd7ff892c8e903d773d84977a903fa324",
      "tree": "b3845e717963e6e0d1a2e501be6a9902c260a624",
      "parents": [
        "56b012fc9060c47d48cf0a7930931d74bae69c37"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Tue Jul 21 17:30:45 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 17:32:20 2026"
      },
      "message": "Rewrite sin/cos kernels to improve accuracy\n\nThe main change is to how range reduction works, to avoid catastrophic cancellation in the final [-pi/2, pi/2] reduction by examining the quadrant that the argument lies in. This enables removing any absolute tolerance from the tests for correctness.\n\nOverall changes are:\n- Increases the range in which the range reduction is correct.\n- The implementation moves to the SIMD wrappers, enabling a \"gnu_vector\" implementation.\n- Adds fp64 kernels\n\nThe performance impact of this is mixed, it is a significant speedup on avx512, a wash on avx2+fma3, but a significant regression elsewhere.\n\n```\nbench/cosine_fp32_avx512/m:1/n:4096/real_time        2.418µ ± 7%   1.796µ ±  4%   -25.73% (p\u003d0.002 n\u003d6)\nbench/sine_fp32_avx512/m:1/n:4096/real_time          2.341µ ± 3%   1.735µ ± 12%   -25.89% (p\u003d0.002 n\u003d6)\nbench/cosine_fp32_fma3/m:1/n:4096/real_time          3.547µ ± 2%   3.537µ ±  4%         ~ (p\u003d0.589 n\u003d6)\nbench/sine_fp32_fma3/m:1/n:4096/real_time            3.294µ ± 1%   3.368µ ±  4%         ~ (p\u003d0.065 n\u003d6)\nbench/cosine_fp32_avx/m:1/n:4096/real_time           3.114µ ± 4%   6.525µ ±  5%  +109.51% (p\u003d0.002 n\u003d6)\nbench/sine_fp32_avx/m:1/n:4096/real_time             2.995µ ± 2%   5.489µ ±  1%   +83.30% (p\u003d0.002 n\u003d6)\nbench/cosine_fp32_sse41/m:1/n:4096/real_time         6.220µ ± 4%   8.780µ ±  6%   +41.16% (p\u003d0.002 n\u003d6)\nbench/sine_fp32_sse41/m:1/n:4096/real_time           5.901µ ± 2%   8.661µ ±  5%   +46.79% (p\u003d0.002 n\u003d6)\n```\n\nPiperOrigin-RevId: 951568883\n"
    },
    {
      "commit": "d0329965b5cc64c2d552913c1317e72d56e118c9",
      "tree": "f42296b5da74621f95f326ec743d4fb290346e32",
      "parents": [
        "f825effdd77a839b7c2d2297dbf15f72c624e7a0"
      ],
      "author": {
        "name": "RuwanpuragePawan",
        "email": "ruwanpuragepawannimeshranasing@gmail.com",
        "time": "Sun Jul 19 09:03:00 2026"
      },
      "committer": {
        "name": "RuwanpuragePawan",
        "email": "ruwanpuragepawannimeshranasing@gmail.com",
        "time": "Tue Jul 21 15:38:35 2026"
      },
      "message": "Fix integer overflows in NCHW convolution size calculations\n\nThis patch secures size_t multiplication paths inside both\ncreate_conv2d_hwc2chw_path() and create_dwconv_path() by routing\nthem through xnn_safe_mul().\n\nUnchecked parameter validation allowed massive group and channel\nboundaries to wrap around to small sizes, causing out-of-bounds heap\ncorruption and memory faults. Chaining validation checks forces an\nimmediate xnn_status_out_of_memory error exit before allocation occurs.\n"
    },
    {
      "commit": "56b012fc9060c47d48cf0a7930931d74bae69c37",
      "tree": "92dcac61e608cb67cf0854f05a34ee859791b426",
      "parents": [
        "b33103f153afa783bd8c323143acab196e7b9506",
        "931a8e95a39eb0c0cb919954d2706df58adabbb2"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 13:21:46 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 13:21:46 2026"
      },
      "message": "Merge pull request #10780 from destro4evr-rgb:fix/resize-bilinear-indirection-buffer-size-overflow\n\nPiperOrigin-RevId: 951450473\n"
    },
    {
      "commit": "b33103f153afa783bd8c323143acab196e7b9506",
      "tree": "fa3285ba5ed20cd88a43955a3da76e06aa58e086",
      "parents": [
        "638038e507afbd87df58bbc261498c9c51b7888c",
        "80e2cdeebada2fdcd278a35a4695ed3df1c18020"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 12:10:22 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 12:10:22 2026"
      },
      "message": "Merge pull request #10782 from destro4evr-rgb:fix/unpooling-indirection-size-overflow\n\nPiperOrigin-RevId: 951422248\n"
    },
    {
      "commit": "638038e507afbd87df58bbc261498c9c51b7888c",
      "tree": "b4911ee9baec395e304f43bd8285a4ab23c86261",
      "parents": [
        "f08e7e8c616103055737f9ff453402ea0d03f036"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Tue Jul 21 10:25:01 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 10:25:52 2026"
      },
      "message": "Add int8xint8 FC XNNPACK kernels for AVXVNNIINT8\n\n- Was 1x8C8\n- Now generate MR\u003d2 to MR\u003d8\n\nPiperOrigin-RevId: 951379740\n"
    },
    {
      "commit": "f08e7e8c616103055737f9ff453402ea0d03f036",
      "tree": "d4d821823b267e699cad9a38be403052836a6c0e",
      "parents": [
        "8def0d9711b9c6b9a4be2c5b05cafda48728a743",
        "bd7f032a826f5e016f02200bb89d8eb8ac17a3a1"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 06:37:10 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 06:37:10 2026"
      },
      "message": "Merge pull request #10781 from destro4evr-rgb:fix/dwconv-step-height-indirection-size-overflow\n\nPiperOrigin-RevId: 951277783\n"
    },
    {
      "commit": "8def0d9711b9c6b9a4be2c5b05cafda48728a743",
      "tree": "0968cd2f9eb96b528af727126c9869cbe5ce91e3",
      "parents": [
        "ec5a9e058bb2a8f6af6bdee231585d1d75adf415"
      ],
      "author": {
        "name": "Ping Yu",
        "email": "piyu@google.com",
        "time": "Tue Jul 21 01:31:03 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 21 01:32:52 2026"
      },
      "message": "Implement parsing for CUSTOM and STABLEHLO_COMPOSITE operations.\n\nPiperOrigin-RevId: 951166041\n"
    },
    {
      "commit": "ec5a9e058bb2a8f6af6bdee231585d1d75adf415",
      "tree": "9fd007d7f1afb0d776eabd61d980fc9b83c645fc",
      "parents": [
        "b97724553d890bd4dfdc3d56c89094ba1323f4d8"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Mon Jul 20 21:56:58 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 20 21:57:36 2026"
      },
      "message": "Add a couple more shapes into XNNPACK attention benchmark.\n\nPiperOrigin-RevId: 951069323\n"
    },
    {
      "commit": "b97724553d890bd4dfdc3d56c89094ba1323f4d8",
      "tree": "35620bcfd51178812b65658a2c050dfc1f2f316a",
      "parents": [
        "9bc10e6756445da5ceda43ffcfd9d16760cb6f5b"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Mon Jul 20 18:51:30 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 20 18:52:11 2026"
      },
      "message": "Fix schedule of pack_b for dot\n\nIt is dimensions 0 and 1 that need the specific step, not 0 and 2.\n\nPiperOrigin-RevId: 950969195\n"
    },
    {
      "commit": "9bc10e6756445da5ceda43ffcfd9d16760cb6f5b",
      "tree": "1cb0d86f3bf84319308017e445f352f5e6701aa0",
      "parents": [
        "d9d3702e1b1b1ddb0d0855e0c7191ed3055796bb"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Mon Jul 20 17:27:51 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 20 17:28:45 2026"
      },
      "message": "Compute loop workers globally on the assembled loop nest\n\nWorkers were computed per-function in make_schedule, before fusion:\nthe accounting of tasks produced by outer loops only saw the function\u0027s own loop nest. After fusion a function\u0027s loops can end up inside loops of other functions (and shared loops can change steps), so those counts are stale.\n\nInstead, make_schedule no longer computes workers at all. Once\nschedule() has assembled the whole global loop nest (and all steps are final), a new compute_workers() pass walks the nest and computes the workers of every loop from the number of tasks its ancestor chain can produce, using the same target task count formula as before. Emission reads the workers from the loop level, next to the step.\n\nThis is pretty significant improvement for some of the benchmarks, for example:\n\n```\nQD8TransformerBlock/T:128/D:1536/N:6/H:256/F:12288/process_time/real_time            42.14m ±   5%    10.79m ±   2%  -74.39% (p\u003d0.002 n\u003d6)\nQD8TransformerBlock/T:128/D:2048/N:8/H:256/F:16384/process_time/real_time            72.22m ±   8%    18.46m ±   3%  -74.43% (p\u003d0.002 n\u003d6)\nQD8TransformerBlock/T:128/D:2304/N:8/H:256/F:9216/process_time/real_time             47.47m ±   4%    13.06m ±   4%  -72.48% (p\u003d0.002 n\u003d6)\nQD8TransformerBlock/T:128/D:1152/N:4/H:256/F:6912/process_time/real_time             5.438m ±   6%    5.489m ±   4%        ~ (p\u003d0.699 n\u003d6)\nFP32TransformerBlock/T:128/D:1536/N:6/H:256/F:12288/process_time/real_time           26.64m ±   2%    11.41m ±   6%  -57.17% (p\u003d0.002 n\u003d6)\nFP32TransformerBlock/T:128/D:2048/N:8/H:256/F:16384/process_time/real_time           46.85m ±   6%    19.62m ±   6%  -58.11% (p\u003d0.002 n\u003d6)\nFP32TransformerBlock/T:128/D:2304/N:8/H:256/F:9216/process_time/real_time            31.25m ±   4%    14.14m ±   6%  -54.76% (p\u003d0.002 n\u003d6)\nFP32TransformerBlock/T:128/D:1152/N:4/H:256/F:6912/process_time/real_time            5.347m ±   3%    5.357m ±   8%        ~ (p\u003d0.589 n\u003d6)\nFP16TransformerBlock/T:128/D:1536/N:6/H:256/F:12288/process_time/real_time           26.59m ±   4%    11.44m ±   7%  -56.99% (p\u003d0.002 n\u003d6)\nFP16TransformerBlock/T:128/D:2048/N:8/H:256/F:16384/process_time/real_time           46.98m ±   4%    19.79m ±   6%  -57.88% (p\u003d0.002 n\u003d6)\nFP16TransformerBlock/T:128/D:2304/N:8/H:256/F:9216/process_time/real_time            31.56m ±   4%    14.21m ±   4%  -54.98% (p\u003d0.002 n\u003d6)\nFP16TransformerBlock/T:128/D:1152/N:4/H:256/F:6912/process_time/real_time            5.405m ±   4%    5.352m ±   5%        ~ (p\u003d0.394 n\u003d6)\n```\n\nPiperOrigin-RevId: 950918457\n"
    },
    {
      "commit": "931a8e95a39eb0c0cb919954d2706df58adabbb2",
      "tree": "db241648473f086d2f116c9403882a7039ebbeac",
      "parents": [
        "d306a347e0959349910bd6feef2cb0030748c9b9"
      ],
      "author": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Mon Jul 20 17:04:49 2026"
      },
      "committer": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Mon Jul 20 17:04:49 2026"
      },
      "message": "resize-bilinear-nhwc: revert unnecessary cosmetic change in setup function\n\nThe packed_weights_size computation in xnn_setup_resize_bilinear2d_nhwc\ndoes not need overflow protection (dimensions were already validated in\nreshape). Restore the original one-liner.\n"
    },
    {
      "commit": "d9d3702e1b1b1ddb0d0855e0c7191ed3055796bb",
      "tree": "f8a81a30d58888d684339755c0dcb10de5de1a7e",
      "parents": [
        "f825effdd77a839b7c2d2297dbf15f72c624e7a0",
        "9175dd9f28f872cd3aa163d9c6e33807965019d7"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 20 14:37:42 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 20 14:37:42 2026"
      },
      "message": "Merge pull request #10779 from aizu-m:reduce-compute-dim-narrowing\n\nPiperOrigin-RevId: 950830317\n"
    },
    {
      "commit": "f825effdd77a839b7c2d2297dbf15f72c624e7a0",
      "tree": "47d6f4e8decb37ca2b147ed2aa88dcf7245b7714",
      "parents": [
        "32cad9dd73ba05f969317765a0a7ec88b20b1765"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Sat Jul 18 18:26:06 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Sat Jul 18 18:27:11 2026"
      },
      "message": "Adjust thread counts in YNNPACK benchmarks and runtime.\n\nThe TestScheduler and ynn::threadpool now consistently treat the provided thread count as the number of *background* worker threads. The thread that invokes the runtime or benchmark is also a participant in the work, so the total number of threads executing tasks is one more than the number of background threads. Benchmarks are updated to pass thread_count - 1 to the scheduler, and the runtime adjusts its internal thread count calculation accordingly.\n\nPiperOrigin-RevId: 950120361\n"
    },
    {
      "commit": "32cad9dd73ba05f969317765a0a7ec88b20b1765",
      "tree": "95424287858b684dd7df5374b04af964e513d480",
      "parents": [
        "fd976d9277e8ccc49991a29230e1eb604d07e1c3"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Fri Jul 17 23:39:06 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 23:39:43 2026"
      },
      "message": "Disable fp8 on Apple by default\n\nDue to lack of compiler support\n\nPiperOrigin-RevId: 949820854\n"
    },
    {
      "commit": "bd7f032a826f5e016f02200bb89d8eb8ac17a3a1",
      "tree": "10d706273f60b7d61cca681fe7cf38921dda27ca",
      "parents": [
        "fd976d9277e8ccc49991a29230e1eb604d07e1c3"
      ],
      "author": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Fri Jul 17 22:01:17 2026"
      },
      "committer": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Fri Jul 17 22:01:17 2026"
      },
      "message": "operators: guard dwconv step_height and indirection buffer size against overflow\n\nreshape_dwconv computed step_height and indirection_buffer_size through\nconsecutive unguarded multiplications. On 32-bit targets (WebAssembly,\nARM32) where size_t is 32 bits, large output_width, kernel_height, or\noutput_height values cause both products to silently wrap, resulting in\nan undersized heap allocation followed by a heap out-of-bounds write\nwhen xnn_indirection_init_dwconv2d fills the full untruncated extent.\n\nReplace both computations with xnn_safe_mul/xnn_safe_add chains that\nreturn xnn_status_out_of_memory on overflow.\n"
    },
    {
      "commit": "d306a347e0959349910bd6feef2cb0030748c9b9",
      "tree": "e7f91a7e4bad2558cfe8060b248dddc021a7efc3",
      "parents": [
        "fd976d9277e8ccc49991a29230e1eb604d07e1c3"
      ],
      "author": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Fri Jul 17 21:31:05 2026"
      },
      "committer": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Fri Jul 17 21:31:05 2026"
      },
      "message": "resize-bilinear: guard indirection buffer and packed weights size against overflow\n\nxnn_reshape_resize_bilinear2d_nhwc_* and xnn_reshape_resize_bilinear2d_nchw_*\ncomputed the indirection buffer size and packed-weights size with raw\nmultiplication:\n\n  sizeof(void*) * (output_height * output_width * 4)\n  (output_height * output_width * 2) \u003c\u003c log2_weight_element_size\n\nOn 32-bit targets (WebAssembly, ARM32) where size_t is 32 bits, the product\noverflows when output_height * output_width \u003e\u003d 2^28 (e.g. 16384 x 16384),\nwrapping the allocation size to zero. xnn_indirection_init_resize_bilinear2d_*\nthen writes 4 * output_height * output_width pointer entries into this\nundersized buffer, producing a heap out-of-bounds write.\n\nReplace the raw multiplications with chained xnn_safe_mul calls that return\nxnn_status_out_of_memory on overflow, matching the pattern used in other\nreshape paths.\n"
    },
    {
      "commit": "9175dd9f28f872cd3aa163d9c6e33807965019d7",
      "tree": "b20b5a28c6db1972609c05c01375460f33e754d4",
      "parents": [
        "fd976d9277e8ccc49991a29230e1eb604d07e1c3"
      ],
      "author": {
        "name": "Aizal Khan",
        "email": "aizumusheer2@gmail.com",
        "time": "Fri Jul 17 19:21:44 2026"
      },
      "committer": {
        "name": "Aizal Khan",
        "email": "aizumusheer2@gmail.com",
        "time": "Fri Jul 17 19:21:44 2026"
      },
      "message": "use size_t for reduced input dims in reduce compute kernels\n"
    },
    {
      "commit": "fd976d9277e8ccc49991a29230e1eb604d07e1c3",
      "tree": "599e99a3bb30449b7849b90d8ccade12886413ce",
      "parents": [
        "c2a632ded5abfaa93fc9de9acda609b8197a24fe"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Fri Jul 17 18:49:39 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 18:50:39 2026"
      },
      "message": "Use cpuinfo support for ARM fp8\n\nNow that we can detect whether fp8 is supported at runtime, we can enable these kernels in the BUILD by default.\n\nPiperOrigin-RevId: 949683820\n"
    },
    {
      "commit": "c2a632ded5abfaa93fc9de9acda609b8197a24fe",
      "tree": "1d979230092ad3218358c8400bf208b5d39cd3ae",
      "parents": [
        "aaecd5dae3cbc9ef16d961bcd006f894eea2b664"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Fri Jul 17 17:42:42 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 17:43:39 2026"
      },
      "message": "Fix validation tests when running in YNNPACK\n\n- Return error code matching XNNPACK for reshape errors.\n- Fix missing validation for reshape.\n\nPiperOrigin-RevId: 949648827\n"
    },
    {
      "commit": "aaecd5dae3cbc9ef16d961bcd006f894eea2b664",
      "tree": "5d1c7042f5ab125f3d93a45d67c27f39c5ecb8cd",
      "parents": [
        "53a1797ba4360cbde068f2a984652be0f0b7b6fe"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Fri Jul 17 17:33:57 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 17:35:10 2026"
      },
      "message": "Update cpuinfo in XNNPACK\n\nCommit history for pytorch/cpuinfo (ea6b9f1b -\u003e 6882af58):\n- 4628dc06 Vineeth Chelur: Implement `cpuinfo_deinitialize()` to free heap-allocated globals (#387)\n- 398ad9f5 fbarchard: fix: remove unused import \u0027string\u0027 in android-device-dump.py (#391)\n- 0c0ab15c fbarchard: Add Apple M1/M2/M3/M4 Linux MIDR detection (#385)\n- f9176bdf maxim-davgalev: Initialize vendor/uarch out-parameters in cpuinfo_arm_decode_vendor_uarch (#395)\n- f72c5c54 fbarchard: Add Nova Lake (Coyote Cove) uarch support (#393)\n- f6f4b3ed vozvivan: arm/linux: zero-initialize stack buffers to fix Valgrind warnings (#376)\n- 66eb8598 johnthacker: Normalization of AMD brand string with \"with Radeon Graphics\" (#389)\n- 317b8b50 Dmitry Baryshkov: Fix syscall() undeclared error if built with stricter build options (#374)\n- 315d03ca James Y Knight: Delete unused variable \"core_apic_id\" (#272)\n- ae544364 Nikita Shulga: [CI] Fix Win UWP builds (#401)\n- 69e20fa2 Mark Hansen: Remove obsolete .travis.yml (#410)\n- b1a5d63f Nikita Shulga: Revert \"Implement `cpuinfo_deinitialize()` to free heap-allocated globals (#387)\" (#411)\n- 9c6d2485 fbarchard: Add AMX-FP8 ISA feature detection and reporting to cpuinfo (#416)\n- c3766aef fbarchard: Guard _DEFAULT_SOURCE with #ifndef in src/api.c (#422)\n- 6882af58 fbarchard: Add ARM v8.7 FP8 ISA feature detection and reporting to cpuinfo. (#413)\n\nPiperOrigin-RevId: 949644336\n"
    },
    {
      "commit": "53a1797ba4360cbde068f2a984652be0f0b7b6fe",
      "tree": "6f4286612ab41f80e1fcc59ab6ea88ce345f0219",
      "parents": [
        "5251854e5249238c3a6b0a2e6c103dcc27dccc5b"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Fri Jul 17 16:46:40 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 16:47:22 2026"
      },
      "message": "Fix crash due to not allowing internal extra dimensions.\n\nFixes https://github.com/jax-ml/jax/issues/39257 (after updating XNNPACK in XLA).\n\nIn this case, the following happens:\n\n- `sum(multiply(A*B))` gets rewritten to `dot(A, B)`\n- The dot wants to skip packing of B based on the dot shape. This usually results in a transpose that inserts two new dimensions that can alias. This handled the extra 2 dimensions like dot expects.\n- However, in this case, that transpose fuses with another transpose, and cannot alias. Transpose doesn\u0027t support the extra 2 internal dimensions \u003d\u003e crash.\n\nIn this fix, I went ahead and replaced all internal rank upper bounds with the internal max rank, rather than trying to only do it where we think they will be needed by dot ops.\n\nPiperOrigin-RevId: 949619272\n"
    },
    {
      "commit": "5251854e5249238c3a6b0a2e6c103dcc27dccc5b",
      "tree": "eced340a9c6ef924f48c3fbba05005641e5c65f4",
      "parents": [
        "9b0e561dcfdedcf389c2d03897772ed603fd1d6a",
        "9fe71df82d9a31e67235f7c05b820d89d41bcd0b"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 11:51:21 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 11:51:21 2026"
      },
      "message": "Merge pull request #10702 from destro4evr-rgb:fix/litert-arithmetic-axis-bounds-checks\n\nPiperOrigin-RevId: 949500929\n"
    },
    {
      "commit": "9fe71df82d9a31e67235f7c05b820d89d41bcd0b",
      "tree": "eced340a9c6ef924f48c3fbba05005641e5c65f4",
      "parents": [
        "9b0e561dcfdedcf389c2d03897772ed603fd1d6a"
      ],
      "author": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Thu Jul 16 14:26:56 2026"
      },
      "committer": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Fri Jul 17 09:56:48 2026"
      },
      "message": "litert/tensor/arithmetic.h: bounds-check axis and index values before vector subscript\n\nRebased onto master after lvalue LockedBufferSpan refactor.\n"
    },
    {
      "commit": "9b0e561dcfdedcf389c2d03897772ed603fd1d6a",
      "tree": "49da882dda33d581460b4886d0102b86cc376bd5",
      "parents": [
        "0e925591be0b8654b89383e47c4d4c16ee26e786",
        "c8306cee9cb65aaefc09ba98693be734eeb14f61"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 09:02:24 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 09:02:24 2026"
      },
      "message": "Merge pull request #10765 from destro4evr-rgb:fix/litert-unpack-axis-bounds-check\n\nPiperOrigin-RevId: 949440833\n"
    },
    {
      "commit": "0e925591be0b8654b89383e47c4d4c16ee26e786",
      "tree": "c2b820c71752660b72b306c3c10178f60eb6cc60",
      "parents": [
        "1dfe86ea44889c8fca254bcacf4d3378c37565fe",
        "11f8dc86bb1f1aed055126d130bac5545c33b5e6"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 08:40:45 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 08:40:45 2026"
      },
      "message": "Merge pull request #10756 from sachinhambar:fix/lut-fusion-invalid-unary-node\n\nPiperOrigin-RevId: 949432896\n"
    },
    {
      "commit": "1dfe86ea44889c8fca254bcacf4d3378c37565fe",
      "tree": "20630375eb6ecce8f92d3e42486f5166ac5bf201",
      "parents": [
        "218bb2e0ad827f97bae844c2f57ea53ed1f16192",
        "79cbaa9ac10098315f4e32e0a0068f1877886c7e"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 08:33:30 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 08:33:30 2026"
      },
      "message": "Merge pull request #10748 from aizu-m:conv-reshape-input-channels\n\nPiperOrigin-RevId: 949430405\n"
    },
    {
      "commit": "218bb2e0ad827f97bae844c2f57ea53ed1f16192",
      "tree": "805f6e710a3afa0cb45f7504d8b1e12cf5e390a9",
      "parents": [
        "ff8907d9dd17d9cde438a13ccf4c4a39c6cd57e9"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Fri Jul 17 08:11:43 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 08:12:35 2026"
      },
      "message": "Change x86 uint8 x int2 dot kernels to use tile_k \u003d 8\n\nPiperOrigin-RevId: 949423256\n"
    },
    {
      "commit": "11f8dc86bb1f1aed055126d130bac5545c33b5e6",
      "tree": "0a9c0d2b474c8b5530fac30242978b598c5c1cbe",
      "parents": [
        "097670022557a7efd79139317d75497d41a9da2a"
      ],
      "author": {
        "name": "Sachin Hambar",
        "email": "sachinhambar@gmail.com",
        "time": "Fri Jul 17 04:20:11 2026"
      },
      "committer": {
        "name": "Sachin Hambar",
        "email": "sachinhambar@gmail.com",
        "time": "Fri Jul 17 04:20:11 2026"
      },
      "message": "Move num_inputs assert past the LUT-fused-node check in is_pure_unary_elementwise\n\nThe assert(node-\u003enum_inputs \u003e\u003d 1) ran before the num_inputs !\u003d 1 check,\nso a node already fused into a LUT (num_inputs \u003d\u003d 2) never reached that\nguard safely in debug builds—asserting is fine once we know we\u0027re past\nthe fused-node case, not before.\n"
    },
    {
      "commit": "ff8907d9dd17d9cde438a13ccf4c4a39c6cd57e9",
      "tree": "3693a6c605c096e4b8e933fbc188450f4726119b",
      "parents": [
        "1336d27f8ed9e4716dc4e103b9c5895aa05218dd"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Fri Jul 17 01:42:54 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 01:43:39 2026"
      },
      "message": "Actually provide splits for the pack.\n\nPreviously, we did have splits, but they always were equal to extent, which allowed them be fused with the dot, but didn\u0027t give pack any loops on it\u0027s own. Now, we actually provide splits so pack can be, for example, parallelized on it\u0027s own if it\u0027s not fused.\n\nI ran benchmarks and it\u0027s pretty much neutral.\n\nPiperOrigin-RevId: 949293944\n"
    },
    {
      "commit": "1336d27f8ed9e4716dc4e103b9c5895aa05218dd",
      "tree": "94abfe226a1724f5e37805afa4853a981744c714",
      "parents": [
        "f6d310ea256480361335d3370f3b61e0cea75a2a"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Thu Jul 16 22:47:41 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 17 00:29:32 2026"
      },
      "message": "Add `YNN_NODE_FLAG_KEEP_SHAPE` flag and implement it in `ynn_define_slice_like`\n\nThis allows cropping computations without actually changing the shape.\n\nPiperOrigin-RevId: 949219372\n"
    },
    {
      "commit": "f6d310ea256480361335d3370f3b61e0cea75a2a",
      "tree": "47586996e3def5a17f494586dae817fe3380186e",
      "parents": [
        "6d553e8984714aea57dfe95480913e9a4c9738bd"
      ],
      "author": {
        "name": "Misha Gutman",
        "email": "aelphy@google.com",
        "time": "Thu Jul 16 21:59:42 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Thu Jul 16 22:00:37 2026"
      },
      "message": "Fixed static tensor behavior with ynnpack to match XNNPACK. Old behavior didn\u0027t allow for external tensors with static weights.\n\nPiperOrigin-RevId: 949192669\n"
    },
    {
      "commit": "c8306cee9cb65aaefc09ba98693be734eeb14f61",
      "tree": "f5f22b9ee88e99e86df57cd4e3cb0144856b36f5",
      "parents": [
        "b1858faaf6ba76f9df1a056524b216bbfa968121"
      ],
      "author": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Thu Jul 16 21:04:21 2026"
      },
      "committer": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Thu Jul 16 21:04:21 2026"
      },
      "message": "litert/arithmetic: add axis bounds check in Unpack()\n\nUnpack() passed the axis parameter directly to vector::erase() without\nvalidating it was within [0, input_rank). An out-of-bounds axis produces\nUB and heap corruption. Add the same guard that Pack() already has.\n"
    },
    {
      "commit": "6d553e8984714aea57dfe95480913e9a4c9738bd",
      "tree": "01acca47deac4d3da9ee7b2c0c3e1d7a830a0545",
      "parents": [
        "b1858faaf6ba76f9df1a056524b216bbfa968121"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Thu Jul 16 20:56:46 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Thu Jul 16 20:57:54 2026"
      },
      "message": "Add xnn_pack_qs8_to_qu8_qc4uw_gemm_gio_w\n- needed for x86 to support gio layout for QD8 with 4 bit weights\n\nPiperOrigin-RevId: 949157997\n"
    },
    {
      "commit": "b1858faaf6ba76f9df1a056524b216bbfa968121",
      "tree": "ab67f6b8d485481d24a3970198d2d19292cc6453",
      "parents": [
        "dff8cb5db8c102d0f65140838f11de8bc9668c86"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Thu Jul 16 18:30:10 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Thu Jul 16 18:31:05 2026"
      },
      "message": "Re-enable packw optimized kernels and refine ISA guards in GEMM configs.\n\nPiperOrigin-RevId: 949078590\n"
    },
    {
      "commit": "dff8cb5db8c102d0f65140838f11de8bc9668c86",
      "tree": "dc818519a2dcf3fe7b2076db22d3b1a5bee17820",
      "parents": [
        "464e9d3f0376d56bc650e5d4358aac27835f277e"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Thu Jul 16 10:47:27 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Thu Jul 16 10:51:28 2026"
      },
      "message": "Force `LockedBufferSpan` data access through an lvalue.\n\n`LockedBufferSpan` used through an rvalue is dangerous since the buffer\nis unlocked on destruction. We want to ensure that someone doesn\u0027t write\na one-liner that accesses the data of a `LockedBufferSpan` without\nensuring that it survives until the end.\n\nThis also fixes the places in the codebase where these invalid uses have\nbeen used.\n\nPiperOrigin-RevId: 948877587\n"
    },
    {
      "commit": "80e2cdeebada2fdcd278a35a4695ed3df1c18020",
      "tree": "8a2fd91d3e2b50265967b7d198c7020cc4f15dbd",
      "parents": [
        "464e9d3f0376d56bc650e5d4358aac27835f277e"
      ],
      "author": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Thu Jul 16 04:24:23 2026"
      },
      "committer": {
        "name": "destro4evr-rgb",
        "email": "destro4evr@proton.me",
        "time": "Thu Jul 16 04:24:23 2026"
      },
      "message": "unpooling: guard indirection buffer size computation against overflow\n\nxnn_reshape_unpooling2d_nhwc_x32 computed the indirection buffer size\nwith a raw four-factor multiplication (batch_size * input_height *\ninput_width * pooling_size * sizeof(void*)) with no overflow check.\nLarge caller-supplied dimensions silently wrap the product to a value far\nsmaller than required. xnn_reallocate_memory then returns a valid pointer\nto an undersized buffer, and xnn_indirection_init_unpool2d subsequently\nwrites the full (attacker-sized) number of pointer entries out of bounds.\n\nReplace the raw multiplication with chained xnn_safe_mul calls, matching\nthe pattern already used in argmax-pooling-nhwc.c and average-pooling-nhwc.c\nafter PR #10646.\n"
    },
    {
      "commit": "464e9d3f0376d56bc650e5d4358aac27835f277e",
      "tree": "1ec2dfa557de03e54e17914d41794337221da797",
      "parents": [
        "66b5f2b0774f774ff444d00a5a732b5e1fe23586"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Wed Jul 15 19:02:06 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 19:02:53 2026"
      },
      "message": "Remove deprecated quantization parameters from `ynn_define_convert`\n\nPiperOrigin-RevId: 948469402\n"
    },
    {
      "commit": "66b5f2b0774f774ff444d00a5a732b5e1fe23586",
      "tree": "690680fbb3d577880d6a78481457ee74fc5d4126",
      "parents": [
        "a635548f566e6c21faf7863bd5a20f35a541ccfd"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Wed Jul 15 18:52:31 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 18:53:22 2026"
      },
      "message": "Try keeping buffer and tensor type coherent.\n\nThe `vector` buffer variants in `TensorInit` are mainly used for testing\nand for small ad-hoc buffer creation. In these cases, we usualy know the\ntensor type and if should match the buffer type.\n\nThis change simplifies the code by sharing the code path for all vector\nvariants. Because the `vector` is always copied into a new buffer, we\nalso convert the elements to the tensor type during the copy. If the\ntensor type is unspecified, we deduce it from the vector type.\n\nA fix for the numerical test suite is also applied as it would set\npre-packed data as `int8_t` values which conflict with this change.\n\nPiperOrigin-RevId: 948464453\n"
    },
    {
      "commit": "a635548f566e6c21faf7863bd5a20f35a541ccfd",
      "tree": "571f885dddeae11c6236378ead0e951e6a9117d3",
      "parents": [
        "7647747ba01ee3345cb6358e9db5e879a59ff75b"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Wed Jul 15 18:17:28 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 18:18:19 2026"
      },
      "message": "Add a loop fusion test for combining dot steps with LCM.\n\nPiperOrigin-RevId: 948443900\n"
    },
    {
      "commit": "7647747ba01ee3345cb6358e9db5e879a59ff75b",
      "tree": "bcf16cd605cf53fd81765d7f2d6178452dd15469",
      "parents": [
        "b372ff19f1f087f22bf1a6db5d3287657ebc62d8"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Wed Jul 15 18:00:51 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 18:01:45 2026"
      },
      "message": "Add a transposed decode1 attention benchmark.\n\nThis mirrors the generic attention benchmark and is useful for comparison.\n\nPiperOrigin-RevId: 948433049\n"
    },
    {
      "commit": "b372ff19f1f087f22bf1a6db5d3287657ebc62d8",
      "tree": "95f0437aef4f7b23426643c858f258ba48e8ebda",
      "parents": [
        "b692f9111d05a7cd3c5f6d40bc3c892dde61ab90"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Wed Jul 15 17:15:49 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 17:16:49 2026"
      },
      "message": "Allow changing the order of the loops when looking for the match to fuse with.\n\nThis change is motivated by the ```dot(A, pack(f(B)))``` pipeline, where we would like all three functions to share the loops. For the func f, it\u0027s only possible if we reorder its loops.\n\nWhile working on this I also ran into an issue with the functions which had loops with extent-1: we skipped adding such loops for consumers, but when we still had them for producers when tried to match them, which was breaking perfectly valid fusions. This made loops in the attention_bench/AttentionDecode fully fused, which improved multi-threaded performance significantly (geomean improvement over all Decode cases is 24%).\n\nPiperOrigin-RevId: 948405387\n"
    },
    {
      "commit": "097670022557a7efd79139317d75497d41a9da2a",
      "tree": "72cce24bf5d50a223920d32072b104684ddde472",
      "parents": [
        "b692f9111d05a7cd3c5f6d40bc3c892dde61ab90"
      ],
      "author": {
        "name": "Sachin Hambar",
        "email": "sachinhambar@gmail.com",
        "time": "Wed Jul 15 15:10:02 2026"
      },
      "committer": {
        "name": "Sachin Hambar",
        "email": "sachinhambar@gmail.com",
        "time": "Wed Jul 15 15:10:02 2026"
      },
      "message": "Fix LUT fusion treating an already-fused LUT node as a plain unary op\n\n`xnn_define_unary_elementwise_lut_in_place` converts a chain of unary\nops in place into a node with 2 inputs (tensor + LUT table) and\n`unary_operator \u003d\u003d xnn_unary_invalid`. `is_pure_unary_elementwise` did\nnot check for this, so if the fusion pass encounters that node again\nit misreads `inputs[0]` as an ordinary single-input operand and tries\nto fuse the LUT node into a further LUT.\n"
    },
    {
      "commit": "b692f9111d05a7cd3c5f6d40bc3c892dde61ab90",
      "tree": "c90151f87bfecc8156c6eb8f0c9785c4b27acd46",
      "parents": [
        "f0fd32b3905f93d70c25604adddc759b65795480"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Wed Jul 15 11:43:18 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 11:44:11 2026"
      },
      "message": "Allow passing a scalar value as the `TensorInit` buffer.\n\nThe element is converted to the tensor type into a new buffer owned by\nthe Tensor.\n\nPiperOrigin-RevId: 948260670\n"
    },
    {
      "commit": "79cbaa9ac10098315f4e32e0a0068f1877886c7e",
      "tree": "f268b869c53ff46445445f60ad832036837e7079",
      "parents": [
        "599a1be6e4d564490ea4e66f869c617399d3637b"
      ],
      "author": {
        "name": "Aizal Khan",
        "email": "aizumusheer2@gmail.com",
        "time": "Wed Jul 15 10:21:52 2026"
      },
      "committer": {
        "name": "Aizal Khan",
        "email": "aizumusheer2@gmail.com",
        "time": "Wed Jul 15 10:21:52 2026"
      },
      "message": "Match channel-mismatch reshape error to batch-matrix-multiply wording\n"
    },
    {
      "commit": "f0fd32b3905f93d70c25604adddc759b65795480",
      "tree": "452c596177a66883e6a55824e0d5709fe943540b",
      "parents": [
        "0de517a4d700e0e59899f6edfab05ed31d246d73"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Wed Jul 15 04:30:45 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 04:34:42 2026"
      },
      "message": "Make B a dynamic value in concatenated_mixed_dot_bench\n\nThis more accurately reproduces the workload this is intended to represent, and also avoids inconsistency in whether pack_b constant folds or not, depending on the type of the conversion.\n\nPiperOrigin-RevId: 948084242\n"
    },
    {
      "commit": "0de517a4d700e0e59899f6edfab05ed31d246d73",
      "tree": "34299a282e83b70a02a6fda90bef3b9aab8b1e11",
      "parents": [
        "5fe544dfb1a761e85639991510d675bc226d85df"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 01:41:54 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 15 01:42:41 2026"
      },
      "message": "Fix quantization export to allow explicit q/ dq nodes in the graph\n\nPiperOrigin-RevId: 948024623\n"
    },
    {
      "commit": "5fe544dfb1a761e85639991510d675bc226d85df",
      "tree": "4a8bb23450f31ab426c8884f05e25489890274be",
      "parents": [
        "e74ee67f2b0424b672273541519664bf88f442d2"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Tue Jul 14 23:08:31 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 14 23:09:22 2026"
      },
      "message": "Support static FP32 bias for FP16 depthwise convolution 2D in XNNPACK subgraph.\n\nPiperOrigin-RevId: 947959414\n"
    },
    {
      "commit": "e74ee67f2b0424b672273541519664bf88f442d2",
      "tree": "cd408d6503b6d4eea7945a380e2da56c02991f8b",
      "parents": [
        "31020912dcb28db709a01f9b52a422d88cf9e5cd"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Tue Jul 14 22:15:52 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 14 22:16:50 2026"
      },
      "message": "Fix for fingerprint test\n\n- set kernel_scale_element_size to size of float to avoid ASAN failure\n\nPiperOrigin-RevId: 947931865\n"
    },
    {
      "commit": "31020912dcb28db709a01f9b52a422d88cf9e5cd",
      "tree": "328511ba51524f7c7789ac05a26725ae4fe2515f",
      "parents": [
        "2c59699a70bb4a1c55ba7b5b5181de401d9bc00b"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Tue Jul 14 17:16:28 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 14 17:17:23 2026"
      },
      "message": "Add dynamic shapes to `concatenated_mixed_dot_bench`\n\nPiperOrigin-RevId: 947758849\n"
    },
    {
      "commit": "2c59699a70bb4a1c55ba7b5b5181de401d9bc00b",
      "tree": "035f9d5d6f12549f0b69a5d286d59b158c5385f5",
      "parents": [
        "f303466171f74b3614302617c64730359ebb14c9"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Tue Jul 14 16:48:55 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 14 16:49:50 2026"
      },
      "message": "Remove exception for `pack_b` from the constant folding heuristic\n\nPiperOrigin-RevId: 947740722\n"
    },
    {
      "commit": "f303466171f74b3614302617c64730359ebb14c9",
      "tree": "a94726dfaed9512873176f23f88b39cf5adfc4e4",
      "parents": [
        "44db232920d81d2993720e349a55918c4524e5b6"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Tue Jul 14 16:24:33 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 14 16:25:13 2026"
      },
      "message": "Update KleidiAI and Slinky in XNNPACK\n\nCommit history for kleidi/kleidiai (dc50c2e6 -\u003e b87ef9c9):\n- ded0f58c Emil Ohlsson: fix: Clean up the use of `clamp_range`\n- 8e08e0ed Emil Ohlsson: docs: Split kernel naming\n- e683484a Cathal Lawlor: chore: add electric fence negative CI canary\n- ae936c0e Aude Vuilliomenet: chore: CI Re-enable API compatibility checker.\n- 2ff86189 Emil Ohlsson: feat: bf16 LHS packing for SME\n- f0cac762 Emil Ohlsson: feat: Add SME version of f32_bf16p_bf16p kernel\n- d96e3f0a Dan Johansson: feat: Add CI job that checks micro-kernel header includes\n- ea78dad7 Viet-Hoa Do: chore: Rename kernel wrapper registry files\n- 66135f72 Puneet Matharu: feat: Add experimental ops library\n- 9bcaee2e John McLoughlin: chore: Bump to release 1.27.0\n- 678b268d Emil Ohlsson: fix: Controllable accumulator type\n- 546ead02 John McLoughlin: chore: Extend the benchmark tool to facilitate kernel \u0026 LHS combinations\n- c8aa2b14 Viet-Hoa Do: chore: Add documentation for micro-kernel API design\n- b431044a Emil Ohlsson: chore: Add name checker\n- 45ee3ee2 Emil Ohlsson: chore: Automated naming documentation\n- 7ca389b8 Emil Ohlsson: chore: trim coverage test output\n- f6503d33 Dan Johansson: feat: Add fp16 \u003c- qai8dxp x qsi4cxp SME2 GEMM\n- 2f057318 Aude Vuilliomenet: docs: add vscale information\n- 142c5852 Suhail M: feat: Add SVE2p1 FP16 Matmul kernels with FP32 Accumulation\n- b9786177 David Mansell: fix: ops: Remove C-style casts in hybrid kernels.\n- 1231da54 John McLoughlin: chore: Bump release to 1.28.0\n- 8a3aadbc Dan Johansson: fix: Update pack_matmul benchmark suite\n- 5b8a014c Cathal Lawlor: chore: disable experimental/ops CI for rework\n- b87ef9c9 Dan Johansson: feat: Add fp16 \u003c- qai8dxp x qsi4cxp SME2 GEMV\n\nCommit history for dsharlet/slinky (8480fc77 -\u003e 8eda4a05):\n- 8eda4a05 Dillon: Avoid substituting buffer bounds that are defined in terms of itself (#855)\n\nPiperOrigin-RevId: 947728024\n"
    },
    {
      "commit": "599a1be6e4d564490ea4e66f869c617399d3637b",
      "tree": "192c835f3db0d2b7e2cdaa0f4844e52dcb122db1",
      "parents": [
        "44db232920d81d2993720e349a55918c4524e5b6"
      ],
      "author": {
        "name": "Aizal Khan",
        "email": "aizumusheer2@gmail.com",
        "time": "Tue Jul 14 13:27:05 2026"
      },
      "committer": {
        "name": "Aizal Khan",
        "email": "aizumusheer2@gmail.com",
        "time": "Tue Jul 14 13:27:05 2026"
      },
      "message": "validate runtime input channels in convolution reshape paths\n"
    },
    {
      "commit": "44db232920d81d2993720e349a55918c4524e5b6",
      "tree": "9778c00fadf21d22bb645becace3fde1b9cd9b17",
      "parents": [
        "11d93dd04f90c77b363ac3aaeb8ef021667713d3"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Tue Jul 14 00:59:40 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 14 01:02:34 2026"
      },
      "message": "Fix presubmit checks about unchecked absl::StatusOr access.\n\nPiperOrigin-RevId: 947344210\n"
    },
    {
      "commit": "11d93dd04f90c77b363ac3aaeb8ef021667713d3",
      "tree": "eab39a9a606efb95d3151a9ae5c5776891639398",
      "parents": [
        "a1f8b25718b22874c1fa28500365340f2a1913e2"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Tue Jul 14 00:41:33 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Tue Jul 14 00:42:20 2026"
      },
      "message": "Move scheduler bounds from producer loop splits to consumer inputs.\n\nThis was really a hack ported from the previous matcher which used loop extents to decide when loops should fused. It really only worked for the specific case of dot(A, pack(B) and anything even slightly more complex wasn\u0027t fusing (see the test case for an example). I believe we should be overriding the bounds which are used to infer the source regions, which makes it independent of what the producer is.\n\nI ran a whole lot of benchmarks, it\u0027s generally neutral for most of the existing ones, except of some minor gain ~10% in bench/subgraph:depthwise.\n\nPiperOrigin-RevId: 947335372\n"
    },
    {
      "commit": "a1f8b25718b22874c1fa28500365340f2a1913e2",
      "tree": "5ca88c1a6f72e0aabc47a3b8e2563f14f37867b1",
      "parents": [
        "fdd1e56341e8843e67744e7692ead43cef510926"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Mon Jul 13 22:44:04 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 22:44:45 2026"
      },
      "message": "Update XNNPACK dependencies\n\nCommit history for pytorch/cpuinfo (bc3c01e2 -\u003e ea6b9f1b):\n- d05fbcd5 Ken Unger: Add riscv half-precision floating point detection (#375)\n- e829e80f fbarchard: Adding ro.soc.model support to cpuinfo to detect Qualcomm SM8850 SoC (#381)\n- 3681f0ce Nicolas Pitre: Add Cortex-A320 to MIDR decode table (#384)\n- ea6b9f1b fbarchard: Add Google Tensor SoC detection. (#382)\n\nCommit history for kleidi/kleidiai (51f71905 -\u003e dc50c2e6):\n- 07809921 Declan Cox: feat: Add SME matmul clamp FP32 \u003c- qsi8d32p x qsi4c32p kernels and supporting RHS packing kernels\n- 51b3b669 Emil Ohlsson: docs: Indicate deprecated naming\n- aef56b18 Evie Wright: feat: f32 \u003c- qsi8d32p x qsi4c32p matmul ukernels using advanced SIMD support variable block length\n- 0e6d26d9 Emil Ohlsson: fix: Make kernel commit ZA\n- cd569730 Felix Johnny Thomasmathibalan: feat: Skip CI pipelines for draft merge requests\n- c6b34654 Anitha Raj: feat: Advanced SIMD Matmul Micro-kernels F32 \u003c- QAI8DXP(LHS) x QSU2CXP(RHS)\n- d01637c3 Felix Johnny Thomasmathibalan: docs: Update documentation in the interface file\n- 09bcd563 Patryk Kaiser: feat: Add NxK RHS packers for F32 and F16 Adv SIMD kernels\n- 4581dbcb Patryk Kaiser: chore: Enable overflow protection by default\n- 1cb4545b James Gross: fix: Fix qai8 SME2 test failures when k\u003d1\n- 89a7ef40 Dan Johansson: docs: 3rd party contribution policy\n- fb22f766 Patryk Kaiser: docs: Update changelog with new packing kernels\n- 5866364d Viet-Hoa Do: feat: Implement SME2 FP32 GEMV with 4vsx1 RHS format\n- e3ef8a28 Emil Ohlsson: feat: Add Decomposable GEMM kernels\n- 158dbabb Felix Johnny Thomasmathibalan: feat: Add depthwise indirect kernels and packing for FP16 SME2\n- 147cf38f Emil Ohlsson: chore: Add debug build\n- 29d3e891 Viet-Hoa Do: feat: Implement SME2 static Int8 GEMM and GEMV kernels\n- 41560cc9 Emil Ohlsson: fix: Replace invalid clamp ranges in test file\n- 0b360985 Dan Johansson: fix: Enable random seed for matmul_clamp_f32_qai8dxp_qsi4cxp_test\n- b6aac192 James Gross: fix: Fix get_lhs/rhs_get_offset API to use mr and nr\n- 8d6e32ff Emil Ohlsson: fix: Skip unsupported cases\n- 780a951d Dan Johansson: fix: Reduce scope of public sme1-only testing\n- 4f32151f John McLoughlin: fix: Update api checker to cater for additional source files\n- dc50c2e6 Mohammed Suhail Munshi: chore: Bump version to v1.26.0\n\nCommit history for google/pthreadpool (9003ee6c -\u003e 02460584):\n- 283bff7f Pranav P: Fix endianness issue which fixes deadlock\n- 6bde301a Gregory Comer: Directly join owned threads on cleanup\n- 85d17182 Alexander Shaposhnikov: Update googletest, cpuinfo version.\n- 65eb1fd3 Alexander Shaposhnikov: Add PTHREADPOOL_NO_SANITIZE_FUNCTION macro.\n- a56dcd79 XNNPACK Team: Merge pull request #92 from GregoryComer:win-cleanup-fix\n- 02460584 XNNPACK Team: Merge pull request #88 from pranavkaruvally:s390x-fix\n\nPiperOrigin-RevId: 947284345\n"
    },
    {
      "commit": "fdd1e56341e8843e67744e7692ead43cef510926",
      "tree": "98eae368d71b57478c642b6a88a57168ec20cb81",
      "parents": [
        "5d9f5ae30964fc7a97af6d0e67b591eb492aaf2c"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 22:17:28 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 22:18:19 2026"
      },
      "message": "Added option to enable ynnpack integration with litert\n\nPiperOrigin-RevId: 947270459\n"
    },
    {
      "commit": "5d9f5ae30964fc7a97af6d0e67b591eb492aaf2c",
      "tree": "887c356da4c98c65c263377eafcbe0126c5b3fe3",
      "parents": [
        "bbc8bb6becfde6898a8f168b996c3104dbad1c4f"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Mon Jul 13 21:53:59 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 21:54:47 2026"
      },
      "message": "xx-pad AVX512SKX microkernel with p64_u128\n\nPiperOrigin-RevId: 947258523\n"
    },
    {
      "commit": "bbc8bb6becfde6898a8f168b996c3104dbad1c4f",
      "tree": "bf96a2254d3aca5abc39c1a5a4cb034eddbd90ec",
      "parents": [
        "cdbda0336b251187b0bad12ec310354f8bc84b6f"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Mon Jul 13 19:23:37 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 19:24:18 2026"
      },
      "message": "Add concatenated_mixed_dot_bench\n\nThis benchmark reproduces part of gemma4 E2B\n\nPiperOrigin-RevId: 947179454\n"
    },
    {
      "commit": "cdbda0336b251187b0bad12ec310354f8bc84b6f",
      "tree": "61cc0e721eae7eb41d2c48faf9576817d733ac73",
      "parents": [
        "79dc6c43baf5432dd5c0a10c014e1b8711f3007e"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Mon Jul 13 19:06:20 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 19:07:39 2026"
      },
      "message": "Update Slinky in XNNPACK\n\nCommit history for dsharlet/slinky (dad55945 -\u003e 8480fc77):\n- 22c8c113 Dillon: Avoid unnecessary body traversals in `depends_on` (#846)\n- d9b5f83f Dillon: Fix padded aliases with translation (#847)\n- 50445524 Volodymyr Kysenko: Fix slide_and_fold_storage corrupting sibling crops of a concatenate output. (#850)\n- 76510c89 Dillon: Fix simplify rules and associated test coverage hole (#853)\n- 8480fc77 Dillon:  Refactor `slide_and_fold_storage` (#854)\n\nPiperOrigin-RevId: 947170440\n"
    },
    {
      "commit": "79dc6c43baf5432dd5c0a10c014e1b8711f3007e",
      "tree": "eb298dc5b5567af3b8e073963180fbf2d3cb857a",
      "parents": [
        "894aab8e6bfbfe63f97a0b9d2dd1d11288a18e08"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Mon Jul 13 17:55:20 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 17:55:59 2026"
      },
      "message": "Fix warnings about control reaching the end of a non-void function.\n\nThe macro handles all the cases but some older compilers aren\u0027t capable\nof distinguishing this.\n\nPiperOrigin-RevId: 947131102\n"
    },
    {
      "commit": "894aab8e6bfbfe63f97a0b9d2dd1d11288a18e08",
      "tree": "9029e97154892eeb02c411d3a6be34936c6b9151",
      "parents": [
        "d3f0715a6b6a31d5a2b05dbfb16502c4781eee03"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Mon Jul 13 17:45:20 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 17:46:16 2026"
      },
      "message": "Add `convert(ternary(...))` and `ternary(convert(...))` fusion rules\n\nThis handles quantize and dequantize\n\nPiperOrigin-RevId: 947125622\n"
    },
    {
      "commit": "d3f0715a6b6a31d5a2b05dbfb16502c4781eee03",
      "tree": "601018672dea0bc6148fab26d8850d4a0ce675c0",
      "parents": [
        "d5b1ae6977a377fda80794ce2e6b4a10e0a43027",
        "e6f7ecf03677a4cd8a41ffd6f46079537af3f04f"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 17:14:19 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 17:14:19 2026"
      },
      "message": "Merge pull request #10690 from velonica0:rvv-ppmm-new\n\nPiperOrigin-RevId: 947106996\n"
    },
    {
      "commit": "d5b1ae6977a377fda80794ce2e6b4a10e0a43027",
      "tree": "2876ffa787add8d920dd66b966297c3c537f73a0",
      "parents": [
        "64c959c0042e8d1de96820274a171b57ed0c401f",
        "9f2ea13a816867b508c9ccbedd68d27b806b06c9"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 16:48:26 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 16:48:26 2026"
      },
      "message": "Merge pull request #10629 from GregoryComer:bf16-rewrite\n\nPiperOrigin-RevId: 947093062\n"
    },
    {
      "commit": "64c959c0042e8d1de96820274a171b57ed0c401f",
      "tree": "80101629d9126d7c77e2da7c948a9d43e70d59a6",
      "parents": [
        "7365892223cb8b88d6e1f66aecb6665f15d78043"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Mon Jul 13 15:06:19 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 15:07:13 2026"
      },
      "message": "Implement conversion over ranges of elements.\n\nThis allows for conversion to and from sub-byte sized elements.\n\nWe introduce a `RangeConversion` helper that defines the conversion\nalgorithm and that is reused by all specializations.\n\nPiperOrigin-RevId: 947045672\n"
    },
    {
      "commit": "7365892223cb8b88d6e1f66aecb6665f15d78043",
      "tree": "c9dc9c0ace5405eb924f078a94aee516c761301a",
      "parents": [
        "225cae503300b991baba200a7f629087b8d55765"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Mon Jul 13 14:12:42 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 14:13:40 2026"
      },
      "message": "Add equality operator to int4_t and int2_t.\n\nThis will simplify equality checks in tests.\n\nPiperOrigin-RevId: 947022731\n"
    },
    {
      "commit": "225cae503300b991baba200a7f629087b8d55765",
      "tree": "868e6effea4fb6f0f5b30f964d3995b6c978e365",
      "parents": [
        "1890ca2fd8c29f7a7ae0321ca43050e117c112b9",
        "a3fc80e2be418a1e896215134c98f9564fdd1f8b"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 11:45:45 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 11:45:45 2026"
      },
      "message": "Merge pull request #10700 from aizu-m:fully-connected-input-channel-mismatch\n\nPiperOrigin-RevId: 946958498\n"
    },
    {
      "commit": "1890ca2fd8c29f7a7ae0321ca43050e117c112b9",
      "tree": "2c2df523143e05fb2259ad8d73f22ca80fd30100",
      "parents": [
        "5b5b4a35e39ec52c06d1e5989d684a6ddacb3096",
        "49bd3ab2d3787aa62253ba70b8dd06b720f0a46e"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 11:20:07 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 11:20:07 2026"
      },
      "message": "Merge pull request #10677 from JakeStevens:avx2_bf16_qb4w_gemm\n\nPiperOrigin-RevId: 946948149\n"
    },
    {
      "commit": "5b5b4a35e39ec52c06d1e5989d684a6ddacb3096",
      "tree": "7624695b5573d77da72089a7246feee3771e8988",
      "parents": [
        "f45b3ea023012bd29deaa1900272933bca798ab5"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Mon Jul 13 09:05:09 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Mon Jul 13 09:05:52 2026"
      },
      "message": "Relax tolerance of sum of squares in YNNPACK\n\nThis test is flaky right now\n\nPiperOrigin-RevId: 946887202\n"
    },
    {
      "commit": "f45b3ea023012bd29deaa1900272933bca798ab5",
      "tree": "48240f3908bcb13a8d54ac3f883f8b6bb34ce19e",
      "parents": [
        "43664800cb7fe4b60f30b569012c8699cc81c183"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Sun Jul 12 15:48:16 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Sun Jul 12 15:49:12 2026"
      },
      "message": "Fix int8 -\u003e uint8 rewrite for dots\n\nThe current rewrite is incorrect because it does not account for the change in zero point of the int8 -\u003e uint8 rewrite. We didn\u0027t catch this bug because both XNNPACK and YNNPACK\u0027s tests for statically quantized int8 use an int8 input value, so the rewrite did not occur in these tests.\n\nThis changes the rewrite to work in a few different ways:\n\n1. If the producer of the int8 value is a dynamic quantization, adjust the dynamic quantization to produce uint8 instead of int8.\n2. Otherwise, insert a `requantize_to_int8` op, which may later fuse with a `quantize_int8` if it exists. This rewrite requires the sum of B to account for the change in zero point.\n\nPiperOrigin-RevId: 946592249\n"
    },
    {
      "commit": "43664800cb7fe4b60f30b569012c8699cc81c183",
      "tree": "5a8a6c2152b754e8bd3238ca50dcb59829fe5797",
      "parents": [
        "7e739a47f85e1b4c89e7aba355f40e2d42ce6849"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Sun Jul 12 02:22:47 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Sun Jul 12 02:23:47 2026"
      },
      "message": "Optimize SIMD sub-byte interleave wrappers.\n\nDetails:\n- Optimized 2-bit and 4-bit interleave SIMD wrappers in byte_vec.h and x86_vec128_base.h using delta swaps, reducing operation count.\n- Optimized and corrected u8x8 interleave wrappers in byte_vec.h using delta swaps.\n\nBenchmarks:\n```\nname                                               time/op         time/op     vs base\nbench/transpose_x4_avx512/m:128/n:126/real_time     1.347µ ±  3%   1.339µ ±  4%       ~ (p\u003d0.937 n\u003d6)\nbench/transpose_x4_avx512/m:126/n:128/real_time     1.310µ ±  4%   1.306µ ± 10%       ~ (p\u003d0.589 n\u003d6)\nbench/transpose_x4_avx512/m:128/n:128/real_time     1.272µ ±  1%   1.266µ ±  2%       ~ (p\u003d1.000 n\u003d6)\nbench/transpose_x4_avx512/m:512/n:512/real_time     24.00µ ±  3%   24.49µ ±  1%  +2.05% (p\u003d0.041 n\u003d6)\nbench/transpose_x4_sse2/m:128/n:126/real_time       1.778µ ±  7%   1.693µ ±  8%       ~ (p\u003d0.240 n\u003d6)\nbench/transpose_x4_sse2/m:126/n:128/real_time       1.828µ ±  2%   1.718µ ±  4%  -6.03% (p\u003d0.002 n\u003d6)\nbench/transpose_x4_sse2/m:128/n:128/real_time       1.376µ ±  1%   1.308µ ±  1%  -4.91% (p\u003d0.002 n\u003d6)\nbench/transpose_x4_sse2/m:512/n:512/real_time       27.54µ ±  3%   26.26µ ±  2%  -4.68% (p\u003d0.002 n\u003d6)\nbench/transpose_x2/m:256/n:252/real_time            18.94µ ± 27%   17.40µ ±  3%  -8.09% (p\u003d0.002 n\u003d6)\nbench/transpose_x2/m:252/n:256/real_time            19.54µ ± 39%   17.65µ ±  2%  -9.66% (p\u003d0.002 n\u003d6)\nbench/transpose_x2/m:256/n:256/real_time            18.71µ ± 33%   17.05µ ±  2%  -8.86% (p\u003d0.002 n\u003d6)\nbench/transpose_x2/m:512/n:512/real_time            83.01µ ± 66%   74.81µ ±  4%  -9.87% (p\u003d0.002 n\u003d6)\nbench/transpose_x4/m:128/n:126/real_time            5.493µ ± 32%   5.079µ ±  1%  -7.53% (p\u003d0.002 n\u003d6)\nbench/transpose_x4/m:126/n:128/real_time            5.635µ ±  9%   5.303µ ±  2%  -5.89% (p\u003d0.002 n\u003d6)\nbench/transpose_x4/m:128/n:128/real_time            5.345µ ±  8%   5.003µ ±  5%  -6.39% (p\u003d0.004 n\u003d6)\nbench/transpose_x4/m:512/n:512/real_time           101.45µ ±  3%   97.23µ ±  2%  -4.16% (p\u003d0.009 n\u003d6)\ngeomean                                            7.293µ         6.923µ        -5.08%\n```\nPiperOrigin-RevId: 946374836\n"
    },
    {
      "commit": "7e739a47f85e1b4c89e7aba355f40e2d42ce6849",
      "tree": "b45d09f915e5b29ea04dfe0f783b4391d7828eb3",
      "parents": [
        "ba90c6597f2e55c4eb7baf65086ef72d40ae451a"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Sat Jul 11 02:33:50 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Sat Jul 11 02:34:45 2026"
      },
      "message": "Add 2- and 4-bit 8-way interleaving kernels\n\nThese kernels are sometimes used by the new 2- and 4-bit dot kernels for packing.\n\nBenchmarks, 2-bit:\n```\n---------------------------------------------------------------------------------------------------------------------\nBenchmark                                                           Time             CPU   Iterations UserCounters...\n---------------------------------------------------------------------------------------------------------------------\nbench/interleave8_x2_avx512/factor:8/m:4/n:8192/real_time       0.020 ns        0.020 ns   6993444864 Bytes\u003d37.7647G/s\nbench/interleave8_x2_avx512/factor:8/m:8/n:8192/real_time       0.010 ns        0.010 ns   13874364416 Bytes\u003d48.679G/s\nbench/interleave8_x2_avx2/factor:8/m:4/n:8192/real_time         0.051 ns        0.051 ns   2722299904 Bytes\u003d14.7926G/s\nbench/interleave8_x2_avx2/factor:8/m:8/n:8192/real_time         0.026 ns        0.026 ns   5444075520 Bytes\u003d19.0996G/s\nbench/interleave8_x2_sse2/factor:8/m:4/n:8192/real_time         0.089 ns        0.089 ns   1573748736 Bytes\u003d8.41797G/s\nbench/interleave8_x2_sse2/factor:8/m:8/n:8192/real_time         0.051 ns        0.051 ns   2910650368 Bytes\u003d9.76529G/s\nbench/interleave8_x2/factor:8/m:4/n:8192/real_time              0.375 ns        0.375 ns    369655808 Bytes\u003d1.99753G/s\nbench/interleave8_x2/factor:8/m:8/n:8192/real_time              0.077 ns        0.077 ns   1748566016 Bytes\u003d6.47937G/s\nbench/interleave_x2/factor:8/m:4/n:8192/real_time               0.891 ns        0.891 ns    155353088 Bytes\u003d841.932M/s\nbench/interleave_x2/factor:8/m:8/n:8192/real_time               0.492 ns        0.492 ns    287113216 Bytes\u003d1.01659G/s\n```\n\nAnd 4-bit:\n```\n---------------------------------------------------------------------------------------------------------------------\nBenchmark                                                           Time             CPU   Iterations UserCounters...\n---------------------------------------------------------------------------------------------------------------------\nbench/interleave8_x4_avx512/factor:8/m:4/n:8192/real_time       0.037 ns        0.037 ns   3156148224 Bytes\u003d40.0814G/s\nbench/interleave8_x4_avx512/factor:8/m:8/n:8192/real_time       0.020 ns        0.020 ns   6919684096 Bytes\u003d50.5271G/s\nbench/interleave8_x4_avx2/factor:8/m:4/n:8192/real_time         0.073 ns        0.073 ns   1950810112 Bytes\u003d20.5901G/s\nbench/interleave8_x4_avx2/factor:8/m:8/n:8192/real_time         0.036 ns        0.036 ns   3793813504 Bytes\u003d27.6633G/s\nbench/interleave8_x4_sse2/factor:8/m:4/n:8192/real_time         0.082 ns        0.082 ns   1729003520 Bytes\u003d18.199G/s\nbench/interleave8_x4_sse2/factor:8/m:8/n:8192/real_time         0.044 ns        0.044 ns   3195928576 Bytes\u003d22.709G/s\nbench/interleave8_x4/factor:8/m:4/n:8192/real_time              0.220 ns        0.220 ns    641138688 Bytes\u003d6.82579G/s\nbench/interleave8_x4/factor:8/m:8/n:8192/real_time              0.110 ns        0.110 ns   1273692160 Bytes\u003d9.09961G/s\nbench/interleave_x4/factor:8/m:4/n:8192/real_time                1.04 ns         1.04 ns    131432448 Bytes\u003d1.44017G/s\nbench/interleave_x4/factor:8/m:8/n:8192/real_time               0.585 ns        0.585 ns    247988224 Bytes\u003d1.71044G/s\n```\n\nPiperOrigin-RevId: 946006797\n"
    },
    {
      "commit": "ba90c6597f2e55c4eb7baf65086ef72d40ae451a",
      "tree": "54e123171400271bc79b1de8f9970eb15cfd3260",
      "parents": [
        "d0133af81ecae71b0457c69690a618a5e925c56a"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Fri Jul 10 21:40:48 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 10 21:41:41 2026"
      },
      "message": "Add benchmarks for various attention cases using ynnpack API to define the graphs.\n\nThis includes a couple of different variations:\n* T \u003e\u003e 1 (i.e. prefill case)\n* T \u003d\u003d 1 (i.e. decode case)\n* two of the above, but the inputs and output are transposed\n* T \u003d\u003d 1, but the graph is specialized for the decode case.\n\nPiperOrigin-RevId: 945898685\n"
    },
    {
      "commit": "d0133af81ecae71b0457c69690a618a5e925c56a",
      "tree": "0c1dc6fb6b09f02fabf063ecbd175d48a402b335",
      "parents": [
        "be502dd139a9b2980f4eb021a72e1c1477397c9c"
      ],
      "author": {
        "name": "Volodymyr Kysenko",
        "email": "vksnk@google.com",
        "time": "Fri Jul 10 18:00:53 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 10 18:01:46 2026"
      },
      "message": "Add x86 AVX2/FMA3 FP32 GEMV-M dot kernels.\n\nThese kernels are specialized for the case where the second operand has only one column (n\u003d1). Unlike other dot kernels which vectorize along \u0027n\u0027, these kernels vectorize along \u0027k\u0027 and perform a horizontal reduction to compute the scalar dot product for each row. This avoids wasting SIMD lanes when n is small, improving efficiency for GEMV operations.\n\nPiperOrigin-RevId: 945796777\n"
    },
    {
      "commit": "be502dd139a9b2980f4eb021a72e1c1477397c9c",
      "tree": "298dca3f9979206b63ab81e6c39ed4cbafe50fcb",
      "parents": [
        "ae5e2eac7740660522dbd9a263133ae243e22994"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Fri Jul 10 16:22:04 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 10 16:23:00 2026"
      },
      "message": "Fix allocation behaviour in OwningCpuBuffer::(Copy|Transform).\n\nThe number of elements passed to `Allocate` was the size of the input\nsequence. For types that pack multiple elements (`int2_t`, `int4_t`)\nthis means that the count is wrong.\n\nTo fix this, we add a `kNumElements` member to `StorageImpl` and use it\nto scale the container size.\n\nPiperOrigin-RevId: 945746661\n"
    },
    {
      "commit": "ae5e2eac7740660522dbd9a263133ae243e22994",
      "tree": "0438d616d18d3b48a79c43214cfdafb442d2dfba",
      "parents": [
        "efa2e754e390ab3024f6aa617385cb28bdfa9969"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Fri Jul 10 15:03:30 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 10 15:04:35 2026"
      },
      "message": "Suppress linter warning for TensorHandle error status implicit construction.\n\nPiperOrigin-RevId: 945714801\n"
    },
    {
      "commit": "efa2e754e390ab3024f6aa617385cb28bdfa9969",
      "tree": "a261c5941227d08aeabac7b54d5917518e1ad596",
      "parents": [
        "17b23923d58ff9c96aef0626e96b5292150e5124"
      ],
      "author": {
        "name": "Misha Gutman",
        "email": "aelphy@google.com",
        "time": "Fri Jul 10 12:05:35 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Fri Jul 10 12:06:20 2026"
      },
      "message": "Expose native FP16 support query in XNNPACK.\n\nPiperOrigin-RevId: 945649583\n"
    },
    {
      "commit": "a3fc80e2be418a1e896215134c98f9564fdd1f8b",
      "tree": "71bb14c59c47fac6619976ee4e5683f4b6d08c12",
      "parents": [
        "47f738f4519e1a0454252a365e4d4cdf7431dfa9"
      ],
      "author": {
        "name": "Aizal Khan",
        "email": "aizumusheer2@gmail.com",
        "time": "Fri Jul 10 05:16:40 2026"
      },
      "committer": {
        "name": "Aizal Khan",
        "email": "aizumusheer2@gmail.com",
        "time": "Fri Jul 10 05:16:40 2026"
      },
      "message": "reject input channel mismatch in fully-connected reshape\n"
    },
    {
      "commit": "9f2ea13a816867b508c9ccbedd68d27b806b06c9",
      "tree": "ea58e29ab054f62f878b7648a43592ba57ee85d0",
      "parents": [
        "cb9c05adaf87a43c96a74c2a95d11935037ceb9c"
      ],
      "author": {
        "name": "Gregory Comer",
        "email": "gjcomer@meta.com",
        "time": "Thu Jul 09 20:31:56 2026"
      },
      "committer": {
        "name": "Gregory Comer",
        "email": "gjcomer@meta.com",
        "time": "Thu Jul 09 20:31:56 2026"
      },
      "message": "Add :math dep\n"
    },
    {
      "commit": "17b23923d58ff9c96aef0626e96b5292150e5124",
      "tree": "fc605b1cdcd2a9019ecf527fdf8a672cf9a1ed5b",
      "parents": [
        "47f738f4519e1a0454252a365e4d4cdf7431dfa9"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Thu Jul 09 20:12:26 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Thu Jul 09 20:13:09 2026"
      },
      "message": "Fix bug where static tensors are incorrectly deduplicated\n\nThe `YNN_VALUE_FLAG_COPY_DATA_FP32` flag implementation had a bug, where it would deduplicate tensors if their data was equivalent, ignoring the need for converting the data.\n\nPiperOrigin-RevId: 945278582\n"
    },
    {
      "commit": "cb9c05adaf87a43c96a74c2a95d11935037ceb9c",
      "tree": "88ca129dea0b798baafce7b619662cdcb71bbc59",
      "parents": [
        "0d05b116d72ff4286ce64c4c08e4ee9a854a9024"
      ],
      "author": {
        "name": "Gregory Comer",
        "email": "gjcomer@meta.com",
        "time": "Thu Jul 09 07:38:48 2026"
      },
      "committer": {
        "name": "Gregory Comer",
        "email": "gjcomer@meta.com",
        "time": "Thu Jul 09 07:38:48 2026"
      },
      "message": "Update GN build and fix bf16 sqrt tolerance\n"
    },
    {
      "commit": "e6f7ecf03677a4cd8a41ffd6f46079537af3f04f",
      "tree": "706eff95f14fa3a37e4c7a6aecdc71008136c73d",
      "parents": [
        "b3d2c9adb6303d3102ddfd1dc97f56c009b95a27"
      ],
      "author": {
        "name": "velonica0",
        "email": "like@mail.nankai.edu.cn",
        "time": "Tue Jul 07 02:14:27 2026"
      },
      "committer": {
        "name": "velonica0",
        "email": "like@mail.nankai.edu.cn",
        "time": "Thu Jul 09 01:28:17 2026"
      },
      "message": "[RVV] add rvv f32 kernel for ppmm\n"
    },
    {
      "commit": "47f738f4519e1a0454252a365e4d4cdf7431dfa9",
      "tree": "80659dd1d4cfbf1c7e6c55f2a4e346e3567d698f",
      "parents": [
        "203a0ad8d823a2a02593651f3f9bfebc96c9af78"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Wed Jul 08 20:59:42 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 21:01:16 2026"
      },
      "message": "Fix wasm undeclared identifier \u0027XNN_SIMD_NUM_RCP_ITER_F32\u0027\n\nPiperOrigin-RevId: 944685467\n"
    },
    {
      "commit": "203a0ad8d823a2a02593651f3f9bfebc96c9af78",
      "tree": "9e987876f3ab2e89684edb1e3509565420d5b445",
      "parents": [
        "316f4cfab5a877dd512f678b9b20ffe1b83d0460"
      ],
      "author": {
        "name": "Frank Barchard",
        "email": "fbarchard@google.com",
        "time": "Wed Jul 08 20:39:43 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 20:40:31 2026"
      },
      "message": "Fix SEH access violation in static broadcast optimization by allocating XNN_EXTRA_BYTES.\n\nPiperOrigin-RevId: 944674440\n"
    },
    {
      "commit": "316f4cfab5a877dd512f678b9b20ffe1b83d0460",
      "tree": "0f266eee2994420e42db7dcef5cee91d2c0c7cc3",
      "parents": [
        "303b083a93af4f05ee3504e22c1c8467dc12286e",
        "dfde8dc943d29f3621fdf72e70d6b5cdb05d71d3"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 19:13:22 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 19:13:22 2026"
      },
      "message": "Merge pull request #10682 from velonica0:rvv-ibilinear-new\n\nPiperOrigin-RevId: 944630470\n"
    },
    {
      "commit": "303b083a93af4f05ee3504e22c1c8467dc12286e",
      "tree": "eec14822af91305f8be60e7104381a2273b1a604",
      "parents": [
        "4906b934053c4df4550765538d3182122b07db9b"
      ],
      "author": {
        "name": "Quentin Khan",
        "email": "qkhan@google.com",
        "time": "Wed Jul 08 19:04:43 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 19:05:23 2026"
      },
      "message": "Overload `XnnpackRunner::(Set|Write)Input` for contiguous data containers.\n\nThe new overload will catch any type that has a `data()` member and\ncreate a `byte` span from it.\n\nPiperOrigin-RevId: 944626796\n"
    },
    {
      "commit": "4906b934053c4df4550765538d3182122b07db9b",
      "tree": "2ac36c4ecbe7b6af0d36f11dd13282e34d7bca35",
      "parents": [
        "83b8931629792c1e2fcfaf0138cac1ff8df282a4"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Wed Jul 08 17:32:58 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 17:34:09 2026"
      },
      "message": "Use transpose kernels for all transpose ops\n\nCurrently, if see a transpose that looks like it only transposes dimensions other than the stride 1 dimensions, we use `slinky::copy`. However, this misses a lot of cases where we should use transpose kernels. Consider the following example: A is a tensor with 8 bit elements of shape `[X, Y, Z, 4]`, and we transpose it with permutation `{0, 2, 1, 3}` to `[X, Z, Y, 4]`. This is equivalent to transposing a tensor with 32-bit elements of shape `[X, Y, Z]` to `[X, Z, Y]`. Currently, we would use a fast transpose kernel for the latter case, but not the former.\n\nThis change refactors the implementation of static_transpose to always use transpose kernels by fusing such dimensions with the element size before finding the kernel to use.\n\nPiperOrigin-RevId: 944575562\n"
    },
    {
      "commit": "83b8931629792c1e2fcfaf0138cac1ff8df282a4",
      "tree": "57be6ac10b8fbf41d7abe5ead5b04efcbdc7428f",
      "parents": [
        "b3d2c9adb6303d3102ddfd1dc97f56c009b95a27"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Wed Jul 08 16:59:38 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 17:00:15 2026"
      },
      "message": "Add transpose subgraph benchmark\n\nThis exposes that we fail to use transpose kernels in many cases that we could do so, because the innermost dimension is not transposed and could be treated as a larger element instead.\n\nPiperOrigin-RevId: 944557285\n"
    },
    {
      "commit": "dfde8dc943d29f3621fdf72e70d6b5cdb05d71d3",
      "tree": "88f5826600ad78364d70ccab637b860c260dd8b5",
      "parents": [
        "b3d2c9adb6303d3102ddfd1dc97f56c009b95a27"
      ],
      "author": {
        "name": "velonica0",
        "email": "like@mail.nankai.edu.cn",
        "time": "Tue Jul 07 01:58:32 2026"
      },
      "committer": {
        "name": "velonica0",
        "email": "like@mail.nankai.edu.cn",
        "time": "Wed Jul 08 13:57:19 2026"
      },
      "message": "[RVV] add rvv f32 kernel for ibilinear\n"
    },
    {
      "commit": "b3d2c9adb6303d3102ddfd1dc97f56c009b95a27",
      "tree": "ffb1485f1cbc2a6011d0ec63aff9d992386d892b",
      "parents": [
        "c7361e45db682dea6cbca98b4ce5a719e545aade",
        "fcd4329f85b46e30837d78c0a93fb4c0120aeed0"
      ],
      "author": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 03:03:38 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 03:03:38 2026"
      },
      "message": "Merge pull request #10656 from yolanda15:update_f32_gemm\n\nPiperOrigin-RevId: 944231519\n"
    },
    {
      "commit": "c7361e45db682dea6cbca98b4ce5a719e545aade",
      "tree": "49d656407b8709c48751cb8d7cbee5f4f0dc6674",
      "parents": [
        "1db304f21e33703aa377155021bc5264c1908812"
      ],
      "author": {
        "name": "Dillon Sharlet",
        "email": "dsharlet@google.com",
        "time": "Wed Jul 08 01:47:40 2026"
      },
      "committer": {
        "name": "XNNPACK Team",
        "email": "xnnpack-github-robot@google.com",
        "time": "Wed Jul 08 01:48:36 2026"
      },
      "message": "Add a memory bandwidth measurement to transpose benchmarks\n\nPiperOrigin-RevId: 944209195\n"
    }
  ],
  "next": "1db304f21e33703aa377155021bc5264c1908812"
}