chore: merge upstream MNN 3.4.1#1
Merged
pruthvikar merged 358 commits intocleanup/remove-unused-apps-and-projectsfrom Mar 26, 2026
Merged
chore: merge upstream MNN 3.4.1#1pruthvikar merged 358 commits intocleanup/remove-unused-apps-and-projectsfrom
pruthvikar merged 358 commits intocleanup/remove-unused-apps-and-projectsfrom
Conversation
[BugFix] fix a bug in compute mGroupWithComputeRate GitOrigin-RevId: 0a30b5c040bc34aff1de94e7fa571ebb8f2c20fa
Feature/smallmodel opt GitOrigin-RevId: 5610add6e64c6d49f8b984d0d744c85f206f2be7
Title: [Metal Feature] check UI Status for metal command commit 本次代码评审主要增加了对执行状态的检查和错误处理,并引入了新的日志打印方式以提高调试和监控能力。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/24965986 GitOrigin-RevId: b7ad051c324c1b7d4aa231fc062f2f5d8e7f7a0f
Title: [Bugfix:CI] Fix duplicate msg when sync to github. 这段代码在 `copybara_sync.sh` 脚本中新增了一个功能,用于检测并跳过从 GitHub 导入的 commits,通过识别包含 `GitOrigin-RevId` 的 commit 来确定上次同步点,并从该点之后的第一个非导入 commit 开始进行同步。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25060238 GitOrigin-RevId: ca65f11f52c1b76a826cbdc260a063d1467a8f35
Title: [feature:opencl]opencl支持将权重存储到单个文件中 本次代码评审的主要内容是对OpenCL后端进行了优化,引入了`MmapPool`以支持内存映射池管理,并在多个执行单元中增加了对内存映射错误的检查与处理,同时调整了部分数据传输和转换逻辑以提高性能和稳定性。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/24702294 GitOrigin-RevId: 1a83d5da23cbb011d0cf522cdc6d49f5778c0999
…l_opt Feature/reduce conv small opt
GitOrigin-RevId: 59c693e6995609611fb7197f2288cc929370bdd6
opt(RVV): Optimize blitter functions with intrinsics GitOrigin-RevId: 880fb7a3a8e93edb188ef4804f24cd88ea29c76c
opt(RVV): Optimize resize functions with intrinsics GitOrigin-RevId: c9e9ac1362e1613acb11e924268b4e1284c9f142
opt(RVV): Optimize top1 functions with intrinsics GitOrigin-RevId: fc3cad1eae2ea3c93fe34b8bfec58a2f7201de9c
opt(RVV): Optimize Softmax and ReluWithSlopeChannel with intrinsics GitOrigin-RevId: bb4fb7cd6ac13a67582556277c91d38a958f2da8
opt(RVV): Optimize conv and strassen functions with intrinsics GitOrigin-RevId: 4c8794c50d00acf88baee977d20694a3f9b8b1cf
opt(RVV): Optimize max and min float functions with intrinsics GitOrigin-RevId: d246089d9de5602aeb58e91d1169923d58ed9712
opt(RVV): Optimize core math and stride functions with intrinsics GitOrigin-RevId: 767fdd24db8ead2a04086edab37f3785dd0e80df
…nctions opt(RVV): Optimize transpose functions with intrinsics GitOrigin-RevId: e643e7c3e1cda978161cc3921355cb4d1d3eec69
opt(RVV): Optimize pack and unpack functions with intrinsics GitOrigin-RevId: d786e4f5f353fa1e29319783f0bf7c3d2df00eb7
fix(diffusion): simplify export logic and fix dynamic axes GitOrigin-RevId: 24b3e6fb92a32c193260fc39d82e70e70abba762
mnn lib库自动化build脚本 GitOrigin-RevId: cb0a6d77c72cf6c04cd256355dd5989460821ceb
Add a compile option and macro to default enable kleidiAI GitOrigin-RevId: 96323077925a4788927649b4d262dc3d8288a66d
Title: [Doc:Update] update dingtalk in README. 本次代码评审的主要改动是对README文件中的钉钉群信息进行了更新,包括群号、状态以及删除了一些过时的信息。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25029869 GitOrigin-RevId: da3eed28af8d3cf35cd2578f76ad40d75f00158b
[BugFix] fix a bug in compute mGroupWithComputeRate GitOrigin-RevId: 5d5b47cfb2c6278818dde17c4efc8ffbbb9b779a
Feature/smallmodel opt GitOrigin-RevId: 99ed7ba5bc1eefac17236785e3fadde5d0f372e8
Title: [Metal Feature] check UI Status for metal command commit 本次代码评审主要增加了对执行状态的检查和错误处理,并引入了新的日志打印方式以提高调试和监控能力。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/24965986 GitOrigin-RevId: 7b85c75bae9cab7744a249ff10510e581c808e94
Title: [Bugfix:CI] Fix duplicate msg when sync to github. 这段代码在 `copybara_sync.sh` 脚本中新增了一个功能,用于检测并跳过从 GitHub 导入的 commits,通过识别包含 `GitOrigin-RevId` 的 commit 来确定上次同步点,并从该点之后的第一个非导入 commit 开始进行同步。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25060238 GitOrigin-RevId: 9caa49c6127112c4dc317584143e8ef041bab77d
GitOrigin-RevId: 804774d5836618d85384e4f0ce815ec94dec02de
Title: [Attention Feature] Support metal flash attention with lower memory and speedup 本次代码评审的主要改动包括引入了新的 `flash_softmax`、`flash_matmul_qkv`、`flash_scale` 和 `flash_attention_fused` 内核函数以优化注意力机制计算,并调整了相关参数和缓冲区管理逻辑,同时更新了掩码处理方式以支持更灵活的键值序列长度控制。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25100824 GitOrigin-RevId: 4bec555e71af031749770b66a29c9fb5f28f438e ORIGINAL_AUTHOR=MNNSyncBot <hi@zhaode.wang>
…gfix mask shape. 1. LLM's mask is scalar when mask is lower triangular and use cpu backend. 2. Bugfix CPU LLM supports any shape mask. Signed-off-by: jingbang.yjb <jingbang.yjb@alibaba-inc.com> Discussed-in: Merge-Request 25279138 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25279138 GitOrigin-RevId: ce5d6d2fa7fbebd151b68c021dce9af98b383cdb
Discussed-in: Merge-Request 25804622 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25804622 GitOrigin-RevId: 7ebc965b93a9fd1e8a54f53c05bc6bdb37a432b4
本次代码评审的主要内容是对扩散模型引擎进行了重构,引入了新的Sana Diffusion模型,并对现有Stable Diffusion模型进行了优化,包括统一的生成接口定义、新增工厂方法创建不同类型的扩散模型实例以及相应的演示程序和文档说明。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25760822 * feat: 增加 sana diffusion, 重构代码 * chore: 保留run接口,保留diffusion_demo不变 * docs: update diffusion usage * chore: drop tokenizer.cpp * fix: sana_llm.hpp使用llm.hpp采用包引入,否则下游报错 * fix: 修复sana_llm.hpp 引入问题 * fix: 打包framework的时候包含sana_llm.hpp GitOrigin-RevId: 00b1fe902ce163a6a567169b1f964f541aeba2ee
GitOrigin-RevId: dbfc4a7db7d3ad4fea182aac413fe5f0bf2db031
Discussed-in: Merge-Request 25729546 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25729546 GitOrigin-RevId: 6bd567d39b386132e094db11e44d8434b6681fa0
Discussed-in: Merge-Request 25913990 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25913990 GitOrigin-RevId: 290df25642c6fc47bb49f7facdf42e81b5e0cd41
Discussed-in: Merge-Request 25939666 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25939666 GitOrigin-RevId: fdc2f722daa7e156757279bad607bdb946de416c
Discussed-in: Merge-Request 25961499 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25961499 GitOrigin-RevId: af16a87d90931191281da636738dccff171c7f3e
[VULKAN] Support configuring coopMat when creating VkDevice. Optimize codes related to creating VkDevice. [VULKAN] Support setting extra spec consts when creating compute pipelines. [VULKAN][BUFFER] Support using coopMat in Conv1x1. [VULKAN][BUFFER] Add device check to coopMat branch conditions. Modify local size setting for shaders in VulkanConv1x1Coop. [VULKAN][BUFFER] Support onClone in <VulkanConv1x1Coop>. And check subgroupSize before creating Conv Op. [VULKAN][BUFFER] Update compile results. Discussed-in: Merge-Request 25276376 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25276376 GitOrigin-RevId: 1e4249e3b8d3b7a81057d91717cdabd2509d6c73
Discussed-in: Merge-Request 25982500 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25982500 GitOrigin-RevId: b0b36b349c52bf8f1fcd2bd5172c4afab9a4f810
Title: [LLM:Bugfix] Fix HQQ OOM and Embedding overflow. 针对 Qwen3.5-27B 支持过程中的两个问题进行了修复: 1. HQQ 量化优化: - 修复了在单卡 3090 上量化 `lm_head` 权重时显存不足(OOM)的问题。 - 解决方案:采用分块(Chunk-based)量化策略降低显存峰值占用。 2. DiskEmbedding 溢出修复: - 修复了当词表较大(如 24w)时,默认 `int` 类型保存 offset 导致的整型溢出问题。 - 解决方案:将相关索引和 offset 变量类型修改为 `size_t`。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/26078666 GitOrigin-RevId: aa8545496f2c96913afc3217d5949de0930b8f3d
…rators. Discussed-in: Merge-Request 26095381 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/26095381 GitOrigin-RevId: 024a6654879090799368764bd2b2e0a23c9cb428
…libaba#4189) * [feat] use new markdown view and streaming message * [feat] support useMarkdown * [update] preview message * [reformat] formate with swift format * [refactor]: change DispatchQueue.main.async to MainActor.run * [fix] send sequence * [feat] MTL config, batch file test and support local models - change config to support MTL - support more local models - support batch file test * [update] gitignore * [add] local model json * [add] local batch test * [feat] support text, image and audio batch test * [refactor] batch test view and model * [update] localizations * [feat] support video input * [feat] support backend, precision and thread config * feat: Add multimodal processing support and configuration options * update: support switch use multimodal prompt API * update: support video with imgs * feat: support audio output * delete: unused code and comments * feat: support sana diffusion * feat: add support for Sana Diffusion style transfer model # Conflicts: # apps/iOS/MNNLLMChat/MNNLLMiOS/Chat/ViewModels/LLMChatViewModel.swift # apps/iOS/MNNLLMChat/MNNLLMiOS/Chat/Views/LLMChatView.swift # apps/iOS/MNNLLMChat/MNNLLMiOS/Chat/Views/ModelSettingsView.swift # apps/iOS/MNNLLMChat/MNNLLMiOS/Localizable.xcstrings # apps/iOS/MNNLLMChat/MNNLLMiOS/MainTab/ModelList/Models/ModelListViewModel.swift # apps/iOS/MNNLLMChat/MNNLLMiOS/Service/Util/AssetExtractor.swift * refactor: remove cfgPrompt from style transfer and update prompt processing - Removed cfgPrompt parameter from runStyleTransfer and related methods. - Updated processPrompt to processSinglePrompt for improved clarity and functionality. - Added benchmark result saving functionality to track performance metrics. - Cleaned up unused code and comments related to cfgPrompt. * feat: use sana llm as engine * update: readme version * update: iOS backend * update: Chat Package * update: set iterations to 10 * update: set default seed to 42 for sana diffusion * update: load diffusion on background thread * update: sana diffusion api * feat: 增加根据exif信息旋转输入图像的操作 * feat: show diffusion progress * add: diffusion total cost time * update: Chat version * update: change image and text position --------- Co-authored-by: 游薪渝(揽清) <azure.yxy@alibaba-inc.com> Co-authored-by: 蔚山 <weishan.wyf@alibaba-inc.com>
…_bench Discussed-in: Merge-Request 26143023 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/26143023 GitOrigin-RevId: cfe809ec16e10ff1ff6bf16a75daede9cd6bd50c
Title: [LLM:Fix] Fix JSON merge issue in merge_and_clear for jinja config 将 `merge_json` 函数从 `llmconfig.cpp` 移动到 `llmconfig.hpp` 并定义为 `static inline` 函数,同时更新其在 `llmconfig.hpp` 中的实现以支持递归合并 JSON 对象。 Link: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/26158563 GitOrigin-RevId: 583824991d06804eaf75e37471918fbf1215eb70
Discussed-in: Merge-Request 25973150 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/25973150 GitOrigin-RevId: 83adccd7e28417c68280e8b4c8d0392816cee994
Discussed-in: Merge-Request 26168056 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/26168056 GitOrigin-RevId: 0506c76c610e1b297c38e8fc9e237e31ea7a36ac
Discussed-in: Merge-Request 26193851 , URL: https://code.alibaba-inc.com/AliNN/AliNNPrivate/codereview/26193851 GitOrigin-RevId: bec038578d1f788fea87657c01f660e4282eee69
Merge alibaba/MNN tag 3.4.1 into fork, preserving the cleanup of removed apps/ and project/ directories. Key improvements from upstream: - Metal TensorAPI support (M-series perf boost) - CPU MatMul/LayerNorm/broadcast optimization + ThreadPool overhead reduction - KleidiAI fp32 depthwise conv kernels (ARM) - Loop Op GPU optimization (Metal/OpenCL) - pure GPU path - Metal INT8/INT4 Conv2D fix - 7 memory safety fixes in shape/execution operators - Vulkan CoopMat Conv1x1 acceleration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Seems you are using me but didn't get OPENAI_API_KEY seted in Variables/Secrets for this repo. you could follow readme for more information |
pruthvikar
added a commit
to getcarv/mnn-sys
that referenced
this pull request
Mar 26, 2026
Update MNN submodule from post-3.3.0 (a5d3b04) to 3.4.1 merge (a1803f7). Depends on: getcarv/MNN#1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8f9264c
into
cleanup/remove-unused-apps-and-projects
1 check passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
apps/andproject/directoriesWhy
Upstream MNN 3.4.0 and 3.4.1 contain significant performance and stability improvements relevant to our inference workloads:
Performance (3.4.0)
Stability (3.4.1)
How
git merge 3.4.1— all 333 conflicts were modify/delete inapps/andproject/(files deleted by cleanup, modified by upstream)git rmon all conflicted files + removed any new upstream additions to those dirsTesting
🤖 Generated with Claude Code