Added Qwen3.5 0.8B by FarmersWrap · Pull Request #657 · UbiquitousLearning/mllm

FarmersWrap · 2026-03-22T00:50:02Z

Please check Guidelines for Contributing.

Summary by CodeRabbit

New Features
- Added support for Qwen3.5 model with hybrid attention architecture combining linear-attention GDN layers and full-attention blocks.
- Introduced interactive CLI example for loading and chatting with Qwen3.5.
- Added tokenizer implementation for Qwen3.5 with SentencePiece BPE integration.
- Extended QNN AOT optimization with RoPE and SiLU operation handlers.

coderabbitai · 2026-03-22T00:50:19Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d7893796-2325-4e49-bb38-22d788da2387

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR introduces comprehensive support for the Qwen3.5 language model, including model configuration, architecture with hybrid attention mechanisms (full attention and GDN layers), tokenization, and a CLI example. It also extends the QNN AOT backend with new lowering patterns for SiLU and RoPE operations.

Changes

Cohort / File(s)	Summary
QNN AOT Lowering Visitors `mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp`, `mllm/backends/qnn/aot/visitor/SiLU.`, `mllm/backends/qnn/aot/visitor/RoPE.`	Added QNN AOT pattern handlers for SiLU and RoPE operations. SiLU decomposes into Sigmoid and ElementWiseMultiply nodes; RoPE creates a custom HTP node with pose_type parameter. Patterns are registered in the lowering pass registry.
Qwen3.5 Model Architecture `mllm/models/qwen3_5/configuration_qwen3_5.hpp`, `mllm/models/qwen3_5/modeling_qwen3_5.hpp`, `mllm/models/qwen3_5/modeling_qwen3_5_qnn_aot.hpp`	Introduced Qwen3.5 configuration parsing (hybrid attention settings, partial RoPE, GDN parameters) and model implementation featuring 18 GDN layers with linear-attention-style computation and 6 full-attention layers with output gating, plus helper routines for partial RoPE rotation.
Qwen3.5 Tokenization `mllm/models/qwen3_5/tokenization_qwen3_5.hpp`	Added Qwen3.5-compatible tokenizer with regex-based splitting, BPE encoding/decoding, special token handling, and message template conversion for chat inputs.
Qwen3.5 Example Application `examples/qwen3_5/main.cpp`, `examples/qwen3_5/CMakeLists.txt`, `examples/CMakeLists.txt`	Added interactive CLI example (`mllm-qwen3-5-runner`) for loading Qwen3.5 models, supporting streaming chat generation, Perfetto profiling, and memory reporting.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat(qualcomm): Qnn AOT Lowering passes #596: Extends LLM2QnnLoweringPass by registering additional QNN AOT lowering visitor patterns similarly to how this PR adds SiLU and RoPE patterns.
feat: add LLM2QnnLoweringPass and update graph splitting logic #577: Modifies the QNN AOT lowering pipeline and pattern registration infrastructure that this PR builds upon.
feat(qwen3): Introduce Single Head Attention (SHA) optimization for Qualcomm qwen model #616: Adds QNN AOT lowering support and example workflows for Qwen models in parallel with this PR's pattern implementations.

Suggested reviewers

liang1232018
chenghuaWang
oreomaker

Poem

🐰 A Qwen hops forth with hybrid grace,
Full attention and GDN intertwine,
RoPE spins while SiLU graces the place,
QNN patterns align so fine,
Chat flows stream, memories trace! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description only contains the contribution template and a link to guidelines without any implementation details, objectives, or actual description content.	Provide a detailed description explaining the Qwen3.5 addition, including key features (hybrid attention, GDN layers, partial RoPE), QNN AOT support, and any breaking changes or migration notes.
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.20% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Added Qwen3.5 0.8B' directly matches the main change—adding comprehensive support for the Qwen3.5 model including configuration, tokenizer, modeling, and example code.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can make CodeRabbit's review stricter and more nitpicky using the `assertive` profile, if that's what you prefer.

Change the reviews.profile setting to assertive to make CodeRabbit's nitpick more issues in your PRs.

coderabbitai

Actionable comments posted: 12

🧹 Nitpick comments (3)

examples/qwen3_5/main.cpp (2)

55-64: Consider resetting KV cache before inference if multi-turn support is added.

Based on the ARGeneration implementation (see mllm/models/ARGeneration.cpp:445-447), chat() does not reset the KV cache. For single-prompt usage this works, but if you add a loop, you'll need to call model.kvCache().reset() between independent prompts to avoid stale cached states affecting new responses.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@examples/qwen3_5/main.cpp` around lines 55 - 64, The example currently calls
model.chat(inputs) directly and ARGeneration::chat does not reset the KV cache;
to avoid stale state when adding multi-turn or repeated independent prompts,
call model.kvCache().reset() before each independent inference (e.g., right
before invoking model.chat(inputs) in the try block). Reference the ARGeneration
behavior and use model.kvCache().reset() to clear the cache between prompts so
each new prompt starts with a fresh KV state.

47-64: UI text is misleading: "Interactive CLI" but only single prompt.

The banner says "Qwen3.5 Interactive CLI" and mentions "Enter 'exit' or 'quit' to end the session", but the code only processes one prompt and exits. This may confuse users expecting a multi-turn session.

Either add a loop for actual interactive behavior or update the messaging.

♻️ Proposed fix for multi-turn interaction

     fmt::print("\n{:*^60}\n", " Qwen3.5 Interactive CLI ");
     fmt::print("Enter 'exit' or 'quit' to end the session\n\n");

     std::string prompt_text;
+    while (true) {
+      fmt::print("Prompt text (or 'exit/quit'): ");
+      std::getline(std::cin, prompt_text);
+
+      if (prompt_text == "exit" || prompt_text == "quit" || std::cin.eof()) {
+        break;
+      }

-    fmt::print("Prompt text (or 'exit/quit'): ");
-    std::getline(std::cin, prompt_text);
-
-    try {
-      fmt::print("Processing...\n");
-      auto inputs = tokenizer.convertMessage({.prompt = prompt_text});
+      try {
+        fmt::print("Processing...\n");
+        model.kvCache().reset();  // Reset KV cache between turns
+        auto inputs = tokenizer.convertMessage({.prompt = prompt_text});

-      fmt::print("\nResponse: ");
+        fmt::print("\nResponse: ");

-      for (auto& step : model.chat(inputs)) { std::wcout << tokenizer.detokenize(step.cur_token_id) << std::flush; }
+        for (auto& step : model.chat(inputs)) { std::wcout << tokenizer.detokenize(step.cur_token_id) << std::flush; }

-      fmt::print("\n{}\n", std::string(60, '-'));
-    } catch (const std::exception& e) { fmt::print("\nError: {}\n{}\n", e.what(), std::string(60, '-')); }
+        fmt::print("\n{}\n", std::string(60, '-'));
+      } catch (const std::exception& e) { fmt::print("\nError: {}\n{}\n", e.what(), std::string(60, '-')); }
+    }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@examples/qwen3_5/main.cpp` around lines 47 - 64, The current UI displays
"Qwen3.5 Interactive CLI" and instructs to use 'exit'/'quit' but only reads a
single prompt; change to a real interactive loop: wrap the
prompt/read/process/response block in a while loop that repeatedly prints
"Prompt text (or 'exit/quit'): ", reads prompt_text, breaks when prompt_text is
"exit" or "quit", and otherwise calls tokenizer.convertMessage(...) and iterates
model.chat(...) to stream responses (use the existing tokenizer.detokenize and
step.cur_token_id usage). Ensure exceptions are caught per-iteration and the
prompt/response separator (std::string(60,'-')) is printed after each exchange
so multi-turn behavior matches the banner.

mllm/models/qwen3_5/modeling_qwen3_5.hpp (1)

537-546: KV cache allocates slots for all layers, but only full attention layers use it.

The StaticCache is initialized with cfg.num_hidden_layers (all 24 layers), but only the 6 full attention layers use KV caching. GDN layers (line 438) explicitly don't use the cache. This wastes memory allocating 18 unused cache slots.
♻️ Suggested optimization

Consider only allocating cache for full attention layers:
-    kv_cache_ = nn::StaticCache(cfg.max_cache_length, cfg.num_hidden_layers,
+    kv_cache_ = nn::StaticCache(cfg.max_cache_length, cfg.numFullAttentionLayers(),
                                 cfg.num_attention_heads,
This would also require updating layer_idx_ in full attention layers to be an index into the reduced cache rather than the global layer index.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mllm/models/qwen3_5/modeling_qwen3_5.hpp` around lines 537 - 546, The
constructor Qwen3_5ForCausalLM currently initializes kv_cache_ with
cfg.num_hidden_layers causing slots for GDN layers that never use the cache;
change the allocation to use only the number of full-attention layers (compute
count of layers where attention is enabled) when constructing nn::StaticCache
and adjust any code that sets layer_idx_ in those full-attention layer objects
so layer_idx_ becomes an index into this reduced cache rather than the original
global layer index (update places that reference layer_idx_ to use the new
mapping). Ensure the cache creation uses the same cfg.max_cache_length,
cfg.num_attention_heads, cfg.num_key_value_heads, head_dim and types as before.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/qwen3_5/main.cpp`:
- Around line 30-34: The help-check currently runs after Argparse::parse(argc,
argv) which will fail if required(true) arguments are missing; modify the flow
so help is detected before required-arg validation by either (a) checking argv
for "-h" or "--help" before calling Argparse::parse, or (b) using a parser
option/alternate parse method that allows short-circuiting validation when help
is present; update the code around Argparse::parse(argc, argv), help.isSet(),
and the subsequent Argparse::printHelp()/mllm::shutdownContext() so that
printHelp() is invoked and the program exits without triggering
required-argument errors.

In `@mllm/backends/qnn/aot/visitor/RoPE.cpp`:
- Around line 45-52: The code in RoPE.cpp dereferences op->inputs() and
op->outputs() without checking arity (uses inputs_it, i_0, i_sin, i_cos and
o_0); add explicit validation that op->inputs().size() >= 3 and
op->outputs().size() >= 1 before creating inputs_it or dereferencing std::next,
and handle the error path (return, throw, or log and abort) consistent with
surrounding error handling conventions so you never dereference invalid
iterators for the RoPE operator.
- Around line 57-60: The lowering currently hardcodes pose_type=0 in RoPE.cpp
when calling qnn_op_node->emplaceParamScalar, but the custom kernel expects
pose_type 2 or 4; add a pose_type field to the IR struct/class RoPEOpOptions (or
derive it deterministically from an existing field like rope_theta) and populate
it from the op creation site, then replace the hardcoded
static_cast<uint32_t>(0) in RoPE.cpp with the value from RoPEOpOptions (passed
through into QNNParamScalarWrapper::create("pose_type",
static_cast<uint32_t>(pose_type))). Ensure the new field is
serialized/constructed where RoPEOpOptions is instantiated so the lowering can
read it.

In `@mllm/backends/qnn/aot/visitor/SiLU.cpp`:
- Around line 44-47: The SiLU visitor currently copies quant_recipe only if
present which is inconsistent with other QNN AOT patterns; update the SiLU
handling to require the quantization recipe the same way RMSNorm/Add/Mul/Neg do
by checking for the attribute with MLLM_RETURN_FALSE_IF_NOT (or equivalent) on
op->getAttr("quant_recipe") and then unconditionally set it on sigmoid_out via
sigmoid_out->setAttr("quant_recipe", op->getAttr("quant_recipe")); this enforces
presence of quant metadata and matches existing QNN patterns.

In `@mllm/models/qwen3_5/configuration_qwen3_5.hpp`:
- Around line 115-117: The method isFullAttentionLayer(int layer_idx) performs
unchecked indexing into layer_types; guard against out-of-range and negative
indexes by validating layer_idx before accessing layer_types: check that
layer_idx >= 0 and layer_idx < static_cast<int>(layer_types.size()) and return a
safe default (e.g., false) or handle the error (assert/log/throw) if the index
is invalid; update isFullAttentionLayer to perform this bounds check against
layer_types.size() and then compare layer_types[layer_idx] == "full_attention".
- Around line 34-61: The code computes layer_types using (i + 1) %
full_attention_interval which is unsafe when full_attention_interval <= 0 and it
accepts provided layer_types without validating length or allowed values; update
initialization to first validate full_attention_interval is > 0 (otherwise set
to 1 or return an error) before using it in the loop, and when
tc.contains("layer_types") verify that layer_types.size() == num_hidden_layers
and each entry is one of the allowed strings (e.g., "full_attention" or
"linear_attention")—if validation fails, either normalize/replace invalid
entries or emit an error so downstream helpers do not break. Ensure these checks
reference the variables full_attention_interval, layer_types, and
num_hidden_layers and are performed before deriving the computed layer layout.

In `@mllm/models/qwen3_5/modeling_qwen3_5.hpp`:
- Around line 553-555: The logic for tie_word_embeddings is inverted: instead of
creating and using lm_head_ when cfg.tie_word_embeddings is true, create
lm_head_ only when cfg.tie_word_embeddings is false (or alternatively always
create but copy/alias embedding weights into lm_head_ when
cfg.tie_word_embeddings is true). Update the construction in the constructor
around lm_head_ (currently guarded by cfg.tie_word_embeddings) to be guarded by
!cfg.tie_word_embeddings (or add weight sharing code to copy embedding->lm_head_
when tying), and update the forward usage (the branch keyed by
tie_word_embeddings_) to use lm_head_ only when embeddings are not tied (or use
the shared embedding weights when tying). Ensure references to lm_head_,
cfg.tie_word_embeddings and tie_word_embeddings_ are consistent after the
change.
- Around line 360-363: The attention code omits the softmax step: after
computing attn_weights = matmul(q, k, false, true) and scaling by
1/sqrt(head_k_dim_), apply a softmax over the source sequence dimension (the S
dimension of attn_weights) before multiplying by v; update the sequence that
builds attn_weights (using nn::functional::softmax or equivalent) so
attn_weights becomes a probability distribution prior to the subsequent matmul
with v, keeping existing variables q, k, v, head_k_dim_, and the final output =
matmul(attn_weights, v).
- Around line 339-363: The simplified GDN attention path is missing the causal
mask and softmax, breaking autoregression; in the GDN fallback (where q, k, v
are transposed and attn_weights is computed) apply the same causal masking used
by Qwen3_5FullAttention (e.g., reuse the class CausalMask member or call
mask_(attn_weights)) after scaling and then run softmax before multiplying by v,
ensuring attn_weights becomes a masked softmax(QK^T / sqrt(d)) and preserving
the autoregressive property.

In `@mllm/models/qwen3_5/tokenization_qwen3_5.hpp`:
- Around line 226-229: In convertMessage, guard against std::string::npos
returned by Qwen3_5Message::message_template.find("{{{prompt}}}") before calling
replace: check the found position (pos) and if it is npos handle the missing
placeholder (for example, log/throw/return an error or append the prompt into
the template) instead of calling applied_string.replace(pos, 12,
message.prompt); this prevents undefined behavior when the "{{{prompt}}}" token
is absent.
- Around line 206-211: In detokenize, avoid using
bytes_2_unicode_dict_inverse_[c] (operator[]) which inserts defaults for unknown
wchar_t; instead, validate lookups from bytes_2_unicode_dict_inverse_ for each
wchar_t returned by _detokenize and handle misses explicitly (e.g., use find()
or at() with try/catch) — on missing mapping append a safe UTF-8 replacement
sequence (such as the UTF-8 bytes for U+FFFD or a question mark) to utf_8_str or
return an error; update the function around _detokenize, utf_8_str, and
bytes_2_unicode_dict_inverse_ to perform this check before converting with
mllm::preprocessor::utf8string2WideString.
- Around line 103-113: The lookahead logic for pattern 6 is wrong: after
consuming all whitespace in the while loop inside the tokenization code (the
block that sets start and advances pos), the subsequent check should only accept
the match when we reached end-of-string (trailing whitespace), because next char
after the loop will never be whitespace; update the condition in that block to
test only for pos >= str.size() (remove the std::iswspace(str[pos]) check) so
matched = str.substr(start, pos-start) is returned only for trailing whitespace.

---

Nitpick comments:
In `@examples/qwen3_5/main.cpp`:
- Around line 55-64: The example currently calls model.chat(inputs) directly and
ARGeneration::chat does not reset the KV cache; to avoid stale state when adding
multi-turn or repeated independent prompts, call model.kvCache().reset() before
each independent inference (e.g., right before invoking model.chat(inputs) in
the try block). Reference the ARGeneration behavior and use
model.kvCache().reset() to clear the cache between prompts so each new prompt
starts with a fresh KV state.
- Around line 47-64: The current UI displays "Qwen3.5 Interactive CLI" and
instructs to use 'exit'/'quit' but only reads a single prompt; change to a real
interactive loop: wrap the prompt/read/process/response block in a while loop
that repeatedly prints "Prompt text (or 'exit/quit'): ", reads prompt_text,
breaks when prompt_text is "exit" or "quit", and otherwise calls
tokenizer.convertMessage(...) and iterates model.chat(...) to stream responses
(use the existing tokenizer.detokenize and step.cur_token_id usage). Ensure
exceptions are caught per-iteration and the prompt/response separator
(std::string(60,'-')) is printed after each exchange so multi-turn behavior
matches the banner.

In `@mllm/models/qwen3_5/modeling_qwen3_5.hpp`:
- Around line 537-546: The constructor Qwen3_5ForCausalLM currently initializes
kv_cache_ with cfg.num_hidden_layers causing slots for GDN layers that never use
the cache; change the allocation to use only the number of full-attention layers
(compute count of layers where attention is enabled) when constructing
nn::StaticCache and adjust any code that sets layer_idx_ in those full-attention
layer objects so layer_idx_ becomes an index into this reduced cache rather than
the original global layer index (update places that reference layer_idx_ to use
the new mapping). Ensure the cache creation uses the same cfg.max_cache_length,
cfg.num_attention_heads, cfg.num_key_value_heads, head_dim and types as before.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3d4151fc-e18a-478c-96cc-f69c90a24e1f

📥 Commits

Reviewing files that changed from the base of the PR and between 05c75f8 and 1e672a9.

📒 Files selected for processing (12)

examples/CMakeLists.txt
examples/qwen3_5/CMakeLists.txt
examples/qwen3_5/main.cpp
mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp
mllm/backends/qnn/aot/visitor/RoPE.cpp
mllm/backends/qnn/aot/visitor/RoPE.hpp
mllm/backends/qnn/aot/visitor/SiLU.cpp
mllm/backends/qnn/aot/visitor/SiLU.hpp
mllm/models/qwen3_5/configuration_qwen3_5.hpp
mllm/models/qwen3_5/modeling_qwen3_5.hpp
mllm/models/qwen3_5/modeling_qwen3_5_qnn_aot.hpp
mllm/models/qwen3_5/tokenization_qwen3_5.hpp

coderabbitai · 2026-03-22T00:58:59Z

examples/qwen3_5/main.cpp

+  if (help.isSet()) {
+    Argparse::printHelp();
+    mllm::shutdownContext();
+    return 0;
+  }


⚠️ Potential issue | 🟡 Minor

Help flag check may fail due to required argument validation.

The help flag is checked after Argparse::parse(argc, argv) (line 17), but arguments are marked as required(true). If the user runs with just -h, the parser may fail before reaching the help check.

🛠️ Proposed fix

Consider checking for help before validating required arguments, or ensuring Argparse::parse doesn't error on missing required args when -h is present. A common pattern is:

+ // Check for help before full parse + for (int i = 1; i < argc; ++i) { + if (std::string(argv[i]) == "-h" || std::string(argv[i]) == "--help") { + Argparse::printHelp(); + mllm::shutdownContext(); + return 0; + } + } + Argparse::parse(argc, argv); - - if (help.isSet()) { - Argparse::printHelp(); - mllm::shutdownContext(); - return 0; - }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/qwen3_5/main.cpp` around lines 30 - 34, The help-check currently runs after Argparse::parse(argc, argv) which will fail if required(true) arguments are missing; modify the flow so help is detected before required-arg validation by either (a) checking argv for "-h" or "--help" before calling Argparse::parse, or (b) using a parser option/alternate parse method that allows short-circuiting validation when help is present; update the code around Argparse::parse(argc, argv), help.isSet(), and the subsequent Argparse::printHelp()/mllm::shutdownContext() so that printHelp() is invoked and the program exits without triggering required-argument errors.

coderabbitai · 2026-03-22T00:58:59Z

mllm/backends/qnn/aot/visitor/RoPE.cpp

+  auto inputs_it = op->inputs().begin();
+  auto i_0 = (*inputs_it)->cast_<ir::tensor::TensorValue>();          // input tensor
+  auto i_sin = (*std::next(inputs_it))->cast_<ir::tensor::TensorValue>();  // sin embeddings
+  auto i_cos = (*std::next(inputs_it, 2))->cast_<ir::tensor::TensorValue>();  // cos embeddings
+
+  // RoPE output
+  auto o_0 = op->outputs().front()->cast_<ir::tensor::TensorValue>();
+


⚠️ Potential issue | 🟠 Major

Validate RoPE input/output arity before dereferencing iterators.

Line 47, Line 48, and Line 51 assume 3 inputs and 1 output exist. Missing checks here can dereference invalid iterators on malformed IR.

Proposed fix

// RoPE inputs: x, sin, cos + if (op->inputs().size() < 3 || op->outputs().empty()) { + MLLM_ERROR("RoPE lowering expects at least 3 inputs and 1 output"); + return false; + } auto inputs_it = op->inputs().begin(); auto i_0 = (*inputs_it)->cast_<ir::tensor::TensorValue>(); // input tensor auto i_sin = (*std::next(inputs_it))->cast_<ir::tensor::TensorValue>(); // sin embeddings auto i_cos = (*std::next(inputs_it, 2))->cast_<ir::tensor::TensorValue>(); // cos embeddings

As per coding guidelines, "Validate inputs for public APIs and critical internal functions."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/backends/qnn/aot/visitor/RoPE.cpp` around lines 45 - 52, The code in RoPE.cpp dereferences op->inputs() and op->outputs() without checking arity (uses inputs_it, i_0, i_sin, i_cos and o_0); add explicit validation that op->inputs().size() >= 3 and op->outputs().size() >= 1 before creating inputs_it or dereferencing std::next, and handle the error path (return, throw, or log and abort) consistent with surrounding error handling conventions so you never dereference invalid iterators for the RoPE operator.

coderabbitai · 2026-03-22T00:58:59Z

mllm/backends/qnn/aot/visitor/RoPE.cpp

+  // pose_type parameter: 0 for standard RoPE
+  // The custom HTP op uses this to select between different RoPE variants
+  qnn_op_node->emplaceParamScalar(mllm::qnn::QNNParamScalarWrapper::create("pose_type", static_cast<uint32_t>(0)));
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "1) Verify RoPE kernel pose_type handling in custom package:" rg -n -C3 'pose_type|only support pose_type' mllm/backends/qnn/custom-op-package/LLaMAPackage/src/ops/RoPE.cpp echo echo "2) Verify current lowering hardcodes pose_type:" rg -n -C3 'emplaceParamScalar|pose_type' mllm/backends/qnn/aot/visitor/RoPE.cpp echo echo "3) Find where RoPE variant is represented in IR/AOp to thread through lowering:" rg -n -C3 'class .*RoPEOp|pose_type' mllm/core mllm/compile mllm/models

Repository: UbiquitousLearning/mllm

Length of output: 6375

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Check RoPEOp struct/class definition ===" cat -n mllm/core/aops/RoPEOp.hpp echo echo "=== Check visitor code to see how RoPEOp variant is extracted ===" cat -n mllm/backends/qnn/aot/visitor/RoPE.cpp | head -80

Repository: UbiquitousLearning/mllm

Length of output: 5148

Add pose_type field to RoPEOpOptions and pass through the correct variant to the custom op kernel.

The hardcoded pose_type = 0 at line 59 will cause a fatal error in the custom HTP op. The kernel (mllm/backends/qnn/custom-op-package/LLaMAPackage/src/ops/RoPE.cpp lines 664–665) only supports pose_type == 2 (LLaMA) or pose_type == 4, and rejects all other values with ErrorFatal. The IR representation (RoPEOpOptions) currently lacks a field to capture the variant, so the lowering cannot pass the correct value. Either add a pose_type field to RoPEOpOptions, or derive it from existing parameters (e.g., rope_theta).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/backends/qnn/aot/visitor/RoPE.cpp` around lines 57 - 60, The lowering currently hardcodes pose_type=0 in RoPE.cpp when calling qnn_op_node->emplaceParamScalar, but the custom kernel expects pose_type 2 or 4; add a pose_type field to the IR struct/class RoPEOpOptions (or derive it deterministically from an existing field like rope_theta) and populate it from the op creation site, then replace the hardcoded static_cast<uint32_t>(0) in RoPE.cpp with the value from RoPEOpOptions (passed through into QNNParamScalarWrapper::create("pose_type", static_cast<uint32_t>(pose_type))). Ensure the new field is serialized/constructed where RoPEOpOptions is instantiated so the lowering can read it.

coderabbitai · 2026-03-22T00:58:59Z

mllm/backends/qnn/aot/visitor/SiLU.cpp

+  // Copy quantization recipe from output to intermediate if available
+  if (op->getAttr("quant_recipe")) {
+    sigmoid_out->setAttr("quant_recipe", op->getAttr("quant_recipe"));
+  }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check how other QNN AOT patterns handle quant_recipe rg -n "quant_recipe" mllm/backends/qnn/aot/visitor/ -A2 -B2

Repository: UbiquitousLearning/mllm

Length of output: 3254

🏁 Script executed:

# Get the full SiLU.cpp file to understand the complete context cat -n mllm/backends/qnn/aot/visitor/SiLU.cpp

Repository: UbiquitousLearning/mllm

Length of output: 3503

🏁 Script executed:

# Check if there are other activation/unary patterns with optional quant_recipe handling rg "if.*getAttr.*quant_recipe" mllm/backends/qnn/aot/visitor/ -B5 -A5

Repository: UbiquitousLearning/mllm

Length of output: 1043

🏁 Script executed:

# Count pattern frequency - how many patterns use MLLM_RETURN_FALSE_IF_NOT vs optional check echo "=== Patterns requiring quant_recipe (MLLM_RETURN_FALSE_IF_NOT) ===" && \ rg "MLLM_RETURN_FALSE_IF_NOT.*quant_recipe" mllm/backends/qnn/aot/visitor/ | wc -l && \ echo "=== Patterns with optional quant_recipe (if check) ===" && \ rg "if.*getAttr.*quant_recipe" mllm/backends/qnn/aot/visitor/ | wc -l

Repository: UbiquitousLearning/mllm

Length of output: 192

Ensure quantization metadata is handled consistently with other QNN patterns.

SiLU treats quant_recipe as optional (lines 45-47), but all other QNN AOT patterns—RMSNorm, Add, Mul, and Neg—require it via MLLM_RETURN_FALSE_IF_NOT. This inconsistency means intermediate sigmoid_out tensors may lack quantization metadata if the attribute is missing, potentially breaking quantization expectations in QNN deployment.

Either make quant_recipe required here for consistency with other patterns, or document why optional handling is intentional for SiLU decomposition.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/backends/qnn/aot/visitor/SiLU.cpp` around lines 44 - 47, The SiLU visitor currently copies quant_recipe only if present which is inconsistent with other QNN AOT patterns; update the SiLU handling to require the quantization recipe the same way RMSNorm/Add/Mul/Neg do by checking for the attribute with MLLM_RETURN_FALSE_IF_NOT (or equivalent) on op->getAttr("quant_recipe") and then unconditionally set it on sigmoid_out via sigmoid_out->setAttr("quant_recipe", op->getAttr("quant_recipe")); this enforces presence of quant metadata and matches existing QNN patterns.

coderabbitai · 2026-03-22T00:58:59Z

mllm/models/qwen3_5/configuration_qwen3_5.hpp

+    full_attention_interval = tc.value("full_attention_interval", 4);
+
+    // GDN (Gated Delta Network) parameters
+    linear_num_key_heads = tc.value("linear_num_key_heads", 16);
+    linear_num_value_heads = tc.value("linear_num_value_heads", 16);
+    linear_key_head_dim = tc.value("linear_key_head_dim", 128);
+    linear_value_head_dim = tc.value("linear_value_head_dim", 128);
+    linear_conv_kernel_dim = tc.value("linear_conv_kernel_dim", 4);
+
+    // RoPE parameters (nested under rope_parameters)
+    if (tc.contains("rope_parameters")) {
+      auto& rp = tc["rope_parameters"];
+      rope_theta = rp.value("rope_theta", 10000000.0f);
+      partial_rotary_factor = rp.value("partial_rotary_factor", 0.25f);
+    }
+
+    // Layer types: explicit list or computed from full_attention_interval
+    if (tc.contains("layer_types")) {
+      for (auto& lt : tc["layer_types"]) { layer_types.push_back(lt.get<std::string>()); }
+    } else {
+      for (int i = 0; i < num_hidden_layers; ++i) {
+        if ((i + 1) % full_attention_interval == 0) {
+          layer_types.push_back("full_attention");
+        } else {
+          layer_types.push_back("linear_attention");
+        }
+      }
+    }


⚠️ Potential issue | 🔴 Critical

Validate full_attention_interval and layer_types before deriving layer layout.

At Line 55, (i + 1) % full_attention_interval is unsafe when full_attention_interval <= 0. Also, when layer_types is provided (Line 51-53), there is no validation for allowed values or expected length vs num_hidden_layers, which can break downstream helpers.

Proposed fix

// Qwen3.5 hybrid attention attn_output_gate = tc.value("attn_output_gate", true); full_attention_interval = tc.value("full_attention_interval", 4); + if (full_attention_interval <= 0) { + throw std::invalid_argument("full_attention_interval must be > 0"); + } @@ if (tc.contains("layer_types")) { for (auto& lt : tc["layer_types"]) { layer_types.push_back(lt.get<std::string>()); } + if (static_cast<int32_t>(layer_types.size()) != num_hidden_layers) { + throw std::invalid_argument("layer_types length must equal num_hidden_layers"); + } + for (const auto& lt : layer_types) { + if (lt != "full_attention" && lt != "linear_attention") { + throw std::invalid_argument("layer_types must be 'full_attention' or 'linear_attention'"); + } + } } else {

As per coding guidelines, "Validate inputs for public APIs and critical internal functions."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/models/qwen3_5/configuration_qwen3_5.hpp` around lines 34 - 61, The code computes layer_types using (i + 1) % full_attention_interval which is unsafe when full_attention_interval <= 0 and it accepts provided layer_types without validating length or allowed values; update initialization to first validate full_attention_interval is > 0 (otherwise set to 1 or return an error) before using it in the loop, and when tc.contains("layer_types") verify that layer_types.size() == num_hidden_layers and each entry is one of the allowed strings (e.g., "full_attention" or "linear_attention")—if validation fails, either normalize/replace invalid entries or emit an error so downstream helpers do not break. Ensure these checks reference the variables full_attention_interval, layer_types, and num_hidden_layers and are performed before deriving the computed layer layout.

coderabbitai · 2026-03-22T00:58:59Z

mllm/models/qwen3_5/modeling_qwen3_5.hpp

+    auto attn_weights = nn::functional::matmul(q, k, false, true);  // [B, H, S, S]
+    auto scale = 1.f / sqrtf(static_cast<float>(head_k_dim_));
+    attn_weights = attn_weights * scale;
+    auto output = nn::functional::matmul(attn_weights, v);  // [B, H, S, Dv]


⚠️ Potential issue | 🟠 Major

Missing softmax in GDN attention computation.

The comment at line 353 describes softmax(Q K^T / sqrt(d)) V, but the code only computes (Q K^T / sqrt(d)) V without applying softmax. This produces unbounded attention weights instead of a probability distribution.

🐛 Proposed fix

auto attn_weights = nn::functional::matmul(q, k, false, true); // [B, H, S, S] auto scale = 1.f / sqrtf(static_cast<float>(head_k_dim_)); attn_weights = attn_weights * scale; + // TODO: Add causal mask here for autoregressive behavior + attn_weights = nn::functional::softmax(attn_weights, -1); auto output = nn::functional::matmul(attn_weights, v); // [B, H, S, Dv]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

auto attn_weights = nn::functional::matmul(q, k, false, true); // [B, H, S, S]

auto scale = 1.f / sqrtf(static_cast<float>(head_k_dim_));

attn_weights = attn_weights * scale;

auto output = nn::functional::matmul(attn_weights, v); // [B, H, S, Dv]

auto attn_weights = nn::functional::matmul(q, k, false, true); // [B, H, S, S]

auto scale = 1.f / sqrtf(static_cast<float>(head_k_dim_));

attn_weights = attn_weights * scale;

// TODO: Add causal mask here for autoregressive behavior

attn_weights = nn::functional::softmax(attn_weights, -1);

auto output = nn::functional::matmul(attn_weights, v); // [B, H, S, Dv]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/models/qwen3_5/modeling_qwen3_5.hpp` around lines 360 - 363, The attention code omits the softmax step: after computing attn_weights = matmul(q, k, false, true) and scaling by 1/sqrt(head_k_dim_), apply a softmax over the source sequence dimension (the S dimension of attn_weights) before multiplying by v; update the sequence that builds attn_weights (using nn::functional::softmax or equivalent) so attn_weights becomes a probability distribution prior to the subsequent matmul with v, keeping existing variables q, k, v, head_k_dim_, and the final output = matmul(attn_weights, v).

coderabbitai · 2026-03-22T00:58:59Z

mllm/models/qwen3_5/modeling_qwen3_5.hpp

+    if (cfg.tie_word_embeddings) {
+      lm_head_ = reg<nn::Linear>("lm_head_out", cfg.hidden_size, cfg.vocab_size, false, cfg.linear_impl_type);
+    }


⚠️ Potential issue | 🟠 Major

Inverted logic for tie_word_embeddings.

The standard meaning of tie_word_embeddings=true is that the embedding weights are shared with the output projection, so no separate lm_head is needed. The code appears to have inverted logic:

Line 553-555: Creates lm_head_ only when tie_word_embeddings is true

Line 591: Uses lm_head_ only when tie_word_embeddings_ is true

This means when embeddings are tied, a separate projection is created and used, which defeats the purpose of tying.

🐛 Proposed fix

- if (cfg.tie_word_embeddings) { + if (!cfg.tie_word_embeddings) { lm_head_ = reg<nn::Linear>("lm_head_out", cfg.hidden_size, cfg.vocab_size, false, cfg.linear_impl_type); }

And for the forward pass:

- if (tie_word_embeddings_) { sequence = lm_head_(sequence); } + if (tie_word_embeddings_) { + // Reuse embedding weights transposed as output projection + sequence = nn::functional::linear(sequence, llm.language_model_.embedding_.weight(), std::nullopt); + } else { + sequence = lm_head_(sequence); + }

Or alternatively, always create lm_head_ but load embedding weights into it when tying.

Also applies to: 591-591

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/models/qwen3_5/modeling_qwen3_5.hpp` around lines 553 - 555, The logic for tie_word_embeddings is inverted: instead of creating and using lm_head_ when cfg.tie_word_embeddings is true, create lm_head_ only when cfg.tie_word_embeddings is false (or alternatively always create but copy/alias embedding weights into lm_head_ when cfg.tie_word_embeddings is true). Update the construction in the constructor around lm_head_ (currently guarded by cfg.tie_word_embeddings) to be guarded by !cfg.tie_word_embeddings (or add weight sharing code to copy embedding->lm_head_ when tying), and update the forward usage (the branch keyed by tie_word_embeddings_) to use lm_head_ only when embeddings are not tied (or use the shared embedding weights when tying). Ensure references to lm_head_, cfg.tie_word_embeddings and tie_word_embeddings_ are consistent after the change.

coderabbitai · 2026-03-22T00:58:59Z

mllm/models/qwen3_5/tokenization_qwen3_5.hpp

+  // 6. Match \s+(?!\S)
+  if (std::iswspace(str[pos])) {
+    size_t start = pos;
+    while (pos < str.size() && std::iswspace(str[pos])) ++pos;
+    if (pos >= str.size() || std::iswspace(str[pos])) {
+      matched = str.substr(start, pos - start);
+      return true;
+    } else {
+      pos = start;
+    }
+  }


⚠️ Potential issue | 🟡 Minor

Logic issue in pattern 6: lookahead (?!\S) is incorrectly implemented.

The comment says "Match \s+(?!\S)" (whitespace not followed by non-whitespace), but the condition pos >= str.size() || std::iswspace(str[pos]) is checked after consuming the whitespace. This means if the next char is non-whitespace, the match is abandoned and pos is reset, which is correct. However, the check std::iswspace(str[pos]) at line 107 will always be false because the while loop at line 106 already consumed all whitespace.

🔧 Proposed fix

The condition should check that we're at end-of-string OR the next character after whitespace is also whitespace (which can't happen since we consumed all whitespace). The intent seems to be matching trailing whitespace. Consider:

// 6. Match \s+(?!\S) — trailing whitespace not followed by non-whitespace if (std::iswspace(str[pos])) { size_t start = pos; while (pos < str.size() && std::iswspace(str[pos])) ++pos; - if (pos >= str.size() || std::iswspace(str[pos])) { + if (pos >= str.size()) { + // Only match if this whitespace is at end of string (true trailing whitespace) matched = str.substr(start, pos - start); return true; - } else { + } + // Otherwise, fall through to pattern 7 for general whitespace pos = start; - } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// 6. Match \s+(?!\S)

if (std::iswspace(str[pos])) {

size_t start = pos;

while (pos < str.size() && std::iswspace(str[pos])) ++pos;

if (pos >= str.size() || std::iswspace(str[pos])) {

matched = str.substr(start, pos - start);

return true;

} else {

pos = start;

}

}

// 6. Match \s+(?!\S)

if (std::iswspace(str[pos])) {

size_t start = pos;

while (pos < str.size() && std::iswspace(str[pos])) ++pos;

if (pos >= str.size()) {

// Only match if this whitespace is at end of string (true trailing whitespace)

matched = str.substr(start, pos - start);

return true;

}

// Otherwise, fall through to pattern 7 for general whitespace

pos = start;

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/models/qwen3_5/tokenization_qwen3_5.hpp` around lines 103 - 113, The lookahead logic for pattern 6 is wrong: after consuming all whitespace in the while loop inside the tokenization code (the block that sets start and advances pos), the subsequent check should only accept the match when we reached end-of-string (trailing whitespace), because next char after the loop will never be whitespace; update the condition in that block to test only for pos >= str.size() (remove the std::iswspace(str[pos]) check) so matched = str.substr(start, pos-start) is returned only for trailing whitespace.

coderabbitai · 2026-03-22T00:58:59Z

mllm/models/qwen3_5/tokenization_qwen3_5.hpp

+  std::wstring detokenize(int64_t pos_idx) override {
+    auto str = _detokenize(pos_idx);
+    std::string utf_8_str;
+    for (wchar_t c : str) { utf_8_str.push_back((unsigned char)(bytes_2_unicode_dict_inverse_[c])); }
+    return {mllm::preprocessor::utf8string2WideString(utf_8_str)};
+  }


⚠️ Potential issue | 🟡 Minor

Missing error handling for unknown characters in detokenize.

If a wchar_t from the BPE output is not found in bytes_2_unicode_dict_inverse_, std::unordered_map::operator[] returns a default-initialized value (0), which will insert a null byte into the UTF-8 string. This could cause silent truncation or corruption.

🛡️ Proposed fix with validation

std::wstring detokenize(int64_t pos_idx) override { auto str = _detokenize(pos_idx); std::string utf_8_str; - for (wchar_t c : str) { utf_8_str.push_back((unsigned char)(bytes_2_unicode_dict_inverse_[c])); } + for (wchar_t c : str) { + auto it = bytes_2_unicode_dict_inverse_.find(c); + if (it != bytes_2_unicode_dict_inverse_.end()) { + utf_8_str.push_back(static_cast<unsigned char>(it->second)); + } + // Silently skip unmapped characters, or optionally log a warning + } return {mllm::preprocessor::utf8string2WideString(utf_8_str)}; }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/models/qwen3_5/tokenization_qwen3_5.hpp` around lines 206 - 211, In detokenize, avoid using bytes_2_unicode_dict_inverse_[c] (operator[]) which inserts defaults for unknown wchar_t; instead, validate lookups from bytes_2_unicode_dict_inverse_ for each wchar_t returned by _detokenize and handle misses explicitly (e.g., use find() or at() with try/catch) — on missing mapping append a safe UTF-8 replacement sequence (such as the UTF-8 bytes for U+FFFD or a question mark) to utf_8_str or return an error; update the function around _detokenize, utf_8_str, and bytes_2_unicode_dict_inverse_ to perform this check before converting with mllm::preprocessor::utf8string2WideString.

coderabbitai · 2026-03-22T00:58:59Z

mllm/models/qwen3_5/tokenization_qwen3_5.hpp

+  ARGenerationOutputPast convertMessage(const Qwen3_5Message& message) {
+    auto applied_string = Qwen3_5Message::message_template;
+    size_t pos = applied_string.find("{{{prompt}}}");
+    applied_string.replace(pos, 12, message.prompt);


⚠️ Potential issue | 🟡 Minor

Potential undefined behavior if template placeholder is missing.

If message_template does not contain "{{{prompt}}}", find() returns std::string::npos, and calling replace(npos, 12, ...) causes undefined behavior (typically an out-of-range exception or buffer overflow).

🛡️ Proposed fix with validation

ARGenerationOutputPast convertMessage(const Qwen3_5Message& message) { auto applied_string = Qwen3_5Message::message_template; size_t pos = applied_string.find("{{{prompt}}}"); + if (pos == std::string::npos) { + throw std::runtime_error("Message template missing {{{prompt}}} placeholder"); + } applied_string.replace(pos, 12, message.prompt);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

ARGenerationOutputPast convertMessage(const Qwen3_5Message& message) {

auto applied_string = Qwen3_5Message::message_template;

size_t pos = applied_string.find("{{{prompt}}}");

applied_string.replace(pos, 12, message.prompt);

ARGenerationOutputPast convertMessage(const Qwen3_5Message& message) {

auto applied_string = Qwen3_5Message::message_template;

size_t pos = applied_string.find("{{{prompt}}}");

if (pos == std::string::npos) {

throw std::runtime_error("Message template missing {{{prompt}}} placeholder");

}

applied_string.replace(pos, 12, message.prompt);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@mllm/models/qwen3_5/tokenization_qwen3_5.hpp` around lines 226 - 229, In convertMessage, guard against std::string::npos returned by Qwen3_5Message::message_template.find("{{{prompt}}}") before calling replace: check the found position (pos) and if it is npos handle the missing placeholder (for example, log/throw/return an error or append the prompt into the template) instead of calling applied_string.replace(pos, 12, message.prompt); this prevents undefined behavior when the "{{{prompt}}}" token is absent.

…ect inference Replace simplified linear attention placeholder with complete Gated Delta Network: - Manual causal depthwise Conv1D for prefill and decode (framework Conv1D had stride issues with transposed input) - Sequential scan with delta rule: decay/beta gating from A_log/dt_bias, L2 normalization on q/k, query scaling by 1/sqrt(d_k) - Conv state and recurrent state management across decode steps - Standard RMSNorm (no add_unit_offset) matching Qwen3_5RMSNormGated - EOS set to im_end_token_id for proper chat termination Verified output matches HuggingFace reference for "Hello world" and math prompts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…d model Implements the full QNN ahead-of-time compilation and quantization pipeline for Qwen3.5's hybrid architecture (18 GDN + 6 full attention layers). Only the 6 full attention layers are compiled to QNN; GDN layers stay on CPU. Python quantization pipeline: - modeling_qwen3_5.py: PyTorch model with QDQ nodes for full attention layers - runner.py: Qwen3_5Quantizer with calibrate/convert/save workflow - train.py: CLI entry point for end-to-end quantization C++ AOT compile & runtime: - compile.cpp: traces 6 attention layers x 2 seq lengths into 12 QNN graphs - modeling_qwen3_5_qnn_aot.hpp: per-layer QNN trace model with partial RoPE, output gating, and Conv2D LPBQ w4a16 quantization - aot_run.cpp: hybrid CPU+QNN runtime with per-layer dispatch - convert_weights.py: pre-bakes +1.0 RMSNorm offset and RoPE tables Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Added Qwen3.5 0.8B

1e672a9

FarmersWrap requested review from chenghuaWang, liang1232018 and oreomaker as code owners March 22, 2026 00:50

FarmersWrap marked this pull request as draft March 22, 2026 00:50

coderabbitai bot reviewed Mar 22, 2026

View reviewed changes

Ryan Yu and others added 2 commits March 21, 2026 22:24

Conversation

FarmersWrap commented Mar 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

FarmersWrap commented Mar 22, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 22, 2026 •

edited

Loading