Skip to content

feat(vm): 8 opcode handler modules (4)#7

Open
haeter525 wants to merge 16 commits into
mainfrom
pr-d/handlers
Open

feat(vm): 8 opcode handler modules (4)#7
haeter525 wants to merge 16 commits into
mainfrom
pr-d/handlers

Conversation

@haeter525
Copy link
Copy Markdown
Member

@haeter525 haeter525 commented May 21, 2026

Summary

Fills in the handler modules with the following type of instructions:

  • arithmetic
  • array
  • branch
  • compare
  • field
  • throw
  • type_check
  • type_convert

So the VM can execute every non-stub-dependent opcode end-to-end.

Depends on #4 (Emulator Foundation) which depends on #5 (parser). Base of this PR is pr-b/vm-scaffold; once #4 and #5 merge, the base auto-updates to main and the diff shrinks to handler-only.

Scope

Area Change
src/dextrace/vm/handlers/ 8 new handler modules (arithmetic, array, branch, compare, field, throw, type_check, type_conv)
src/dextrace/vm/engine.py Add handler imports + register() calls
tests/vm/ test_{arithmetic,wide_arithmetic,handlers,type_check,throw_signal,trace,call_return}.py — handler unit tests + engine integration
tests/test_vm_run_*.py arrays, fibonacci, inheritance, instance_fields, interface_dispatch, long_arithmetic, packed_switch, try_catch, try_catch_with_fields — end-to-end execution tests
tools/gen_*_fixture.py 8 new fixture generators (auto-run by tests/conftest.py at session start; the generated .dex files are gitignored)

One-line verification

pytest tests/vm/test_arithmetic.py tests/vm/test_wide_arithmetic.py -q

Full suite:

pytest -q   # 314 passed

haeter525 and others added 3 commits May 21, 2026 21:37
Adds DEX try_item and encoded_catch_handler decoding to the existing
DexParser, exposed as new TryItem and CatchHandler dataclasses. No
existing parser behaviour changes.

Includes:
- src/dextrace/core/dex_parser.py: TryItem, CatchHandler, parse_try_items
- tests/test_dex_catch.py: parser-level coverage against the try_catch fixture
- tools/gen_try_catch_fixture.py: generator for the .dex fixture
- tests/conftest.py: pytest_configure auto-runs every tools/gen_*_fixture.py
  at session start so .dex files don't need to be committed
- .gitignore: tests/fixtures/samples/*.dex

Verification (one-liner):
    pytest tests/test_dex_catch.py -q

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Lays down the Dalvik interpreter scaffolding (primitives + engine loop)
with the absolute minimum handler (`move` only — covers const-* opcodes)
and an empty android_stubs registry. Engine wiring for the remaining
handlers and stub modules is held back behind explicit "scaffold:
re-enable in follow-up" comments so the next PRs become pure additions.

End-to-end demo: const_return.dex (one `const/16` + one `return`) runs
through the engine and produces 42.

Primitives (src/dextrace/vm/):
- decoder, register_file, state, call_frame, signals, errors, int_ops
- heap, class_hierarchy, trace
- engine.py (~900 LoC interpreter loop)
- handlers/__init__.py, handlers/move.py (only handler shipped)
- android_stubs/__init__.py (REGISTRY/types only; stub modules deferred)

Supporting:
- core/dex_class_iter.py (used by class_hierarchy)
- dalvik/payload.py: PackedSwitchTable / SparseSwitchTable /
  FillArrayDataTable decoders (consumed by engine for switch + array ops)

Tests (117 passing):
- tests/vm/test_{register_file,int_ops,heap,class_hierarchy,
  class_hierarchy_subtype,switch_payload}.py — primitive unit tests
- tests/test_vm_run_const_return.py — Python-API end-to-end (CLI
  variants land alongside cli/cmd_run.py in the CLI PR)

Held back for follow-up PRs:
- handlers/{arithmetic,array,branch,compare,field,throw,
  type_check,type_conv}.py — engine has matching scaffold markers
  at each import + register() site
- android_stubs/{sms,text,intent,telephony,network,runtime,
  filesystem,content}.py — package __init__ has matching marker
- Integration tests that need full handlers / stubs / CLI

Verification (one-liner):
    pytest tests/test_vm_run_const_return.py -q

Full scaffold suite:
    pytest -q   # 117 passed

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drop P1/P4/P5a-f and OV-1..OV-6 references from inline comments — the
codenames are noise without the design doc next to them. Also remove the
empty stub-bootstrap block in android_stubs since stub modules will
re-add it when they land.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@haeter525 haeter525 changed the title feat(vm): 8 opcode handler modules feat(vm): 8 opcode handler modules (4) May 21, 2026
@haeter525 haeter525 marked this pull request as ready for review May 21, 2026 16:09
haeter525 and others added 3 commits May 23, 2026 16:54
Void invokes and void external misses no longer synthesize a fake 0 into
pending_result; the dispatch loop now clears pending_result_* whenever
state.pc != pending_result_pc, so only the immediately following
move-result* can consume an invoke's value (per Dalvik verifier).

Addresses PR #4 review feedback.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The inheritance.dex fixture (and its generator on the stacked-PR
chain) produces classes `LBase;` / `LMid;` / `LMain;` — no package
prefix. The test was querying `Lp3/Base;` / `Lp3/Mid;` and always
hit "vtable miss" instead of exercising vtable resolution.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The test_class_hierarchy.py tests added in 06b286a reference
inheritance.dex and fib_recursive.dex, but their generators only
existed downstream on pr-e/android-stubs. On a clean clone (or CI),
conftest.py had nothing to regenerate, the fixtures were missing,
and the tests errored with FileNotFoundError. Pull the generators
upstream to the PR that needs them.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
haeter525 and others added 8 commits May 29, 2026 00:26
The try_catch.dex fixture used a placeholder class descriptor that
leaked an internal short codename into the public test corpus.
Rename it to a descriptive name so downstream tests (test_throw_signal,
test_vm_run_try_catch) query the same descriptor the generator emits
without any decoder ring.

String-table reorder ripples through type_string_ids meanings, so
method_ids, class_def class/superclass, and the catch handler's
type_idx are updated to match.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the `dextrace run <dex> --entry <sig>` subcommand to the existing
`dextrace` entry point. Flags: --arg / --args (JSON), --json, --trace,
--strict-stubs, --dump-regs, --verbose.

Verification (one-liner):
    pytest tests/test_vm_run_const_return.py -q

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Address review feedback on PR #6:

- `--json --dump-regs` previously printed JSON then appended a plain
  `registers: ...` line, breaking machine consumers. Fold registers
  into the JSON payload instead; text-mode output unchanged.
- `--args '[true]'` slipped past validation because `bool` is a
  subclass of `int` in Python. Use `type(v) not in (int, str)` to
  reject JSON booleans explicitly.
- Strengthen `test_cli_dump_regs` to assert the register line, and
  add coverage for `--json --dump-regs` and `--args '[true]'`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fills in the held-back handler modules from the scaffold: arithmetic,
array, branch, compare, field, throw, type_check, type_conv. Re-enables
their imports and register() calls in engine.py so the VM can execute
every non-Android-stub-dependent opcode.

Verification (one-liner):
    pytest tests/vm/test_arithmetic.py tests/vm/test_wide_arithmetic.py -q

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Four test files queried the VM with codename signatures that no
longer matched what the generators emit, producing
DexTraceVMError: method not found:

  Lp2/Fib;     → LFibonacciTest;       (test_vm_run_fibonacci.py)
  LConstReturnTest; → Lcom/example/ConstReturn;
                                       (test_vm_run_inheritance.py,
                                        tests/vm/test_handlers.py)
  Lp5e;        → LArraysTest;          (tests/vm/test_trace.py)
  Lp5f;        → LInterfaceDispatchTest; (tests/vm/test_trace.py)

Rename the queries to match what the fixture/generator already
produces — the codenames were stale, not the fixtures.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@haeter525 haeter525 changed the base branch from pr-b/vm-scaffold to main June 2, 2026 00:07
# Conflicts:
#	src/dextrace/vm/engine.py
#	tools/gen_const_return_fixture.py
#	tools/gen_fibonacci_fixture.py
#	tools/gen_inheritance_fixture.py
@sidra-asa
Copy link
Copy Markdown
Collaborator

The PR is structurally cleaner after rebasing onto main, and the existing test suite passes. However, several VM semantics still need tightening before merge.

1. Set pending_result_pc for filled-new-array

File: src/dextrace/vm/handlers/array.py

filled-new-array produces a pending result that must be consumed by the immediately following move-result-object. It currently sets pending_result but does not set pending_result_pc, so the engine cannot enforce the pending-result consumer invariant.

Suggested fix

state.pending_result = handle
state.pending_result_is_wide = False
state.pending_result_pc = insn.uoff + insn.size_units

Also clear pending_result_pc in all move-result* handlers after consumption.

2. Implement Java/Dalvik float-to-integral narrowing semantics

File: src/dextrace/vm/handlers/type_conv.py

float-to-int, double-to-int, float-to-long, and double-to-long currently call Python int(). This raises on NaN / Infinity, while Java/Dalvik conversions should produce deterministic narrowed values.

Suggested fix

Add conversion helpers that handle NaN, +Infinity, -Infinity, and overflow boundaries, then replace direct int(f) / int(d) calls with those helpers.

3. Store ushr-int* results as signed 32-bit register values

File: src/dextrace/vm/handlers/arithmetic.py

ushr-int* currently writes raw unsigned Python integers. For example, -1 >>> 0 becomes 4294967295, but the Dalvik register should hold the signed 32-bit value -1.

Suggested fix

Wrap all ushr-int* results with i32(...):

state.registers.set(dest, i32(u32(a) >> (b & 0x1F)))

Apply the same change to /2addr and /lit8.

4. Add width-specific array and field handlers for byte/short/char/boolean

Files:

  • src/dextrace/vm/handlers/array.py
  • src/dextrace/vm/handlers/field.py

The typed variants currently share generic _aget/_aput and _iget/_iput logic. This misses Dalvik width semantics: byte and short should sign-extend, char should zero-extend, and boolean should normalize to 0/1.

Suggested fix

Add width helpers such as _i8, _i16, _u16, and _bool, then register typed closures separately for array and field handlers. Add regression tests like:

255 -> aput-byte -> aget-byte == -1
255 -> iput-byte -> iget-byte == -1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants