Skip to content

Conversation

@IamYJLee
Copy link
Contributor

@IamYJLee IamYJLee commented Oct 27, 2025

This PR fixes incorrect handling of allocation functions (marked withallockind("alloc") or allockind("realloc")) that do not have anallocsize attribute.

Previously, allocation functions without allocsize could produce invalid expr during verification because their allocation size was not modeled properly.

Changes

Core Fix (instr.cpp)

  • Handle allocation functions without allocsize by calling addFnCall to obtain a symbolic allocation size

Preprocessing Enhancement (transform.cpp)

  • Add isAllocWithoutSize() helper to explicitly identify allocation functions without size information
  • Add has_write_fncall detection in the preprocessor stage, enabled under the isAllocWithoutSize() condition

Tests

  • Add a test case covering allocation functions without an allocsize attribute

Fixes #1238

@IamYJLee IamYJLee changed the title [Draft PR] Added comments where code changes may be needed [Draft PR] Added comments where code changes may be needed - #1238 Oct 27, 2025
ir/state.h Outdated
std::vector<Memory::FnRetData> ret_data;

// Add non-deterministic local_blk_size variable member

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it goes here

Copy link
Contributor Author

@IamYJLee IamYJLee Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback, Nuno.
When managing the non-deterministic variable for local_blk_size in the FnCallOutput structure, should the condition inside FnCall::toSMT be modified to perform both m.alloc and addFnCall?
Or would it be better to handle this by adding a new function to the State class that accesses FnCallOutput after performing m.alloc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, maybe we can change addFnCall to return the size, and then pass that size to alloc() in FnCall::toSMT.
Actually, that way we don't even need to add another field to FnCallOutput, since the output of the function is the size. We won't store the pointer, since that's local.

Copy link
Contributor Author

@IamYJLee IamYJLee Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t fully understand your comment yet.
In this case, the @myalloc function is processed through FnCall::toSMT.
To explain in more detail — inside FnCall::toSMT, m.alloc is called first, but addFnCall is not called.
Meanwhile, the @llvm.objectsize function is processed through Ternary::toSMT.

@myalloc 
    ⇒ FnCall::toSMT → m.alloc (does not call addFnCall)
@llvm.objectsize 
    ⇒ Ternary::toSMT

So I’m wondering how the return value from addFnCall should be passed in this situation.
By the way, when I tried calling addFnCall before m.alloc during the processing of @myalloc, I got the following error message:

Assertion failed: (has_write_fncall || num_inaccessiblememonly_fns > 0),
function get_fncallmem_bid, file memory.cpp, line 73.

Actually, I don’t think this error is particularly meaningful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to call addFnCall, but change the return type to be the size of the allocation. Then take that variable and pass it to m.alloc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the addFnCall function before calling m.alloc(). (0bd72f5)
However, I encountered the following assertion in get_fncallmem_bid() (memory.cpp):

Assertion failed: (has_write_fncall || num_inaccessiblememonly_fns > 0), function get_fncallmem_bid, file memory.cpp, line 75.
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  libLLVMSupport.dylib     0x000000010affae8c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 56
1  libLLVMSupport.dylib     0x000000010aff8470 llvm::sys::RunSignalHandlers() + 64
2  libLLVMSupport.dylib     0x000000010affb984 SignalHandler(int, __siginfo*, void*) + 360
3  libsystem_platform.dylib 0x000000019de54624 _sigtramp + 56
4  libsystem_pthread.dylib  0x000000019de1a88c pthread_kill + 296
5  libsystem_c.dylib        0x000000019dd23c60 abort + 124
6  libsystem_c.dylib        0x000000019dd22eec err + 0
7  alive-tv                 0x00000001025303a0 IR::Memory::setState(IR::Memory::CallState const&, IR::SMTMemoryAccess const&, std::__1::vector<IR::PtrInput, std::__1::allocator<IR::PtrInput>> const&, unsigned int) + 3208
8  alive-tv                 0x0000000102552ff0 IR::State::addFnCall(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::vector<IR::StateValue, std::__1::allocator<IR::StateValue>>&&, std::__1::vector<IR::PtrInput, std::__1::allocator<IR::PtrInput>>&&, IR::Type const&, IR::StateValue&&, IR::Type const*, std::__1::vector<IR::StateValue, std::__1::allocator<IR::StateValue>>&&, IR::FnAttrs const&, unsigned int) + 6516

...

According to the error log, has_write_fncall is false.
When allocsize is present, has_write_fncall remains false.
If has_write_fncall is true, it means that the function call writes to global memory (i.e., not inaccessible-only).
Therefore, I think has_write_fncall should be true when both allocsize and allockind are present — or even when only allockind is present.

Also, since Bid is added in Memory::mkFnRet, creating a block during this process seems to result in duplication.

Could you please advise on this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nunoplopes Could you please review this when you get a chance?

alloc_size = I->second.ret_data[0].size;
}

memory.setState(I->second.callstate, memaccess, I->first.args_ptr,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be okay to skip this part when addFnCall is executed before m.alloc?

@IamYJLee
Copy link
Contributor Author

IamYJLee commented Dec 4, 2025

@nunoplopes
Hi Nuno, I know you’re busy, but I’d like to follow up on this draft PR.
I’m having trouble with some parts of the implementation, so if you have a moment, any guidance or feedback would be greatly appreciated.

Copy link
Member

@nunoplopes nunoplopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.
Looking again with fresh eyes, I think only the first change in instr.cpp is required. Everything else is not needed.

ir/instr.cpp Outdated

if (!attrs.has(FnAttrs::AllocSize)) {
auto result = s.addFnCall(std::move(fnName_mangled).str(), std::move(inputs),
std::move(ptr_inputs), getType(), std::move(ret_val),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type should be an integer with number of bits equal to the size_t width.
Also, you shouldn't set the ret_arg_ty / ret_val here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address review feedback.

This is the verification result.

define i64 @src() {
#0:
  %stack = call ptr @myalloc() allockind(alloc)
  %sz = objectsize ptr %stack, i1 0, i1 0
  ret i64 %sz
}
=>
define i64 @tgt() {
#0:
  ret i64 -1
}
Transformation doesn't verify!

ERROR: Source and target don't have the same return domain

Example:

Source:
ptr %stack = function did not return!

SOURCE MEMORY STATE
===================
NON-LOCAL BLOCKS:
Block 0 >       size: 0 align: 1        alloc type: 0   alive: false    address: #x00

LOCAL BLOCKS:
Block 2 >       size: 0 align: 8        alloc type: 2   alive: true     address: #x10

Target:

TARGET MEMORY STATE
===================
NON-LOCAL BLOCKS:
Block 0 >       size: 0 align: 1        alloc type: 0   alive: false    address: #x00

If the myalloc() function has the willreturn attribute, the verification result becomes correct.

----------------------------------------
declare ptr @myalloc() willreturn allockind(alloc)

define i64 @src() {
#0:
  %stack = call ptr @myalloc() willreturn allockind(alloc)
  %sz = objectsize ptr %stack, i1 0, i1 0
  ret i64 %sz
}
=>
define i64 @tgt() {
#0:
  ret i64 -1
}
Transformation seems to be correct!

If a function is marked as alloc or realloc but has no AllocSize attribute, do not try to compute the allocation size.
Instead, model the size as a symbolic value returned by the call.
@IamYJLee
Copy link
Contributor Author

IamYJLee commented Jan 5, 2026

Sorry for the delay. Looking again with fresh eyes, I think only the first change in instr.cpp is required. Everything else is not needed.

@nunoplopes No problem. Happy New Year.
Thanks for taking another look and for the feedback. 👍

ir/instr.cpp Outdated
smt::expr np_size;

if (!attrs.has(FnAttrs::AllocSize)) {
has_write_fncall |= true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this must go into the preprocessor. Here it's too late.

ir/instr.cpp Outdated
std::move(ptr_inputs), sizet_type, StateValue{},
nullptr, std::vector<StateValue>{}, attrs, indirect_hash);
size = std::move(result.value);
np_size = expr(true);
Copy link
Member

@nunoplopes nunoplopes Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we can use result.non_poison.

@nunoplopes
Copy link
Member

It's looking good. Fixing the couple of things I pointed out + adding some tests would make the patch ready to go. Thanks!

- Replace hardcoded true with  in instr.cpp
- Move  handling into the preprocessor stage (transform.cpp)
- Keep AllocSize attribute checks for allocation functions
- Add a test case covering allocations without allocsize
@IamYJLee IamYJLee changed the title [Draft PR] Added comments where code changes may be needed - #1238 Fix handling of allocation functions without allocsize attribute - #1238 Jan 7, 2026
@IamYJLee IamYJLee marked this pull request as ready for review January 7, 2026 02:53
@IamYJLee IamYJLee requested a review from nunoplopes January 7, 2026 02:53
inaccessiblememonly_fns.emplace(call->getName()).second)
++num_inaccessiblememonly_fns;
}
if(attrs.isAllocWithoutSize()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing space here before (

@@ -0,0 +1,14 @@
; ERROR: Source and target don't have the same return domain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrong output. A call into an allocator can be removed.

++num_inaccessiblememonly_fns;
}
if(attrs.isAllocWithoutSize()) {
has_write_fncall |= attrs.mem.canWriteSomething();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want the call to get the object size to write anything.

Copy link
Contributor Author

@IamYJLee IamYJLee Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will adjust the arguments to addFnCall to make the call read-only, which enables the removal of this code.

ir/instr.cpp Outdated
static IntType sizet_type(string("size_t"), config::max_sizet_bits);
auto result = s.addFnCall(std::move(fnName_mangled).str(), std::move(inputs),
std::move(ptr_inputs), sizet_type, StateValue{},
nullptr, std::vector<StateValue>{}, attrs, indirect_hash);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to adjust the attrs here so the call is read-only.

Copy link
Contributor Author

@IamYJLee IamYJLee Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FnAttrs readonly_attrs = attrs;
readonly_attrs.mem.setCanOnlyRead();
auto result = s.addFnCall(std::move(fnName_mangled).str(), std::move(inputs),
      std::move(ptr_inputs), sizet_type, StateValue{},
      nullptr, std::vector<StateValue>{}, readonly_attrs, indirect_hash);

I attempted the change as above, yet the error ERROR: Source and target don't have the same return domain persists.
I think this may be due to the existence of counterexamples in which the myalloc function fails to return.
Would it make sense to add the willreturn attribute to explicitly state that the user-defined allocator always returns (possibly with any value)?

Copy link
Contributor Author

@IamYJLee IamYJLee Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics of this behavior are not explicitly specified in the LLVM Language Reference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, that's a good point about willreturn. LLVM currently removes calls to custom allocators, though.
Would you mind opening either a discussion on discourse or a bug report to clarify the assumptions on custom allocators, (and CC me) please? Thanks!

@IamYJLee IamYJLee requested a review from nunoplopes January 12, 2026 00:11
@nunoplopes
Copy link
Member

Given the discussion on discourse, I think we want to: 1) mark the function call as touching only inaccessible memory, and 2) introduce a non-deterministic boolean to determine if the function terminates or not.
We will have to the same for other non-library allocation functions. This will make it illegal to reorder e.g. I/O across allocation functions. I don't know if LLVM respects that, but we need to implement it to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for allockind without allocsize

2 participants