Skip to content

Conversation

@sauraen
Copy link

@sauraen sauraen commented Sep 21, 2025

The existing check for whether immediate values are in range is incorrect by being too loose. For example, for a hypothetical 8-bit immediate value, it would allow values in the range -0xFF to 0xFF. In reality, if this argument is interpreted by the hardware as unsigned, the legal range is 0x00 to 0xFF, and if it's interpreted as signed, the legal range is -0x80 to 0x7F.

This PR changes the check to what is described above, and adds a flag to instruction definitions signifying whether the immediate is interpreted as signed or unsigned. It also adds an option to relax this check and allow the full signed + unsigned range, though this is still tighter than the existing code (in the example above, it allows -0x80 to 0xFF).

Currently, the relax option is never set. Please let me know whether you would like this to be a new assembler intrinsic, if so what, and if not how you would like it to be configured.

The context is that I have a N64 RSP assembly project using armips which makes heavy use of immediates defined programmatically in terms of labels and other indirect values. One of these worked out to be a value which was outside the signed range but inside the unsigned range for an instruction's immediate (in the example above, like 0xC0). armips accepted it without an error, but the hardware interpreted it as signed (continuing the example, like -0x40), leading to incorrect behavior.

@Kingcom
Copy link
Owner

Kingcom commented Sep 24, 2025

Thanks! I can definitely see why this would be desirable, yeah. But I've also seen code and disassembly that used these unsigned immediates with the intention of being parsed as a two's complement value - which is what I originally based this on. There are probably arguments for e.g. a lui/addiu pair being easier to understand when written that way - as the lower half of the immediate will appear as it would in the final value. But then there are pseudo opcodes for those, so it probably doesn't matter too much.

Still, as plenty such code may exist I think an option to retain the old behavior would be best. Maybe taking a hint from GCC warnings and using something like sign conversion (in our case, with strict and relaxed settings)? Would be nicer to implement if armips had specific warning/error categories one could promote to errors or silence, but that's probably something for the future.

We should also have a few simple tests to verify both cases. Could you add those?

@sauraen
Copy link
Author

sauraen commented Sep 25, 2025

I agree that this change--at least with strict settings--could cause code which is working and "good enough" to not compile, which is why I added the relax option. But I wanted your input about how to set this up.

  • Do you want a command-line flag, or an intrinsic, or something else?
  • If we make the default "warn if strict violated, error if relax violated", do you think this will be too noisy and we should disable even the warning by default?
  • Are you okay with the new relax range being enforced (e.g. now -0x80 to 0xFF is legal, previously -0xFF to 0xFF was legal)? If someone wants -0xFF to -0x81 are accepted despite this not being correctly interpreted as either signed or unsigned, maybe they don't want a range check at all and just mask to the lower N bits.

@Kingcom
Copy link
Owner

Kingcom commented Sep 25, 2025

Do you want a command-line flag, or an intrinsic, or something else?

Most settings can be changed inside of the assembly code, so it would probably be most consistent to make it into a directive. That would also allow to adapt the code bit by bit. As the command caches the value, a simple directive such as this should work.

If we make the default "warn if strict violated, error if relax violated", do you think this will be too noisy and we should disable even the warning by default?

Using the relaxed setting should probably not output a warning, as it's specifically to retain the legacy behavior. I'd think of this like -Wno-sign-conversion. The strict mode (which can be the default) should output an error, i.e. -Werror=sign-conversion. It could potentially have a hint as to why and how to circumvent, but that would likely turn it to be too verbose.

Are you okay with the new relax range being enforced (e.g. now -0x80 to 0xFF is legal, previously -0xFF to 0xFF was legal)? If someone wants -0xFF to -0x81 are accepted despite this not being correctly interpreted as either signed or unsigned, maybe they don't want a range check at all and just mask to the lower N bits.

I think making this always an error is fine. This is just asking for mistakes.

@sauraen
Copy link
Author

sauraen commented Sep 28, 2025

I've connected the relax mode to the intrinsic .relaximmsign <on/off> and added some basic tests for the in/out of range cases. Please let me know if you would like any further changes.

Copy link
Owner

@Kingcom Kingcom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review. Looks good! Just possibly one small minor change.

return nullptr;
}

std::unique_ptr<CAssemblerCommand> parseDirectiveRelaxImmSign(Parser& parser, int flags)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIPS specific directives could go into https://github.com/Kingcom/armips/blob/master/Archs/MIPS/MipsParser.cpp#L212 - unless we would want to make this a general feature for all architectures. But as it's mostly a legacy behavior for MIPS it's probably not needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants