If we implement co_switch as a #define to a GCC __asm__, instead of the current machine code trickery, we can let GCC deal with spilling the registers - and, more importantly, not spilling registers that are currently unused.
I believe this would yield a performance boost (though most likely small, sometimes zero depending on platform and how the callers look). The drawback is an ifdef in the header (can't put it in the backend, it'd just make gcc dump every callee preserve reg to the stack, just like the current backends).
Does this sound like an interesting optimization, or would ifdefs in headers be unappealing? If yes, I'll implement it, check how bsnes performance reacts, and submit a PR later today or tomorrow.
(GCC ASM obviously won't work on MSVC, but we can keep the current backends for that. Will work on Clang, though probably not clang-cl.)
If we implement co_switch as a #define to a GCC __asm__, instead of the current machine code trickery, we can let GCC deal with spilling the registers - and, more importantly, not spilling registers that are currently unused.
I believe this would yield a performance boost (though most likely small, sometimes zero depending on platform and how the callers look). The drawback is an ifdef in the header (can't put it in the backend, it'd just make gcc dump every callee preserve reg to the stack, just like the current backends).
Does this sound like an interesting optimization, or would ifdefs in headers be unappealing? If yes, I'll implement it, check how bsnes performance reacts, and submit a PR later today or tomorrow.
(GCC ASM obviously won't work on MSVC, but we can keep the current backends for that. Will work on Clang, though probably not clang-cl.)