-
Notifications
You must be signed in to change notification settings - Fork 382
Add fallback for ARM Windows fp16 detection. #348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @tonybaloney, would you mind taking a look at this one? |
|
What chip is it? |
This was observed on an Azure Standard D16pds v5 machine with an Ampere Altra processor. |
|
|
||
| // PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE may not be available in older | ||
| // Windows versions. If fp16arith was not detected with | ||
| // IsProcessorFeaturePresent(PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE), fall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this is true in practice, if the detection for FP16 is available and used, and it says false, we should respect that?
There have been recent A75 without FP16. They also dont have dot product.
Can you clarify a case for when this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this is true in practice, if the detection for FP16 is available and used, and it says false, we should respect that?
Yes, I agree that if detection is available, we should respect it. However, I am not sure that it is actually available. IsProcessorFeaturePresent() returns 0 both when the feature is actually not detected to be present and also when detection is not available.
Can you clarify a case for when this is necessary?
The behavior I observed was that, on a CI build system with an Ampere Altra processor, cpuinfo_has_arm_fp16_arith() used to return true (prior to the use of IsProcessorFeaturePresent(PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE) in src/arm/windows/init.c) and then started to return false. On that same system, I was able to run a test program with NEON FP16 intrinsics successfully, so it appears that FP16 instructions are actually available.
fbarchard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice src/arm/mach/init.c has a similar issue with older OS not supporting the fp16 detect, so if the cpu is known it sets it
cpuinfo_isa.fp16arith = get_sys_info_by_name("hw.optional.arm.FEAT_FP16") != 0;
if (!cpuinfo_isa.fp16arith) {
// Optional in ARMv8.2-A (implemented in Apple cores),
// list only cores released before iOS 15 / macOS 12
switch (cpu_family) {
case CPUFAMILY_ARM_MONSOON_MISTRAL:
case CPUFAMILY_ARM_VORTEX_TEMPEST:
case CPUFAMILY_ARM_LIGHTNING_THUNDER:
case CPUFAMILY_ARM_FIRESTORM_ICESTORM:
cpuinfo_isa.fp16arith = true;
}
}
|
+1 I think this is either a limitation on the kernel or on the chipset that it's not lighting up the feature. This workaround will have to exist |
Background: On a Windows ARM system, I observed that
cpuinfo_has_arm_fp16_arith()started to return false after upgrading to a more recent cpuinfo version.In #333, the initialization of
cpuinfo_isa.fp16arithwas updated to useIsProcessorFeaturePresent(PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE). I suspect that this is not supported on older Windows versions.This change adds a fallback path to set
cpuinfo_isa.fp16ariththe old way.cpuinfo/src/arm/windows/init.c
Lines 205 to 208 in d3a86a8