Skip to content

Fix SIGFPE crash when Cluster directive used with DSCONV layers#40

Open
velcroapple wants to merge 1 commit into
maestro-project:masterfrom
velcroapple:fix/cluster-directive-divzero
Open

Fix SIGFPE crash when Cluster directive used with DSCONV layers#40
velcroapple wants to merge 1 commit into
maestro-project:masterfrom
velcroapple:fix/cluster-directive-divzero

Conversation

@velcroapple

Copy link
Copy Markdown

Bug

Running MAESTRO with mappings that apply a Cluster directive to DSCONV
layers (e.g. Resnet50_rs.m) causes a floating point exception (SIGFPE)
and core dump.

Root Cause

Two issues work together to cause the crash:

  1. In DFA_iteration-analysis.hpp, Cluster directives push an empty
    iter_state_list into valid_iteration_states_ because there is no
    handler for DirectiveClass::Cluster. This causes num_total_cases = 0
    and empty sub-cluster results.

  2. In CA_cost-analysis-engine.hpp, computation_delay is divided by on
    line 344 before the existing zero-guard on line 365 kicks in. When
    sub-cluster results are empty, computation_delay stays 0, causing
    division by zero.

Note: Cluster::GetOfs() returns 0 by design in DFA_directives.hpp,
and the else branch in DFA_cluster-unit.hpp has a //TODO: Handle this error comment indicating this case was known but unhandled.

Fix

  • Skip pushing empty iter_state_list for Cluster directives in
    DFA_iteration-analysis.hpp
  • Move the zero-guard for computation_delay to before the first
    division in CA_cost-analysis-engine.hpp

Reproducer

./maestro --HW_file='data/hw/accelerator_1.m'
--Mapping_file='data/mapping/Resnet50_rs.m'
--print_res=true

Crashes with SIGFPE before this fix, runs cleanly after.

- Skip pushing empty iter_state_list for Cluster directives in
  DFA_iteration-analysis.hpp to prevent num_total_cases=0
- Add zero-guard for computation_delay before division in
  CA_cost-analysis-engine.hpp line 344
- Reproducer: ./maestro with Resnet50_rs.m mapping crashes on
  DSCONV layers that use Cluster(3,P) with K=1 dimensions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant