Skip to content

Reimplementing histograms with accumulators #2430

@athas

Description

@athas

Currently histograms are a distinct construct throughout the compiler (Hist in the SOACS IR, and SegHist in the GPU/MC IR). Just as we rewrote scatters to be expressed through accumulators, we could do the same for histograms. The advantages would be the following:

  1. An overall simplification of the compiler.
  2. Allowing multiple histogram updates per input element.
  3. Consistent code generation no matter whether it is a "real" histogram in the source program, or produced by a program transformation such as AD.
  4. All other UpdateAdd-based optimisations would work transparently with histograms, such as what is asked for in Fuse concat with reduce_by_index #2418.

I think Cosmin proposed this some time ago as well.

The main challenge is that we do very careful and precise code generation for histograms. However, I believe it may not be so hard to detect that a SegMap is actually "histogram-like", and do our usual subhistogramming/multipass/whatever strategies. Here is an idea:

  1. When generating code for a SegMap, see if it performs any UpdateAccs on free accumulators with operators. If so, record these and treat it as a histogram.
  2. Essentially generate code like we currently do for SegHist, except that instead of having a list of HistOps that we manually apply after the KernelBody has run, the KernelBody itself does the updates. In order to do multihistogramming, we simply change which array a given accumulator (identified by its certificate) refers to. We still need to have all the same logic for combining multiple subhistograms. This is tracked in the ImpState and is currently looked up by lookupAcc, but we could easily add a combinator for locally overriding the underlying array.

The main challenges are the following:

  1. Cosmin's carefully tuned histogram strategy assumes only one update per input element. If we eventually add support for multiple updates, we will have to refine the strategy. This is probably not so difficult and can be postponed.
  2. We will need to handle arbitrary accumulators in reverse-mode AD.

Point (2) here is really difficult, I fear. I added support for scatter-like WithAccs to AD, which was difficult enough, but I don't really know how we could possibly handle arbitrary update operators without adding a tape. Currently we assume that all WithAcc operators are the result of preceding applications of AD, which means they are linear, and so have a constant adjoint. One solution is to preserve Hist at the SOACS level until after AD, at which point we rewrite it to accumulators.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions