Skip to content

[WIP] DL4J toSameDiff conversion method#495

Draft
rnett wants to merge 70 commits intomasterfrom
rn_dl4j_to_samediff
Draft

[WIP] DL4J toSameDiff conversion method#495
rnett wants to merge 70 commits intomasterfrom
rn_dl4j_to_samediff

Conversation

@rnett
Copy link
Copy Markdown

@rnett rnett commented Jun 23, 2020

Adding a toSameDiff method + overloads to MultiLayerNetwork and ComputationGraph, and adding the used define* methods for layers, vertices, activations, and losses.

Need a SameDiff Mish function.
Currently no support for Truncate convolution mode in SameDiff.

rectifiedTanh and rationalTanh were in SameDiff.math but not SameDiff.nn..

Needed new SameDiff functions for optimization:

  • a RELU that supports leaky and a custom threshold at the same time.
  • A non-weighted cross entropy loss. The weightedCrossEntropyWithLogits javadoc says it supports null weights, but I get an exception. Same for BinaryCrossentropy
  • For LossSparseMCXENT, a version of OneHot that takes depth as a SDVariable. The current version takes a double and should be passed input.shape[-1], which I can't get w/o offline shape inference.
  • A 1d ops for subsampling and upsampling op.
  • a way to set depthMultiplier for depthwise convolutions
  • weight format on all convolution configurations

Not being implemented in the first pass:

  • Losses
    • LossMultiLabel
    • OCNNLossFunction
    • LossMixtureDensity
  • Activations
    • ActivationMish
  • Layers
    • Mask Layer (no mask support yet)
    • GlobalPoolingLayer (no SameDiff op?)
    • MaskZeroLayer
    • TFOpLayer
    • Yolo2OutputLayer
    • CenterLossOutputLayer (should be done as a loss function)
    • OCNNOutputLayer maybe?
    • EmbeddingLayer (JVM crash issues)
    • EmbeddingSequenceLayer
    • Autoencoder and VAE
  • Vertices (for the most part, because there's no way to get int rank from a SDVariable. A SDIndex.ellipses() would solve this too)
    • SubsetVertex
    • DuplicateToTimeSeriesVertex
  • Every dropout but Dropout (needs SameDiff support).

Need to add support for custom ILossLayer.computeScore methods, e.g. CnnLossLayer.

What's the difference between FrozenLayerWithBackprop and FrozenLayer?

  • toSameDiff
    • MultiLayerNetwork
    • ComputationGraph
  • define*
    • Layers
    • Vertices
    • Activations
    • Losses
    • Dropout (and support).
  • Updater state
  • MNIST test
  • Add to layer tests
  • Add to loss gradient tests

Ryan Nett added 3 commits June 22, 2020 14:00
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
@rnett rnett requested a review from AlexDBlack June 23, 2020 19:04
@rnett rnett self-assigned this Jun 23, 2020
Ryan Nett added 5 commits June 23, 2020 12:12
Signed-off-by: Ryan Nett <ryan@konduit.ai>
…s extend BaseInputPreProcessor

Signed-off-by: Ryan Nett <ryan@konduit.ai>
@rnett rnett removed the request for review from AlexDBlack June 23, 2020 22:11
Ryan Nett added 9 commits June 23, 2020 15:27
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Copy link
Copy Markdown

@AlexDBlack AlexDBlack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good so far. See comments though.

}

// layer
//TODO regularizations? No SameDiff support for per-layer/weight regularizes
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... it's only global ATM, true. Might be worth adding (just not in this PR). Another issue to be opened perhaps.

}

public static SDVariable batchAverage(@NonNull SDVariable loss){
return loss.sum().div(loss.shape().get(SDIndex.point(0)));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loss.shape().get(SDIndex.point(0)) -> sd.sizeAt(loss, 0)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add that to SDVariable (not this PR)? squeeze and expandDims too.

Ryan Nett added 5 commits June 26, 2020 20:49
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Copy link
Copy Markdown

@AlexDBlack AlexDBlack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only half way through the second review. I'll submit another review for the rest soon.

Main issues here are:
(a) We'll need to work out a way to do reshapes, permutes, etc of weights at the INDArray level (not SDVariable level) for performance and memory reasons.
(b) We'll need some sort of coverage checking. There's 2 models we can use for that:
https://github.com/eclipse/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/dtypes/DTypeTests.java
or
https://github.com/eclipse/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-tests/src/test/java/org/nd4j/OpValidationSuite.java

Having the tests in the gradient checks is definitely good, and we don't want to needlessly write redundant tests...
That said, DTypeTests approach allows us to assert that all layers/preprocessors/etc are checked though (and fail a test if not), which is definitely nice. i.e., it stops us introducing a bug if we add a new layer and forget to write/test the SameDiff conversion.

}

testSameDiffActivations(model, network, input, true);
testSameDiffLoss(model, network, input, labels);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be great to test fitting - i.e., parameters are the same after each fit step, for say 3 steps.

// out = out.add(bias);
//
// return doActivation(out);
throw new UnsupportedOperationException("Can't convert EmbeddingLayer to SameDiff");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to isolate this crash.
If it's the squeeze (bad shape) we can maybe wokr around it via .reshape(-1).castTo(DataType.INT64)

Also int32 input should be fine for this too
https://github.com/eclipse/deeplearning4j/blob/master/libnd4j/include/ops/declarable/generic/transforms/gather.cpp#L88

Ryan Nett added 3 commits June 30, 2020 13:57
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Ryan Nett added 30 commits July 2, 2020 15:17
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
…j_to_samediff

# Conflicts:
#	deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/util/ToSameDiffUtils.java
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
fix
Signed-off-by: Ryan Nett <ryan@konduit.ai>
fix
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants