Conversation
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
…s extend BaseInputPreProcessor Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
| } | ||
|
|
||
| // layer | ||
| //TODO regularizations? No SameDiff support for per-layer/weight regularizes |
There was a problem hiding this comment.
Hm... it's only global ATM, true. Might be worth adding (just not in this PR). Another issue to be opened perhaps.
| } | ||
|
|
||
| public static SDVariable batchAverage(@NonNull SDVariable loss){ | ||
| return loss.sum().div(loss.shape().get(SDIndex.point(0))); |
There was a problem hiding this comment.
loss.shape().get(SDIndex.point(0)) -> sd.sizeAt(loss, 0)
There was a problem hiding this comment.
Can we add that to SDVariable (not this PR)? squeeze and expandDims too.
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
AlexDBlack
left a comment
There was a problem hiding this comment.
Only half way through the second review. I'll submit another review for the rest soon.
Main issues here are:
(a) We'll need to work out a way to do reshapes, permutes, etc of weights at the INDArray level (not SDVariable level) for performance and memory reasons.
(b) We'll need some sort of coverage checking. There's 2 models we can use for that:
https://github.com/eclipse/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/dtypes/DTypeTests.java
or
https://github.com/eclipse/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-tests/src/test/java/org/nd4j/OpValidationSuite.java
Having the tests in the gradient checks is definitely good, and we don't want to needlessly write redundant tests...
That said, DTypeTests approach allows us to assert that all layers/preprocessors/etc are checked though (and fail a test if not), which is definitely nice. i.e., it stops us introducing a bug if we add a new layer and forget to write/test the SameDiff conversion.
| } | ||
|
|
||
| testSameDiffActivations(model, network, input, true); | ||
| testSameDiffLoss(model, network, input, labels); |
There was a problem hiding this comment.
It would also be great to test fitting - i.e., parameters are the same after each fit step, for say 3 steps.
| // out = out.add(bias); | ||
| // | ||
| // return doActivation(out); | ||
| throw new UnsupportedOperationException("Can't convert EmbeddingLayer to SameDiff"); |
There was a problem hiding this comment.
Let's try to isolate this crash.
If it's the squeeze (bad shape) we can maybe wokr around it via .reshape(-1).castTo(DataType.INT64)
Also int32 input should be fine for this too
https://github.com/eclipse/deeplearning4j/blob/master/libnd4j/include/ops/declarable/generic/transforms/gather.cpp#L88
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
…j_to_samediff # Conflicts: # deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/util/ToSameDiffUtils.java
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Signed-off-by: Ryan Nett <ryan@konduit.ai>
Adding a
toSameDiffmethod + overloads toMultiLayerNetworkandComputationGraph, and adding the useddefine*methods for layers, vertices, activations, and losses.Need a SameDiff
Mishfunction.Currently no support for Truncate convolution mode in SameDiff.
rectifiedTanhandrationalTanhwere inSameDiff.mathbut notSameDiff.nn..Needed new SameDiff functions for optimization:
weightedCrossEntropyWithLogitsjavadoc says it supports null weights, but I get an exception. Same for BinaryCrossentropyNot being implemented in the first pass:
Dropout(needs SameDiff support).Need to add support for custom ILossLayer.computeScore methods, e.g. CnnLossLayer.
What's the difference between FrozenLayerWithBackprop and FrozenLayer?
define*