Hello. I was really delighted by this new type of structual optimization of neural networks.
Thank you for your job, it is really awesome. 👏
Currently, I making detailed research about architectures that can be generated by using WANN algorithms within classification tasks. And at some point i've tried to change activation functions in input and output layer and I've got interesting results:
Experiments
Applied hyperparameters for all experiments:
Weights set: -2, -1, -0.5, 0.5, 1, 2
Max generations: 1024
Population size: 512
Rank by performance only \ by network complexity: 20% \ 80%
Add connection probability: 20%
Add neuron probability: 25%
Change activation function probability: 50%
Enable disabled connection probability: 5%
Keep best species in next population (elitism): 20
Destroy bad species in next population (cull): 20
XOR
Experiment 1:
Generated architecture without changing activation functions in input and output layer:

Mean error (for all weights): 0
Experiment 2:
Generated architecture with changing activation functions in input and output layer:

Mean error (for all weights): 0
Straight lines detection

Inputs: 3x3 square images
Outputs: 2 squares on the right side of each set.
If horizontal line only exists - output must be (1, 0).
If vertical line only exists - output must be (0, 1).
If both of it exists - output must be (1, 1).
If noone straight line exists - (0, 0).
Target: Teach neural network to detect straight (black) lines in 3x3 image.
Experiment 1:
Generated architecture without changing activation functions in input and output layer:

Mean error (for all weights): 0.0455
Experiment 2:
Generated architecture with changing activation functions in input and output layer:

Mean error (for all weights): 0.0469
Conclusions
Changing activation functions in input and output layers could reduce complexity without loss of accuracy.
It may reduce required computations.
I guess this is because connections that goes from input to hidden and from hidden to output nodes. In some tasks they really can interfer optimization, so synthesis algorithm must "destroy" them by adding additional nodes and connections.
P.S. I really hope my investigation could help for improving this awesome neural networks structural synthesis algorithm.
❤
Hello. I was really delighted by this new type of structual optimization of neural networks.
Thank you for your job, it is really awesome. 👏
Currently, I making detailed research about architectures that can be generated by using WANN algorithms within classification tasks. And at some point i've tried to change activation functions in input and output layer and I've got interesting results:
Experiments
Applied hyperparameters for all experiments:
Weights set: -2, -1, -0.5, 0.5, 1, 2
Max generations: 1024
Population size: 512
Rank by performance only \ by network complexity: 20% \ 80%
Add connection probability: 20%
Add neuron probability: 25%
Change activation function probability: 50%
Enable disabled connection probability: 5%
Keep best species in next population (elitism): 20
Destroy bad species in next population (cull): 20
XOR
Experiment 1:
Generated architecture without changing activation functions in input and output layer:
Mean error (for all weights): 0
Experiment 2:
Generated architecture with changing activation functions in input and output layer:
Mean error (for all weights): 0
Straight lines detection
Inputs: 3x3 square images
Outputs: 2 squares on the right side of each set.
If horizontal line only exists - output must be (1, 0).
If vertical line only exists - output must be (0, 1).
If both of it exists - output must be (1, 1).
If noone straight line exists - (0, 0).
Target: Teach neural network to detect straight (black) lines in 3x3 image.
Experiment 1:
Generated architecture without changing activation functions in input and output layer:
Mean error (for all weights): 0.0455
Experiment 2:
Generated architecture with changing activation functions in input and output layer:
Mean error (for all weights): 0.0469
Conclusions
Changing activation functions in input and output layers could reduce complexity without loss of accuracy.
It may reduce required computations.
I guess this is because connections that goes from input to hidden and from hidden to output nodes. In some tasks they really can interfer optimization, so synthesis algorithm must "destroy" them by adding additional nodes and connections.
P.S. I really hope my investigation could help for improving this awesome neural networks structural synthesis algorithm.
❤