You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here, we generate a random number which decides the number of mutations to be
applied, where the value will be increased or decreased by 30%. We considered this
mutation since we observed that the initial population vector is very sensitive where we
land up with huge differences on changing their values.So we tried to bound it.
Hyperparameters
Pool size:
We selected our pool size to be 10 because of the vector size.
Since a vector has 10 features(and so 10 coefficients omitting the first element since its
contribution is least) and we are given an overfit vector as reference.We optimised each
coefficient at a time and got 10 vectors, so we started with these 10 vectors as initial
population. And every time we get better results we update the initial population with
these new generated best vectors.
So our pool size is 10 × 10 = 100.
Splitting point:
Since we are using diagonal multi-parent crossover function, we had to pick 4 points(
parts) to have 5 children. We chose to distribute those 4 points evenly in the
chromosome by fixing the points after every two chromosomes.
There are two reasons to expect that the use of more parents in diagonal crossover
leads to improved GA performance: a high level of disruption and a large sample of the
search space used when creating offspring. As for the first aspect, by using more
crossover points the operator becomes more disruptive, thus more explorative and less
sensitive for premature convergence. Secondly, by the use of more parents there is
more information on the search space and there is more consensus needed to focus the
search to a certain region, that is the danger of (too) early commitment is reduced.On
using this method we have higher probability of not falling into local best, and can
achieve global best.
Process / Path we followed in reaching our current best and its
Heuristics
Analysis of given Overfit vector and Generating Initial population:
First we analyzed the given overfit vector making some changes in each weight. Then we
considered the best out of them as our initial population.So my First generated initial population
consists of 10 vectors where each vector has a slight difference with one of the weights of given
overfit vector.(The weight of first coefficient remains same as of overfit vector since we didn’t
observed much changes on changing it)
Initial Fitness Function:
After generating our initial population ,we started on calculating fitness function . Our first
fitness function was addition of the two errors. (i.e Train_Error+Validation_Error). We observed
that errors were decreasing to some extent but not much.
We noticed that the train errors are considerably low, but validation errors are huge..
Crossover and Mutations:
The crossovers we tried are uniform crossover and diagonal multi-parent crossover considering
we get a wide range of results.
We randomly selected some positions in the array and replace it with a random number%
Updating Fitness Function:
Then we started updating our fitness functions in such a way that we decrease our validation
error. Then we tried different variants but best one was:
(Train_Error*(X) + Validation_Error*(Y))
We tried different values of X and Y and noticed that the best results are obtained when
X<=4 and Y>=6 and X+Y=
Then we observed some good results where both train and validation are almost equal and are
in a range of (10^7).
Then I manually crossovered these best results and mutated them, and the errors are
decreased to a range of (7* 10^6)
Mutating the best results:
After the best possible results are achieved we thought of mutating them. Till now our
mutation was updating the value, now we thought of just updating the power (or) exponential
part.So we tried and analyzed different values as exponents and decided some ranges for each
weight.
Because of the errors we got, Train error:259295.9197084011 and Validation
error:262101.
Here the two errors are almost equal which ensures that I am not overfitting the dataset and
also we need is the best fit out of given overfit.As given is overfit it performs well on the training
data but not on the validation data so we need to give importance for reducing the validation
error compromising training error upto an extent as already it's a better one.And here we
reduced validation error from 10^8 to 10^6 compromising training error to some extent.
THE ITERATION DIAGRAMS AND TRACE OUTPUTS ARE IN SEPARATE FILES NAMED
“IterationDiagrams.pdf” and we have 2 trace outputs “trace1.txt” contains our intermediate best
initial population and “trace2.txt” contains our final best initial population.
About
A Genitic Algorithm using Diagonal Crossover to reduce overfitting