Optimizers#
helixnet.optimizer is the module that contains the optimisers
- class helixnet.optimizers.Optimizer(lr_o: LearnRate, regularizers: List[Regularizer] | None = None, grad_clip=None)#
An Abstract class that is used by other optimisers and it also performs the primary training loop
- Parameters:
lr_O (float | LearnRate) – The object or float learn rate of the optimizer
regularizers (List[Regularizer]) – The regularizers that will be applied to the parameters by the optimizer
grad_clip (float) – The values which the gradients will be clipped to it. Or pass
Nonein order to avoid performing gradient clipping
- epoch_done()#
A simple method that should be called after each epoch.
This method should be called after every epoch is done in order to inform the optimiser to update it’s parameters like weight decay
- optimize(model: Sequential, loss: Tensor) None#
- This method trains models and calls optimise_param
for every parameter in the layer and it’s called when the training happens
- Also if there any parameter in the model that doesn’t
have any gradients it will skip it.
- Parameters:
model ((models.Sequential)) – The model that will be trained
loss (mg.Tensor) – The loss of the model
- optimize_param(parameter: Tensor) None#
This function takes parameters one by one and must be inherited by the children and overload it with the update logic
- Parameters:
(mg.Tensor) (parameter) – The parameter itself that will optimized
- class helixnet.optimizers.SGD(lr, momentum=None, regularizers: List[Regularizer] | None = None, clip=None)#
- Stochastic Gradient Descend is a powerful optimiser and
is more stable than Adam numerically
- optimize_param(parameter: Tensor) None#
- Parameters:
model (mg.Tensor) – The model that needs to be trained
This method performs training sequential models
Important
Don’t set high parameter values especially for the momentum because it might be numerically unstable
- class helixnet.optimizers.Adam(learning_rate=<helixnet.optimizers.LearnRate object>, epsilon=1e-07, beta_1=0.9, beta_2=0.999, regularizers: ~typing.List[~helixnet.optimizers.Regularizer] | None = None, clip=None)#
Adam a very good optimiser can converge quickly but less stable numerically
- Parameters:
lr (float|LearnRate) – The learn rate of the optimiser
- optimize_param(parameter: Tensor) None#
This function takes parameters one by one and must be inherited by the children and overload it with the update logic
- Parameters:
(mg.Tensor) (parameter) – The parameter itself that will optimized
- class helixnet.optimizers.RMSProp(lr=<helixnet.optimizers.LearnRate object>, epsilon=1e-07, rho=0.9, regularizers: ~typing.List[~helixnet.optimizers.Regularizer] | None = None)#
Root Mean Square Propagation optimiser or for short named RMSProp
- optimize_param(parameter: Tensor) None#
This method contains the update logic for a single parameter.
- class helixnet.optimizers.NesterovSGD(lr, momentum=0.9, regularizers: List[Regularizer] | None = None, clip=None)#
A Nesterov Stochastic Gradient Descend optimizer.
- Parameters:
lr (float|LearnRate) – The learn rate of the optimiser
momentum (float) – The momentum value
regularizers (list) – The list which contains the regularizers
- optimize_param(parameter: Tensor) None#
- Parameters:
model (mg.Tensor) – The model that needs to be trained
This method performs training sequential models
Learn Rates#
HelixNet offers very flexible ways to have a decaying learn rate
- class helixnet.optimizers.LearnRate(lr: float)#
A constant learn rate
- Parameters:
lr (float)
- class helixnet.optimizers.ExpDecay(lr: float, decay: float)#
- Parameters:
lr (float) – The inital learn rate
decay (float) – The decay rate of learn rate
Exponential decay of the learn rate
- class helixnet.optimizers.LinearDecay(lr: float, decay: float)#
- Parameters:
lr (float) – The inital learn rate
decay (float) – The decay rate of learn rate
Linear Decay of the learn rate where \({lr} = {decay} * {steps} + lr_{init}\)
Regularizers#
HelixNet offers multiple regularizers which are helixnet.optimizers.L1 and
helixnet.optimizers.L2 also with a very easy way to
create others using helixnet.optimizers.Regularizer
- class helixnet.optimizers.Regularizer#
A Simple regularizer class for parameter regularization
- regularize(parameter: Tensor) Tensor#
- Parameters:
parameter (mg.Tensor) – the parameter that will be regularized
- This method should be overloaded by the regularizers because this method
will perform the calculations