class MXNet::Optimizer::SGD

Overview

The SGD optimizer with momentum and weight decay.

Updates are calculated by:

rescaled_grad = lr * (rescale_grad * clip(grad, clip_gradient) + wd * weight)
state = momentum * state + rescaled_grad
weight = weight - state

mxnet/optimizer.cr

, , , , , , ,

def self.new(momentum = 0.0, **kwargs) #

Creates a new instance.

This optimizer accepts the following parameters in addition to those accepted by Optimizer.

[View source]

def create_state(index, weight) #

[View source]

def update(index, weight, gradient, state) #

[View source]