class MXNet::Symbol
- MXNet::Symbol
- MXNet::Base
- Reference
- Object
The Symbol
API provides neural network graphs and
auto-differentiation. A symbol represents a multi-output symbolic
expression. Symbols are composited by operators, such as simple
matrix operations (e.g. “+”), or a neural network layer (e.g.
convolution layer). An operator can take several input variables,
produce more than one output variable, and have internal state
variables. A variable can be either free, which we can bind with
values later, or can be an output of another symbol.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = 2 * a + b
e = c.bind({"a" => MXNet::NDArray.array([1, 2]), "b" => MXNet::NDArray.array([2, 3])}, MXNet.cpu)
e.forward.first # => [4, 7]
# <NDArray 2 int32 cpu(0)>
A detailed (albeit in Python) tutorial is available at Symbol - Neural network graphs.
Note: most operators provided in Symbol
are similar to those in
although there are few differences:
adopts a declarative programming style. In other words, we need to first compose the computations, and then feed the computation with data for execution, whereasNDArray
adopts an imperative programming style.
- Most binary operators in
such as#+
don’t broadcast. You need to call the broadcast version of the operator, such as#broadcast_plus
, explicitly.
Included Modules
Extended Modules
Defined in:
mxnet/symbol.crClass Method Summary
.abs(data, **kwargs)
Returns the element-wise absolute value of the input.
.activation(data : self, act_type, **kwargs)
Applies an activation function element-wise to the input.
.add(lhs : self | Number, rhs : self | Number)
Returns element-wise sum of the input arrays.
.add_n(data : Array(self), **kwargs)
Adds all input arguments element-wise.
.arange(start : Number, stop : Number? = nil, ctx = Context.current, **kwargs)
Returns evenly spaced values within a given interval.
.arccos(data, **kwargs)
Returns element-wise inverse cosine of the input array.
.arccosh(data, **kwargs)
Returns the inverse hyperbolic cosine of the input array, computed element-wise.
.arcsin(data, **kwargs)
Returns element-wise inverse sine of the input array.
.arcsinh(data, **kwargs)
Returns the inverse hyperbolic sine of the input array, computed element-wise.
.arctan(data, **kwargs)
Returns element-wise inverse tangent of the input array.
.arctanh(data, **kwargs)
Returns the inverse hyperbolic tangent of the input array, computed element-wise.
.argmax(data, **kwargs)
Returns indices of the maximum values along an axis.
.argmin(data, **kwargs)
Returns indices of the minimum values along an axis.
.argsort(data, **kwargs)
Returns the indices that would sort an input array along the given axis.
.broadcast_add(lhs, rhs, **kwargs)
Returns element-wise sum of the input arrays with broadcasting.
.broadcast_axis(data, **kwargs)
Broadcasts the input array over particular axis.
.broadcast_div(lhs, rhs, **kwargs)
Returns element-wise division of the input arrays with broadcasting.
.broadcast_equal(lhs, rhs, **kwargs)
Returns the result of element-wise equal to (
) comparison operation with broadcasting. -
.broadcast_greater(lhs, rhs, **kwargs)
Returns the result of element-wise greater than (
) comparison operation with broadcasting. -
.broadcast_greater_equal(lhs, rhs, **kwargs)
Returns the result of element-wise greater than or equal to (
) comparison operation with broadcasting. -
.broadcast_lesser(lhs, rhs, **kwargs)
Returns the result of element-wise less than (
) comparison operation with broadcasting. -
.broadcast_lesser_equal(lhs, rhs, **kwargs)
Returns the result of element-wise less than or equal to (
) comparison operation with broadcasting. -
.broadcast_like(lhs, rhs, **kwargs)
Broadcasts the left hand side to have the same shape as right hand side.
.broadcast_logical_and(lhs, rhs, **kwargs)
Returns element-wise logical and of the input arrays with broadcasting.
.broadcast_logical_or(lhs, rhs, **kwargs)
Returns element-wise logical or of the input arrays with broadcasting.
.broadcast_logical_xor(lhs, rhs, **kwargs)
Returns element-wise logical xor of the input arrays with broadcasting.
.broadcast_maximum(lhs, rhs, **kwargs)
Returns element-wise maximum of the input arrays with broadcasting.
.broadcast_minimum(lhs, rhs, **kwargs)
Returns element-wise minimum of the input arrays with broadcasting.
.broadcast_minus(lhs, rhs, **kwargs)
Returns element-wise difference of the input arrays with broadcasting.
.broadcast_mul(lhs, rhs, **kwargs)
Returns element-wise product of the input arrays with broadcasting.
.broadcast_not_equal(lhs, rhs, **kwargs)
Returns the result of element-wise not equal to (
) comparison operation with broadcasting. -
.broadcast_plus(lhs, rhs, **kwargs)
Returns element-wise sum of the input arrays with broadcasting.
.broadcast_power(lhs, rhs, **kwargs)
Returns result of first array elements raised to powers from second array, element-wise with broadcasting.
.broadcast_sub(lhs, rhs, **kwargs)
Returns element-wise difference of the input arrays with broadcasting.
.broadcast_to(data, **kwargs)
Broadcasts the input array to a new shape.
.cbrt(data, **kwargs)
Returns element-wise cube-root value of the input.
.ceil(data, **kwargs)
Returns element-wise ceiling of the input.
.clip(data, a_min, a_max, **kwargs)
Clips (limits) the values in an array.
.concat(data : Array(self), **kwargs)
Joins input arrays along a given axis.
.convolution(data : self, weight : self?, bias : self?, kernel, num_filter, **kwargs)
Compute N-D convolution on (N+2)-D input.
.cos(data, **kwargs)
Computes the element-wise cosine of the input array.
.cosh(data, **kwargs)
Returns the hyperbolic cosine of the input array, computed element-wise.
.create_symbol(op, *args, name : String? = nil, **kwargs)
TODO cache op handles
.degrees(data, **kwargs)
Converts each element of the input array from radians to degrees.
.diag(data, **kwargs)
Extracts a diagonal or constructs a diagonal array.
.divide(lhs : self | Number, rhs : self | Number)
Returns element-wise division of the input arrays.
.dot(lhs, rhs, **kwargs)
Computes the dot product of two arrays.
.equal(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise equal to (
) comparison operation. -
.exp(data, **kwargs)
Returns element-wise exponential value of the input.
.expand_dims(data, axis, **kwargs)
Inserts a new axis of size 1 into the array shape.
.expm1(data, **kwargs)
exp(x) - 1
computed element-wise on the input. -
.fix(data, **kwargs)
Returns element-wise rounded value to the nearest integer towards zero.
.flatten(data, **kwargs)
Flattens the input array into a 2-D array by collapsing the higher dimensions.
.flip(data, axis, **kwargs)
Reverses the order of elements along given axis while preserving array shape.
.floor(data, **kwargs)
Returns the element-wise floor of the input.
.fully_connected(data : self, weight : self?, bias : self?, num_hidden : Int, **kwargs)
Applies a linear transformation: Y = XWᵀ + b.
.gamma(data, **kwargs)
Returns the gamma function (extension of the factorial function to the reals), computed element-wise on the input array.
.gammaln(data, **kwargs)
Returns the log of the absolute value of the gamma function, computed element-wise on the input array.
.greater(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise greater than (
) comparison operation. -
.greater_equal(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise greater than or equal to (
) comparison operation. -
.group(symbols : Array(MXNet::Symbol)) : MXNet::Symbol
Creates a symbol that contains a collection of other symbols, grouped together.
.hypot(lhs : self, rhs : self, **kwargs)
Given the legs of a right triangle, return its hypotenuse.
.lesser(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise less than (
) comparison operation. -
.lesser_equal(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise less than or equal to (
) comparison operation. - .load(fname)
.log(data, **kwargs)
Returns element-wise natural logarithmic value of the input.
.log10(data, **kwargs)
Returns element-wise base-10 logarithmic value of the input.
.log1p(data, **kwargs)
.log(1 + x)
computed element-wise on the input. -
.log2(data, **kwargs)
Returns element-wise base-2 logarithmic value of the input.
.log_softmax(data, **kwargs)
Computes the log softmax of the input.
.logical_and(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise logical and (
) comparison operation. -
.logical_not(data, **kwargs)
Performs element-wise logical not of the input array.
.logical_or(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise logical or (
) comparison operation. -
.logical_xor(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise logical xor (
) comparison operation. -
.max(data, **kwargs)
Computes the max of array elements over given axes.
.maximum(lhs : self | Number, rhs : self | Number)
Returns element-wise maximum of the input arrays.
.mean(data, **kwargs)
Computes the mean of array elements over given axes.
.min(data, **kwargs)
Computes the min of array elements over given axes.
.minimum(lhs : self | Number, rhs : self | Number)
Returns element-wise minimum of the input arrays.
.modulo(lhs : self | Number, rhs : self | Number)
Returns element-wise modulo of the input arrays.
.multiply(lhs : self | Number, rhs : self | Number)
Returns element-wise product of the input arrays.
.nanprod(data, **kwargs)
Computes the product of array elements over given axes treating not-a-number values (NaN) as one.
.nansum(data, **kwargs)
Computes the sum of array elements over given axes treating not-a-number values (NaN) as zero.
.norm(data, **kwargs)
Computes the norm.
.not_equal(lhs : self | Number, rhs : self | Number)
Returns the result of element-wise not equal to (
) comparison operation. -
.one_hot(indices, depth, **kwargs)
Returns a one-hot array.
.ones(shape : Int | Array(Int), ctx = Context.current, **kwargs)
Returns an array filled with all ones, with the given shape.
.ones_like(data, **kwargs)
Returns an array of ones with the same shape, data type and storage type as the input array.
.pick(data, index, **kwargs)
Picks elements from an input array according to the indices along the given axis.
.pooling(data : self, **kwargs)
Performs pooling on the input.
.power(base : self | Number, exp : self | Number)
Returns result of first array elements raised to powers from second array, element-wise.
.prod(data, **kwargs)
Computes the product of array elements over given axes.
.radians(data, **kwargs)
Converts each element of the input array from degrees to radians.
.random_exponential(lam : Number = 1.0, ctx : Context = Context.current, **kwargs)
Draws random samples from an exponential distribution.
.random_gamma(alpha : Number = 1.0, beta : Number = 1.0, ctx : Context = Context.current, **kwargs)
Draws random samples from a gamma distribution.
.random_normal(loc : Number = 0.0, scale : Number = 1.0, ctx : Context = Context.current, **kwargs)
Draws random samples from a normal (Gaussian) distribution.
.random_poisson(lam : Number = 1.0, ctx : Context = Context.current, **kwargs)
Draws random samples from a Poisson distribution.
.random_randint(low : Int, high : Int, ctx : Context = Context.current, **kwargs)
Draws random samples from a discrete uniform distribution.
.random_uniform(low : Number = 0.0, high : Number = 1.0, ctx : Context = Context.current, **kwargs)
Draws random samples from a uniform distribution.
.rcbrt(data, **kwargs)
Returns element-wise inverse cube-root value of the input.
.reciprocal(data, **kwargs)
Returns the reciprocal of the argument, element-wise.
.relu(data, **kwargs)
Computes the rectified linear activation.
.reshape(data, shape, **kwargs)
Reshapes the input array.
.reshape_like(lhs, rhs, **kwargs)
Reshape some or all dimensions of lhs to have the same shape as some or all dimensions of rhs.
.rint(data, **kwargs)
Returns element-wise rounded value to the nearest integer.
.round(data, **kwargs)
Returns element-wise rounded value to the nearest integer.
.rsqrt(data, **kwargs)
Returns element-wise inverse square-root value of the input.
.sample_exponential(lam : self, shape = [] of Int32, **kwargs)
Draws concurrent samples from exponential distributions.
.sample_gamma(alpha : self, beta : self, shape = [] of Int32, **kwargs)
Draws random samples from gamma distributions.
.sample_multinomial(data : self, get_prob : Bool = false, **kwargs)
Draws random samples from multinomial distributions.
.sample_normal(mu : self, sigma : self, shape = [] of Int32, **kwargs)
Draws concurrent samples from normal (Gaussian) distributions.
.sample_poisson(lam : self, shape = [] of Int32, **kwargs)
Draws concurrent samples from Poisson distributions.
.sample_uniform(low : self, high : self, shape = [] of Int32, **kwargs)
Draws concurrent samples from uniform distributions.
.save(fname, symbol)
Saves symbol to a JSON file.
.sgd_mom_update(weight : self, grad : self, mom : self, lr : Float, **kwargs)
Momentum update function for Stochastic Gradient Descent (SGD) optimizer.
.sgd_update(weight : self, grad : self, lr : Float, **kwargs)
Update function for Stochastic Gradient Descent (SGD) optimizer.
.shape_array(data, **kwargs)
Returns a 1-D array containing the shape of the data.
.shuffle(data, **kwargs)
Randomly shuffles the elements.
.sigmoid(data, **kwargs)
Computes the sigmoid activation.
.sign(data, **kwargs)
Returns the element-wise sign of the input.
.sin(data, **kwargs)
Computes the element-wise sine of the input array.
.sinh(data, **kwargs)
Returns the hyperbolic sine of the input array, computed element-wise.
.size_array(data, **kwargs)
Returns a 1-D array containing the size of the data.
.slice(data, begin _begin, end _end, **kwargs)
Slices a region of the array.
.slice_axis(data, axis, begin _begin, end _end, **kwargs)
Slices along a given axis.
.slice_like(data, shape_like, **kwargs)
Slices like the shape of another array.
.softmax(data, **kwargs)
Applies the softmax function.
.sort(data, **kwargs)
Returns a sorted copy of an input array along the given axis.
.sqrt(data, **kwargs)
Returns element-wise square-root value of the input.
.square(data, **kwargs)
Returns element-wise squared value of the input.
.subtract(lhs : self | Number, rhs : self | Number)
Returns element-wise difference of the input arrays.
.sum(data, **kwargs)
Computes the sum of array elements over given axes.
.take(a, indices, **kwargs)
Takes elements from an input array along the given axis.
.tan(data, **kwargs)
Computes the element-wise tangent of the input array.
.tanh(data, **kwargs)
Returns the hyperbolic tangent of the input array, computed element-wise.
.tile(data, reps, **kwargs)
Repeats the array multiple times.
.topk(data, **kwargs)
Returns the top k elements in an input array along the given axis.
.transpose(data, **kwargs)
Permutes the dimensions of an array.
.trunc(data, **kwargs)
Return the element-wise truncated value of the input.
.var(name : String, attr = nil, shape = nil, dtype = nil)
Creates a symbolic variable with the specified name.
.where(condition, x, y, **kwargs)
Returns elements, either from x or y, depending on the condition.
.zeros(shape : Int | Array(Int), ctx = Context.current, **kwargs)
Returns an array filled with all zeros, with the given shape.
.zeros_like(data, **kwargs)
Returns an array of zeros with the same shape, data type and storage type as the input array.
Instance Method Summary
Performs element-wise not equal to (
) comparison operation (without broadcasting). -
Performs element-wise modulo (without broadcasting).
Performs element-wise logical and (
) comparison operation (without broadcasting). -
Performs element-wise multiplication (without broadcasting).
Returns the result of the first array elements raised to powers from the second array (or scalar), element-wise (without broadcasting).
Performs element-wise addition (without broadcasting).
Leaves the values unchanged.
Performs element-wise subtraction (without broadcasting).
Performs element-wise numerical negative.
Performs element-wise division (without broadcasting).
Performs element-wise less than (
) comparison operation (without broadcasting). -
Performs element-wise less than or equal to (
) comparison operation (without broadcasting). -
Performs element-wise equal to (
) comparison operation (without broadcasting). -
Performs element-wise greater than (
) comparison operation (without broadcasting). -
Performs element-wise greater than or equal to (
) comparison operation (without broadcasting). -
Performs element-wise logical xor (
) comparison operation (without broadcasting). -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#as_type(dtype : ::Symbol)
Casts all elements of the input to the specified type.
Gets the attribute for specified key.
Recursively gets all attributes from the symbol and its children.
#bind(args : Array(MXNet::NDArray) | Hash(String, MXNet::NDArray) = [] of MXNet::NDArray, ctx : Context = MXNet::Context.current)
Binds the current symbol to an executor and returns the executor.
#broadcast_add(rhs, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
#broadcast_div(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_equal(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_greater(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_greater_equal(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_lesser(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_lesser_equal(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_like(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_logical_and(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_logical_or(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_logical_xor(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_maximum(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_minimum(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_minus(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_mul(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_not_equal(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_plus(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_power(rhs, **kwargs)
Convenience fluent method for
. -
#broadcast_sub(rhs, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#clip(a_min, a_max, **kwargs)
Convenience fluent method for
. -
Returns a deep copy of this symbol.
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#dot(rhs, **kwargs)
Convenience fluent method for
. -
Returns a shallow copy of this symbol.
#eval(ctx : Context = MXNet::Context.current)
Evaluates a symbol given arguments.
#eval(*ndargs : MXNet::NDArray, ctx : Context = MXNet::Context.current)
Evaluates a symbol given arguments.
#eval(ctx : Context = MXNet::Context.current, **ndargs : MXNet::NDArray)
Evaluates a symbol given arguments.
Convenience fluent method for
. -
#expand_dims(axis, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#flip(axis, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Infers the dtypes of all arguments and all outputs, given the known dtypes of some arguments.
Infers the dtypes partially.
Infers the shapes of all arguments and all outputs, given the known shapes of some arguments.
Infers the shapes partially.
Lists all the arguments of the symbol.
Gets all attributes.
Lists all the auxiliary states of the symbol.
Lists all the outputs of the symbol.
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Gets name of the symbol.
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#one_hot(depth, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
#pick(index, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#reshape(shape, **kwargs)
Convenience fluent method for
. -
#reshape_like(rhs, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#slice(begin _begin, end _end, **kwargs)
Convenience fluent method for
. -
#slice_axis(axis, begin _begin, end _end, **kwargs)
Convenience fluent method for
. -
#slice_like(shape_like, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#take(indices, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#tile(reps, **kwargs)
Convenience fluent method for
. - #to_s(io)
Convenience fluent method for
. -
Convenience fluent method for
. -
Convenience fluent method for
. -
#where(x, y, **kwargs)
Convenience fluent method for
. -
Convenience fluent method for
. -
Performs element-wise logical or (
) comparison operation (without broadcasting).
Class Method Detail
Returns the element-wise absolute value of the input.
Assume x is an array with the following elements:
[-2, 0, 3]
abs(x) # => [2, 0, 3]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Applies an activation function element-wise to the input.
The following activation functions are supported:
- relu: Rectified Linear Unit, y = max(x, 0)
- softrelu: Soft ReLU or SoftPlus, y = log(1 + exp(x))
- tanh: Hyperbolic tangent, y = exp(x) − exp(−x) / exp(x) + exp(−x)
- sigmoid: y = 1 / 1 + exp(−x)
- softsign: y = x / 1 + abs(x)
- data (
, required) The input array. - act_type (
, or:softsign
, required) Activation function to be applied. - name (
, optional) Name of the symbol.
Returns element-wise sum of the input arrays.
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs + rhs
Adds all input arguments element-wise.
is potentially more efficient than calling .add
n times.
- data (
, required) List of arrays to add. - name (
, optional) Name of the symbol.
Returns evenly spaced values within a given interval.
Values are generated within the half-open interval [start,
. In other words, the interval includes start but
excludes stop.
arange(3) # => [0.0, 1.0, 2.0]
arange(2, 6) # => [2.0, 3.0, 4.0, 5.0]
arange(2, 6, step: 2) # => [2.0, 4.0]
arange(2, 6, step: 1.5, repeat: 2) # => [2.0, 2.0, 3.5, 3.5, 5.0 , 5.0]
arange(2, 6, step: 2, repeat: 3, dtype: :int32) # => [2, 2, 2, 4, 4, 4]
- data (
, required) Input data. - start (
, optional, default =0.0
) Start of interval. - stop (
, required) End of interval. - step (
, optional, default =1.0
) Spacing between values. - repeat (
, optional, default =1
) Number of times to repeat each value. - dtype (
, default =:float32
) The data type of the output array. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Returns element-wise inverse cosine of the input array.
The input should be in range [-1, 1]
The output is in the closed interval [0, 𝜋]
arccos([-1, -.707, 0, .707, 1]) = [𝜋, 3𝜋/4, 𝜋/2, 𝜋/4, 0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the inverse hyperbolic cosine of the input array, computed element-wise.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise inverse sine of the input array.
The input should be in the range [-1, 1]
The output is in the closed interval [-𝜋/2, 𝜋/2]
arcsin([-1, -.707, 0, .707, 1]) = [-𝜋/2, -𝜋/4, 0, 𝜋/4, 𝜋/2]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the inverse hyperbolic sine of the input array, computed element-wise.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise inverse tangent of the input array.
The output is in the closed interval [-𝜋/2, 𝜋/2]
arctan([-1, 0, 1]) = [-𝜋/4, 0, 𝜋/4]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the inverse hyperbolic tangent of the input array, computed element-wise.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns indices of the maximum values along an axis.
In the case of multiple occurrences of maximum values, the indices corresponding to the first occurrence are returned.
Assume x is an array with the following elements:
[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0]]
argmax(x, axis: 0) = [1.0, 1.0, 1.0]
argmax(x, axis: 1) = [2.0, 2.0]
argmax(x, axis: 1, keepdims: true) = [[2.0], [2.0]]
- data (
, required) Input data. - axis (
, optional, default =-1
) The axis along which to perform the reduction. If omitted, the last axis is used. - keepdims (
, optional, default = false) If true, the reduced axis is left in the result as a dimension with size one. - name (
, optional) Name of the symbol.
Returns indices of the minimum values along an axis.
In the case of multiple occurrences of minimum values, the indices corresponding to the first occurrence are returned.
Assume x is an array with the following elements:
[[0.0, 1.0, 2.0], [3.0, 4.0, 5.0]]
argmin(x, axis: 0) = [0.0, 0.0, 0.0]
argmin(x, axis: 1) = [0.0, 0.0]
argmin(x, axis: 1, keepdims: true) = [[0.0], [0.0]]
- data (
, required) Input data. - axis (
, optional, default =-1
) The axis along which to perform the reduction. If omitted, the last axis is used. - keepdims (
, optional, default = false) If true, the reduced axis is left in the result as a dimension with size one. - name (
, optional) Name of the symbol.
Returns the indices that would sort an input array along the given axis.
This function performs sorting along the given axis and returns an array of indices having the same shape as an input array that index data in the sorted order.
Assume x is an array with the following elements:
[[0.3, 0.2, 0.4], [0.1, 0.3, 0.2]]
argsort(x) = [[1.0, 0.0, 2.0], [0.0, 2.0, 1.0]]
argsort(x, axis: 0) = [[1.0, 0.0, 1.0], [0.0, 1.0, 0.0]]
argsort(x, axis: None) = [3.0, 1.0, 5.0, 0.0, 4.0, 2.0]
argsort(x, is_ascend: false) = [[2.0, 0.0, 1.0], [1.0, 2.0, 0.0]]
- data (
, required) Input data. - axis (
, optional, default =-1
) The axis along which to choose sort the input tensor. If omitted, the last axis is used. IfNone
, the flattened array is used. - is_ascend (
, optional, default = false) Whether to sort in ascending or descending order. - dtype (
, optional, default =:float32
) The data type of the output indices. - name (
, optional) Name of the symbol.
Returns element-wise sum of the input arrays with broadcasting.
is an alias for .broadcast_plus
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_add(x, y) # => [[1, 1, 1], [2, 2, 2]]
Broadcasts the input array over particular axis.
Broadcasting is allowed on axes with size 1, such as from [2, 1, 3, 1]
to [2, 8, 3, 9]
. Elements will be duplicated on the broadcasted
Assume x is an array with the following elements:
[[[1], [2]]]
broadcast_axis(x, axis: 2, size: 3) = [[[1, 1, 1], [2, 2, 2]]]
broadcast_axis(x, axis: [0, 2], size: [2, 3]) = [[[1, 1, 1], [2, 2, 2]], [[1, 1, 1], [2, 2, 2]]]
- name (
, optional) Name of the symbol. - axis (
, optional) The axis on which to perform the broadcasting. - size (
, optional) Target sizes of the broadcasting axis. - name (
, optional) Name of the symbol.
Returns element-wise division of the input arrays with broadcasting.
Assume x and y are arrays with the following elements:
[[6, 6, 6], [6, 6, 6]] # x
[[2], [3]] # y
broadcast_div(x, y) # => [[3, 3, 3], [2, 2, 2]]
Returns the result of element-wise equal to (#==
) comparison
operation with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_equal(x, y) # => [[0, 0, 0], [1, 1, 1]]
Returns the result of element-wise greater than (#>
) comparison
operation with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_greater(x, y) # => [[1, 1, 1], [0, 0, 0]]
Returns the result of element-wise greater than or equal to
) comparison operation with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_greater_equal(x, y) # => [[1, 1, 1], [1, 1, 1]]
Returns the result of element-wise less than (#<
) comparison
operation with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_lesser(x, y) # => [[0, 0, 0], [0, 0, 0]]
Returns the result of element-wise less than or equal to (#<=
comparison operation with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_lesser_equal(x, y) # => [[0, 0, 0], [1, 1, 1]]
Broadcasts the left hand side to have the same shape as right hand side.
Broadcasting is a mechanism that allows NDArray
to perform
arithmetic operations with other arrays of different shapes
efficiently without creating multiple copies of arrays. See:
for explanation.
Broadcasting is allowed on axes with size 1, such as from [2, 1, 3, 1]
to [2, 8, 3, 9]
. Elements will be duplicated on the broadcasted
Assume x and y are arrays with the following elements:
[[1, 2, 3]] # x
[[5, 6, 7], [7, 8, 9]] # y
broadcast_like(x, y) = [[1, 2, 3], [1, 2, 3]])
Returns element-wise logical and of the input arrays with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_logical_and(x, y) # => [[0, 0, 0], [1, 1, 1]]
Returns element-wise logical or of the input arrays with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 0], [1, 1, 0]] # x
[[1], [0]] # y
broadcast_logical_or(x, y) # => [[1, 1, 1], [1, 1, 0]]
Returns element-wise logical xor of the input arrays with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 0], [1, 1, 0]] # x
[[1], [0]] # y
broadcast_logical_or(x, y) # => [[0, 0, 1], [1, 1, 0]]
Returns element-wise maximum of the input arrays with broadcasting.
This function compares two input arrays and returns a new array having the element-wise maxima.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_maximum(x, y) # => [[1, 1, 1], [1, 1, 1]]
Returns element-wise minimum of the input arrays with broadcasting.
This function compares two input arrays and returns a new array having the element-wise minima.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_minimum(x, y) # => [[0, 0, 0], [1, 1, 1]]
Returns element-wise difference of the input arrays with broadcasting.
is an alias to the function .broadcast_sub
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_minus(x, y) # => [[1, 1, 1], [0, 0, 0]]
Returns element-wise product of the input arrays with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_mul(x, y) # => [[0, 0, 0], [1, 1, 1]]
Returns the result of element-wise not equal to (#!=
comparison operation with broadcasting.
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_not_equal(x, y) # => [[1, 1, 1], [0, 0, 0]]
Returns element-wise sum of the input arrays with broadcasting.
is an alias for .broadcast_add
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_plus(x, y) # => [[1, 1, 1], [2, 2, 2]]
Returns result of first array elements raised to powers from second array, element-wise with broadcasting.
Assume x and y are arrays with the following elements:
[[2, 2, 2], [2, 2, 2]] # x
[[1], [2]] # y
broadcast_power(x, y) # => [[2, 2, 2], [4, 4, 4]]
Returns element-wise difference of the input arrays with broadcasting.
is an alias to the function .broadcast_minus
Assume x and y are arrays with the following elements:
[[1, 1, 1], [1, 1, 1]] # x
[[0], [1]] # y
broadcast_sub(x, y) # => [[1, 1, 1], [0, 0, 0]]
Broadcasts the input array to a new shape.
Broadcasting is a mechanism that allows NDArray
to perform
arithmetic operations with other arrays of different shapes
efficiently without creating multiple copies of arrays. See:
for explanation.
Broadcasting is allowed on axes with size 1, such as from [2, 1, 3, 1]
to [2, 8, 3, 9]
. Elements will be duplicated on the broadcasted
Assume x is an array with the following elements:
[[1, 2, 3]]
broadcast_to(x, shape: [2, 3]) = [[1, 2, 3], [1, 2, 3]])
The dimension which you do not want to change can also be
specified as 0
. So with shape: [2, 0]
, we will obtain the
same result as in the above example.
- data (
, required) Input data. - shape (
, required) The shape of the desired array. - name (
, optional) Name of the symbol.
Returns element-wise cube-root value of the input.
Assume x is an array with the following elements:
[1, 8, -125]
cbrt(x) = [1, 2, -5]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise ceiling of the input.
The ceiling x
is the smallest integer i
, such that i >= x
Assume x is an array with the following elements:
[-2.1, -1.9, 1.5, 1.9, 2.1]
ceil(x) = [-2.0, -1.9, 2.0, 2.0, 3.0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Clips (limits) the values in an array.
Given an interval, values outside the interval are clipped to the interval edges. Clipping x between a_min and a_x would be:
clip(x, a_min, a_max) = max(min(x, a_max), a_min))
Assume x is an array with the following elements:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
clip(x, 1, 8) # => [1, 1, 2, 3, 4, 5, 6, 7, 8, 8]
- data (
, required) Input data. - a_min (
, required) Minimum value. - a_max (
, required) Maximum value. - name (
, optional) Name of the symbol.
Joins input arrays along a given axis.
The dimensions of the input arrays should be the same except for the axis along which they will be concatenated. The dimension of the output array along the concatenated axis will be equal to the sum of the corresponding dimensions of the input arrays.
Assume x and y are arrays with the following elements:
[[1, 2], [3, 4]] # x
[[1, 4], [1, 1]] # y
concat(x, y) # => [[1, 2, 1, 4], [3, 4, 1, 1]]
- data (
, required) List of arrays to concatenate. - dim (
, default = 1) The dimension to be concated. - name (
, optional) Name of the symbol.
Compute N-D convolution on (N+2)-D input.
For general 2-D convolution, the shapes are:
- data: [batch_size, channel, height, width]
- weight: [num_filter, channel, kernel[0], kernel[1]]
- bias: [num_filter]
- out: [batch_size, num_filter, out_height, out_width]
If no_bias is set to be true, then the bias term is ignored.
The default data layout is NCHW, namely (batch_size, channel, height, width). We can choose other layouts such as NWC.
If num_group is larger than 1, denoted by g, then split the input data evenly into g parts along the channel axis, and also evenly split weight along the first dimension. Next compute the convolution on the i-th part of the data with the i-th weight part. The output is obtained by concatenating all the g results.
1-D convolution does not have height dimension but only width in space. The shapes are:
- data: [batch_size, channel, width]
- weight: [num_filter, channel, kernel[0]]
- bias: [num_filter]
- out: [batch_size, num_filter, out_width]
3-D convolution adds an additional depth dimension besides height and width. The shapes are:
- data: [batch_size, channel, depth, height, width]
- weight: [num_filter, channel, kernel[0], kernel[1], kernel[2]]
- bias: [num_filter]
- out: [batch_size, num_filter, out_depth, out_height, out_width]
Both weight and bias are learnable parameters.
There are other options to tune the performance:
- cudnn_tune: enabling this option leads to higher
startup time but may give faster speed. Options are: "off" -
no tuning, "limited_workspace" - run test and pick the
fastest algorithm that doesn't exceed workspace limit,
"fastest" - pick the fastest algorithm and ignore workspace
(default) - the behavior is determined by the environment variable "MXNET_CUDNN_AUTOTUNE_DEFAULT" -- 0 for off, 1 for limited workspace (default), 2 for fastest. - workspace: a larger number leads to more (GPU) memory usage but may improve the performance.
- data (
, required) Input data. - weight (
, required) Weight matrix. - bias (
, required) Bias parameter. - kernel (
, shape, required) Convolution kernel size:[w]
,[h, w]
or[d, h, w]
. - stride (
, shape, optional, default = []) Convolution stride:[w]
,[h, w]
or[d, h, w]
. Defaults to 1 for each dimension. - dilate (
, shape, optional, default = []) Convolution dilation:[w]
,[h, w]
or[d, h, w]
. Defaults to 1 for each dimension. - pad (
, shape, optional, default = []) Zero pad for convolution:[w]
,[h, w]
or[d, h, w]
. Defaults to no padding. - num_filter (
, required) Convolution filter (channel) number. - num_group (
, optional, default = 1) Number of group partitions. - workspace (
, optional, default = 1024) Maximum temporary workspace allowed (MB) for convolution. This parameter has two usages. When CUDNN is not used, it determines the effective batch size of the convolution kernel. When CUDNN is used, it controls the maximum temporary storage used for tuning the best CUDNN kernel when "limited_workspace" strategy is used. - no_bias (
, optional, default = false) Whether to disable bias parameter. - cudnn_tune (
, optional) Whether to pick the convolution algorithm by running a performance test. - cudnn_off (
, optional, default = false) Turn off cudnn for this layer. - layout (
, optional) Set layout for input, output and weight. Empty for default layout: "NCW" for 1D, "NCHW" for 2D and "NCDHW" for 3D. "NHWC" and "NDHWC" are only supported on GPU. - name (
, optional) Name of the symbol.
Computes the element-wise cosine of the input array.
The input should be in radians (2\𝜋
radians equals 360 degrees).
cos([0, 𝜋/4, 𝜋/2]) = [1, 0.707, 0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the hyperbolic cosine of the input array, computed element-wise.
cosh(x) = (exp(x) + exp(-x)) / 2
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Converts each element of the input array from radians to degrees.
degrees([0, 𝜋/2, 𝜋, 3𝜋/2, 2𝜋]) = [0, 90, 180, 270, 360]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Extracts a diagonal or constructs a diagonal array.
‘s behavior depends on the input array dimensions:
- 1-D arrays: constructs a 2-D array with the input as its diagonal, all other elements are zero.
- N-D arrays: extracts the diagonals of the sub-arrays with axes specified by axis1 and axis2. The output shape is decided by removing the axes numbered axis1 and axis2 from the input shape and appending to the result a new axis with the size of the diagonals in question.
For example, when the input shape is [2, 3, 4, 5]
, axis1
and axis2 are 0 and 2 respectively and k is 0, the
resulting shape is [3, 5, 2]
Assume x and y are arrays with the following elements:
[[1, 2, 3], [4, 5, 6]] # x
[[[1, 2], [3, 4]], [[5, 6], [7, 8]]] # y
diag(x) = [1, 5]
diag(x, k: 1) = [2, 6]
diag(x, k: -1) = [4]
diag(y) = [[1, 7], [2, 8]]
diag(y, k: 1) = [[3], [4]]
diag(y, axis1: -2, axis2: -1) = [[1, 4], [5, 8]]
- data (
, required) Input data. - k (
, optional, default = 0) The diagonal in question. The default is 0. Usek > 0
for diagonals above the main diagonal, andk < 0
for diagonals below the main diagonal. - axis1 (
, optional, default = 0) The first axis of the sub-arrays of interest. Ignored when the input is a 1-D array. - axis2 (
, optional, default = 1) The second axis of the sub-arrays of interest. Ignored when the input is a 1-D array. - name (
, optional) Name of the symbol.
Returns element-wise division of the input arrays.
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs / rhs
Computes the dot product of two arrays.
‘s behavior depends on the input array dimensions:
- 1-D arrays: inner product of vectors
- 2-D arrays: matrix multiplication
- N-D arrays: a sum product over the last axis of the first input and the first axis of the second input
Assume x and y are arrays with the following elements:
[[1, 2], [3, 4]] # x
[[4, 3], [1, 1]] # y
dot(x, y) # => [[8, 5], [20, 13]]
Returns the result of element-wise equal to (#==
) comparison
For each element in input arrays, return 1 (true) if corresponding elements are same, otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs == rhs
Returns element-wise exponential value of the input.
Assume x is an array with the following elements:
[0.0, 1.0, 2.0]
exp(x) = [1.0, 2.71828175, 7.38905621]
The storage type of .exp
output is always dense.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Inserts a new axis of size 1 into the array shape.
For example, given x with shape [2, 3, 4], then
.expand_dims(x, axis: 1)
will return a new array with shape
[2, 1, 3, 4].
- data (
, required) Input data. - axis (
, required) Position where new axis is to be inserted. Suppose that the input array‘s dimension isndim
, the range of the inserted axis is[-ndim, ndim]
. - name (
, optional) Name of the symbol.
Returns exp(x) - 1
computed element-wise on the input.
This function provides greater precision than explicitly
calculating exp(x) - 1
for small values of x.
Assume x is an array with the following elements:
[0.0, 1.0, 2.0]
expm1(x) = [0.0, 1.71828182, 6.38905609]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise rounded value to the nearest integer towards zero.
Assume x is an array with the following elements:
[-2.1, -1.9, 1.5, 1.9, 2.1]
fix(x) = [-2.0, -1.0, 1.0, 1.0, 2.0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Flattens the input array into a 2-D array by collapsing the higher dimensions.
For an input array with shape (d1, d2, ..., dk), .flatten
reshapes the input array into an output array of shape
(d1, d2 * ... * dk).
Note that the bahavior of this function is different from
, which behaves similar to .reshape(shape: [-1])
Assume x is an array with the following elements:
[[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]]
flatten(x).shape # => [2, 6]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Reverses the order of elements along given axis while preserving array shape.
Assume x is an array with the following elements:
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]
flip(x, axis: 0) # => [[5, 6, 7, 8, 9], [0, 1, 2, 3, 4]]
flip(x, axis: 1) # => [[4, 3, 2, 1, 0], [9, 8, 7, 6, 5]]
- data (
, required) Input data. - axis (
, required) The axis on which to reverse elements. - name (
, optional) Name of the symbol.
Returns the element-wise floor of the input.
The floor of x
is the largest integer i
, such that i <= x
Assume x is an array with the following elements:
[-2.1, -1.9, 1.5, 1.9, 2.1]
floor(x) = [-3.0, -2.0, 1.0, 1.0, 2.0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the gamma function (extension of the factorial function to the reals), computed element-wise on the input array.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the log of the absolute value of the gamma function, computed element-wise on the input array.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the result of element-wise greater than (#>
) comparison
For each element in input arrays, return 1 (true) if lhs element is greater than corresponding rhs element, otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs > rhs
Returns the result of element-wise greater than or equal to
) comparison operation.
For each element in input arrays, return 1 (true) if lhs element is greater than or equal to rhs element, otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs >= rhs
Creates a symbol that contains a collection of other symbols, grouped together.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")[a, b]) # => grouped symbol
Returns the result of element-wise less than (#<
) comparison
For each element in input arrays, return 1 (true) if lhs element is less than corresponding rhs element, otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs < rhs
Returns the result of element-wise less than or equal to (#<=
comparison operation.
For each element in input arrays, return 1 (true) if lhs element is less than or equal to rhs element, otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs <= rhs
Returns element-wise natural logarithmic value of the input.
The natural logarithm is the logarithm in base e, so that
log(exp(x)) = x
The storage type of .log
output is always dense.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise base-10 logarithmic value of the input.
10**log10(x) = x
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns .log(1 + x)
computed element-wise on the input.
This function is more accurate than explicitly calculating
.log(1 + x)
for small x.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise base-2 logarithmic value of the input.
2**log2(x) = x
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Computes the log softmax of the input.
This is equivalent to computing .softmax
followed by .log
Assume x is an array with the following elements:
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]
softmax(x, axis: 0) # => [[-0.6931, -0.6931, -0.6931], [-0.6931, -0.6931, -0.6931]]
softmax(x, axis: 1) # => [[-1.0986, -1.0986, -1.0986], [-1.0986, -1.0986, -1.0986]]
- data (
, required) Input data. - axis (
, optional, default = -1) The axis along which to compute softmax. - temperature (
, optional, default = 1.0) Temperature parameter in softmax. - dtype (
, optional) Type of the output in case this can't be inferred. Defaults to the same type as the input if not defined. - name (
, optional) Name of the symbol.
Returns the result of element-wise logical and (#&
) comparison
For each element in input arrays, return 1 (true) if lhs element and rhs element is true (not zero), otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs & rhs
Performs element-wise logical not of the input array.
logical_not([-2, 0, 1]) = [0, 1, 0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the result of element-wise logical or (#|
) comparison
For each element in input arrays, return 1 (true) if lhs element or rhs element is true (not zero), otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs | rhs
Returns the result of element-wise logical xor (#^
) comparison
For each element in input arrays, return 1 (true) if either lhs element or rhs element is true (not zero) but not both, otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs ^ rhs
Computes the max of array elements over given axes.
- data (
, required) Input data. - axis (
, optional) The axis or axes along which to perform the reduction. By default it computes over all elements into a scalar array with shape[1]
. If axis isInt
, a reduction is performed on a particular axis. If axis isArray(Int)
, a reduction is performed on all the axes specified in the list. If exclude istrue
, reduction will be performed on the axes that are not in axis instead. Negative values means indexing from right to left. - keepdims (
, optional, default = false) Iftrue
, the reduced axes are left in the result as a dimension with size one. - exclude (
, optional, default = false) Whether to perform reduction on axes that are not in axis instead. - name (
, optional) Name of the symbol.
Returns element-wise maximum of the input arrays.
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Computes the mean of array elements over given axes.
- data (
, required) Input data. - axis (
, optional) The axis or axes along which to perform the reduction. By default it computes over all elements into a scalar array with shape[1]
. If axis isInt
, a reduction is performed on a particular axis. If axis isArray(Int)
, a reduction is performed on all the axes specified in the list. If exclude istrue
, reduction will be performed on the axes that are not in axis instead. Negative values means indexing from right to left. - keepdims (
, optional, default = false) Iftrue
, the reduced axes are left in the result as a dimension with size one. - exclude (
, optional, default = false) Whether to perform reduction on axes that are not in axis instead. - name (
, optional) Name of the symbol.
Computes the min of array elements over given axes.
- data (
, required) Input data. - axis (
, optional) The axis or axes along which to perform the reduction. By default it computes over all elements into a scalar array with shape[1]
. If axis isInt
, a reduction is performed on a particular axis. If axis isArray(Int)
, a reduction is performed on all the axes specified in the list. If exclude istrue
, reduction will be performed on the axes that are not in axis instead. Negative values means indexing from right to left. - keepdims (
, optional, default = false) Iftrue
, the reduced axes are left in the result as a dimension with size one. - exclude (
, optional, default = false) Whether to perform reduction on axes that are not in axis instead. - name (
, optional) Name of the symbol.
Returns element-wise minimum of the input arrays.
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Returns element-wise modulo of the input arrays.
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs % rhs
Returns element-wise product of the input arrays.
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs * rhs
Computes the product of array elements over given axes treating not-a-number values (NaN) as one.
See .prod
- data (
, required) Input data. - axis (
, optional) The axis or axes along which to perform the reduction.axis: []
oraxis: nil
will compute over all elements into a scalar array with shape[1]
. If axis is anInt
, a reduction is performed on a particular axis. If axis is an array ofInt
, a reduction is performed on all the axes specified in the array. If exclude is true, reduction will be performed on the axes that are not in axis instead. Negative values means indexing from right to left. - keepdims (
, optional, default = false) If this is set to true, the reduced axes are left in the result as dimension with size one. - exclude (
, optional, default = false) Whether to perform reduction on axis that are not in axis instead. - name (
, optional) Name of the symbol.
Computes the sum of array elements over given axes treating not-a-number values (NaN) as zero.
See .sum
- data (
, required) Input data. - axis (
, optional) The axis or axes along which to perform the reduction.axis: []
oraxis: nil
will compute over all elements into a scalar array with shape[1]
. If axis is anInt
, a reduction is performed on a particular axis. If axis is an array ofInt
, a reduction is performed on all the axes specified in the array. If exclude is true, reduction will be performed on the axes that are not in axis instead. Negative values means indexing from right to left. - keepdims (
, optional, default = false) If this is set to true, the reduced axes are left in the result as dimension with size one. - exclude (
, optional, default = false) Whether to perform reduction on axis that are not in axis instead. - name (
, optional) Name of the symbol.
Computes the norm.
This operator computes the norm on an array with the specified
axis, depending on the value of the ord
parameter. By default,
it computes the L2 norm on the entire array. Currently only
ord: 2
supports sparse arrays.
Assume x is an array with the following elements:
[[[1.0, 2.0], [3.0, 4.0]], [[2.0, 2.0], [5.0, 6.0]]]
norm(x, ord: 2, axis: 1) # => [[3.1622, 4.4721], [5.3851, 6.3245]]
norm(x, ord: 1, axis: 1) # => [[40., 6.0], [7.0, 8.0]]
- data (
, required) Input data. - ord (
, optional, default =2
) Order of the norm. Currentlyord: 1
andord: 2
are supported. - axis (
, optional) The axis or axes along which to perform the reduction. By default it computes over all elements into a scalar array with shape[1]
. If axis isInt
, a reduction is performed on a particular axis. If axis isArray(Int)
, it specifies the axes that hold 2-D matrices, and the matrix norms of these matrices are computed. - out_dtype (
, optional) The data type of the output. - keepdims (
, optional, default = false) Iftrue
, the reduced axes are left in the result as a dimension with size one. - name (
, optional) Name of the symbol.
Returns the result of element-wise not equal to (#!=
comparison operation.
For each element in input arrays, return 1 (true) if corresponding elements are different, otherwise return 0 (false).
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs != rhs
Returns a one-hot array.
The locations represented by indices take value on_value, while all other locations take value off_value.
with indices of shape [i0, i1]
and depth of d
would result in an output array of shape [i0, i1, d]
output[i, j, 0..-1] = off_value
output[i, j, indices[i, j]] = on_value
Assume x is an array with the following elements:
[1, 0, 2, 0]
one_hot(x, 3) # => [[0, 1, 0], [1, 0, 0], [0, 0, 1], [1, 0, 0]]
- indices (
, required) Array of locations where to set on_value. - depth (
, required) Depth of the one hot dimension. - on_value (
, optional, default = 1.0) The value assigned to the locations represented by indices. - off_value (
, optional, default = 0.0) The value assigned to the locations not represented by indices. - dtype (
, optional, default =:float32
) Type of the output. - name (
, optional) Name of the symbol.
Returns an array filled with all ones, with the given shape.
- data (
, required) Input data. - shape (
) The shape of the array. - dtype (
, default =:float32
) The data type of the output array. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Returns an array of ones with the same shape, data type and storage type as the input array.
Assume x is an array with the following elements:
[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]
ones_like(x) # => [[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Picks elements from an input array according to the indices along the given axis.
Given an input array of shape [d0, d1]
and indices of shape
, the result will be an output array of shape [i0]
output[i] = input[i, indices[i]]
By default, if any index mentioned is too large, it is replaced by the index that addresses the last element along an axis (clip mode).
This function supports n-dimensional input and (n-1)-dimensional indices arrays.
Assume x, i, j, and k are arrays with the following elements:
[[1, 2], [3, 4], [5, 6]] # x
[0, 1] # i
[0, 1, 0] # j
[1, 0, 2] # k
# pick elements with specified indices along axis 0
pick(x, index: i, 0) # => [1, 4]
# pick elements with specified indices along axis 1
pick(x, index: j, 1) # => [1, 4, 5]
# pick elements with specified indices along axis 1 --
# dims are maintained
pick(x, index: k, 1, keepdims: true) # => [[2], [3], [6]]
- data (
, required) The input array. - index (
, required) The index array. - axis (
, optional, default = -1) The axis to pick the elements. Negative values mean indexing from right to left. Ifnil
, elements in the index with respect to the flattened input will be picked. - keepdims (
, optional, default = false) If true, the axis where we pick the elements is left in the result as a dimension with size one. - name (
, optional) Name of the symbol.
Performs pooling on the input.
The shapes for 1-D pooling are:
- data and out: [batch_size, channel, width] ("NCW" layout) or [batch_size, width, channel] ("NWC" layout)
The shapes for 2-D pooling are:
- data and out: [batch_size, channel, height, width] ("NCHW" layout) or [batch_size, height, width, channel] ("NHWC" layout)
Three pooling options are supported by pool_type:
- avg: average pooling
- max: max pooling
- sum: sum pooling
- lp: Lp pooling
For 3-D pooling, an additional depth dimension is added before height. Namely the input data and output will have shape: [batch_size, channel, depth, height, width] ("NCDHW" layout) or [batch_size, depth, height, width, channel] ("NDHWC" layout).
Notes on Lp pooling:
Lp pooling was first introduced by this paper: L-1 pooling is simply sum pooling, while L-inf pooling is simply max pooling. We can see that Lp pooling stands between those two, in practice the most common value for p is 2.
- data (
, required) Input data. - kernel (
, shape, optional, default = []) Pooling kernel size: [y, x] or [d, y, x]. - pool_type (
, optional, default =:max
) Pooling type to be applied. - global_pool (
, optional, default = false) Ignore kernel size; do global pooling based on current input feature map. - cudnn_off (
, optional, default = false) Turn off cudnn pooling and use MXNet pooling operator. - pooling_convention (
, or:valid
, optional, default =:valid
) Pooling convention to be applied. - stride (
, shape, optional, default = []) Stride for pooling: [y, x] or [d, y, x]. Defaults to 1 for each dimension. - pad (
, shape, optional, default = []) Pad for pooling: [y, x] or [d, y, x]. Defaults to no padding. - p_value (
, optional) Value of p for Lp pooling, can be 1 or 2, required for Lp pooling. - count_include_pad (
, optional) Only used for average pooling. Specify whether to count padding elements for average calculation. For example, with a 55 kernel on a 33 corner of a image, the sum of the 9 valid elements will be divided by 25 if this is set to true, or it will be divided by 9 if this is set to false. Defaults to true. - layout (
, optional) Set layout for input, output and weight. Empty for default layout: "NCW" for 1D, "NCHW" for 2D and "NCDHW" for 3D. "NHWC" and "NDHWC" are only supported on GPU. - name (
, optional) Name of the symbol.
Returns result of first array elements raised to powers from second array, element-wise.
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to base ** exp
Computes the product of array elements over given axes.
Assume x is an array with the following elements:
[[[1, 2], [2, 3], [1, 3]],
[[1, 4], [4, 3], [5, 2]],
[[7, 1], [7, 2], [7, 3]]]
prod(x, axis: 1) # => [[2, 18], [20, 24], [343, 6]]
prod(x, axis: [1, 2]) # => [36, 480, 2058]
- data (
, required) Input data. - axis (
, optional) The axis or axes along which to perform the reduction.axis: []
oraxis: nil
will compute over all elements into a scalar array with shape[1]
. If axis is anInt
, a reduction is performed on a particular axis. If axis is an array ofInt
, a reduction is performed on all the axes specified in the array. If exclude is true, reduction will be performed on the axes that are not in axis instead. Negative values means indexing from right to left. - keepdims (
, optional, default = false) If this is set to true, the reduced axes are left in the result as dimension with size one. - exclude (
, optional, default = false) Whether to perform reduction on axis that are not in axis instead. - name (
, optional) Name of the symbol.
Converts each element of the input array from degrees to radians.
radians([0, 90, 180, 270, 360]) = [0, 𝜋/2, 𝜋, 3𝜋/2, 2𝜋]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Draws random samples from an exponential distribution.
Samples are distributed according to an exponential distribution
parametrized by lam
random_exponential(4.0, shape: [2, 2]) # => [[0.0097189 , 0.08999364], [0.04146638, 0.31715935]]
- lam (
, default = 1.0) Lambda parameter (rate) of the exponential distribution. - shape (
) The shape of the output. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Draws random samples from a gamma distribution.
Samples are distributed according to a gamma distribution
parametrized by alpha
(shape) and beta
random_gamma(9.0, 0.5, shape: [2, 2]) # => [[6.2806954, 6.1658335], [4.5625057, 6.479337]]
- alpha (
, default = 1.0) Alpha parameter (shape) of the gamma distribution. - beta (
, default = 1.0) Beta parameter (scale) of the gamma distribution. - shape (
) The shape of the output. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Draws random samples from a normal (Gaussian) distribution.
Samples are distributed according to a normal distribution
parametrized by loc
(mean) and scale
(standard deviation).
random_normal(0.0, 1.0, shape: [2, 2]) # => [[1.89171135, -1.16881478], [-1.23474145, 1.55807114]]
- loc (
, default = 0.0) Mean of the distribution. - scale (
, default = 1.0) Standard deviation of the distribution. - shape (
) The shape of the output. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Draws random samples from a Poisson distribution.
Samples are distributed according to a Poisson distribution
parametrized by lam
(rate). Samples will always be returned
as a floating point data type.
random_poisson(4.0, shape: [2, 2]) # => [[5.0, 2.0], [4.0, 6.0]]
- lam (
, default = 1.0) Lambda parameter (rate) of the Poisson distribution. - shape (
) The shape of the output. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Draws random samples from a discrete uniform distribution.
Samples are uniformly distributed over the half-open interval
[low, high)
(includes low, but excludes high).
random_randint(0, 5, shape: [2, 2]) # => [[0, 2], [3, 1]]
- low (
, required) Lower boundary of the output interval. - high (
, required) Upper boundary of the output interval. - shape (
) The shape of the output. - dtype (
, default =:int32
) The data type of the output. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Draws random samples from a uniform distribution.
Samples are uniformly distributed over the half-open interval
[low, high)
(includes low, but excludes high).
random_uniform(0.0, 1.0, shape: [2, 2]) # => [[0.60276335, 0.85794562], [0.54488319, 0.84725171]]
- low (
, default = 0.0) Lower bound of the distribution. - high (
, default = 1.0) Upper bound of the distribution. - shape (
) The shape of the output. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Returns element-wise inverse cube-root value of the input.
rcbrt(x) = 1/cbrt(x)
Assume x is an array with the following elements:
[1, 8, -125]
rcbrt(x) = [1.0, 0.5, -0.2]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the reciprocal of the argument, element-wise.
reciprocal(x) = 1/x
Assume x is an array with the following elements:
[-2, 1, 3, 1.6, 0.2]
reciprocal(x) = [-0.5, 1.0, 0.33333334, 0.625, 5.0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Computes the rectified linear activation.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Reshapes the input array.
Returns a copy of the array with a new shape without altering any data.
Assume x is an array with the following elements:
[1, 2, 3, 4]
reshape(shape: [2, 2]) # => [[1, 2], [3, 4]]
Some dimensions of the shape can take special values from the set {0, -1, -2, -3, -4}. The significance of each is explained below:
- 0 copies this dimension from the input to the output shape:
zeros([2, 3, 4]).reshape([4, 0, 2]).shape # => [4, 3, 2]
zeros([2, 3, 4]).reshape([2, 0, 0]).shape # => [2, 3, 4]
- -1 infers the dimension of the output shape by using the remainder of the input dimensions, keeping the size of the new array the same as that of the input array. At most one dimension can be -1:
zeros([2, 3, 4]).reshape([6, 1, -1]).shape # => [6, 1, 4]
zeros([2, 3, 4]).reshape([3, -1, 8]).shape # => [3, 1, 8]
zeros([2, 3, 4]).reshape([-1]).shape # => [24]
- -2 copies all/the remainder of the input dimensions to the output shape:
zeros([2, 3, 4]).reshape([-2]).shape # => [2, 3, 4]
zeros([2, 3, 4]).reshape([2, -2]).shape # => [2, 3, 4]
zeros([2, 3, 4]).reshape([-2, 1, 1]).shape # => [2, 3, 4, 1, 1]
- -3 uses the product of two consecutive dimensions of the input shape as the output dimension:
zeros([2, 3, 4]).reshape([-3, 4]).shape # => [6, 4]
zeros([2, 3, 4, 5]).reshape([-3, -3]).shape # => [6, 20]
zeros([2, 3, 4]).reshape([0, -3]).shape # => [2, 12]
zeros([2, 3, 4]).reshape([-3, -2]).shape # => [6, 4]
- -4 splits one dimension of the input into the two dimensions passed subsequent to -4 (which can contain -1):
zeros([2, 3, 4]).reshape([-4, 1, 2, -2]).shape # => [1, 2, 3, 4]
zeros([2, 3, 4]).reshape([2, -4, -1, 3, -2]).shape # => [2, 1, 3, 4]
- data (
, required) Input data. - shape (
) The target shape. - reverse (
, optional, defaultfalse
) Iftrue
then the special values are inferred from right to left. - name (
, optional) Name of the symbol.
Reshape some or all dimensions of lhs to have the same shape as some or all dimensions of rhs.
Returns a view of the lhs array with a new shape without altering any data.
Assume x and y are arrays with the following elements:
[1, 2, 3, 4, 5, 6] # x
[[0, -4], [3, 2], [2, 2]] # y
reshape_like(x, y) # => [[1, 2], [3, 4], [5, 6]]
Returns element-wise rounded value to the nearest integer.
- For input N.5 rint returns N while round returns N+1.
- For input -N.5 both rint and round return -N-1.
Assume x is an array with the following elements:
[-2.1, -1.9, 1.5, 1.9, 2.1]
rint(x) = [-2.0, -2.0, 1.0, 2.0, 2.0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise rounded value to the nearest integer.
Assume x is an array with the following elements:
[-2.1, -1.9, 1.5, 1.9, 2.1]
round(x) = [-2.0, -2.0, 2.0, 2.0, 2.0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise inverse square-root value of the input.
rsqrt(x) = 1/sqrt(x)
Assume x is an array with the following elements:
[4, 9, 16]
rsqrt(x) = [0.5, 0.33333, 0.25]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Draws concurrent samples from exponential distributions.
Samples are drawn from multiple exponential distributions with
parameters lam
The parameters of the distributions are provided as an input
array. Let [s]
be the shape of the input array, n
be the
dimension of [s]
, [t]
be the shape specified as the
parameter of the operator, and m
be the dimension of [t]
Then the output will be a (n+m
)-dimensional array with shape
For any valid n
-dimensional index i
with respect to the
input array, output[i] will be an m
-dimensional array that
holds randomly drawn samples from the distribution which is
parameterized by the input value at index i
. If the shape
parameter of the operator is not set, then one sample will be
drawn per distribution and the output array has the same shape
as the input array.
Assume lam is an array with the following elements:
[1.0, 8.5]
sample_exponential(lam) # => [0.51837951, 0.09994757]
sample_exponential(lam, shape: [2]) # => [[0.51837951, 0.19866663], [0.09994757, 0.50447971]]
- lam (
) Lambda parameters (rates) of the exponential distributions. - shape (
) Shape to be sampled from each random distribution. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - name (
, optional) Name of the symbol.
Draws random samples from gamma distributions.
Samples are drawn from multiple gamma distributions with
parameters alpha
(shape) and beta
The parameters of the distributions are provided as input
arrays. Let [s]
be the shape of the input arrays, n
be the
dimension of [s]
, [t]
be the shape specified as the
parameter of the operator, and m
be the dimension of [t]
Then the output will be a (n+m
)-dimensional array with shape
For any valid n
-dimensional index i
with respect to the
input arrays, output[i]
will be an m
-dimensional array
that holds randomly drawn samples from the distribution which
is parameterized by the input values at index i
. If the
shape parameter of the operator is not set, then one sample
will be drawn per distribution and the output array has the
same shape as the input arrays.
Assume alpha and beta are arrays with the following elements:
[0.0, 2.5] # alpha
[1.0, 0.7] # beta
sample_gamma(alpha, beta) # => [0.0, 2.25797319]
sample_gamma(alpha, beta, shape: [2]) # => [[0.0, 0.0], [2.25797319, 1.70734084]]
- alpha (
) Alpha parameters (shapes) of the distributions. - beta (
) Beta parameters (scales) of the distributions. - shape (
) Shape to be sampled from each random distribution. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - name (
, optional) Name of the symbol.
Draws random samples from multinomial distributions.
Samples are drawn from multiple multinomial distributions. Note that the input distribution must be normalized (data must sum to 1 along its last axis).
is an n
dimensional array whose last dimension has
length k
, where k
is the number of possible outcomes of
each multinomial distribution. This operator will draw shape
samples from each distribution. If shape
is empty one sample
will be drawn from each distribution.
If get_prob
is true
, a second array containing log
likelihood of the drawn samples will also be returned. This is
usually used for reinforcement learning where you can provide
reward as head gradient for this array to estimate gradient.
probs = [[0.0, 0.1, 0.2, 0.3, 0.4], [0.4, 0.3, 0.2, 0.1, 0.0]]
sample_multinomial(probs) # => [3, 0]
sample_multinomial(probs, shape: [2]) # => [[4, 2], [0, 0]]
sample_multinomial(probs, get_prob: true) # => [2, 1], [0.2, 0.3]
- data (
) Distribution probabilities. Must sum to one on the last axis. - get_prob (
, default = false) Whether to also return the log probabilities of sampled results. This is usually used for differentiating through stochastic variables, e.g. in reinforcement learning. - shape (
) Shape to be sampled from each random distribution. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - name (
, optional) Name of the symbol.
Draws concurrent samples from normal (Gaussian) distributions.
Samples are drawn from multiple normal distributions with
parameters mu
(mean) and sigma
(standard deviation).
The parameters of the distributions are provided as input
arrays. Let [s]
be the shape of the input arrays, n
be the
dimension of [s]
, [t]
be the shape specified as the
parameter of the operator, and m
be the dimension of [t]
Then the output will be a (n+m
)-dimensional array with shape
For any valid n
-dimensional index i
with respect to the
input arrays, output[i]
will be an m
-dimensional array
that holds randomly drawn samples from the distribution which
is parameterized by the input values at index i
. If the
shape parameter of the operator is not set, then one sample
will be drawn per distribution and the output array has the
same shape as the input arrays.
Assume mu and sigma are arrays with the following elements:
[0.0, 2.5] # mu
[1.0, 3.7] # sigma
sample_normal(mu, sigma) # => [-0.56410581, 0.95934606]
sample_normal(mu, sigma, shape: [2]) # => [[-0.56410581, 0.2928229 ], [0.95934606, 4.48287058]]
- mu (
) Means of the distributions. - sigma (
) Standard deviations of the distributions. - shape (
) Shape to be sampled from each random distribution. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - name (
, optional) Name of the symbol.
Draws concurrent samples from Poisson distributions.
Samples are drawn from multiple Poisson distributions with
parameters lam
(rate). Samples will always be returned as
a floating point data type.
The parameters of the distributions are provided as an input
array. Let [s]
be the shape of the input array, n
be the
dimension of [s]
, [t]
be the shape specified as the
parameter of the operator, and m
be the dimension of [t]
Then the output will be a (n+m
)-dimensional array with shape
For any valid n
-dimensional index i
with respect to the
input array, output[i] will be an m
-dimensional array that
holds randomly drawn samples from the distribution which is
parameterized by the input value at index i
. If the shape
parameter of the operator is not set, then one sample will be
drawn per distribution and the output array has the same shape
as the input array.
Assume lam is an array with the following elements:
[1.0, 8.5]
sample_poisson(lam) # => [0.0, 13.0]
sample_poisson(lam, shape: [2]) # => [[0.0, 4.0], [13.0, 8.0]]
- lam (
) Lambda parameters (rates) of the Poisson distributions. - shape (
) Shape to be sampled from each random distribution. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - name (
, optional) Name of the symbol.
Draws concurrent samples from uniform distributions.
Samples are drawn from multiple uniform distributions on the
intervals given by [low, high)
The parameters of the distributions are provided as input
arrays. Let [s]
be the shape of the input arrays, n
be the
dimension of [s]
, [t]
be the shape specified as the
parameter of the operator, and m
be the dimension of [t]
Then the output will be a (n+m
)-dimensional array with shape
For any valid n
-dimensional index i
with respect to the
input arrays, output[i]
will be an m
-dimensional array
that holds randomly drawn samples from the distribution which
is parameterized by the input values at index i
. If the
shape parameter of the operator is not set, then one sample
will be drawn per distribution and the output array has the
same shape as the input arrays.
Assume low and high are arrays with the following elements:
[0.0, 2.5] # low
[1.0, 3.7] # high
sample_uniform(low, high) # => [0.40451524, 3.18687344]
sample_uniform(low, high, shape: [2]) # => [[0.40451524, 0.18017688], [3.18687344, 3.68352246]]
- low (
) Lower bounds of the distributions. - high (
) Upper bounds of the distributions. - shape (
) Shape to be sampled from each random distribution. - dtype (
, default =:float32
) The data type of the output in case this can’t be inferred. - name (
, optional) Name of the symbol.
Saves symbol to a JSON file.
- fname (
) The name of the file. - symbol (
) Symbol to save.
Momentum update function for Stochastic Gradient Descent (SGD) optimizer.
Momentum update has better convergence rates on neural networks.
- weight (
, required) Weights. - grad (
, required) Gradients. - mom (
, required) Momentum. - lr (
, required) Learning rate. - momentum (
, optional, default = 0) The decay rate of momentum estimates at each epoch. - wd (
, optional, default = 0) Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. - rescale_grad (
, optional, default = 1.0) Rescale gradient tograd = rescale_grad * grad
. - clip_gradient (
, optional, default = -1.0) Clip gradient to the range of [-clip_gradient, clip_gradient]. Ifclip_gradient <= 0
, gradient clipping is turned off. - lazy_update (
, optional, default = true) If true, lazy updates are applied if gradient's stype is row_sparse. - name (
, optional) Name of the symbol.
Update function for Stochastic Gradient Descent (SGD) optimizer.
SGD updates the weights using:
weight = weight - learning_rate * (gradient + wd * weight)
- weight (
, required) Weights. - grad (
, required) Gradients. - lr (
, required) Learning rate. - wd (
, optional, default = 0) Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight. - rescale_grad (
, optional, default = 1.0) Rescale gradient tograd = rescale_grad * grad
. - clip_gradient (
, optional, default = -1.0) Clip gradient to the range of [-clip_gradient, clip_gradient]. Ifclip_gradient <= 0
, gradient clipping is turned off. - lazy_update (
, optional, default = true) If true, lazy updates are applied if gradient's stype is row_sparse. - name (
, optional) Name of the symbol.
Returns a 1-D array containing the shape of the data.
Assume x is an array with the following elements:
[[1, 2, 3, 4], [5, 6, 7, 8]]
shape_array(x) = [2, 4]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Randomly shuffles the elements.
Shuffles the array along the first axis. The order of the elements in each subarray does not change. For example, if a 2-D array is given, the order of the rows randomly changes, but the order of the elements in each row does not change.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Computes the sigmoid activation.
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the element-wise sign of the input.
Assume x is an array with the following elements:
[-2, 0, 3]
sign(x) # => [-1, 0, 1]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Computes the element-wise sine of the input array.
The input should be in radians (2\𝜋
radians equals 360 degrees).
sin([0, 𝜋/4, 𝜋/2]) = [0, 0.707, 1]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the hyperbolic sine of the input array, computed element-wise.
sinh(x) = (exp(x) - exp(-x)) / 2
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns a 1-D array containing the size of the data.
Assume x is an array with the following elements:
[[1, 2, 3, 4], [5, 6, 7, 8]]
size_array(x) = [8]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Slices a region of the array.
This function returns a sliced array between the indices given by begin and end with the corresponding step.
For an input array of shape=[d_0, d_1, ..., d_n-1], a slice operation with begin=[b_0, b_1, ..., b_m-1], end=[e_0, e_1, ..., e_m-1], and step=[s_0, s_1, ..., s_m-1], where m <= n, results in an array with the shape (|e_0-b_0|/|s_0|, ..., |e_m-1-b_m-1|/|s_m-1|, d_m, ..., d_n-1).
The resulting array's k-th dimension contains elements from the k-th dimension of the input array starting from index b_k (inclusive) with step s_k until reaching e_k (exclusive).
If the k-th elements are nil
in the sequence of begin,
end, and step, the following rule will be used to set
default values: if s_k
is nil
, set s_k = 1
. If s_k > 0
set b_k = 0
, e_k = d_k
, else set b_k = d_k-1
, e_k = -1
- data (
, required) Input data. - begin (
, required) Beginning indices for the slice operation, supports negative indices. - end (
, required) Ending indices for the slice operation, supports negative indices. - step (
, optional) Step for the slice operation, supports negative values. - name (
, optional) Name of the symbol.
Slices along a given axis.
Returns an array slice along a given axis starting from the begin index to the end index.
Assume x is an array with the following elements:
[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]
slice_axis(x, axis: 1, begin: 0, end: 2) # => [[1, 2], [5, 6], [9, 10]]
- data (
, required) Input data. - axis (
, required) Axis along which to slice. Supports negative indexes. - begin (
, required) The beginning index along the axis to be sliced. Supports negative indexes. - end (
, required) The ending index along the axis to be sliced. Supports negative indexes. - name (
, optional) Name of the symbol.
Slices like the shape of another array.
This function is similar to .slice
, however, the begin
values are always 0
and the end values of specific axes
are inferred from the second input shape_like.
Given a value of shape_like of shape=[d_0, d_1, ..., d_n-1]
and default empty axes, .slice_like
performs the following
out = slice(input, begin: [0, 0, ..., 0], end: [d_0, d_1, ..., d_n-1])
When axes is present, it is used to specify which axes are being sliced.
It is allowed to have first and second inputs with different dimensions, however, you have to make sure axes are specified and do not exceed the dimension limits.
For example, given an input a with shape=[2, 3, 4, 5] and an input b with shape=[1, 2, 3], the following is not allowed because the number of dimensions of a is 4 and the number of dimension of b is 3:
out = slice_like(a, b)
The following is allowed in this situation:
out = slice_like(a, b, axes: [0, 2])
Assume x and y are arrays with the following elements:
[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]] # x
[[0, 0, 0], [0, 0, 0]] # y
slice_like(x, y) = [[1, 2, 3], [5, 6, 7]]
slice_like(x, y, axes: [0, 1]) = [[1, 2, 3], [5, 6, 7]]
slice_like(x, y, axes: [0]) = [[1, 2, 3, 4], [5, 6, 7, 8]]
slice_like(x, y, axes: [-1]) = [[1, 2, 3], [5, 6, 7], [9, 10, 11]]
- data (
, required) Input data. - shape_like (
) Input to shape like. - axes (
) List of axes on which input data will be sliced according to the corresponding size of the second input. By default it will slice on all axes. Negative axes are supported. - name (
, optional) Name of the symbol.
Applies the softmax function.
The resulting array contains elements in the range (0, 1) and the elements along the given axis sum up to 1.
Assume x is an array with the following elements:
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]
softmax(x, axis: 0) # => [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]
softmax(x, axis: 1) # => [[0.3334, 0.3334, 0.3334], [0.3334, 0.3334, 0.3334]]
- data (
, required) Input data. - axis (
, optional, default = -1) The axis along which to compute softmax. - temperature (
, optional, default = 1.0) Temperature parameter in softmax. - dtype (
, optional) Type of the output in case this can't be inferred. Defaults to the same type as the input if not defined. - name (
, optional) Name of the symbol.
Returns a sorted copy of an input array along the given axis.
Assume x is an array with the following elements:
[[1, 4], [3, 1]]
sort(x) = [[1, 4], [1, 3]]
sort(x, axis: 0) = [[1, 1], [3, 4]]
sort(x, axis: None) = [1, 1, 3, 4]
sort(x, is_ascend: false) = [[4, 1], [3, 1]]
- data (
, required) Input data. - axis (
, optional, default =-1
) The axis along which to choose sort the input tensor. If omitted, the last axis is used. IfNone
, the flattened array is used. - is_ascend (
, optional, default = false) Whether to sort in ascending or descending order. - name (
, optional) Name of the symbol.
Returns element-wise square-root value of the input.
Assume x is an array with the following elements:
[4, 9, 16]
sqrt(x) # => [2, 3, 4]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise squared value of the input.
Assume x is an array with the following elements:
[2, 3, 4]
square(x) # => [4, 9, 16]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns element-wise difference of the input arrays.
Both inputs can be a Symbol
or a scalar number. Broadcasting
is not supported.
Equivalent to lhs - rhs
Computes the sum of array elements over given axes.
Assume x is an array with the following elements:
[[[1, 2], [2, 3], [1, 3]],
[[1, 4], [4, 3], [5, 2]],
[[7, 1], [7, 2], [7, 3]]]
sum(x, axis: 1) # => [[4, 8], [10, 9], [21, 6]]
sum(x, axis: [1, 2]) # => [12, 19, 27]
- data (
, required) Input data. - axis (
, optional) The axis or axes along which to perform the reduction.axis: []
oraxis: nil
will compute over all elements into a scalar array with shape[1]
. If axis is anInt
, a reduction is performed on a particular axis. If axis is an array ofInt
, a reduction is performed on all the axes specified in the array. If exclude is true, reduction will be performed on the axes that are not in axis instead. Negative values means indexing from right to left. - keepdims (
, optional, default = false) If this is set to true, the reduced axes are left in the result as dimension with size one. - exclude (
, optional, default = false) Whether to perform reduction on axis that are not in axis instead. - name (
, optional) Name of the symbol.
Takes elements from an input array along the given axis.
This function slices the input array along a particular axis with the provided indices.
Given data tensor of rank r >= 1, and indices tensor of rank q, gather entries of the axis dimension of data (by default outer-most one as axis=0) indexed by indices, and concatenate them in an output tensor of rank q + (r - 1).
Assume x and i are arrays with the following elements:
[[1, 2], [3, 4], [5, 6]] # x
[[0, 1], [1, 2]]] # i
# get rows 0 and 1, then 1 and 2, along axis 0
take(x, i) # => [[[1, 2], [3, 4]], [[3, 4], [5, 6]]]
- a (
, required) The input array. - indices (
, required) The indices of the values to be extracted. - axis (
, optional, default = 0) The axis of input array to be taken. For input tensor of rank r, it could be in the range of [-r, r-1]. - mode (
, optional, default = :clip) Specify how out-of-bound indices bahave. :clip means to clip to the range. If all indices mentioned are too large, they are replaced by the index that addresses the last element along an axis. :wrap means to wrap around. - name (
, optional) Name of the symbol.
Computes the element-wise tangent of the input array.
The input should be in radians (2\𝜋
radians equals 360 degrees).
tan([0, 𝜋, 𝜋/2]) = [0, 1, -∞)]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Returns the hyperbolic tangent of the input array, computed element-wise.
tanh(x) = sinh(x) / cosh(x)
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Repeats the array multiple times.
Assume x is an array with the following elements:
[[1, 2], [3, 4]]
If reps has length d, and the input array has a corresponding dimension of n. There are three cases:
- n=d. Repeat i-th dimension of the input reps[i] times:
tile(x, reps: [2, 3]) = [[1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4],
[1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4]]
- n>d. reps is promoted to length n by pre-pending
1's. For an input shape
[2, 3]
,reps: [2]
is treated as[1, 2]
tile(x, reps: [2]) = [[1, 2, 1, 2],
[3, 4, 3, 4]]
- n<d. The input is promoted to be d-dimensional by
prepending new axes. A shape
[2, 2]
array is promoted to[1, 2, 2]
for 3-D replication:
tile(x, reps: [2, 2, 3]) = [[[1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4],
[1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4]],
[[1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4],
[1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4]]]
- data (
, required) Input data. - reps (
) The number of times to repeat the input array. Each element of reps must be a positive integer. - name (
, optional) Name of the symbol.
Returns the top k elements in an input array along the given axis.
Assume x is an array with the following elements:
[[0.3, 0.2, 0.4], [0.1, 0.3, 0.2]]
topk(x) = [[2.0], [1.0]]
topk(x, ret_typ: :value, k: 2) = [[0.4, 0.3], [0.3, 0.2]]
topk(x, ret_typ: :value, k: 2, is_ascend: true) = [[0.2, 0.3], [0.1, 0.2]]
topk(x, axis: 0, k: 2) = [[0.0, 1.0, 0.0], [1.0, 0.0, 1.0]]
- data (
, required) Input data. - axis (
, optional, default =-1
) Axis along which to choose the top k indices. If omitted, the last axis is used. IfNone
, the flattened array is used. - k (
, optional, default =1
) Number of top elements to select. It should be always smaller than or equal to the element number in the given axis. - ret_typ (
, optional, default =:indices
) The return type.:value
means to return the top k values,:indices
means to return the indices of the top k values,:mask
means to return a mask array containing 0 and 1 (1 means the top k value).:both
means to return a list of both values and indices of top k elements. - is_ascend (
, optional, default = false) Whether to choose k largest or k smallest elements. Top k largest elements will be chosen if set tofalse
. - dtype (
, optional, default =:float32
) The data type of the output indices when ret_typ is:indices
. - name (
, optional) Name of the symbol.
Permutes the dimensions of an array.
Assume x and y are arrays with the following elements:
[[[1, 2], [3, 4], [5, 6], [7, 8]]] # x
[[1, 2], [3, 4]] # y
transpose(x) # => [[[1], [3], [5], [7]], [[2], [4], [6], [8]]]
transpose(x, axes: [1, 0, 2]) # => [[[1, 2]], [[3, 4]], [[5, 6]], [[7, 8]]]
transpose(y) # => [[1, 3], [2, 4]]
- data (
, required) Input data. - axes (
, optional) Target axis order. By default the axes will be inverted. - name (
, optional) Name of the symbol.
Return the element-wise truncated value of the input.
The truncated value of x
is the nearest integer i
which is
closer to zero than x
is. In short, the fractional part of
the signed number x
is discarded.
Assume x is an array with the following elements:
[-2.1, -1.9, 1.5, 1.9, 2.1]
trunc(x) = [-2.0, -1.0, 1.0, 1.0, 2.0]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Creates a symbolic variable with the specified name.
- name (
) Variable name. - attr (
) Additional attributes to set on the variable. - shape (
) The shape of a variable. If specified, it may be used during the shape inference. - dtype (
) The dtype for input variable. If not specified, this value will be inferred.
Returns elements, either from x or y, depending on the condition.
Given three arrays, condition, x and y, return an array with the elements from x or y, depending on whether the elements from condition are true or false. x and y must have the same shape.
If condition has the same shape as x, each element in the output array is from x if the corresponding element in condition is true and from y if false.
If condition does not have the same shape as x, it must be a 1-D array whose size is the same as the size of the first dimension of x. Each row of the output array is from x if the corresponding element from condition is true and from y if false.
Note: all non-zero values are interpreted as true
Assume x, y and condition are arrays with the following elements:
[[1, 2], [3, 4]] # x
[[5, 6], [7, 8]] # y
[[0, 1], [-1, 0]] # condition
where(condition, x, y) = [[5, 2], [3, 8]]
Returns an array filled with all zeros, with the given shape.
- data (
, required) Input data. - shape (
) The shape of the array. - dtype (
, default =:float32
) The data type of the output array. - ctx (
, optional) Device context (default is the current context). Only used for imperative calls. - name (
, optional) Name of the symbol.
Returns an array of zeros with the same shape, data type and storage type as the input array.
Assume x is an array with the following elements:
[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]]
zeros_like(x) # => [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]
- data (
, required) Input data. - name (
, optional) Name of the symbol.
Instance Method Detail
Performs element-wise not equal to (#!=
) comparison operation
(without broadcasting).
Returns the result of the first array elements raised to powers from the second array (or scalar), element-wise (without broadcasting).
Performs element-wise less than or equal to (#<=
) comparison
operation (without broadcasting).
Performs element-wise greater than or equal to (#>=
) comparison
operation (without broadcasting).
Gets the attribute for specified key.
This function only works for non-grouped symbols.
data = MXNet::Symbol.var("data", attr: {mood: "angry"})
data.attr("mood") # => "angry"
- key (
) The key corresponding to the desired attribute.
Recursively gets all attributes from the symbol and its children.
There is a key in the returned hash for every child with a non-empty set of attributes. For each symbol, the name of the symbol is its key in the hash and the correspond value is that symbol's attribute list.
a = MXNet::Symbol.var("a", attr: {"a1" => "a2"})
b = MXNet::Symbol.var("b", attr: {"b1" => "b2"})
c = a + b
c.attr_dict # => {"a" => {"a1" => "a2"}, "b" => {"b1" => "b2"}}
Binds the current symbol to an executor and returns the executor.
First, declare the computation and then bind to the data to
evaluate. This function returns an executor which provides an
method for evaluation.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a + b # => "<Symbol broadcast_add>"
e = c.bind(args: {"a" => MXNet::NDArray.ones([2, 3]), "b" => MXNet::NDArray.ones([2, 3])}, ctx: MXNet.cpu)
e.forward.first # => [[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]
# <NDArray 2x3 float32 cpu(0)>
- args (
orHash(String, MXNet::NDArray)
, default[]
) Input arguments. - If the input type is
, the order should be same as the order returned by#list_arguments
. - If the input type is
Hash(String, MXNet::NDArray)
, the arguments map to those returned by#list_arguments
. - ctx (
, default is current context) The device context the executor is to evaluate on.
Convenience fluent method for .broadcast_greater_equal
Evaluates a symbol given arguments.
The #eval
method combines a call to #bind
(which returns an
) with a call to Executor#forward
. For the common
use case, where you might repeatedly evaluate with the same
arguments, #eval
is slow. In that case, you should call #bind
once and then repeatedly call Executor#forward
. This function
allows simpler syntax for less cumbersome introspection.
Returns an array of MXNet::NDArray
corresponding to the values
taken by each symbol when evaluated on the given arguments. When
called on a single symbol (not a group), the result will be an
array with one element.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a + b # => "<Symbol broadcast_add>"
c.eval(a: MXNet::NDArray.ones([2, 3]), b: MXNet::NDArray.ones([2, 3])) # => [<NDArray 2x3 int32 @cpu(0)>]
c.eval(MXNet::NDArray.ones([2, 3]), MXNet::NDArray.ones([2, 3])) # => [<NDArray 2x3 int32 @cpu(0)>]
- ctx (
, default is current context) The device context the executor is to evaluate on. - ndargs (
) Input arguments. All the arguments must be provided.
Evaluates a symbol given arguments.
The #eval
method combines a call to #bind
(which returns an
) with a call to Executor#forward
. For the common
use case, where you might repeatedly evaluate with the same
arguments, #eval
is slow. In that case, you should call #bind
once and then repeatedly call Executor#forward
. This function
allows simpler syntax for less cumbersome introspection.
Returns an array of MXNet::NDArray
corresponding to the values
taken by each symbol when evaluated on the given arguments. When
called on a single symbol (not a group), the result will be an
array with one element.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a + b # => "<Symbol broadcast_add>"
c.eval(a: MXNet::NDArray.ones([2, 3]), b: MXNet::NDArray.ones([2, 3])) # => [<NDArray 2x3 int32 @cpu(0)>]
c.eval(MXNet::NDArray.ones([2, 3]), MXNet::NDArray.ones([2, 3])) # => [<NDArray 2x3 int32 @cpu(0)>]
- ctx (
, default is current context) The device context the executor is to evaluate on. - ndargs (
) Input arguments. All the arguments must be provided.
Evaluates a symbol given arguments.
The #eval
method combines a call to #bind
(which returns an
) with a call to Executor#forward
. For the common
use case, where you might repeatedly evaluate with the same
arguments, #eval
is slow. In that case, you should call #bind
once and then repeatedly call Executor#forward
. This function
allows simpler syntax for less cumbersome introspection.
Returns an array of MXNet::NDArray
corresponding to the values
taken by each symbol when evaluated on the given arguments. When
called on a single symbol (not a group), the result will be an
array with one element.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a + b # => "<Symbol broadcast_add>"
c.eval(a: MXNet::NDArray.ones([2, 3]), b: MXNet::NDArray.ones([2, 3])) # => [<NDArray 2x3 int32 @cpu(0)>]
c.eval(MXNet::NDArray.ones([2, 3]), MXNet::NDArray.ones([2, 3])) # => [<NDArray 2x3 int32 @cpu(0)>]
- ctx (
, default is current context) The device context the executor is to evaluate on. - ndargs (
) Input arguments. All the arguments must be provided.
Infers the dtypes of all arguments and all outputs, given the known dtypes of some arguments.
This function takes the known dtypes of arguments either
positionally or by name. It returns a tuple of nil
values if
there is not enough information to deduce the missing dtypes.
Inconsistencies in the known dtypes will cause an error to be raised.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a + b
arg_types, out_types, aux_types = c.infer_dtype({"a" => :float32})
arg_types # => [:float32, :float32]
out_types # => [:float32]
aux_types # => []
- args (
Array(::Symbol | Nil)
orHash(String, ::Symbol | Nil)
) Dtypes of known arguments. Unknown dtypes can be marked asnil
Infers the dtypes partially.
This functions works the same way as #infer_dtype
, except that
this function can return partial results.
In the following example, information about "b" is not
available. So, #infer_shape
will return a tuple of nil
values but this method will return partial values.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol::Ops._cast(MXNet::Symbol.var("b"), dtype: :int32)
c = a + b
arg_types, out_types, aux_types = c.infer_dtype_partial([:int32])
arg_types # => [:int32, nil]
out_types # => [:int32]
aux_types # => []
Infers the shapes of all arguments and all outputs, given the known shapes of some arguments.
This function takes the known shapes of arguments either
positionally or by name. It returns a tuple of nil
values if
there is not enough information to deduce the missing shapes.
Inconsistencies in the known shapes will cause an error to be raised.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a + b
arg_shapes, out_shapes, aux_shapes = c.infer_shape([nil, [3, 3]])
arg_shapes # => [[3, 3], [3, 3]]
out_shapes # => [[3, 3]]
aux_shapes # => []
- args (
Array(Array(Int32) | Nil)
orHash(String, Array(Int32) | Nil)
) Shapes of known arguments. Unknown shapes can be marked asnil
Infers the shapes partially.
This functions works the same way as #infer_shape
, except that
this function can return partial results.
In the following example, information about "b" is not
available. So, #infer_shape
will return a tuple of nil
values but this method will return partial values.
a = MXNet::Symbol.fully_connected(MXNet::Symbol.var("a"), nil, nil, num_hidden: 128)
b = MXNet::Symbol.fully_connected(MXNet::Symbol.var("b"), nil, nil, num_hidden: 128)
c = a + b
arg_shapes, out_shapes, aux_shapes = c.infer_shape_partial([[10, 64]])
arg_shapes # => [[10, 64], [128, 64], [128], [], [], []]
out_shapes # => [[10, 128]]
aux_shapes # => []
Lists all the arguments of the symbol.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a * b
c.list_arguments # => ["a", "b"]
Gets all attributes.
data = MXNet::Symbol.var("data", attr: {"mood" => "angry"})
data.list_attr # => {"mood" => "angry"}
Lists all the auxiliary states of the symbol.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a + b
c.list_auxiliary_states # => []
Auxiliary states are special states of symbols that do not
correspond to an argument, and are not updated by gradient
descent. Common examples of auxiliary states include the
moving_mean and moving_variance in BatchNorm
. Most
operators do not have auxiliary states.
Lists all the outputs of the symbol.
a = MXNet::Symbol.var("a")
b = MXNet::Symbol.var("b")
c = a + b
c.last_outputs # => ["_plus12_output"]
Gets name of the symbol.
This function only works for a non-grouped symbol. It returns
for a grouped symbol.