重みイニシャライザー

Weight initializer is an instance of Initializer that destructively edits the contents of numpy.ndarray or cupy.ndarray. Typically, weight initializers are passed to __init__ of Link and initializes its the weights and biases.

ベースクラス

class chainer.initializer.Initializer(dtype=None)[ソース]

Initializes array.

It initializes the given array.

変数:dtype – Data type specifier. It is for type check in __call__ function.

具体的なイニシャライザー

class chainer.initializers.Identity(scale=1.0, dtype=None)[ソース]

Initializes array with the identity matrix.

It initializes the given array with the constant multiple of the identity matrix. Note that arrays to be passed must be 2D squared matrices.

変数:scale (scalar) – A constant to be multiplied to identity matrices.
class chainer.initializers.Constant(fill_value, dtype=None)[ソース]

Initializes array with constant value.

変数:
  • fill_value (scalar or numpy.ndarray or cupy.ndarray) – A constant to be assigned to the initialized array. Broadcast is allowed on this assignment.
  • dtype – Data type specifier.
chainer.initializers.Zero(dtype=None)[ソース]

Returns initializer that initializes array with the all-zero array.

パラメータ:dtype – Data type specifier.
戻り値:An initialized array.
戻り値の型:numpy.ndarray or cupy.ndarray
chainer.initializers.One(dtype=None)[ソース]

Returns initializer that initializes array with the all-one array.

パラメータ:dtype – Data type specifier.
戻り値:An initialized array.
戻り値の型:numpy.ndarray or cupy.ndarray
class chainer.initializers.Normal(scale=0.05, dtype=None)[ソース]

Initializes array with a normal distribution.

Each element of the array is initialized by the value drawn independently from Gaussian distribution whose mean is 0, and standard deviation is scale.

パラメータ:
  • scale (float) – Standard deviation of Gaussian distribution.
  • dtype – Data type specifier.
class chainer.initializers.GlorotNormal(scale=1.0, dtype=None)[ソース]

Initializes array with scaled Gaussian distribution.

Each element of the array is initialized by the value drawn independently from Gaussian distribution whose mean is 0, and standard deviation is \(scale \times \sqrt{\frac{2}{fan_{in} + fan_{out}}}\), where \(fan_{in}\) and \(fan_{out}\) are the number of input and output units, respectively.

Reference: Glorot & Bengio, AISTATS 2010

パラメータ:
  • scale (float) – A constant that determines the scale of the standard deviation.
  • dtype – Data type specifier.
class chainer.initializers.HeNormal(scale=1.0, dtype=None)[ソース]

Initializes array with scaled Gaussian distribution.

Each element of the array is initialized by the value drawn independently from Gaussian distribution whose mean is 0, and standard deviation is \(scale \times \sqrt{\frac{2}{fan_{in}}}\), where \(fan_{in}\) is the number of input units.

Reference: He et al., http://arxiv.org/abs/1502.01852

パラメータ:
  • scale (float) – A constant that determines the scale of the standard deviation.
  • dtype – Data type specifier.
class chainer.initializers.Orthogonal(scale=1.1, dtype=None)[ソース]

Initializes array with an orthogonal system.

This initializer first makes a matrix of the same shape as the array to be initialized whose elements are drawn independently from standard Gaussian distribution. Next, it applies Singular Value Decomposition (SVD) to the matrix. Then, it initializes the array with either side of resultant orthogonal matrices, depending on the shape of the input array. Finally, the array is multiplied by the constant scale.

If the ndim of the input array is more than 2, we consider the array to be a matrix by concatenating all axes except the first one.

The number of vectors consisting of the orthogonal system (i.e. first element of the shape of the array) must be equal to or smaller than the dimension of each vector (i.e. second element of the shape of the array).

変数:
  • scale (float) – A constant to be multiplied by.
  • dtype – Data type specifier.

Reference: Saxe et al., http://arxiv.org/abs/1312.6120

class chainer.initializers.Uniform(scale=0.05, dtype=None)[ソース]

Initializes array with a scaled uniform distribution.

Each element of the array is initialized by the value drawn independently from uniform distribution \([-scale, scale]\).

変数:
  • scale (float) – A constant that determines the scale of the uniform distribution.
  • dtype – Data type specifier.
class chainer.initializers.LeCunUniform(scale=1.0, dtype=None)[ソース]

Initializes array with a scaled uniform distribution.

Each element of the array is initialized by the value drawn independently from uniform distribution \([-s, s]\) where \(s = scale \times \sqrt{\frac{3}{fan_{in}}}\). Here \(fan_{in}\) is the number of input units.

Reference: LeCun 98, Efficient Backprop http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf

変数:
  • scale (float) – A constant that determines the scale of the uniform distribution.
  • dtype – Data type specifier.
class chainer.initializers.GlorotUniform(scale=1.0, dtype=None)[ソース]

Initializes array with a scaled uniform distribution.

Each element of the array is initialized by the value drawn independently from uniform distribution \([-s, s]\) where \(s = scale \times \sqrt{\frac{6}{fan_{in} + fan_{out}}}\). Here, \(fan_{in}\) and fan_{out} are the number of input and output units, respectively.

変数:
  • scale (float) – A constant that determines the scale of the uniform distribution.
  • dtype – Data type specifier.
class chainer.initializers.HeUniform(scale=1.0, dtype=None)[ソース]

Initializes array with scaled uniform distribution.

Each element of the array is initialized by the value drawn independently from uniform distribution \([-s, s]\) where \(s = scale \times \sqrt{\frac{6}{fan_{in}}}\). Here, \(fan_{in}\) is the number of input units.

変数:
  • scale (float) – A constant that determines the scale of the uniform distribution.
  • dtype – Data type specifier.

ヘルパー関数

chainer.init_weight(weights, initializer, scale=1.0)[ソース]

Helper function for initialization of the weight tensor.

This function accepts several types of initializer, prepares the appropriate ~chainer.Initializer if necessary, and does the initialization.

パラメータ:
  • weights (numpy.ndarray or cupy.ndarray) – Weight tensor to be initialized.
  • initializer – The value used to initialize the data. May be None (in which case HeNormal is used as an initializer), a scalar to set all values to, an numpy.ndarray to be assigned, or a callable that takes numpy.ndarray or cupy.ndarray and edits its value.
  • scale (scalar) – A constant to multiply initializer by.