CuPy Overview

CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface that is enough for Chainer.

The following is a brief overview of supported subset of NumPy interface:

  • Basic indexing (indexing by ints, slices, newaxes, and Ellipsis)
  • Element types (dtypes): bool_, (u)int{8, 16, 32, 64}, float{16, 32, 64}
  • Most of the array creation routines
  • Reshaping and transposition
  • All operators with broadcasting
  • All Universal functions (a.k.a. ufuncs) for elementwise operations except those for complex numbers
  • Dot product functions (except einsum) using cuBLAS
  • Reduction along axes (sum, max, argmax, etc.)

CuPy also includes following features for performance:

  • Customizable memory allocator, and a simple memory pool as an example
  • User-defined elementwise kernels
  • User-defined reduction kernels
  • cuDNN utilities

CuPy uses on-the-fly kernel synthesis: when a kernel call is required, it compiles a kernel code optimized for the shapes and dtypes of given arguments, sends it to the GPU device, and executes the kernel. The compiled code is cached to $(HOME)/.cupy/kernel_cache directory (this cache path can be overwritten by setting the CUPY_CACHE_DIR environment variable). It may make things slower at the first kernel call, though this slow down will be resolved at the second execution. CuPy also caches the kernel code sent to GPU device within the process, which reduces the kernel transfer time on further calls.

A list of supported attributes, properties, and methods of ndarray

Memory layout

base ctypes itemsize flags nbytes shape size strides

Data type

dtype

Other attributes

T

Array conversion

tolist() tofile() dump() dumps() astype() copy() view() fill()

Shape manipulation

reshape() transpose() swapaxes() ravel() squeeze()

Item selection and manipulation

take() diagonal()

Calculation

max() argmax() min() argmin() clip() trace() sum() mean() var() std() prod() dot()

Arithmetic and comparison operations

__lt__() __le__() __gt__() __ge__() __eq__() __ne__() __nonzero__() __neg__() __pos__() __abs__() __invert__() __add__() __sub__() __mul__() __div__() __truediv__() __floordiv__() __mod__() __divmod__() __pow__() __lshift__() __rshift__() __and__() __or__() __xor__() __iadd__() __isub__() __imul__() __idiv__() __itruediv__() __ifloordiv__() __imod__() __ipow__() __ilshift__() __irshift__() __iand__() __ior__() __ixor__()

Special methods

__copy__() __deepcopy__() __reduce__() __array__() __len__() __getitem__() __setitem__() __int__() __long__() __float__() __oct__() __hex__() __repr__() __str__()

Memory transfer

get() set()

A list of supported routines of cupy module

Array creation routines

empty() empty_like() eye() identity() ones() ones_like() zeros() zeros_like() full() full_like()

array() asarray() ascontiguousarray() copy()

arange() linspace()

diag() diagflat()

Array manipulation routines

copyto()

reshape() ravel()

rollaxis() swapaxes() transpose()

atleast_1d() atleast_2d() atleast_3d() broadcast broadcast_arrays() broadcast_to() expand_dims() squeeze()

column_stack() concatenate() dstack() hstack() vstack()

array_split() dsplit() hsplit() split() vsplit()

roll()

Binary operations

bitwise_and bitwise_or bitwise_xor invert left_shift right_shift

Indexing routines

take() diagonal()

Input and output

load() save() savez() savez_compressed()

array_repr() array_str()

Linear algebra

dot() vdot() inner() outer() tensordot()

trace()

Logic functions

isfinite isinf isnan

logical_and logical_or logical_not logical_xor

greater greater_equal less less_equal equal not_equal

Mathematical functions

sin cos tan arcsin arccos arctan hypot arctan2 deg2rad rad2deg degrees radians

sinh cosh tanh arcsinh arccosh arctanh

rint floor ceil trunc

sum() prod()

exp expm1 exp2 log log10 log2 log1p logaddexp logaddexp2

signbit copysign ldexp frexp nextafter

add reciprocal negative multiply divide power subtract true_divide floor_divide fmod mod modf remainder

clip() sqrt square absolute sign maximum minimum fmax fmin

Sorting, searching, and counting

argmax() argmin() count_nonzero() where()

Statistics

amin() amax()

mean() var() std()

bincount()

Other

asnumpy()