pytorch中文文档

1 Tensor

torch.Tensor是一种包含单一数据类型元素的多维矩阵

Torch定义了10种CPU tensor类型和GPU tensor类型：

Data type	dtype	CPU tensor	GPU tensor
32-bit floating point	`torch.float32` or `torch.float`	`torch.FloatTensor`	`torch.cuda.FloatTensor`
64-bit floating point	`torch.float64` or `torch.double`	`torch.DoubleTensor`	`torch.cuda.DoubleTensor`
16-bit floating point [1]	`torch.float16` or `torch.half`	`torch.HalfTensor`	`torch.cuda.HalfTensor`
16-bit floating point [2]	`torch.bfloat16`	`torch.BFloat16Tensor`	`torch.cuda.BFloat16Tensor`
32-bit complex	`torch.complex32` or `torch.chalf`
64-bit complex	`torch.complex64` or `torch.cfloat`
128-bit complex	`torch.complex128` or `torch.cdouble`
8-bit integer (unsigned)	`torch.uint8`	`torch.ByteTensor`	`torch.cuda.ByteTensor`
8-bit integer (signed)	`torch.int8`	`torch.CharTensor`	`torch.cuda.CharTensor`
16-bit integer (signed)	`torch.int16` or `torch.short`	`torch.ShortTensor`	`torch.cuda.ShortTensor`
32-bit integer (signed)	`torch.int32` or `torch.int`	`torch.IntTensor`	`torch.cuda.IntTensor`
64-bit integer (signed)	`torch.int64` or `torch.long`	`torch.LongTensor`	`torch.cuda.LongTensor`
Boolean	`torch.bool`	`torch.BoolTensor`	`torch.cuda.BoolTensor`
quantized 8-bit integer (unsigned)	`torch.quint8`	`torch.ByteTensor`	/
quantized 8-bit integer (signed)	`torch.qint8`	`torch.CharTensor`	/
quantized 32-bit integer (signed)	`torch.qint32`	`torch.IntTensor`	/
quantized 4-bit integer (unsigned) [3]	`torch.quint4x2`	`torch.ByteTensor`	/

创建

一个张量tensor可以从Python的list或序列构建

torch.FloatTensor([[1, 2, 3], [4, 5, 6]])

Out[0]: 
tensor([[1., 2., 3.],
        [4., 5., 6.]])

根据可选择的大小和数据新建一个tensor。如果没有提供参数，将会返回一个空的零维张量。如果提供了numpy.ndarray,torch.Tensor或torch.Storage，将会返回一个有同样参数的tensor.如果提供了python序列，将会从序列的副本创建一个tensor

# 接口 一个空张量tensor可以通过规定其大小来构建
class torch.Tensor
class torch.Tensor(*sizes)
class torch.Tensor(size)
class torch.Tensor(sequence)
class torch.Tensor(ndarray)
class torch.Tensor(tensor)
class torch.Tensor(storage)

# 实例化
torch.IntTensor(2, 4).zero_()

可以用python的索引和切片来获取和修改一个张量tensor中的内容

x = torch.FloatTensor([[1, 2, 3], [4, 5, 6]])
x[1][2]
Out[0]: tensor(6.)

x[0][1] = 8
x
Out[1]: 
tensor([[1., 8., 3.],
        [4., 5., 6.]])

每一个张量tensor都有一个相应的torch.Storage用来保存其数据。类tensor提供了一个存储的多维的、横向视图，并且定义了在数值运算

会改变tensor的函数操作会用一个下划线后缀来标示。比如，torch.FloatTensor.abs_()会在原地计算绝对值，并返回改变后的tensor，而tensor.FloatTensor.abs()将会在一个新的tensor中计算结果

关键属性和方法

`Tensor.new_tensor`	Returns a new Tensor with `data` as the tensor data.
`Tensor.new_full`	Returns a Tensor of size `size` filled with `fill_value`.
`Tensor.new_empty`	Returns a Tensor of size `size` filled with uninitialized data.
`Tensor.new_ones`	Returns a Tensor of size `size` filled with `1`.
`Tensor.new_zeros`	Returns a Tensor of size `size` filled with `0`.
`Tensor.is_cuda`	Is `True` if the Tensor is stored on the GPU, `False` otherwise.
`Tensor.is_quantized`	Is `True` if the Tensor is quantized, `False` otherwise.
`Tensor.is_meta`	Is `True` if the Tensor is a meta tensor, `False` otherwise.
`Tensor.device`	Is the `torch.device` where this Tensor is.
`Tensor.grad`	This attribute is `None` by default and becomes a Tensor the first time a call to `backward()` computes gradients for `self`.
`Tensor.ndim`	Alias for `dim()`
`Tensor.real`	Returns a new tensor containing real values of the `self` tensor for a complex-valued input tensor.
`Tensor.imag`	Returns a new tensor containing imaginary values of the `self` tensor.
`Tensor.abs`	See `torch.abs()`
`Tensor.abs_`	In-place version of `abs()`
`Tensor.absolute`	Alias for `abs()`
`Tensor.absolute_`	In-place version of `absolute()` Alias for `abs_()`
`Tensor.acos`	See `torch.acos()`
`Tensor.acos_`	In-place version of `acos()`
`Tensor.arccos`	See `torch.arccos()`
`Tensor.arccos_`	In-place version of `arccos()`
`Tensor.add`	Add a scalar or tensor to `self` tensor.
`Tensor.add_`	In-place version of `add()`
`Tensor.addbmm`	See `torch.addbmm()`
`Tensor.addbmm_`	In-place version of `addbmm()`
`Tensor.addcdiv`	See `torch.addcdiv()`
`Tensor.addcdiv_`	In-place version of `addcdiv()`
`Tensor.addcmul`	See `torch.addcmul()`
`Tensor.addcmul_`	In-place version of `addcmul()`
`Tensor.addmm`	See `torch.addmm()`
`Tensor.addmm_`	In-place version of `addmm()`
`Tensor.sspaddmm`	See `torch.sspaddmm()`
`Tensor.addmv`	See `torch.addmv()`
`Tensor.addmv_`	In-place version of `addmv()`
`Tensor.addr`	See `torch.addr()`
`Tensor.addr_`	In-place version of `addr()`
`Tensor.adjoint`	Alias for `adjoint()`
`Tensor.allclose`	See `torch.allclose()`
`Tensor.amax`	See `torch.amax()`
`Tensor.amin`	See `torch.amin()`
`Tensor.aminmax`	See `torch.aminmax()`
`Tensor.angle`	See `torch.angle()`
`Tensor.apply_`	Applies the function `callable` to each element in the tensor, replacing each element with the value returned by `callable`.
`Tensor.argmax`	See `torch.argmax()`
`Tensor.argmin`	See `torch.argmin()`
`Tensor.argsort`	See `torch.argsort()`
`Tensor.argwhere`	See `torch.argwhere()`
`Tensor.asin`	See `torch.asin()`
`Tensor.asin_`	In-place version of `asin()`
`Tensor.arcsin`	See `torch.arcsin()`
`Tensor.arcsin_`	In-place version of `arcsin()`
`Tensor.as_strided`	See `torch.as_strided()`
`Tensor.atan`	See `torch.atan()`
`Tensor.atan_`	In-place version of `atan()`
`Tensor.arctan`	See `torch.arctan()`
`Tensor.arctan_`	In-place version of `arctan()`
`Tensor.atan2`	See `torch.atan2()`
`Tensor.atan2_`	In-place version of `atan2()`
`Tensor.arctan2`	See `torch.arctan2()`
`Tensor.arctan2_`	atan2_(other) -> Tensor
`Tensor.all`	See `torch.all()`
`Tensor.any`	See `torch.any()`
`Tensor.backward`	Computes the gradient of current tensor w.r.t.
`Tensor.baddbmm`	See `torch.baddbmm()`
`Tensor.baddbmm_`	In-place version of `baddbmm()`
`Tensor.bernoulli`	Returns a result tensor where each \texttt{result[i]}result[i] is independently sampled from \text{Bernoulli}(\texttt{self[i]})Bernoulli(self[i]).
`Tensor.bernoulli_`	Fills each location of `self` with an independent sample from \text{Bernoulli}(\texttt{p})Bernoulli(p).
`Tensor.bfloat16`	`self.bfloat16()` is equivalent to `self.to(torch.bfloat16)`.
`Tensor.bincount`	See `torch.bincount()`
`Tensor.bitwise_not`	See `torch.bitwise_not()`
`Tensor.bitwise_not_`	In-place version of `bitwise_not()`
`Tensor.bitwise_and`	See `torch.bitwise_and()`
`Tensor.bitwise_and_`	In-place version of `bitwise_and()`
`Tensor.bitwise_or`	See `torch.bitwise_or()`
`Tensor.bitwise_or_`	In-place version of `bitwise_or()`
`Tensor.bitwise_xor`	See `torch.bitwise_xor()`
`Tensor.bitwise_xor_`	In-place version of `bitwise_xor()`
`Tensor.bitwise_left_shift`	See `torch.bitwise_left_shift()`
`Tensor.bitwise_left_shift_`	In-place version of `bitwise_left_shift()`
`Tensor.bitwise_right_shift`	See `torch.bitwise_right_shift()`
`Tensor.bitwise_right_shift_`	In-place version of `bitwise_right_shift()`
`Tensor.bmm`	See `torch.bmm()`
`Tensor.bool`	`self.bool()` is equivalent to `self.to(torch.bool)`.
`Tensor.byte`	`self.byte()` is equivalent to `self.to(torch.uint8)`.
`Tensor.broadcast_to`	See `torch.broadcast_to()`.
`Tensor.cauchy_`	Fills the tensor with numbers drawn from the Cauchy distribution:
`Tensor.ceil`	See `torch.ceil()`
`Tensor.ceil_`	In-place version of `ceil()`
`Tensor.char`	`self.char()` is equivalent to `self.to(torch.int8)`.
`Tensor.cholesky`	See `torch.cholesky()`
`Tensor.cholesky_inverse`	See `torch.cholesky_inverse()`
`Tensor.cholesky_solve`	See `torch.cholesky_solve()`
`Tensor.chunk`	See `torch.chunk()`
`Tensor.clamp`	See `torch.clamp()`
`Tensor.clamp_`	In-place version of `clamp()`
`Tensor.clip`	Alias for `clamp()`.
`Tensor.clip_`	Alias for `clamp_()`.
`Tensor.clone`	See `torch.clone()`
`Tensor.contiguous`	Returns a contiguous in memory tensor containing the same data as `self` tensor.
`Tensor.copy_`	Copies the elements from `src` into `self` tensor and returns `self`.
`Tensor.conj`	See `torch.conj()`
`Tensor.conj_physical`	See `torch.conj_physical()`
`Tensor.conj_physical_`	In-place version of `conj_physical()`
`Tensor.resolve_conj`	See `torch.resolve_conj()`
`Tensor.resolve_neg`	See `torch.resolve_neg()`
`Tensor.copysign`	See `torch.copysign()`
`Tensor.copysign_`	In-place version of `copysign()`
`Tensor.cos`	See `torch.cos()`
`Tensor.cos_`	In-place version of `cos()`
`Tensor.cosh`	See `torch.cosh()`
`Tensor.cosh_`	In-place version of `cosh()`
`Tensor.corrcoef`	See `torch.corrcoef()`
`Tensor.count_nonzero`	See `torch.count_nonzero()`
`Tensor.cov`	See `torch.cov()`
`Tensor.acosh`	See `torch.acosh()`
`Tensor.acosh_`	In-place version of `acosh()`
`Tensor.arccosh`	acosh() -> Tensor
`Tensor.arccosh_`	acosh_() -> Tensor
`Tensor.cpu`	Returns a copy of this object in CPU memory.
`Tensor.cross`	See `torch.cross()`
`Tensor.cuda`	Returns a copy of this object in CUDA memory.
`Tensor.logcumsumexp`	See `torch.logcumsumexp()`
`Tensor.cummax`	See `torch.cummax()`
`Tensor.cummin`	See `torch.cummin()`
`Tensor.cumprod`	See `torch.cumprod()`
`Tensor.cumprod_`	In-place version of `cumprod()`
`Tensor.cumsum`	See `torch.cumsum()`
`Tensor.cumsum_`	In-place version of `cumsum()`
`Tensor.chalf`	`self.chalf()` is equivalent to `self.to(torch.complex32)`.
`Tensor.cfloat`	`self.cfloat()` is equivalent to `self.to(torch.complex64)`.
`Tensor.cdouble`	`self.cdouble()` is equivalent to `self.to(torch.complex128)`.
`Tensor.data_ptr`	Returns the address of the first element of `self` tensor.
`Tensor.deg2rad`	See `torch.deg2rad()`
`Tensor.dequantize`	Given a quantized Tensor, dequantize it and return the dequantized float Tensor.
`Tensor.det`	See `torch.det()`
`Tensor.dense_dim`	Return the number of dense dimensions in a sparse tensor `self`.
`Tensor.detach`	Returns a new Tensor, detached from the current graph.
`Tensor.detach_`	Detaches the Tensor from the graph that created it, making it a leaf.
`Tensor.diag`	See `torch.diag()`
`Tensor.diag_embed`	See `torch.diag_embed()`
`Tensor.diagflat`	See `torch.diagflat()`
`Tensor.diagonal`	See `torch.diagonal()`
`Tensor.diagonal_scatter`	See `torch.diagonal_scatter()`
`Tensor.fill_diagonal_`	Fill the main diagonal of a tensor that has at least 2-dimensions.
`Tensor.fmax`	See `torch.fmax()`
`Tensor.fmin`	See `torch.fmin()`
`Tensor.diff`	See `torch.diff()`
`Tensor.digamma`	See `torch.digamma()`
`Tensor.digamma_`	In-place version of `digamma()`
`Tensor.dim`	Returns the number of dimensions of `self` tensor.
`Tensor.dist`	See `torch.dist()`
`Tensor.div`	See `torch.div()`
`Tensor.div_`	In-place version of `div()`
`Tensor.divide`	See `torch.divide()`
`Tensor.divide_`	In-place version of `divide()`
`Tensor.dot`	See `torch.dot()`
`Tensor.double`	`self.double()` is equivalent to `self.to(torch.float64)`.
`Tensor.dsplit`	See `torch.dsplit()`
`Tensor.element_size`	Returns the size in bytes of an individual element.
`Tensor.eq`	See `torch.eq()`
`Tensor.eq_`	In-place version of `eq()`
`Tensor.equal`	See `torch.equal()`
`Tensor.erf`	See `torch.erf()`
`Tensor.erf_`	In-place version of `erf()`
`Tensor.erfc`	See `torch.erfc()`
`Tensor.erfc_`	In-place version of `erfc()`
`Tensor.erfinv`	See `torch.erfinv()`
`Tensor.erfinv_`	In-place version of `erfinv()`
`Tensor.exp`	See `torch.exp()`
`Tensor.exp_`	In-place version of `exp()`
`Tensor.expm1`	See `torch.expm1()`
`Tensor.expm1_`	In-place version of `expm1()`
`Tensor.expand`	Returns a new view of the `self` tensor with singleton dimensions expanded to a larger size.
`Tensor.expand_as`	Expand this tensor to the same size as `other`.
`Tensor.exponential_`	Fills `self` tensor with elements drawn from the exponential distribution:
`Tensor.fix`	See `torch.fix()`.
`Tensor.fix_`	In-place version of `fix()`
`Tensor.fill_`	Fills `self` tensor with the specified value.
`Tensor.flatten`	See `torch.flatten()`
`Tensor.flip`	See `torch.flip()`
`Tensor.fliplr`	See `torch.fliplr()`
`Tensor.flipud`	See `torch.flipud()`
`Tensor.float`	`self.float()` is equivalent to `self.to(torch.float32)`.
`Tensor.float_power`	See `torch.float_power()`
`Tensor.float_power_`	In-place version of `float_power()`
`Tensor.floor`	See `torch.floor()`
`Tensor.floor_`	In-place version of `floor()`
`Tensor.floor_divide`	See `torch.floor_divide()`
`Tensor.floor_divide_`	In-place version of `floor_divide()`
`Tensor.fmod`	See `torch.fmod()`
`Tensor.fmod_`	In-place version of `fmod()`
`Tensor.frac`	See `torch.frac()`
`Tensor.frac_`	In-place version of `frac()`
`Tensor.frexp`	See `torch.frexp()`
`Tensor.gather`	See `torch.gather()`
`Tensor.gcd`	See `torch.gcd()`
`Tensor.gcd_`	In-place version of `gcd()`
`Tensor.ge`	See `torch.ge()`.
`Tensor.ge_`	In-place version of `ge()`.
`Tensor.greater_equal`	See `torch.greater_equal()`.
`Tensor.greater_equal_`	In-place version of `greater_equal()`.
`Tensor.geometric_`	Fills `self` tensor with elements drawn from the geometric distribution:
`Tensor.geqrf`	See `torch.geqrf()`
`Tensor.ger`	See `torch.ger()`
`Tensor.get_device`	For CUDA tensors, this function returns the device ordinal of the GPU on which the tensor resides.
`Tensor.gt`	See `torch.gt()`.
`Tensor.gt_`	In-place version of `gt()`.
`Tensor.greater`	See `torch.greater()`.
`Tensor.greater_`	In-place version of `greater()`.
`Tensor.half`	`self.half()` is equivalent to `self.to(torch.float16)`.
`Tensor.hardshrink`	See `torch.nn.functional.hardshrink()`
`Tensor.heaviside`	See `torch.heaviside()`
`Tensor.histc`	See `torch.histc()`
`Tensor.histogram`	See `torch.histogram()`
`Tensor.hsplit`	See `torch.hsplit()`
`Tensor.hypot`	See `torch.hypot()`
`Tensor.hypot_`	In-place version of `hypot()`
`Tensor.i0`	See `torch.i0()`
`Tensor.i0_`	In-place version of `i0()`
`Tensor.igamma`	See `torch.igamma()`
`Tensor.igamma_`	In-place version of `igamma()`
`Tensor.igammac`	See `torch.igammac()`
`Tensor.igammac_`	In-place version of `igammac()`
`Tensor.index_add_`	Accumulate the elements of `alpha` times `source` into the `self` tensor by adding to the indices in the order given in `index`.
`Tensor.index_add`	Out-of-place version of `torch.Tensor.index_add_()`.
`Tensor.index_copy_`	Copies the elements of `tensor` into the `self` tensor by selecting the indices in the order given in `index`.
`Tensor.index_copy`	Out-of-place version of `torch.Tensor.index_copy_()`.
`Tensor.index_fill_`	Fills the elements of the `self` tensor with value `value` by selecting the indices in the order given in `index`.
`Tensor.index_fill`	Out-of-place version of `torch.Tensor.index_fill_()`.
`Tensor.index_put_`	Puts values from the tensor `values` into the tensor `self` using the indices specified in `indices` (which is a tuple of Tensors).
`Tensor.index_put`	Out-place version of `index_put_()`.
`Tensor.index_reduce_`	Accumulate the elements of `source` into the `self` tensor by accumulating to the indices in the order given in `index` using the reduction given by the `reduce` argument.
`Tensor.index_reduce`
`Tensor.index_select`	See `torch.index_select()`
`Tensor.indices`	Return the indices tensor of a sparse COO tensor.
`Tensor.inner`	See `torch.inner()`.
`Tensor.int`	`self.int()` is equivalent to `self.to(torch.int32)`.
`Tensor.int_repr`	Given a quantized Tensor, `self.int_repr()` returns a CPU Tensor with uint8_t as data type that stores the underlying uint8_t values of the given Tensor.
`Tensor.inverse`	See `torch.inverse()`
`Tensor.isclose`	See `torch.isclose()`
`Tensor.isfinite`	See `torch.isfinite()`
`Tensor.isinf`	See `torch.isinf()`
`Tensor.isposinf`	See `torch.isposinf()`
`Tensor.isneginf`	See `torch.isneginf()`
`Tensor.isnan`	See `torch.isnan()`
`Tensor.is_contiguous`	Returns True if `self` tensor is contiguous in memory in the order specified by memory format.
`Tensor.is_complex`	Returns True if the data type of `self` is a complex data type.
`Tensor.is_conj`	Returns True if the conjugate bit of `self` is set to true.
`Tensor.is_floating_point`	Returns True if the data type of `self` is a floating point data type.
`Tensor.is_inference`	See `torch.is_inference()`
`Tensor.is_leaf`	All Tensors that have `requires_grad` which is `False` will be leaf Tensors by convention.
`Tensor.is_pinned`	Returns true if this tensor resides in pinned memory.
`Tensor.is_set_to`	Returns True if both tensors are pointing to the exact same memory (same storage, offset, size and stride).
`Tensor.is_shared`	Checks if tensor is in shared memory.
`Tensor.is_signed`	Returns True if the data type of `self` is a signed data type.
`Tensor.is_sparse`	Is `True` if the Tensor uses sparse storage layout, `False` otherwise.
`Tensor.istft`	See `torch.istft()`
`Tensor.isreal`	See `torch.isreal()`
`Tensor.item`	Returns the value of this tensor as a standard Python number.
`Tensor.kthvalue`	See `torch.kthvalue()`
`Tensor.lcm`	See `torch.lcm()`
`Tensor.lcm_`	In-place version of `lcm()`
`Tensor.ldexp`	See `torch.ldexp()`
`Tensor.ldexp_`	In-place version of `ldexp()`
`Tensor.le`	See `torch.le()`.
`Tensor.le_`	In-place version of `le()`.
`Tensor.less_equal`	See `torch.less_equal()`.
`Tensor.less_equal_`	In-place version of `less_equal()`.
`Tensor.lerp`	See `torch.lerp()`
`Tensor.lerp_`	In-place version of `lerp()`
`Tensor.lgamma`	See `torch.lgamma()`
`Tensor.lgamma_`	In-place version of `lgamma()`
`Tensor.log`	See `torch.log()`
`Tensor.log_`	In-place version of `log()`
`Tensor.logdet`	See `torch.logdet()`
`Tensor.log10`	See `torch.log10()`
`Tensor.log10_`	In-place version of `log10()`
`Tensor.log1p`	See `torch.log1p()`
`Tensor.log1p_`	In-place version of `log1p()`
`Tensor.log2`	See `torch.log2()`
`Tensor.log2_`	In-place version of `log2()`
`Tensor.log_normal_`	Fills `self` tensor with numbers samples from the log-normal distribution parameterized by the given mean \muμ and standard deviation \sigmaσ.
`Tensor.logaddexp`	See `torch.logaddexp()`
`Tensor.logaddexp2`	See `torch.logaddexp2()`
`Tensor.logsumexp`	See `torch.logsumexp()`
`Tensor.logical_and`	See `torch.logical_and()`
`Tensor.logical_and_`	In-place version of `logical_and()`
`Tensor.logical_not`	See `torch.logical_not()`
`Tensor.logical_not_`	In-place version of `logical_not()`
`Tensor.logical_or`	See `torch.logical_or()`
`Tensor.logical_or_`	In-place version of `logical_or()`
`Tensor.logical_xor`	See `torch.logical_xor()`
`Tensor.logical_xor_`	In-place version of `logical_xor()`
`Tensor.logit`	See `torch.logit()`
`Tensor.logit_`	In-place version of `logit()`
`Tensor.long`	`self.long()` is equivalent to `self.to(torch.int64)`.
`Tensor.lt`	See `torch.lt()`.
`Tensor.lt_`	In-place version of `lt()`.
`Tensor.less`	lt(other) -> Tensor
`Tensor.less_`	In-place version of `less()`.
`Tensor.lu`	See `torch.lu()`
`Tensor.lu_solve`	See `torch.lu_solve()`
`Tensor.as_subclass`	Makes a `cls` instance with the same data pointer as `self`.
`Tensor.map_`	Applies `callable` for each element in `self` tensor and the given `tensor` and stores the results in `self` tensor.
`Tensor.masked_scatter_`	Copies elements from `source` into `self` tensor at positions where the `mask` is True.
`Tensor.masked_scatter`	Out-of-place version of `torch.Tensor.masked_scatter_()`
`Tensor.masked_fill_`	Fills elements of `self` tensor with `value` where `mask` is True.
`Tensor.masked_fill`	Out-of-place version of `torch.Tensor.masked_fill_()`
`Tensor.masked_select`	See `torch.masked_select()`
`Tensor.matmul`	See `torch.matmul()`
`Tensor.matrix_power`	NOTE`matrix_power()` is deprecated, use `torch.linalg.matrix_power()` instead.
`Tensor.matrix_exp`	See `torch.matrix_exp()`
`Tensor.max`	See `torch.max()`
`Tensor.maximum`	See `torch.maximum()`
`Tensor.mean`	See `torch.mean()`
`Tensor.nanmean`	See `torch.nanmean()`
`Tensor.median`	See `torch.median()`
`Tensor.nanmedian`	See `torch.nanmedian()`
`Tensor.min`	See `torch.min()`
`Tensor.minimum`	See `torch.minimum()`
`Tensor.mm`	See `torch.mm()`
`Tensor.smm`	See `torch.smm()`
`Tensor.mode`	See `torch.mode()`
`Tensor.movedim`	See `torch.movedim()`
`Tensor.moveaxis`	See `torch.moveaxis()`
`Tensor.msort`	See `torch.msort()`
`Tensor.mul`	See `torch.mul()`.
`Tensor.mul_`	In-place version of `mul()`.
`Tensor.multiply`	See `torch.multiply()`.
`Tensor.multiply_`	In-place version of `multiply()`.
`Tensor.multinomial`	See `torch.multinomial()`
`Tensor.mv`	See `torch.mv()`
`Tensor.mvlgamma`	See `torch.mvlgamma()`
`Tensor.mvlgamma_`	In-place version of `mvlgamma()`
`Tensor.nansum`	See `torch.nansum()`
`Tensor.narrow`	See `torch.narrow()`
`Tensor.narrow_copy`	See `torch.narrow_copy()`.
`Tensor.ndimension`	Alias for `dim()`
`Tensor.nan_to_num`	See `torch.nan_to_num()`.
`Tensor.nan_to_num_`	In-place version of `nan_to_num()`.
`Tensor.ne`	See `torch.ne()`.
`Tensor.ne_`	In-place version of `ne()`.
`Tensor.not_equal`	See `torch.not_equal()`.
`Tensor.not_equal_`	In-place version of `not_equal()`.
`Tensor.neg`	See `torch.neg()`
`Tensor.neg_`	In-place version of `neg()`
`Tensor.negative`	See `torch.negative()`
`Tensor.negative_`	In-place version of `negative()`
`Tensor.nelement`	Alias for `numel()`
`Tensor.nextafter`	See `torch.nextafter()`
`Tensor.nextafter_`	In-place version of `nextafter()`
`Tensor.nonzero`	See `torch.nonzero()`
`Tensor.norm`	See `torch.norm()`
`Tensor.normal_`	Fills `self` tensor with elements samples from the normal distribution parameterized by `mean` and `std`.
`Tensor.numel`	See `torch.numel()`
`Tensor.numpy`	Returns the tensor as a NumPy `ndarray`.
`Tensor.orgqr`	See `torch.orgqr()`
`Tensor.ormqr`	See `torch.ormqr()`
`Tensor.outer`	See `torch.outer()`.
`Tensor.permute`	See `torch.permute()`
`Tensor.pin_memory`	Copies the tensor to pinned memory, if it's not already pinned.
`Tensor.pinverse`	See `torch.pinverse()`
`Tensor.polygamma`	See `torch.polygamma()`
`Tensor.polygamma_`	In-place version of `polygamma()`
`Tensor.positive`	See `torch.positive()`
`Tensor.pow`	See `torch.pow()`
`Tensor.pow_`	In-place version of `pow()`
`Tensor.prod`	See `torch.prod()`
`Tensor.put_`	Copies the elements from `source` into the positions specified by `index`.
`Tensor.qr`	See `torch.qr()`
`Tensor.qscheme`	Returns the quantization scheme of a given QTensor.
`Tensor.quantile`	See `torch.quantile()`
`Tensor.nanquantile`	See `torch.nanquantile()`
`Tensor.q_scale`	Given a Tensor quantized by linear(affine) quantization, returns the scale of the underlying quantizer().
`Tensor.q_zero_point`	Given a Tensor quantized by linear(affine) quantization, returns the zero_point of the underlying quantizer().
`Tensor.q_per_channel_scales`	Given a Tensor quantized by linear (affine) per-channel quantization, returns a Tensor of scales of the underlying quantizer.
`Tensor.q_per_channel_zero_points`	Given a Tensor quantized by linear (affine) per-channel quantization, returns a tensor of zero_points of the underlying quantizer.
`Tensor.q_per_channel_axis`	Given a Tensor quantized by linear (affine) per-channel quantization, returns the index of dimension on which per-channel quantization is applied.
`Tensor.rad2deg`	See `torch.rad2deg()`
`Tensor.random_`	Fills `self` tensor with numbers sampled from the discrete uniform distribution over `[from, to - 1]`.
`Tensor.ravel`	see `torch.ravel()`
`Tensor.reciprocal`	See `torch.reciprocal()`
`Tensor.reciprocal_`	In-place version of `reciprocal()`
`Tensor.record_stream`	Ensures that the tensor memory is not reused for another tensor until all current work queued on `stream` are complete.
`Tensor.register_hook`	Registers a backward hook.
`Tensor.remainder`	See `torch.remainder()`
`Tensor.remainder_`	In-place version of `remainder()`
`Tensor.renorm`	See `torch.renorm()`
`Tensor.renorm_`	In-place version of `renorm()`
`Tensor.repeat`	Repeats this tensor along the specified dimensions.
`Tensor.repeat_interleave`	See `torch.repeat_interleave()`.
`Tensor.requires_grad`	Is `True` if gradients need to be computed for this Tensor, `False` otherwise.
`Tensor.requires_grad_`	Change if autograd should record operations on this tensor: sets this tensor's `requires_grad` attribute in-place.
`Tensor.reshape`	Returns a tensor with the same data and number of elements as `self` but with the specified shape.
`Tensor.reshape_as`	Returns this tensor as the same shape as `other`.
`Tensor.resize_`	Resizes `self` tensor to the specified size.
`Tensor.resize_as_`	Resizes the `self` tensor to be the same size as the specified `tensor`.
`Tensor.retain_grad`	Enables this Tensor to have their `grad` populated during `backward()`.
`Tensor.retains_grad`	Is `True` if this Tensor is non-leaf and its `grad` is enabled to be populated during `backward()`, `False` otherwise.
`Tensor.roll`	See `torch.roll()`
`Tensor.rot90`	See `torch.rot90()`
`Tensor.round`	See `torch.round()`
`Tensor.round_`	In-place version of `round()`
`Tensor.rsqrt`	See `torch.rsqrt()`
`Tensor.rsqrt_`	In-place version of `rsqrt()`
`Tensor.scatter`	Out-of-place version of `torch.Tensor.scatter_()`
`Tensor.scatter_`	Writes all values from the tensor `src` into `self` at the indices specified in the `index` tensor.
`Tensor.scatter_add_`	Adds all values from the tensor `src` into `self` at the indices specified in the `index` tensor in a similar fashion as `scatter_()`.
`Tensor.scatter_add`	Out-of-place version of `torch.Tensor.scatter_add_()`
`Tensor.scatter_reduce_`	Reduces all values from the `src` tensor to the indices specified in the `index` tensor in the `self` tensor using the applied reduction defined via the `reduce` argument (`"sum"`, `"prod"`, `"mean"`, `"amax"`, `"amin"`).
`Tensor.scatter_reduce`	Out-of-place version of `torch.Tensor.scatter_reduce_()`
`Tensor.select`	See `torch.select()`
`Tensor.select_scatter`	See `torch.select_scatter()`
`Tensor.set_`	Sets the underlying storage, size, and strides.
`Tensor.share_memory_`	Moves the underlying storage to shared memory.
`Tensor.short`	`self.short()` is equivalent to `self.to(torch.int16)`.
`Tensor.sigmoid`	See `torch.sigmoid()`
`Tensor.sigmoid_`	In-place version of `sigmoid()`
`Tensor.sign`	See `torch.sign()`
`Tensor.sign_`	In-place version of `sign()`
`Tensor.signbit`	See `torch.signbit()`
`Tensor.sgn`	See `torch.sgn()`
`Tensor.sgn_`	In-place version of `sgn()`
`Tensor.sin`	See `torch.sin()`
`Tensor.sin_`	In-place version of `sin()`
`Tensor.sinc`	See `torch.sinc()`
`Tensor.sinc_`	In-place version of `sinc()`
`Tensor.sinh`	See `torch.sinh()`
`Tensor.sinh_`	In-place version of `sinh()`
`Tensor.asinh`	See `torch.asinh()`
`Tensor.asinh_`	In-place version of `asinh()`
`Tensor.arcsinh`	See `torch.arcsinh()`
`Tensor.arcsinh_`	In-place version of `arcsinh()`
`Tensor.size`	Returns the size of the `self` tensor.
`Tensor.slogdet`	See `torch.slogdet()`
`Tensor.slice_scatter`	See `torch.slice_scatter()`
`Tensor.sort`	See `torch.sort()`
`Tensor.split`	See `torch.split()`
`Tensor.sparse_mask`	Returns a new sparse tensor with values from a strided tensor `self` filtered by the indices of the sparse tensor `mask`.
`Tensor.sparse_dim`	Return the number of sparse dimensions in a sparse tensor `self`.
`Tensor.sqrt`	See `torch.sqrt()`
`Tensor.sqrt_`	In-place version of `sqrt()`
`Tensor.square`	See `torch.square()`
`Tensor.square_`	In-place version of `square()`
`Tensor.squeeze`	See `torch.squeeze()`
`Tensor.squeeze_`	In-place version of `squeeze()`
`Tensor.std`	See `torch.std()`
`Tensor.stft`	See `torch.stft()`
`Tensor.storage`	Returns the underlying storage.
`Tensor.storage_offset`	Returns `self` tensor's offset in the underlying storage in terms of number of storage elements (not bytes).
`Tensor.storage_type`	Returns the type of the underlying storage.
`Tensor.stride`	Returns the stride of `self` tensor.
`Tensor.sub`	See `torch.sub()`.
`Tensor.sub_`	In-place version of `sub()`
`Tensor.subtract`	See `torch.subtract()`.
`Tensor.subtract_`	In-place version of `subtract()`.
`Tensor.sum`	See `torch.sum()`
`Tensor.sum_to_size`	Sum `this` tensor to `size`.
`Tensor.svd`	See `torch.svd()`
`Tensor.swapaxes`	See `torch.swapaxes()`
`Tensor.swapdims`	See `torch.swapdims()`
`Tensor.symeig`	See `torch.symeig()`
`Tensor.t`	See `torch.t()`
`Tensor.t_`	In-place version of `t()`
`Tensor.tensor_split`	See `torch.tensor_split()`
`Tensor.tile`	See `torch.tile()`
`Tensor.to`	Performs Tensor dtype and/or device conversion.
`Tensor.to_mkldnn`	Returns a copy of the tensor in `torch.mkldnn` layout.
`Tensor.take`	See `torch.take()`
`Tensor.take_along_dim`	See `torch.take_along_dim()`
`Tensor.tan`	See `torch.tan()`
`Tensor.tan_`	In-place version of `tan()`
`Tensor.tanh`	See `torch.tanh()`
`Tensor.tanh_`	In-place version of `tanh()`
`Tensor.atanh`	See `torch.atanh()`
`Tensor.atanh_`	In-place version of `atanh()`
`Tensor.arctanh`	See `torch.arctanh()`
`Tensor.arctanh_`	In-place version of `arctanh()`
`Tensor.tolist`	Returns the tensor as a (nested) list.
`Tensor.topk`	See `torch.topk()`
`Tensor.to_dense`	Creates a strided copy of `self` if `self` is not a strided tensor, otherwise returns `self`.
`Tensor.to_sparse`	Returns a sparse copy of the tensor.
`Tensor.to_sparse_csr`	Convert a tensor to compressed row storage format (CSR).
`Tensor.to_sparse_csc`	Convert a tensor to compressed column storage (CSC) format.
`Tensor.to_sparse_bsr`	Convert a CSR tensor to a block sparse row (BSR) storage format of given blocksize.
`Tensor.to_sparse_bsc`	Convert a CSR tensor to a block sparse column (BSC) storage format of given blocksize.
`Tensor.trace`	See `torch.trace()`
`Tensor.transpose`	See `torch.transpose()`
`Tensor.transpose_`	In-place version of `transpose()`
`Tensor.triangular_solve`	See `torch.triangular_solve()`
`Tensor.tril`	See `torch.tril()`
`Tensor.tril_`	In-place version of `tril()`
`Tensor.triu`	See `torch.triu()`
`Tensor.triu_`	In-place version of `triu()`
`Tensor.true_divide`	See `torch.true_divide()`
`Tensor.true_divide_`	In-place version of `true_divide_()`
`Tensor.trunc`	See `torch.trunc()`
`Tensor.trunc_`	In-place version of `trunc()`
`Tensor.type`	Returns the type if dtype is not provided, else casts this object to the specified type.
`Tensor.type_as`	Returns this tensor cast to the type of the given tensor.
`Tensor.unbind`	See `torch.unbind()`
`Tensor.unflatten`	See `torch.unflatten()`.
`Tensor.unfold`	Returns a view of the original tensor which contains all slices of size `size` from `self` tensor in the dimension `dimension`.
`Tensor.uniform_`	Fills `self` tensor with numbers sampled from the continuous uniform distribution:
`Tensor.unique`	Returns the unique elements of the input tensor.
`Tensor.unique_consecutive`	Eliminates all but the first element from every consecutive group of equivalent elements.
`Tensor.unsqueeze`	See `torch.unsqueeze()`
`Tensor.unsqueeze_`	In-place version of `unsqueeze()`
`Tensor.values`	Return the values tensor of a sparse COO tensor.
`Tensor.var`	See `torch.var()`
`Tensor.vdot`	See `torch.vdot()`
`Tensor.view`	Returns a new tensor with the same data as the `self` tensor but of a different `shape`.
`Tensor.view_as`	View this tensor as the same size as `other`.
`Tensor.vsplit`	See `torch.vsplit()`
`Tensor.where`	`self.where(condition, y)` is equivalent to `torch.where(condition, self, y)`.
`Tensor.xlogy`	See `torch.xlogy()`
`Tensor.xlogy_`	In-place version of `xlogy()`
`Tensor.zero_`	Fills `self` tensor with zeros.

1.1 storage

tensor的数据结构、storage()、stride()、storage_offset()

pytorch中一个tensor对象分为头信息区（Tensor）和存储区（Storage）两部分

头信息区主要保存tensor的形状（size）、步长（stride）、数据类型（type）等信息；而真正的data（数据)则以连续一维数组的形式放在存储区，由torch.Storage实例管理着

注意：storage永远是一维数组，任何维度的tensor的实际数据都存储在一维的storage中

获取tensor的storage

a = torch.tensor([[1.0, 4.0],[2.0, 1.0],[3.0, 5.0]])
a.storage()
Out[0]: 
 1.0
 4.0
 2.0
 1.0
 3.0
 5.0
[torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6]

a.storage()[2] = 9

id(a.storage())
Out[1]: 1343354913168

2 实例

2.1 图片分类

小土堆+李沐课程笔记

PyTorch深度学习快速入门教程（绝对通俗易懂！）【小土堆】

2.1.1 Pytorch加载数据

Pytorch中加载数据需要Dataset、Dataloader。

Dataset提供一种方式去获取每个数据及其对应的label，告诉我们总共有多少个数据。
Dataloader为后面的网络提供不同的数据形式，它将一批一批数据进行一个打包。

2.1.2 Tensorboard

import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

# 准备的测试数据集
test_data = torchvision.datasets.CIFAR10("./dataset",train=False,transform=torchvision.transforms.ToTensor())               
# batch_size=4 使得 img0, target0 = dataset[0]、img1, target1 = dataset[1]、img2, target2 = dataset[2]、img3, target3 = dataset[3]，然后这四个数据作为Dataloader的一个返回      
test_loader = DataLoader(dataset=test_data,batch_size=64,shuffle=True,num_workers=0,drop_last=False)      
# 用for循环取出DataLoader打包好的四个数据
writer = SummaryWriter("logs")
step = 0
for data in test_loader:
    imgs, targets = data # 每个data都是由4张图片组成，imgs.size 为 [4,3,32,32]，四张32×32图片三通道，targets由四个标签组成             
    writer.add_images("test_data",imgs,step)
    step = step + 1

writer.close()

2.1.3 Transforms

① Transforms当成工具箱的话，里面的class就是不同的工具。例如像totensor、resize这些工具。

② Transforms拿一些特定格式的图片，经过Transforms里面的工具，获得我们想要的结果。

from torchvision import transforms
from PIL import Image

img_path = "Data/FirstTypeData/val/bees/10870992_eebeeb3a12.jpg"
img = Image.open(img_path)  

tensor_trans = transforms.ToTensor()  # 创建 transforms.ToTensor类 的实例化对象
tensor_img = tensor_trans(img)  # 调用 transforms.ToTensor类 的__call__的魔术方法   
print(tensor_img)

2.1.4 torchvision数据集

① torchvision中有很多数据集，当我们写代码时指定相应的数据集指定一些参数，它就可以自行下载。

② CIFAR-10数据集包含60000张32×32的彩色图片，一共10个类别，其中50000张训练图片，10000张测试图片。

import torchvision
train_set = torchvision.datasets.CIFAR10(root="./dataset",train=True,download=True) # root为存放数据集的相对路线
test_set = torchvision.datasets.CIFAR10(root="./dataset",train=False,download=True) # train=True是训练集，train=False是测试集  

print(test_set[0])       # 输出的3是target 
print(test_set.classes)  # 测试数据集中有多少种

img, target = test_set[0] # 分别获得图片、target
print(img)
print(target)

print(test_set.classes[target]) # 3号target对应的种类
img.show()

2.1.5 损失函数

① Loss损失函数一方面计算实际输出和目标之间的差距。

② Loss损失函数另一方面为我们更新输出提供一定的依据

L1loss损失函数

import torch
from torch.nn import L1Loss
inputs = torch.tensor([1,2,3],dtype=torch.float32)
targets = torch.tensor([1,2,5],dtype=torch.float32)
inputs = torch.reshape(inputs,(1,1,1,3))
targets = torch.reshape(targets,(1,1,1,3))
loss = L1Loss()  # 默认为 maen
result = loss(inputs,targets)
print(result)

MSE损失函数

import torch
from torch.nn import L1Loss
from torch import nn
inputs = torch.tensor([1,2,3],dtype=torch.float32)
targets = torch.tensor([1,2,5],dtype=torch.float32)
inputs = torch.reshape(inputs,(1,1,1,3))
targets = torch.reshape(targets,(1,1,1,3))
loss_mse = nn.MSELoss()
result_mse = loss_mse(inputs,targets)
print(result_mse)

交叉熵损失函数

import torch
from torch.nn import L1Loss
from torch import nn

x = torch.tensor([0.1,0.2,0.3])
y = torch.tensor([1])
x = torch.reshape(x,(1,3)) # 1的 batch_size，有三类
loss_cross = nn.CrossEntropyLoss()
result_cross = loss_cross(x,y)
print(result_cross)

2.1.6 优化器

① 损失函数调用backward方法，就可以调用损失函数的反向传播方法，就可以求出我们需要调节的梯度，我们就可以利用我们的优化器就可以根据梯度对参数进行调整，达到整体误差降低的目的。

② 梯度要清零，如果梯度不清零会导致梯度累加

loss = nn.CrossEntropyLoss() # 交叉熵    
tudui = Tudui()
optim = torch.optim.SGD(tudui.parameters(),lr=0.01)   # 随机梯度下降优化器
for data in dataloader:
    imgs, targets = data
    outputs = tudui(imgs)
    result_loss = loss(outputs, targets) # 计算实际输出与目标输出的差距
    optim.zero_grad()  # 梯度清零
    result_loss.backward() # 反向传播，计算损失函数的梯度
    optim.step()   # 根据梯度，对网络的参数进行调优
    print(result_loss) # 对数据只看了一遍，只看了一轮，所以loss下降不大

神经网络学习率优化

import torch
import torchvision
from torch import nn 
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)       
dataloader = DataLoader(dataset, batch_size=64,drop_last=True)

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()        
        self.model1 = Sequential(
            Conv2d(3,32,5,padding=2),
            MaxPool2d(2),
            Conv2d(32,32,5,padding=2),
            MaxPool2d(2),
            Conv2d(32,64,5,padding=2),
            MaxPool2d(2),
            Flatten(),
            Linear(1024,64),
            Linear(64,10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x

loss = nn.CrossEntropyLoss() # 交叉熵    
tudui = Tudui()
optim = torch.optim.SGD(tudui.parameters(),lr=0.01)   # 随机梯度下降优化器
scheduler = torch.optim.lr_scheduler.StepLR(optim, step_size=5, gamma=0.1) # 每过 step_size 更新一次优化器，更新是学习率为原来的学习率的 0.1 倍    
for epoch in range(20):
    running_loss = 0.0
    for data in dataloader:
        imgs, targets = data
        outputs = tudui(imgs)
        result_loss = loss(outputs, targets) # 计算实际输出与目标输出的差距
        optim.zero_grad()  # 梯度清零
        result_loss.backward() # 反向传播，计算损失函数的梯度
        optim.step()   # 根据梯度，对网络的参数进行调优
        scheduler.step() # 学习率太小了，所以20个轮次后，相当于没走多少
        running_loss = running_loss + result_loss
    print(running_loss) # 对这一轮所有误差的总和

2.1.7 网络模型使用及修改

网络模型添加

import torchvision
from torch import nn

dataset = torchvision.datasets.CIFAR10("./dataset",train=True,transform=torchvision.transforms.ToTensor(),download=True)       
vgg16_true = torchvision.models.vgg16(pretrained=True) # 下载卷积层对应的参数是多少、池化层对应的参数时多少，这些参数时ImageNet训练好了的
vgg16_true.add_module('add_linear',nn.Linear(1000,10)) # 在VGG16后面添加一个线性层，使得输出为适应CIFAR10的输出，CIFAR10需要输出10个种类

print(vgg16_true)

网络模型修改

import torchvision
from torch import nn

vgg16_false = torchvision.models.vgg16(pretrained=False) # 没有预训练的参数     
print(vgg16_false)
vgg16_false.classifier[6] = nn.Linear(4096,10)
print(vgg16_false)

2.1.8 网络模型保存与读取

模型结构 + 模型参数

import torchvision
import torch
vgg16 = torchvision.models.vgg16(pretrained=False)
torch.save(vgg16,"./model/vgg16_method1.pth") # 保存方式一：模型结构 + 模型参数      
print(vgg16)

model = torch.load("./model/vgg16_method1.pth") # 保存方式一对应的加载模型    
print(model)

模型参数（官方推荐），不保存网络模型结构

import torchvision
import torch
vgg16 = torchvision.models.vgg16(pretrained=False)
torch.save(vgg16.state_dict(),"./model/vgg16_method2.pth") # 保存方式二：模型参数（官方推荐）,不再保存网络模型结构  
print(vgg16)

model = torch.load("./model/vgg16_method2.pth") # 导入模型参数   
print(model)

2.1.9 固定模型参数

在训练过程中可能需要固定一部分模型的参数，只更新另一部分参数，有两种思路实现这个目标

一个是设置不要更新参数的网络层为false
另一个就是在定义优化器时只传入要更新的参数

当然最优的做法是，优化器中只传入requires_grad=True的参数，这样占用的内存会更小一点，效率也会更高

import torch
import torch.nn as nn
import torch.optim as optim


# 定义一个简单的网络
class net(nn.Module):
    def __init__(self, num_class=3):
        super(net, self).__init__()
        self.fc1 = nn.Linear(8, 4)
        self.fc2 = nn.Linear(4, num_class)

    def forward(self, x):
        return self.fc2(self.fc1(x))


model = net()

# 冻结fc1层的参数
for name, param in model.named_parameters():
    if "fc1" in name:
        param.requires_grad = False

loss_fn = nn.CrossEntropyLoss()

# 只传入requires_grad = True的参数
optimizer = optim.SGD(filter(lambda p: p.requires_grad, net.parameters(), lr=1e-2)
print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

model.train()
for epoch in range(10):
    x = torch.randn((3, 8))
    label = torch.randint(0, 3, [3]).long()
    output = model(x)

    loss = loss_fn(output, label)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

2.1.10 训练流程

DataLoader加载数据集

import torchvision
from torch import nn
from torch.utils.data import DataLoader

# 准备数据集
train_data = torchvision.datasets.CIFAR10("./dataset",train=True,transform=torchvision.transforms.ToTensor(),download=True)       
test_data = torchvision.datasets.CIFAR10("./dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)       

# length 长度
train_data_size = len(train_data)
test_data_size = len(test_data)
# 如果train_data_size=10，则打印：训练数据集的长度为：10
print("训练数据集的长度：{}".format(train_data_size))
print("测试数据集的长度：{}".format(test_data_size))

# 利用 Dataloader 来加载数据集
train_dataloader = DataLoader(train_data_size, batch_size=64)        
test_dataloader = DataLoader(test_data_size, batch_size=64)

测试网络正确

import torch
from torch import nn

# 搭建神经网络
class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()        
        self.model1 = nn.Sequential(
            nn.Conv2d(3,32,5,1,2),  # 输入通道3，输出通道32，卷积核尺寸5×5，步长1，填充2    
            nn.MaxPool2d(2),
            nn.Conv2d(32,32,5,1,2),
            nn.MaxPool2d(2),
            nn.Conv2d(32,64,5,1,2),
            nn.MaxPool2d(2),
            nn.Flatten(),  # 展平后变成 64*4*4 了
            nn.Linear(64*4*4,64),
            nn.Linear(64,10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x

if __name__ == '__main__':
    tudui = Tudui()
    input = torch.ones((64,3,32,32))
    output = tudui(input)
    print(output.shape)  # 测试输出的尺寸是不是我们想要的

网络训练数据

① model.train()和model.eval()的区别主要在于Batch Normalization和Dropout两层。

② 如果模型中有BN层(Batch Normalization）和 Dropout，需要在训练时添加model.train()。model.train()是保证BN层能够用到每一批数据的均值和方差。对于Dropout，model.train()是随机取一部分网络连接来训练更新参数。

③ 不启用 Batch Normalization 和 Dropout。如果模型中有BN层(Batch Normalization）和Dropout，在测试时添加model.eval()。model.eval()是保证BN层能够用全部训练数据的均值和方差，即测试过程中要保证BN层的均值和方差不变。对于Dropout，model.eval()是利用到了所有网络连接，即不进行随机舍弃神经元。

④ 训练完train样本后，生成的模型model要用来测试样本。在model(test)之前，需要加上model.eval()，否则的话，有输入数据，即使不训练，它也会改变权值。这是model中含有BN层和Dropout所带来的性质。

⑤ 在做one classification的时候，训练集和测试集的样本分布是不一样的，尤其需要注意这一点

import torchvision
import torch
from torch import nn
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

# from model import * 相当于把 model中的所有内容写到这里，这里直接把 model 写在这里
class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()        
        self.model1 = nn.Sequential(
            nn.Conv2d(3,32,5,1,2),  # 输入通道3，输出通道32，卷积核尺寸5×5，步长1，填充2    
            nn.MaxPool2d(2),
            nn.Conv2d(32,32,5,1,2),
            nn.MaxPool2d(2),
            nn.Conv2d(32,64,5,1,2),
            nn.MaxPool2d(2),
            nn.Flatten(),  # 展平后变成 64*4*4 了
            nn.Linear(64*4*4,64),
            nn.Linear(64,10)
        )

    def forward(self, x):
        x = self.model1(x)
        return x

# 准备数据集
train_data = torchvision.datasets.CIFAR10("./dataset",train=True,transform=torchvision.transforms.ToTensor(),download=True)       
test_data = torchvision.datasets.CIFAR10("./dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)       

# length 长度
train_data_size = len(train_data)
test_data_size = len(test_data)
# 如果train_data_size=10，则打印：训练数据集的长度为：10
print("训练数据集的长度：{}".format(train_data_size))
print("测试数据集的长度：{}".format(test_data_size))

# 利用 Dataloader 来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)        
test_dataloader = DataLoader(test_data, batch_size=64)

# 创建网络模型
tudui = Tudui() 

# 损失函数
loss_fn = nn.CrossEntropyLoss() # 交叉熵，fn 是 fuction 的缩写

# 优化器
learning = 0.01  # 1e-2 就是 0.01 的意思
optimizer = torch.optim.SGD(tudui.parameters(),learning)   # 随机梯度下降优化器  

# 设置网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0

# 训练的轮次
epoch = 10

# 添加 tensorboard
writer = SummaryWriter("logs")

for i in range(epoch):
    print("-----第 {} 轮训练开始-----".format(i+1))

    # 训练步骤开始
    tudui.train() # 当网络中有dropout层、batchnorm层时，这些层能起作用
    for data in train_dataloader:
        imgs, targets = data
        outputs = tudui(imgs)
        loss = loss_fn(outputs, targets) # 计算实际输出与目标输出的差距

        # 优化器对模型调优
        optimizer.zero_grad()  # 梯度清零
        loss.backward() # 反向传播，计算损失函数的梯度
        optimizer.step()   # 根据梯度，对网络的参数进行调优

        total_train_step = total_train_step + 1
        if total_train_step % 100 == 0:
            print("训练次数：{}，Loss：{}".format(total_train_step,loss.item()))  # 方式二：获得loss值
            writer.add_scalar("train_loss",loss.item(),total_train_step)

    # 测试步骤开始（每一轮训练后都查看在测试数据集上的loss情况）
    tudui.eval()  # 当网络中有dropout层、batchnorm层时，这些层不能起作用
    total_test_loss = 0
    total_accuracy = 0
    with torch.no_grad():  # 没有梯度了
        for data in test_dataloader: # 测试数据集提取数据
            imgs, targets = data
            outputs = tudui(imgs)
            loss = loss_fn(outputs, targets) # 仅data数据在网络模型上的损失
            total_test_loss = total_test_loss + loss.item() # 所有loss
            accuracy = (outputs.argmax(1) == targets).sum()
            total_accuracy = total_accuracy + accuracy

    print("整体测试集上的Loss：{}".format(total_test_loss))
    print("整体测试集上的正确率：{}".format(total_accuracy/test_data_size))
    writer.add_scalar("test_loss",total_test_loss,total_test_step)
    writer.add_scalar("test_accuracy",total_accuracy/test_data_size,total_test_step)  
    total_test_step = total_test_step + 1

    torch.save(tudui, "./model/tudui_{}.pth".format(i)) # 保存每一轮训练后的结果
    #torch.save(tudui.state_dict(),"tudui_{}.path".format(i)) # 保存方式二         
    print("模型已保存")

writer.close()

2.2 迁移学习

迁移学习｜模型查看&参数查看｜预训练模型加载｜模型修改｜参数冻结

2.2.1 模型|参数查看

import torch


class MyModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = torch.nn.Sequential(
            torch.nn.Linear(3, 4),
            torch.nn.Linear(4, 3),
        )
        self.layer2 = torch.nn.Linear(3, 6)
        self.layer3 = torch.nn.Sequential(
            torch.nn.Linear(6, 7),
            torch.nn.Linear(7, 5),
        )

    def forward(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        return x


net = MyModel()
print(net)
MyModel(
  (layer1): Sequential(
    (0): Linear(in_features=3, out_features=4, bias=True)
    (1): Linear(in_features=4, out_features=3, bias=True)
  )
  (layer2): Linear(in_features=3, out_features=6, bias=True)
  (layer3): Sequential(
    (0): Linear(in_features=6, out_features=7, bias=True)
    (1): Linear(in_features=7, out_features=5, bias=True)
  )
)

查看参数

for layer in net.modules():
    print(type(layer))  # 查看每一层的类型
    # print(layer)
<class '__main__.MyModel'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.linear.Linear'>


for param in net.parameters():
    print(param.shape)  # 打印每一层的参数
torch.Size([4, 3])
torch.Size([4])
torch.Size([3, 4])
torch.Size([3])
torch.Size([6, 3])
torch.Size([6])
torch.Size([7, 6])
torch.Size([7])
torch.Size([5, 7])
torch.Size([5])

for name, param in net.named_parameters():
    print(name, param.shape)  # 看的更细
layer1.0.weight torch.Size([4, 3])
layer1.0.bias torch.Size([4])
layer1.1.weight torch.Size([3, 4])
layer1.1.bias torch.Size([3])
layer2.weight torch.Size([6, 3])
layer2.bias torch.Size([6])
layer3.0.weight torch.Size([7, 6])
layer3.0.bias torch.Size([7])
layer3.1.weight torch.Size([5, 7])
layer3.1.bias torch.Size([5])

for key, value in net.state_dict().items():  # 参数名以及参数
    print(key, value.shape)
layer1.0.weight torch.Size([4, 3])
layer1.0.bias torch.Size([4])
layer1.1.weight torch.Size([3, 4])
layer1.1.bias torch.Size([3])
layer2.weight torch.Size([6, 3])
layer2.bias torch.Size([6])
layer3.0.weight torch.Size([7, 6])
layer3.0.bias torch.Size([7])
layer3.1.weight torch.Size([5, 7])
layer3.1.bias torch.Size([5])

2.2.2 模型保存|加载

# 1、加载模型+参数
net = torch.load("resnet50.pth")

# 2、已有模型,加载预训练参数
resnet50 = models.resnet50(weights=None)  
resnet50.load_state_dict(torch.load("resnet58_weight.pth"))

2.2.3 网络的修改

from torch import nn
from torchvision import models

alexnet = models.alexnet(weights=models.AlexNet_Weights.DEFAULT)
print(alexnet)

AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace=True)
    (3): Dropout(p=0.5, inplace=False)
    (4): Linear(in_features=4096, out_features=4096, bias=True)
    (5): ReLU(inplace=True)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
)

修改网络结构

# 1、-----删除网络的最后一层-----
# del alexnet.classifier
del alexnet.classifier[6]
print(alexnet)
AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace=True)
    (3): Dropout(p=0.5, inplace=False)
    (4): Linear(in_features=4096, out_features=4096, bias=True)
    (5): ReLU(inplace=True)
  )
)

# 2、-----删除网络的最后多层-----
alexnet.classifier = alexnet.classifier[:-2]
print(alexnet)

AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace=True)
    (3): Dropout(p=0.5, inplace=False)
  )
)

# 3、-----修改网络的某一层-----
alexnet.classifier[6] = nn.Linear(in_features=4096, out_features=1024)
print(alexnet)

AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=4096, out_features=1024, bias=True)
  )
)

# 4、-----网络添加层，每次添加一层-----
alexnet.classifier.add_module('7', nn.ReLU(inplace=True))
alexnet.classifier.add_module('8', nn.Linear(in_features=1024, out_features=20))
print(alexnet)

AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace=True)
    (3): Linear(in_features=4096, out_features=1024, bias=True)
    (4): ReLU(inplace=True)
    (5): Linear(in_features=1024, out_features=20, bias=True)
  )
)

2.2.4 参数冻结

# 任务一∶
# 1、将模型A作为backbone，修改为模型B
# 2、模型A的预训练参数加载到模型B上

resnet_modified = resnet50()
new_weights_dict = resnet_modified.state_dict()

resnet = models.resnet50(weights=models.ResNet50_Weights.DEFAULT)
weights_dict = resnet.state_dict()

for k in weights_dict.keys():
    if k in new_weights_dict.keys() and not k.startswith('fc'):
        new_weights_dict[k] = weights_dict[k]
resnet_modified.load_state_dict(new_weights_dict)
# resnet_modified.load_state_dict(new_weights_dict,strict=False)

# 任务二:
# 冻结与训练好的参数
params = []
train_layer = ['layer5', 'conv_end', 'bn_end']
for name, param in resnet_modified.named_parameters():
    if any(name.startswith(prefix) for prefix in train_layer):
        print(name)
        params.append(param)
    else:
        param.requires_grad = False
optimizer = torch.optim.SGD(params, lr=0.001, momentum=0.9, weight_decay=5e-4)

pytorch学习_进阶知识.md