Pytorch激活函数最全汇总

为了更清晰地学习Pytorch中的激活函数，并对比它们之间的不同，这里对最新版本的Pytorch中的激活函数进行了汇总，主要介绍激活函数的公式、图像以及使用方法，具体细节可查看官方文档。

1、ELU

2、Hardshrink

3、Hardsigmoid

4、Hardtanh

5、Hardswish

6、LeakyReLU

7、LogSigmoid

8、PReLU

9、ReLU

10、ReLU6

11、RReLU

12、SELU

13、CELU

14、GELU

15、Sigmoid

16、SiLU

17、Mish

18、Softplus

19、Softshrink

20、Softsign

21、Tanh

22、Tanhshrink

23、Threshold

24、GLU

25、Softmin

26、Softmax

27、LogSoftmax

28、其它

1、ELU

公式：

$ELU(x)=\left\{\begin{matrix} x, & x>0\\ \alpha\ast \left ( exp\left ( x \right )-1 \right ), & x\leqslant 0 \end{matrix}\right.$

图像：

示例：

m = nn.ELU()
input = torch.randn(2)
output = m(input)

2、Hardshrink

公式：

$HardShrink(x)=\left\{\begin{matrix} x, & x>\lambda \\ x, & x<-\lambda \\ 0, & otherwise \end{matrix}\right.$

图像：

示例：

m = nn.Hardshrink()
input = torch.randn(2)
output = m(input)

3、Hardsigmoid

公式：

$Hardsigmoid(x)=\left\{\begin{matrix} 0, & x\leq -3 \\ 1, & x\geq +3 \\ \frac{x}{6}+\frac{1}{2}, & otherwise \end{matrix}\right.$

图像：

示例：

m = nn.Hardsigmoid()
input = torch.randn(2)
output = m(input)

4、Hardtanh

公式：

$Hardsigmoid(x)=\left\{\begin{matrix} max\_val, & x>max\_val \\ min\_val, & x<min\_val \\ x, & otherwise \end{matrix}\right.$

图像：

示例：

m = nn.Hardtanh(-2, 2)
input = torch.randn(2)
output = m(input)

5、Hardswish

公式：

$Hardswish(x)=\left\{\begin{matrix} 0, & x\leq -3 \\ x, & x\geq +3 \\ \frac{x\cdot \left ( x+3 \right )}{6}, & otherwise \end{matrix}\right.$

图像：

示例：

m = nn.Hardwish()
input = torch.randn(2)
output = m(input)

6、LeakyReLU

公式：

$LeakyReLU(x)=\left\{\begin{matrix} x, & x\geq0\\ negetive\_slope\times x, & otherwise \end{matrix}\right.$

图像：

示例：

m = nn.LeakyReLU(0.1)
input = torch.randn(2)
output = m(input)

7、LogSigmoid

公式：

$LogSigmoid\left ( x \right )=log\left ( \frac{1}{1+exp(-x))} \right )$

图像：

示例;

m = nn.LogSigmoid()
input = torch.randn(2)
output = m(input)

8、PReLU

公式：

$PReLU(x)=\left\{\begin{matrix} x, & x\geq0 \\ ax, & otherwise \end{matrix}\right.$

其中，a是可学习的参数。

图像：

示例：

m = nn.PReLU()
input = torch.randn(2)
output = m(input)

9、ReLU

公式：

$ReLU(x)=\left\{\begin{matrix} x, & x\geq0 \\ x, & otherwise \end{matrix}\right.$

图像：

示例：

m = nn.ReLU()
input = torch.randn(2)
output = m(input)

10、ReLU6

公式：

$ReLU6=min(max(0,x),6)$

图像：

示例：

m = nn.ReLU6()
input = torch.randn(2)
output = m(input)

11、RReLU

公式：

$RReLU(x)=\left\{\begin{matrix} x, & x\geq0 \\ ax, & otherwise \end{matrix}\right.$

其中，a从均匀分布U(lower,upper)随机采样得到。

图像：

示例：

m = nn.RReLU(0.1, 0.3)
input = torch.randn(2)
output = m(input)

12、SELU

公式：

$SELU(x)=scale\ast (max(0,x)+min(0,\alpha\ast (exp(x)-1)))$

其中，a=1.6732632423543772848170429916717，scale=1.0507009873554804934193349852946。

图像：

示例：

m = nn.SELU()
input = torch.randn(2)
output = m(input)

13、CELU

公式：

$CELU\left (x \right )=max(0,x)+min(0,\alpha \ast (exp(x)-1))$

图像：

示例：

m = nn.CELU()
input = torch.randn(2)
output = m(input)

14、GELU

公式：

$GELU(x)=0.5 \ast x \ast (1+Tanh(\sqrt{(2/\pi)} \ast (x+0.044715 \ast x^{3})))$

图像：

示例：

m = nn.GELU()
input = torch.randn(2)
output = m(input)

15、Sigmoid

公式：

$Sigmoid(x)=\sigma (x)=\frac{1}{1+exp(-x)}$

图像：

示例：

m = nn. Sigmoid()
input = torch.randn(2)
output = m(input)

16、SiLU

公式：

$SiLU(x)=x*\sigma (x)=x \ast \frac{1}{1+exp(-x)}$

图像：

示例：

m = nn.SiLU()
input = torch.randn(2)
output = m(input)

17、Mish

公式：

$Mish(x)=x \ast Tanh(Softplus(x))$

图像：

示例：

m = nn.Mish()
input = torch.randn(2)
output = m(input)

18、Softplus

公式：

$Softplus(x)=\frac{1}{\beta} \ast log(1+exp(\beta \ast x))$

对于数值稳定性，当 $input \times \beta >threshold$ 时，恢复到线性函数。

图像：

示例：

m = nn.Softplus()
input = torch.randn(2)
output = m(input)

19、Softshrink

公式：

$Softshrink(x)=\left\{\begin{matrix} x-\lambda, & x>\lambda\\ x+\lambda, & x<-\lambda\\ 0, & otherwise \end{matrix}\right.$

图像：

示例：

m = nn.Softshrink()
input = torch.randn(2)
output = m(input)

20、Softsign

公式：

$SoftSign(x)=\frac{x}{1+\left | x \right |}$

图像：

示例：

m = nn.Softsign()
input = torch.randn(2)
output = m(input)

21、Tanh

公式：

$Tanh(x)=tanh(x)=\frac{exp(x)-exp(-x)}{exp(x)+exp(-x)}$

图像：

示例：

m = nn.Tanh()
input = torch.randn(2)
output = m(input)

22、Tanhshrink

公式：

$Tanhshrink(x)=x-tanh(x)$

图像：

示例：

m = nn.Tanhshrink()
input = torch.randn(2)
output = m(input)

23、Threshold

公式：

$y=\left\{\begin{matrix} x, & x>threshold \\ value, & otherwise \end{matrix}\right.$

示例：

m = nn.Threshold(0.1, 20)
input = torch.randn(2)
output = m(input)

24、GLU

公式：

$GLU(a,b)=a\bigotimes \sigma(b)$

其中，a是输入矩阵的前半部分，b是后半部分。

示例：

m = nn.GLU()
input = torch.randn(4, 2)
output = m(input)

25、Softmin

公式：

$Softmin(x_i)=\frac{exp(-x_i)}{\sum_j exp(-x_j)}$

示例：

m = nn.Softmin(dim=1)
input = torch.randn(2, 3)
output = m(input)

26、Softmax

公式：

$Softmax(x_i)=\frac{exp(x_i)}{\sum_j exp(x_j)}$

示例：

m = nn.Softmax(dim=1)
input = torch.randn(2, 3)
output = m(input)

27、LogSoftmax

公式：

$LogSoftmax(x_i)=log\begin{pmatrix} \frac{exp(x_i)}{\sum_j exp(x_j)} \end{pmatrix}$

示例：

m = nn.LogSoftmiax(dim=1)
input = torch.randn(2, 3)
output = m(input)

28、其它

还有MultiheadAttention、Softmax2d、AdaptiveLogSoftmaxWithLoss相对复杂一些没有添加，可去官网文档查看。