feed_forward
Origin: Sashi Novitasari Modification: Heli Qi Affiliation: NAIST Date: 2022.07
PositionwiseFeedForward
Bases: Module
Position-wise Feed-forward layer Projects the output vectors of multi- head attention layer to fdfwd_dim and then back to d_model.
Source code in speechain/module/transformer/feed_forward.py
forward(x)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
Tensor
|
(batch, seq_maxlen, d_model) |
required |
Returns:
Source code in speechain/module/transformer/feed_forward.py
module_init(d_model=512, fdfwd_dim=2048, fdfwd_type='linear', fdfwd_activation='ReLU', fdfwd_args={}, dropout=0.1)
Initializes position-wise feed-forward layer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
d_model
|
int
|
int The dimension of the hidden feature vector in each Transformer layer |
512
|
fdfwd_dim
|
int
|
int The value of the out_features of the first linear feedforward layer and the in_features of the second linear feedforward layer |
2048
|
fdfwd_type
|
str
|
str The type of the feed-forward layer. 'linear' means the Linear layer while 'conv' means the Conv1d layer. |
'linear'
|
fdfwd_activation
|
str
|
str The name of the activation function of feedforward layers. Should be the name of functions in 'torch.nn'. |
'ReLU'
|
fdfwd_kernel
|
int The kernal size of the Conv1d feed-forward layer. This argument is not effective if fdfwd_type == 'linear'. |
required | |
dropout
|
float The dropout rate for the Dropout layer after the first linear feedforward layer |
0.1
|