linear
Author: Sashi Novitasari Affiliation: NAIST (-2022) Date: 2022.08
Author: Heli Qi Affiliation: NAIST Date: 2022.09
LinearPrenet
Bases: Module
The Linear prenet. Usually used before the Transformer TTS decoder. This prenet is made up of one or more Linear blocks which is composed of the components below: 1. (mandatory) a Linear layer 2. (optional) an activation function 3. (optional) a Dropout layer
Reference
Neural Speech Synthesis with Transformer Network https://ojs.aaai.org/index.php/AAAI/article/view/4642/4520
Source code in speechain/module/prenet/linear.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
|
forward(feat, feat_len)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feat
|
Tensor
|
(batch, feat_maxlen, feat_dim) The input feature tensors. |
required |
feat_len
|
Tensor
|
(batch,) The length of each feature tensor. feat_len is not used in this forward function, but it's better to include this argument here for compatibility with other prenet classes. |
required |
feat, feat_len
Type | Description |
---|---|
The embedded feature vectors with their lengths. |
Source code in speechain/module/prenet/linear.py
module_init(feat_dim=None, lnr_dims=[256, 256], lnr_activation='ReLU', lnr_dropout=None, zero_centered=False)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feat_dim
|
int
|
int The dimension of input acoustic feature tensors. Used for calculating the in_features of the first Linear layer. |
None
|
lnr_dims
|
int or List[int]
|
int or List[int] The values of out_features of each Linear layer. The first value in the List represents the out_features of the first Linear layer. |
[256, 256]
|
lnr_activation
|
str
|
str The type of the activation function after all Linear layers. None means no activation function is needed. |
'ReLU'
|
lnr_dropout
|
float or List[float]
|
float or List[float] The values of p rate of the Dropout layer after each Linear layer. |
None
|
zero_centered
|
bool
|
bool Whether the output of this module is centered at 0. If the specified activation function changes the centroid of the output distribution, e.g. ReLU and LeakyReLU, the activation function won't be attached to the final Linear layer if zer_centered is set to True. |
False
|