Module

Module The neural network parts of all the Model objects in this toolkit are constructed by multiple Below is the nested Module tree of an encoder-decoder ASR model:

This base class has two required abstract interface functions that must be overriden by all Module inherits torch.nn.Module and it is the base class for all Module objects in this toolkit. Module objects in a nested structure.
name="__codelineno-0-1" href="#__codelineno-0-1">ASR (Model) ---> ASREncoder (Module) ---> Speech2MelSpec (Module) ---> Speech2LinearSpec (Module) ---> LinearSpec2MelSpec (Module) ---> Conv2dPrenet (Module) ---> LinearPrenet (Module) ---> TransformerEncoder (Module) ---> PositionalEncoding (Module) ---> MultiHeadedAttention (Module) ---> PositionwiseFeedForward (Module) ---> ASRDecoder (Module) ---> EmbedPrenet (Module) ---> TransformerDecoder (Module) ---> PositionalEncoding (Module) ---> MultiHeadedAttention (Module) ---> PositionwiseFeedForward (Module) ---> TokenPostnet (Module) subclasses: module_init() for module initialization and forward() for output calculation.

👆Back to the handbook page

Module Library

/speechain
    /module
        /abs.py             # Abstract class of Module. Base of all Module implementations.
        /frontend           # Acoustic feature extraction frontend modules
            /speech2linear.py   # Module implementation of speech-to-linear frontend. Used to transform the input speech waveforms into linear spectrogram.
            /linear2mel.py      # Module implementation of linear-to-mel frontend. Used to transform the input linear spectrogram into log-mel spectrogram.
            /speech2mel.py      # Module implementation of speech-to-mel frontend. Used to transform the input speech waveforms into log-mel spectrogram.
            /delta_feat.py      # Module implementation of delta frontend. Mainly used for ASR training when we want to take the first and second derivatives of log-mel spectrogram.
        /norm               # Normalization modules
            /feat_norm.py       # Module implementation of per-channel feature normalization.
        /augment            # Data augmentation modules
            /specaug.py         # Module implementation of SpecAugment. Mainly used for ASR training.
        /encoder            # Model encoder modules
            /asr.py             # Module implementation of ASR encoders. Used for ASR model construction.
            /tts.py             # Module implementation of TTS encoders. Used for TTS model construction.
        /decoder            # Model decoder modules
            /asr.py             # Module implementation of ASR autoregressive decoders. Used for autoregressive ASR model construction.
            /tts.py             # Module implementation of TTS autoregressive decoders. Used for autoregressive TTS model construction.
        /prenet             # Model prenet modules in front of encoders and decoders
            /conv1d.py          # Module implementation of 1D Convolutional prenet.
            /conv2d.py          # Module implementation of 2D Convolutional prenet.
            /embed.py           # Module implementation of token embedding prenet.
            /linear.py          # Module implementation of stacked linear prenet.
            /spk_embed.py       # Module implementation of speaker embedding prenet.
        /postnet            # Model postnet modules behind encoders and decoders
            /conv1d.py          # Module implementation of 1D Convolutional postnet.
            /token.py           # Module implementation of token prediction postnet.
        /transformer        # Transformer-related modules
            /encoder.py         # Module implementation of Transformer encoder layers. Used for decoder construction of ASR and TTS models.
            /decoder.py         # Module implementation of Transformer autoregressive decoder layers. Used for decoder construction of autoregressive ASR and TTS models.
            /pos_enc.py         # Module implementation of positional encoding layers.
            /attention.py       # Module implementation of multi-head attention layers.
            /feed_forward.py    # Module implementation of point-wise feed-forward layers.

👆Back to the table of contents

API Document

Non-overridable backbone functions:
1. speechain.module.abs.Module.__init__

Overridable interface functions:
1. speechain.module.abs.Module.module_init
2. speechain.module.abs.Module.forward
3. speechain.module.abs.Module.recover
4. speechain.module.abs.Module.reset_parameters
5. speechain.module.abs.Module.get_recordable_para

speechain.module.abs.Module.init(self, input_size, distributed, **module_conf)

Description:
This initialization function is shared by all Module subclasses. There are two built-in variable members: input_size and output_size. input_size is the last dimension of the input tensor while output_size is the last dimension of the output tensor.
These two member variables serve as the socket and plug that are used to communicate with the front and back Module objects in a Model object. You could utilize self.input_size in your module_init() implement to initialize your module and give the output data dimension to self.output_size.
Note: The usage of these two member variables is not mandatory, but it would be a convenient way for you to initialize your module.
Arguments:
- input_size: int = None
  The last dimension of the tensor from the front Module object. If not given, this argument would be None.
- distributed: bool = False
  Whether the Model object this Module object is belong to is distributed to multiple GPUs.
- **module_conf:
  The arguments used by module_init() for your customized Module initialization.

👆Back to the API list

speechain.module.abs.Module.module_init(self, **module_conf)

Description:
Abstract interface function for customized initialization of each Module subclass. This interface function is mandatory to be overridden by your implementation.
Arguments:
- **module_conf:
  The arguments used for customized Module initialization. For more details, please refer to the docstring of your target Module subclass.

👆Back to the API list

speechain.module.abs.Module.forward(self, **kwargs)

Description:
This abstract interface function is the customized implementation of torch.nn.Module.forward() used during model forward calculation. This interface function is mandatory to be overridden by your implementation.
Arguments:
- **kwargs:
  The input arguments for module forward calculation.
  For more details, please refer to the docstring of forward() of your target Module subclass.
Return:
Module forward calculation results.
For more details, please refer to the docstring of forward() of your target Module subclass.

👆Back to the API list

speechain.module.abs.Module.recover(self, **kwargs)

Description:
This abstract interface function is used to recover the module forward calculation results back to the input data. It can be considered as the reverse process of forward().
This interface function is not mandatory to be overridden.
Arguments:
- **kwargs:
  The input forward calculation results to be recovered. For more details, please refer to the docstring of recover() of your target Module subclass.
Return:
The recovered data or closely-recovered data (sometimes forward() may not be totally recoverable).
For more details, please refer to the docstring of recover() of your target Module subclass.

👆Back to the API list

speechain.module.abs.Module.reset_parameters(self)

Description:
This abstract interface function is used to initialize the customized parameters in the Module subclass if had.
Some Module subclasses have their customized parameters with specific initialization functions.
If your Module implementation has some customized parameters and you want to initialize them by yourself,
please give the initialization logic in this interface function.
This interface function is not mandatory to be overridden.
Note: Don't forget to add self.default_init_modules.append(YourModule) in model_init() of your Model.

👆Back to the API list

speechain.module.abs.Module.get_recordable_para(self)

Description:
This function returns the parameters of the module that you want to record as part of step information.
If you want to record the value of the customized parameters of your module:
1. when it is a leaf (no Module members) in the nested Module tree of the model, please override this function and return the parameter values in a Dict.
  For an example, you can refer to ${SPEECHAIN_ROOT}/speechain/module/transformer/pos_enc.py.
2. when it is a non-leaf (with Module members) in the nested Module tree of the model, please follow the pseudocode below:
```
class YourModule(Module):
    def get_recordable_para(self) -> Dict or None:
      output = dict()
      # add the value of your target parameter into the output as key-value items
      output.update(super(YourModule, self).get_recordable_para())
      return output
```
Return: Dict or None
For the leaf module, the default implementation returns None;
For the non-leaf module, the default implementation returns a Dict containing names and recordable parameters of its member modules.

👆Back to the API list

👆Back to the table of contents

Module

Table of Contents

Module Library

API Document

speechain.module.abs.Module.__init__(self, input_size, distributed, **module_conf)

speechain.module.abs.Module.module_init(self, **module_conf)

speechain.module.abs.Module.forward(self, **kwargs)

speechain.module.abs.Module.recover(self, **kwargs)

speechain.module.abs.Module.reset_parameters(self)

speechain.module.abs.Module.get_recordable_para(self)

speechain.module.abs.Module.init(self, input_size, distributed, **module_conf)