The follwoing article implements Multivariate LSTM-FCN architecture in pytorch. For a review of other algorithms that can be used in Timeseries classification check my previous review article.
Network Architecture
LSTM block
The LSTM block is composed mainly of a LSTM (alternatively Attention LSTM) layer, followed by a Dropout layer.
A shuffle layer is used at the begning of this block in case the number of time steps N (sequence length of the LSTM layer), is greater than the number of variables M.
This tricks improves the efficiency as the LSTM layer since it will require M time steps to process N variables, instead of N time steps to process M variables each timestep in case no shuffle is applied.
In pytorch, the LSRM block looks like the following:
FCN block
The core component of fully convolutional block is a convolutional block that contains:
In pytorch, the a convolutional block looks like the following:
The fully convolutional block contains three of these convolutional blocks, used as a feature extractor. Then it uses a global average pooling layer to generate channel-wise statistics.
In pytorch, a FCN block would look like:
LSTM-FCN
Finally, putting together the previous blocks to construct the LSTM-FCN architecture by concatenating the out of the blocks and passing it throgh a softmax activation to generate the final output.
Training
Part II discusses the training setup of the LSTM-FCN architecture using different Datasets.