Contexto e modularização em redes neurais recorrentes para aprendizagem de seqüências temporais
Data
2017-11-15
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Biblioteca Digital de Teses e Dissertações da USP
Universidade de São Paulo
Escola de Engenharia de São Carlos
Universidade de São Paulo
Escola de Engenharia de São Carlos
Resumo
Descrição
Este trabalho apresenta um sistema neural modular, que processa separadamente informações de contexto espacial e temporal, para a tarefa de reprodução de sequências temporais. Para o desenvolvimento do sistema neural foram considerados redes neurais recorrentes, modelos estocásticos, sistemas neurais modulares e processamento de informações de contexto. Em seguida, foram estudados três modelos com abordagens distintas para aprendizagem de seqüências temporais: uma rede neural parcialmente recorrente, um exemplo de sistema neural modular e um modelo estocástico utilizando a teoria de modelos markovianos escondidos. Com base nos estudos e modelos apresentados, esta pesquisa propõe um sistema formado por dois módulos sucessivos distintos. Uma rede de propagação direta (módulo estimador de contexto espacial) realiza o processamento de contexto espacial identificando a seqüência a ser reproduzida e fornecendo um protótipo do contexto para o segundo módulo. Este é formado por uma rede parcialmente recorrente (módulo de reprodução de sequências temporais) para aprender as informações de contexto temporal e reproduzir em suas saídas a seqüência identificada pelo módulo anterior. Para a finalidade mencionada, este mestrado utiliza a distribuição de Gibbs na saída do módulo para contexto espacial de forma que este forneça probabilidades de contexto espacial, indicando o grau de certeza do módulo e possibilitando a utilização de procedimentos especiais para os casos de dúvida. O sistema neural foi testado em conjuntos contendo trajetórias abertas, fechadas, e com diferentes situações de ambigüidade e complexidade. Duas situações distintas foram avaliadas: (a) capacidade do sistema em reproduzir trajetórias a partir de pontos iniciais treinados; e (b) capacidade de generalização do sistema reproduzindo trajetórias considerando pontos iniciais ou finais em situações não treinadas. A situação (b) é um problema de difícil ) solução em redes neurais devido à falta de contexto temporal, essencial na reprodução de seqüências. Foram realizados experimentos comparando o desempenho do sistema modular proposto com o de uma rede parcialmente recorrente operando sozinha e um sistema modular neural (TOTEM). Os resultados sugerem que o sistema proposto apresentou uma capacidade de generalização significamente melhor, sem que houvesse uma deterioração na capacidade de reproduzir seqüências treinadas. Esses resultados foram obtidos em sistema mais simples que o TOTEM.
This work presents a new modular neural system to deal separately with spatial and temporal context information, during temporal sequence processing. Given the initial and final states of the sequence, the neural system can reproduce the whole sequence linking these points. The proposed model involves concepts on recurrent neural networks, stochastic models, modular neural systems and context information processing. Three models based on distinct approaches to learn temporal sequences were particularly important in this work: a partially recurrent neural network, a modular neural system and a stochastic model based on the Hidden Markov Models theory. This master thesis presents a new modular neural system composed of two supervised neural networks. A feedforward neural network (spatial context estimator) to identify the desired sequence to be reproduced and to provide a spatial context prototype to the second module. This is a partially recurrent neural network to reproduce the sequence identified by the former module. Moreover, the first module employs the Gibbs distribution in the spatial context estimator outputs in such a way to obtain the uncertainty of the sequence identification task. Thus, with these probability values, special procedures may be used whenever a doubt occurs. The proposed system was evaluated in different domains containing open and closed sequences with different levels of complexity due to space dimension and level of ambiguity of the trained trajectories. The system was evaluated according to its ability to reproduce the sequence whenever versions of the initial and final points are provided. A version may be exactly the points seen during the training stage or points trained as intermediate states. The latter is considered a difficult task for recurrent neural networks due to the lack of temporal context information. Experiments were done comparing the performance of the proposed modular neural system with the performance of a recurrent neural network itself and a modular neural system (a model called TOTEM) for sequence reproduction. The results suggest that the proposed modular neural system presented ability to generalize significant1y better that of the recurrent neural network without deteriorating its ability to reproduce sequences starting from trained situations. The neural system may reproduce the results of the TOTEM with a simpler topology.
This work presents a new modular neural system to deal separately with spatial and temporal context information, during temporal sequence processing. Given the initial and final states of the sequence, the neural system can reproduce the whole sequence linking these points. The proposed model involves concepts on recurrent neural networks, stochastic models, modular neural systems and context information processing. Three models based on distinct approaches to learn temporal sequences were particularly important in this work: a partially recurrent neural network, a modular neural system and a stochastic model based on the Hidden Markov Models theory. This master thesis presents a new modular neural system composed of two supervised neural networks. A feedforward neural network (spatial context estimator) to identify the desired sequence to be reproduced and to provide a spatial context prototype to the second module. This is a partially recurrent neural network to reproduce the sequence identified by the former module. Moreover, the first module employs the Gibbs distribution in the spatial context estimator outputs in such a way to obtain the uncertainty of the sequence identification task. Thus, with these probability values, special procedures may be used whenever a doubt occurs. The proposed system was evaluated in different domains containing open and closed sequences with different levels of complexity due to space dimension and level of ambiguity of the trained trajectories. The system was evaluated according to its ability to reproduce the sequence whenever versions of the initial and final points are provided. A version may be exactly the points seen during the training stage or points trained as intermediate states. The latter is considered a difficult task for recurrent neural networks due to the lack of temporal context information. Experiments were done comparing the performance of the proposed modular neural system with the performance of a recurrent neural network itself and a modular neural system (a model called TOTEM) for sequence reproduction. The results suggest that the proposed modular neural system presented ability to generalize significant1y better that of the recurrent neural network without deteriorating its ability to reproduce sequences starting from trained situations. The neural system may reproduce the results of the TOTEM with a simpler topology.
Palavras-chave
Contexto, Redes neurais recorrentes, Seqüências temporais, Sistemas neurais modulares, Context information, Modular neural systems, Recurrent neural networks, Temporal sequences