Timezone: »

Neural Edit Operations for Biological Sequences
Satoshi Koide · Keisuke Kawano · Takuro Kutsuna

Tue Dec 04 07:45 AM -- 09:45 AM (PST) @ Room 517 AB #125

The evolution of biological sequences, such as proteins or DNAs, is driven by the three basic edit operations: substitution, insertion, and deletion. Motivated by the recent progress of neural network models for biological tasks, we implement two neural network architectures that can treat such edit operations. The first proposal is the edit invariant neural networks, based on differentiable Needleman-Wunsch algorithms. The second is the use of deep CNNs with concatenations. Our analysis shows that CNNs can recognize star-free regular expressions, and that deeper CNNs can recognize more complex regular expressions including the insertion/deletion of characters. The experimental results for the protein secondary structure prediction task suggest the importance of insertion/deletion. The test accuracy on the widely-used CB513 dataset is 71.5%, which is 1.2-points better than the current best result on non-ensemble models.

Author Information

Satoshi Koide (Toyota Central R&D Labs.)
Keisuke Kawano (Toyota Central R&D Labs., Inc)
Takuro Kutsuna (Toyota Central R&D Labs. Inc.)

More from the Same Authors