CapsNet for Protein Post-translational Modification site prediction. It is implemented by deep learning library Keras2.1.1 and Tensorflow backend
Training and testing data
The 10-fold cross-validation training and tesing data for each PTM used in the paper are in folder all_PTM_raw_data. Each subfolder contains 10-fold annotated training sequences (metazoa_sequence_annotated_training_X.fasta, sequences that have more than 30% sequence identity with the testing set were removed) and corresponding annotated testing sequences (metazoa_sequences_cross_testing_annotated_X.fasta) and testing sequences without annotation (metazoa_sequences_cross_testing_X.fasta)
Installation has been tested in Linux and Mac OS X with Python 2.7.
Since the package is written in python 2.7, python 2.7 with the pip tool must be installed first. It uses the following dependencies: numpy, scipy, pandas, h5py, keras version=2.1.1 You can install these packages first, by the following commands:
This is the Tensorflow version, you must change the backend to TensorFlow. If you have run Keras at least once, you will find the Keras configuration file at: $HOME/.keras/keras.json If it isn’t there, you can create it. Change the default configuration file into:
If you want to use GPU, you also need to install CUDA and cuDNN; refer to their websites for instructions. CPU is only suitable for prediction not training.
For custom training:
python train_models.py -input [custom training data in fasta format] -output-prefix [prefix of pre-trained model] -residue-types [custom specified residue types]
Custom prediction from custom general models and custom kinase-specific models:
python predict.py -input [custom prediction data in fasta format] -model-prefix [prefix of pre-trained model] -output [custom specified prefix for the prediction results]
Please cite the following paper for using this code: Duolin Wang, Yanchun Liang, Dong Xu*, Capsule Network for Protein Post-translational Modification Site Prediction.Bioinformatics,2018.