A Method for Extracting Optimal Sequence Related to Biological Activity

Summary "This software can extract the optimal sequence relating to biological activity, if we have the data composed of sequences and corresponding activity, regardless of definition of characteristic sequence (cited from the original site). "
Data type DNA-motif


Summary Aln is a program for aligning a pair of nucleotide or amino acid sequences or alignments. Aln can even align a nucleotide sequence and a single or a group of protein sequences. This can be used to predict eukaryotic gene structures (protein-coding exons) based on sequence homology with known protein sequences.
Data type "DNA, amino acid sequences"


Summary ALNGG detects the protein coding gene by genome comparison between two species.
Data type "DNA-sequence , Genome"


Summary "ASIAN is a tool for automatically inferring the relationships between objects from data including redundant information, e.g. expression profiles that were measured for a large number of genes under various conditions.The tool combines cluster analysis, regression analysis, and graphical Gaussian modeling.By inputting your raw data, you can obtain some relationships between objects: the correlation, the grouping, the group number, and the network graph. "
Data type Gene expression profile


Summary "COSMOS can detect somatic structural variations from whole genome short-read sequences. Also, it can be applicable to de novo SV detection in a family trio"
Data type DNA-sequnece


Summary Discovery of combinatorial regulations is a key to understand complex gene regulation machineries. Combining this scripts (chip2lamp) with a statistical analysis LAMP allows us to find statistically significant combinations by integrating ChIP-seqs and RNA-seqs. This can handle MACS1/2 result as a ChIP-seq peak caller and Cuffdiff result from RNA-seq.
Data type "ChIP-sequence, RNA-sequence"


Summary "DNemulator is a package for simulating DNA sequencing errors, polymorphisms, cytosine methylation and bisulfite conversion."
Data type DNA-sequence


Summary GUPPY is a program to visualize sequence annotation data of the genetic sequence data with graphical layout.
Data type DNA-sequence


Summary "GeneDecoder is a gene finding technology for eukaryotes, based on hidden Markov models (HMMs). The algorithm, using dynamic programing method and statistic models trained by annotated genome sequences, divides the input nucleic acid sequence into some meaningful segments."
Data type "DNA-sequence , Eukaryote gene"


Summary "H-InvDB Enrichment Analysis Tool (HEAT) is a data-mining tool for automatically identifying features specific to a given human gene set. HEAT searches for H-InvDB annotations that are significantly enriched in a user-defined gene set, as compared with the entire H-InvDB representative transcripts. This technique is called Gene Set Enrichment Analysis (GSEA), and is popularly used in analyzing results of microarray experiments. Fisher's exact probability is used in statistical tests of HEAT."
Data type Annotation


Summary "The LAMPLINK can detect statistically significant epistatic interactions of two or more SNPs from GWAS data. This software can be used in the same way as the widely used GWAS analysis software PLINK, but LAMPLINK has the additional options for the detection of epistatic interactions with LAMP, which is a multiple testing procedure for combinatorial effects discovery."
Data type DNA-sequnece


Summary "LAST is a software for comparing and aligning sequences, typically DNA or protein sequences. LAST is similar to BLAST, but it copes better with very large amounts of sequence data. It can also report probabilities for every pair of aligned letters, indicating the reliability of each pairing. "
Data type "DNA-sequence ,RNA, protein, user-defined alphabet."


Summary "MAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of
Data type DNA-sequence alignment


Summary "Motif Distribution Viewer (MDV) is a web tool for visualizing the distribution of various motifs around transcription start sites (TSS) on a user-defined set of promoter sequences. The tool can be used on the original site, as well as downloaded to used locally. (cited from original site)."
Data type DNA-motif


Summary "This software is a framework for predicting true reads from the next generation sequencing data. This software generates several key features: observed count, estimate true count, loglikelihood with entropy penalty, loglikelihood ratio, expectation matching score, and specific correction coefficient."
Data type DNA-sequence NGS


Summary PHMMTS (Pair Hidden Markov Models on Tree Structures) aligns a sequence of unknown secondary structure to a sequence of known secondary structure (cited from the original site).
Data type RNA


Summary "PMID-Extractor allows a user to obtain PubMed IDs (PMIDs) from PDF files or text format files of journal paper in your hand. From Digital Object Identifiers (DOIs, http://en.wikipedia.org/wiki/Digital_object_identifier) or text information (e.g. titles) in the first page of each files, To start using PubMedScan, a paper recommender, PMIDs are required to specify the users' interest. That is the main usage of PMID-Extractor. "
Data type Journal


Summary "PRRN is a multiple sequence alignment program by doubly nested randomized iterative method. PRRN accepts either nucleotide or protein sequences. PRRN repeatedly uses pairwise group-to-group alignment to improve the overall weighted sum-of-pairs score at each iterative step, where the pair weights are introduced to correct for uneven representations of the sequences to be aligned. The strategies of PRRN work most effectively for refining a crude alignment obtained by other more rapid methods, e.g. progressive alignment. (Summarized from the original site)"
Data type "DNA, amino acid sequences"

Prediction program

Summary "Gene prediction program can be deleted region search for the candidate region from the database, which does not contain essential genes or synthetic lethal gene, and alongside a series of adverse genetic traits such as the emergence of delayed growth and deletion mutations.(cited from the project reports)"
Data type DNA-sequence


Summary RECOUNT is a software for estimating the true count of Solexa readsbased on a probabilistic model. RECOUNT uses the quality score provided by Solexa and the reads as its input. Typical application of this software is for transcriptome or metagenomic expression analysis (cited from the original site).
Data type DNA-sequence  Next Generation Sequencing Data

SCARNA Local Multiple

Summary "SCARNA_LM (SCARNA Local Multiple) is a local multiple aligner for RNA sequences. It is based on a discriminative pairwise alignment model which incorporates secondary structure features as base pairing probability calculrated by Rfold, and uses an efficient local multiple alignment construction procedure proposed by Phuong et al for local multiple alignment of protein sequences (cited from the original site). "
Data type RNA alignment


Summary "We developed SNP-system which is unified in common interface and released it on the web.Website( http://www.h-invitational.jp/snps/ ) was closed, and at 22th Mar 2013, The archive opened in MEDALS Archive."
Data type DNA-sequence


Summary Spaln is a stand-alone program that maps and aligns a set of cDNA or protein sequences onto a whole genomic sequence in a single job.
Data type Comparative genomics


Summary SlideSort is fast and exact method that can find all similar pairs from a string pool in terms of edit distance (cited from the original site).
Data type DNA-sequence  Protein-sequence

Web page checker

Summary "A web page checker. This tool can monitor web pages, and grouping of web pages, highlight changes in a page, and send Email reports. "
Data type Web contents


Summary "FASTA Perl Loop, a tool for processing multifasta data. Pronounced as ""fast apple"". and its companion program fastqpl, pronounced ""fast Q-ple"", for fastq format data. "
Data type DNA-sequence


Summary Paraclu finds clusters in data attached to sequences(cited from original site).
Data type DNA-sequence


Summary "The seg suite provides tools for manipulating segments and alignments. It uses a format called ""seg"". This program converts segments or alignments from various formats to seg."
Data type "DNA-sequence , RNA-sequence"


Summary "tantan is a tool to find cryptic repeats (low complexity and short-period tandem repeats) in DNA, RNA, and protein sequences.The aim of tantan is to prevent false predictions when searching for homologous regions between two sequences. You can get it from the archive page(cited from the original site)."
Data type "DNA-sequence , RNA, Aminoacid"