MEDALS METI Life science integrated database portal site

Acacia EST database

Summary	The database contains 6253 EST sequences of Acacia, an economic plant.　The database has BLAST search function.
Data type	DNA-sequence EST

Commentary page →

ARCHAIC

Summary	The database provides DNA sequences of archaebacterial genomes and their annotation. The aim of ARCHAIC is to analyze archaebacterial genomic DNA sequences (including originally determined ones) in order to understand the overall organization of the genomes and compare different species. Original informatical methods were developed for identifying genes, pseudo-genes, and operons essentially on the basis of a statistical analysis of transcription and translation signals. Three archaes data are available (Pyrococcus sp. OT3, Thermoplasma volcanium GSS1, Archaeoglobus fulgidus).
Data type	DNA-sequence , genome

Summary

The database provides DNA sequences of archaebacterial genomes and their annotation. The aim of ARCHAIC is to analyze archaebacterial genomic DNA sequences (including originally determined ones) in order to understand the overall organization of the genomes and compare different species. Original informatical methods were developed for identifying genes, pseudo-genes, and operons essentially on the basis of a statistical analysis of transcription and translation signals. Three archaes data are available (Pyrococcus sp. OT3, Thermoplasma volcanium GSS1, Archaeoglobus fulgidus).

Data type

DNA-sequence , genome

Commentary page →

ASTRA

Summary	ASTRA is a database that classifies alternative splicing patterns and alternative transcriptional initiation patterns of Human, Mouse, D.melanogaster, C.elegans, A.thaliana, O. sativa. (cited from the original site)
Data type	DNA-sequence

Commentary page →

CGH Data Base

Summary	CGH Database is a database for array CGH (Comparative Genomic Hybridization)analysis data of various cancer cell lines. NEDO project products is includes in CGH Data Base.
Data type	Disease-cancer

Commentary page →

Database for genetic engineering of microalgae

Summary	Useful information for genetic engineering of microalgae is collected in the database.The information is categorized as follows;1. host - vector (structure of vector, promoter, research institute, papers)2. gene transfer technique (organism, method, papers)3. production of useful materials (product, host, transferred genes, papers)4. inhibitor tolerant mutants (organism, inhibitor, mutant gene, papers)5. synthetic transposon6. selective marker(from original site)
Data type	DNA-regration-region

Summary

Useful information for genetic engineering of microalgae is collected in the database.The information is categorized as follows;1. host - vector (structure of vector, promoter, research institute, papers)2. gene transfer technique (organism, method, papers)3. production of useful materials (product, host, transferred genes, papers)4. inhibitor tolerant mutants (organism, inhibitor, mutant gene, papers)5. synthetic transposon6. selective marker(from original site)

Data type

DNA-regration-region

Commentary page →

Database of genomes and transcriptional regulations for filamentous fungi

Summary	The filamentous fungus, Aspergillus oryzae, plays an important role in the Japanese sake and fermentation industry, and many industrial enzymes are produced by this organism. We hope this database will provide you useful information for your research and industrial applications.
Data type	DNA-sequence

Commentary page →

Distribution of Human cDNA Clones

Summary	You can search for human full-length cDNA clones that are distributed at NBRC (NITE Biological Resource Center). Clone name search, keyword search, BLAST search is possible.
Data type	DNA-sequence

Commentary page →

DoBISCUIT

Summary	Secondary metabolites such as those produced by bacteria have pharmacologically important activities and attract attentions as lead-compounds and/or candidates for drug development.Although we can find many INSDC entries describing either complete or partial sequences of such secondary metabolite biosynthetic gene clusters, gene descriptions in each entry varies wildly.DoBISCUIT enables easy access to comprehensive information related to biosynthesis clusters and browsing standardized up-to-date gene descriptions. DoBISCUIT will thus serve as a useful reference for the analysis of secondary metabolite synthesis.
Data type	Annotation \| secondary metabolite

Summary

Secondary metabolites such as those produced by bacteria have pharmacologically important activities and attract attentions as lead-compounds and/or candidates for drug development.Although we can find many INSDC entries describing either complete or partial sequences of such secondary metabolite biosynthetic gene clusters, gene descriptions in each entry varies wildly.DoBISCUIT enables easy access to comprehensive information related to biosynthesis clusters and browsing standardized up-to-date gene descriptions. DoBISCUIT will thus serve as a useful reference for the analysis of secondary metabolite synthesis.

Data type

Annotation | secondary metabolite

Commentary page →

DOGAN

Summary	You can get the results of microbial sequence data, gene information (ORF), gene map, motifs, paper as the basis of estimated genes and proteome analysis from DOGAN. This information has been published to confirm the annotater, re-annotation is also in line with the passage of time. In addition, Blast homology search using the service and, DNA cloning service orders.
Data type	DNA-sequence , Proteome

Commentary page →

Evola

Summary	Evola (Evolutionary annotation database) is a database providing ortholog information of H-InvDB human genes. Evola contains orthologs among human and 14 vertebrates (chimpanzee, macaque, mouse, rat, dog, horse, cow, opossum, chicken, zebrafish, medaka, Tetraodon, and fugu). Evolutionary bioligical information such as protein multiple alignments, phylogenetic trees, transcript variants, and the degree of natural selection (dN/dS) are implemented.
Data type	Comparative genomics

Summary

Evola (Evolutionary annotation database) is a database providing ortholog information of H-InvDB human genes. Evola contains orthologs among human and 14 vertebrates (chimpanzee, macaque, mouse, rat, dog, horse, cow, opossum, chicken, zebrafish, medaka, Tetraodon, and fugu). Evolutionary bioligical information such as protein multiple alignments, phylogenetic trees, transcript variants, and the degree of natural selection (dN/dS) are implemented.

Data type

Comparative genomics

Commentary page →

FLJ Human cDNA Database

Summary	FLJ human cDNA database was constructed as human cDNA sequence analysis database focused on mRNA varieties caused by variations of transcription start site (TSS) and splicing.Human gene number was estimated to be 20-25 thousand. However number of human mRNA varieties was predicted to be about 100 thousand. The varieties are thought to be caused by variations of TSS and splicing. In our previous human cDNA project, about 30 thousand of FLJ human full-length sequenced cDNAs were deposited to DDBJ/GenBank/EMBL, and we obtained about 1.4 million of 5'-end sequences (5'-EST) of FLJ full-length cDNAs from about 100 kinds of cDNA libraries consist of human tissues and cells constructed by oligo-capping method.(cited from original site)
Data type	DNA-sequence full length cDNA

Summary

FLJ human cDNA database was constructed as human cDNA sequence analysis database focused on mRNA varieties caused by variations of transcription start site (TSS) and splicing.Human gene number was estimated to be 20-25 thousand. However number of human mRNA varieties was predicted to be about 100 thousand. The varieties are thought to be caused by variations of TSS and splicing. In our previous human cDNA project, about 30 thousand of FLJ human full-length sequenced cDNAs were deposited to DDBJ/GenBank/EMBL, and we obtained about 1.4 million of 5'-end sequences (5'-EST) of FLJ full-length cDNAs from about 100 kinds of cDNA libraries consist of human tissues and cells constructed by oligo-capping method.(cited from original site)

Data type

DNA-sequence full length cDNA

Commentary page →

FuLoJa

Summary	FuLoJa include the lotus japonicus full-length cDNA sequences analysed with InterPro(cited from original site).
Data type	DNA-motif

Commentary page →

G-compass

Summary	G-compass was designed as a tool for the study of comparative genomics. It provides the data of evolutionarily conserved genomic regions and orthologous genes between human and 12 vertebrates (chimpanzee, rhesus monkey, mouse, rat, dog, cow, horse, opossum, chicken, zebrafish, medaka and Tetraodon). Information of ultraconserved elements (UCE) and copy number variable regions (CNV) are provided. Sliding window analysis and dot plot analysis are also implemented.
Data type	Comparative genomics

Summary

G-compass was designed as a tool for the study of comparative genomics. It provides the data of evolutionarily conserved genomic regions and orthologous genes between human and 12 vertebrates (chimpanzee, rhesus monkey, mouse, rat, dog, cow, horse, opossum, chicken, zebrafish, medaka and Tetraodon). Information of ultraconserved elements (UCE) and copy number variable regions (CNV) are provided. Sliding window analysis and dot plot analysis are also implemented.

Data type

Comparative genomics

Commentary page →

GenoBase

Summary	Scope of database (GenoBase) is to understand comprehensively the living-cell system of Escherichia coli K-12 (W3110). Until now, GenoBase is the public repository for Sequence Information, Proteome, Transcriptome, Bioinformatics, and Knowledge based on literature concerning E.coli. The results of the NEDO project was contained.
Data type	Annotation

Commentary page →

GGDB

Summary	Glycogene includes genes associated with glycan synthesis such as glycosyltransferase, sugar nucleotide synthases, sugar-nucleotide transporters, sulfotransferases, etc. In "Construction of GlycoGene Library Project " (April, 2001 - March, 2004), we collected and compiled the data on such glycogenes as GlycoGene Database (GGDB), which is the first database to store information on substrate specificity.
Data type	DNA-sequence Substrates, Expression

Summary

Glycogene includes genes associated with glycan synthesis such as glycosyltransferase, sugar nucleotide synthases, sugar-nucleotide transporters, sulfotransferases, etc. In "Construction of GlycoGene Library Project " (April, 2001 - March, 2004), we collected and compiled the data on such glycogenes as GlycoGene Database (GGDB), which is the first database to store information on substrate specificity.

Data type

DNA-sequence Substrates, Expression

Commentary page →

H-InvDB

Summary	H-Invitational Database (H-InvDB) is an integrated database of human genes and transcripts. By extensive analyses of all human transcripts, we provide curated annotations of human genes and transcripts that include gene structures, alternative splicing isoforms, protein functions, etc.
Data type	RNA human full-length cDNA, mRNA

Commentary page →

HGPD

Summary	HGPD is a unique database that stores the information of a set of human Gateway entry clones and protein expression data and helps the user to search the Gateway entry clones. In the full-length human cDNA sequencing project (FLJ project at NEDO), nucleotide sequences of approximately 30000 human cDNA clones have been analyzed.(cited from paper http://nar.oxfordjournals.org/cgi/content/abstract/gkn872?ijkey=zKpNqhZH6jrUuzi&keytype=ref)
Data type	DNA-sequence

Summary

HGPD is a unique database that stores the information of a set of human Gateway entry clones and protein expression data and helps the user to search the Gateway entry clones. In the full-length human cDNA sequencing project (FLJ project at NEDO), nucleotide sequences of approximately 30000 human cDNA clones have been analyzed.(cited from paper http://nar.oxfordjournals.org/cgi/content/abstract/gkn872?ijkey=zKpNqhZH6jrUuzi&keytype=ref)

Data type

DNA-sequence

Commentary page →

Human-Gene diversity Of Life-style related Diseases (H-GOLD)

Summary	Genetic polymorphism information relating to model diseases. This site provides two databases and seven (downloadable) analysis tools. At 22th Jun 2012, The archive of GDBS data set opened in LSDB Archive.
Data type	, microsatellites, SNP

Commentary page →

Integrated Cancer Genome Database

Summary	In order to identify genes responsible for drug sensitivities and side effects of anti-cancer drugs, the project genotyped about 3,000 SNPs in candidate genes for breast cancer patients by the Invader method. The database stores genotype frequencies of these SNPs and their relation with drug sensitivities and side effects. The candidate genes include 512 genes related to pharmacokinetics, DNA repair, apotosis, cell cycle control, angiogenesis, and inflammation. The database provides genotype frequencies of these SNPs and it is downloadable.
Data type	-

Summary

In order to identify genes responsible for drug sensitivities and side effects of anti-cancer drugs, the project genotyped about 3,000 SNPs in candidate genes for breast cancer patients by the Invader method. The database stores genotype frequencies of these SNPs and their relation with drug sensitivities and side effects. The candidate genes include 512 genes related to pharmacokinetics, DNA repair, apotosis, cell cycle control, angiogenesis, and inflammation. The database provides genotype frequencies of these SNPs and it is downloadable.

Data type

-

Commentary page →

JSNP DATABASE

Summary	The standard allele frequency of 786 Japanese SNPs has been registered since Release 9. NEDO project detaset is includes in JSNP.
Data type	single nucleotide polymorphism (SNP) allele frequency

Commentary page →

LEGENDA

Summary	Legenda is the system to find articles in which any pair of gene names, diseases, and substrates are co-occurred in the abstract in MEDLINE. Co-occurrence of the same types (e.g. genes) can be searched. Legenda has its own gene name dictionary.
Data type	Journal , Gene, Disease, Substrate

Commentary page →

MCG CNV Database

Summary	The MCG CNV Database provides copy number variant ( CNV ) and loss of heterozygosity ( LOH ) detected through microarray analyses in healthy Japanese by our in-house BAC-based arrays, so-called MCG arrays1), and SNP array ( illumina, HumanOmniExpress Beadchip ). The MCG CNV Database shows an incidence of CNV and LOH in the Japanese healthy population and can be of assistance to estimate a pathogenicity of CNV or LOH detected in subjects having possible involvement of cryptic genomic aberrations behind their pathogenesis.(cited from original site)
Data type	DNA-polymol array

Summary

The MCG CNV Database provides copy number variant ( CNV ) and loss of heterozygosity ( LOH ) detected through microarray analyses in healthy Japanese by our in-house BAC-based arrays, so-called MCG arrays1), and SNP array ( illumina, HumanOmniExpress Beadchip ). The MCG CNV Database shows an incidence of CNV and LOH in the Japanese healthy population and can be of assistance to estimate a pathogenicity of CNV or LOH detected in subjects having possible involvement of cryptic genomic aberrations behind their pathogenesis.(cited from original site)

Data type

DNA-polymol array

Commentary page →

Microorganism database system

Summary	This database allows users to search microorganisms and to see the various biological information. This database is available only in Japanese.
Data type	Journal old species name (if any), philogenetic position, type strain, isolation origin, chemical components as chemical taxonomic index, substrate availability, energy gain type, accession number of 16S rDNA to DDBJ/EMBL/GenBank and so on.

Commentary page →

MiFuP

Summary	MiFuP is a database of functional potentials deduced from microbial genomes. You can easily search for microbes with potential functions of your interest.In the function search, you can search for functional potentials from your microbial genome sequences or cds sequences. Associated MiFuP wiki is an information web site about microbial functions (e.g., detailed description of each function, underlying mechanisms, representative microbes, industrial use). MiFuP wiki is in Japanese only.
Data type	DNA-Sequence\| Amino acid sequence\| Annotation

Summary

MiFuP is a database of functional potentials deduced from microbial genomes. You can easily search for microbes with potential functions of your interest.In the function search, you can search for functional potentials from your microbial genome sequences or cds sequences. Associated MiFuP wiki is an information web site about microbial functions (e.g., detailed description of each function, underlying mechanisms, representative microbes, industrial use). MiFuP wiki is in Japanese only.

Data type

DNA-Sequence| Amino acid sequence| Annotation

Commentary page →

MiFuP Safety

Summary	It is a database that searches for genes related to harmfulness from the genome information of microorganisms and estimates harmfulness of microorganisms. When a nucleotide sequence of a microbial genome or amino acid sequence is inputted, You can estimate that the target microorganism has a harmful function, detecting a gene region concerning toxicity (toxin production, drug resistance etc.) in the sequence.
Data type	DNA-sequence\| Amino acid sequence\| Annotation

Summary

It is a database that searches for genes related to harmfulness from the genome information of microorganisms and estimates harmfulness of microorganisms. When a nucleotide sequence of a microbial genome or amino acid sequence is inputted, You can estimate that the target microorganism has a harmful function, detecting a gene region concerning toxicity (toxin production, drug resistance etc.) in the sequence.

Data type

DNA-sequence| Amino acid sequence| Annotation

Commentary page →

NBRC culture catalogue search

Summary	Microorganisms (15,400 clones) held by NBRC of NITE can be searched. There are four ways to search, the first is NBRC number, the second is keyword, the third is scientific name or numberof other institutions, and last is homology search. The results can provide the scientific name, history, agency numbers, culture conditions and the paper information. In addition, prokaryotic (16SrDNA) and eukaryotic (28rDNA) for the registration of the sequence to confirm the identity resources search from the search sequence information. (Own translation)
Data type	DNA-sequence

Summary

Microorganisms (15,400 clones) held by NBRC of NITE can be searched. There are four ways to search, the first is NBRC number, the second is keyword, the third is scientific name or numberof other institutions, and last is homology search. The results can provide the scientific name, history, agency numbers, culture conditions and the paper information. In addition, prokaryotic (16SrDNA) and eukaryotic (28rDNA) for the registration of the sequence to confirm the identity resources search from the search sequence information. (Own translation)

Data type

DNA-sequence

Commentary page →

PMPj-Blast

Summary	The database of nucleotide sequences of ESTs, cDNAs and oligo DNA microarray probes for Lotus japonicus et al. which were obtained from the PMPj(cited from original site).
Data type	DNA-sequence , EST-sequence

Commentary page →

RAvariome

Summary	RAvariome is a human genetic variant database of autoimmune inflammatory disease Rheumatoid Arthritis (RA). RAvariome provides literature-curated data and "HOT variants", the significant and reproducible variants in different studies. Moreover, based on the confirmed associations, we developed genetic risk prediction tool that provides relative risk for RA to be usable for clinicians, researchers and the general public.
Data type	Genetic/Genomic variants

Summary

RAvariome is a human genetic variant database of autoimmune inflammatory disease Rheumatoid Arthritis (RA). RAvariome provides literature-curated data and "HOT variants", the significant and reproducible variants in different studies. Moreover, based on the confirmed associations, we developed genetic risk prediction tool that provides relative risk for RA to be usable for clinicians, researchers and the general public.

Data type

Genetic/Genomic variants

Commentary page →

SAHG

Summary	SAHG is a comprehensive database, which exhibits protein structures encoded in the human genome. All of the Open Reading Frames encodeing proteins in the human genome are subjected to protein structure prediction. The development of this database is supported by JST(Japan Science and Technology Agency). (cited from original site)
Data type	Protein-structure

Commentary page →

SEVENS

Summary	SEVENS database includes G-protein coupled receptors (GPCR) genes with seven transmembrane helices, that are identified with high accuracy from complete genomes of 32 eukaryotes, by a pipeline integrating such software as a gene finder, a sequence alignment tool, a motif and domain assignment tool, and a TMH predictor.
Data type	Ptotein-sequence amino acid sequence

Commentary page →

The database of full-length cDNA and EST sequences of Lotus

Summary	The database contains 98838 EST and 3488 full-length cDNA sequences of Lotus. japonicus, a model legume plant. Both of public (from DDBJ) and closed dataset can download from the site.
Data type	DNA-sequence

Commentary page →

TOT-DB

Summary	TOT-DB (The Theileria orientalis Genome Annotation Database) ia a database of the genome information of parasitic protists Theileria orientalis.Genes annotated by an integration of expression data and gene prediction by several software are displayed on the G-integra genome browser.TOT-DB also offers the BLAST homolgy search service and data downloads of the genome sequences and gene annotation of T. orientalis.
Data type	Comparative genomics

Summary

TOT-DB (The Theileria orientalis Genome Annotation Database) ia a database of the genome information of parasitic protists Theileria orientalis.Genes annotated by an integration of expression data and gene prediction by several software are displayed on the G-integra genome browser.TOT-DB also offers the BLAST homolgy search service and data downloads of the genome sequences and gene annotation of T. orientalis.

Data type

Comparative genomics

Commentary page →

TraP

Summary	The metabolic pathways were compared, such as glycolysis, the TCA cycle, amino acid biosynthesis, and nucleic acid biosynthesis among 7 archaea. It is possible to highlight the differences between different archaea. (cited from the original site)
Data type	Metabolite repression

Commentary page →

UCSC GenomeBrowser for Functional RNA

Summary	UCSC GenomeBrowser for Functional RNA is a UCSC Genome Browser mirror with large additional custom tracks specifically associated with non-coding elements. It also includes several functional enhancements such as a presentation of a common secondary structure prediction at any given genomic window <500 bp.
Data type	Comparative genomics

Commentary page →

VarySysDB

Summary	This is a system to search, display, and download our research results on human polymorphism based on publicly available data and annotations of transcripts presented by H-InvDB. It provides information about single nucleotide polymorphisms (SNPs), deletion-insertion polymorphisms (DIPs), short tandem repeats (STRs), single amino acid repeats (SARs), structural variation (or copy number variations: CNVs), and their relations to the genome, transcripts, and functional domains.
Data type	-

Summary

This is a system to search, display, and download our research results on human polymorphism based on publicly available data and annotations of transcripts presented by H-InvDB. It provides information about single nucleotide polymorphisms (SNPs), deletion-insertion polymorphisms (DIPs), short tandem repeats (STRs), single amino acid repeats (SARs), structural variation (or copy number variations: CNVs), and their relations to the genome, transcripts, and functional domains.

Data type

-

Commentary page →

MEDALS METI Life science integrated database portal site

DNA, Genome

Acacia EST database

ARCHAIC

ASTRA

CGH Data Base

Database for genetic engineering of microalgae

Database of genomes and transcriptional regulations for filamentous fungi

Distribution of Human cDNA Clones

DoBISCUIT

DOGAN

Evola

FLJ Human cDNA Database

FuLoJa

G-compass

GenoBase

GGDB

H-InvDB

HGPD

Human-Gene diversity Of Life-style related Diseases (H-GOLD)

Integrated Cancer Genome Database

JSNP DATABASE

LEGENDA

MCG CNV Database

Microorganism database system

MiFuP

MiFuP Safety

NBRC culture catalogue search

PMPj-Blast

RAvariome

SAHG

SEVENS

The database of full-length cDNA and EST sequences of Lotus

TOT-DB

TraP

UCSC GenomeBrowser for Functional RNA

VarySysDB