Home  Project catalog  Life science database integration project

More information

Project name Life science database integration project
Area Genome Informatics
Purpose This project aims to facilitate the research and development in both industry and academia.
Introduction The project develops a portal site that provides various information on databases and analysis tools for life science which were developed by the projects directly sponsored by METI, or developed by the Institutes sponsored by METI. The second aim is to develop an integrated system to utilize the various outcomes related to METI, based on H-Invitationl DB (all human gene database) that was developed in "Genome Information Integration Project (GIIP)".
Keyword database | analysis tool | product | portal | integration
Started fiscal year 2008-2010
Project head Takashi Gojobori
Institute of the head Center for Information Biology and DNA Data Bank of Japan (CIB-DDBJ), National Institute of Genetics (NIG)|Biomedicinal Information Research Center (BIRC), National Institute of Advanced Industrial Science and Technology (AIST)
Budget (million yen) 70 (2008)
Representative Institute of the commission Japan Biological Informatics Consortium (JBiC)
Companies Japan Biological Informatics Consortium (JBiC) | Biomedicinal Information Research Center (BIRC), National Institute of Advanced Industrial Science and Technology (AIST) | HITACHI Government & Public Corporation System Engineering,Ltd. | HITACHI Software Engineering co., Ltd. | DYNACOM CO., Ltd.
Published papers (PubMed IDs) None
Patent (Japan, overseas) -
Archives
The report on "METI life science database project" 2008-2011. Japanese.

Product (Database, Tool)

Evola

Summary Evola (Evolutionary annotation database) is a database providing ortholog information of H-InvDB human genes. Evola contains orthologs of human and 13 species' genes (chimpanzee, macaque, mouse, rat, dog, horse, cow, opossum, chicken, zebrafish, medaka, Tetraodon, and fugu). Viewers of sequence alignments and phylogenetic trees, transcript variants (Locus maps), and natural selection (dN/dS view) are implemented. Duplicate gene family viewer is also available.
Data type Comparative genomics

G-compass

Summary G-compass was designed as a tool for the study of comparative genomics. It provides the data of evolutionarily conserved genomic regions and orthologous genes between human and 12 vertebrates (chimpanzee, rhesus monkey, mouse, rat, dog, cow, horse, opossum, chicken, zebrafish, medaka and Tetraodon). Information of ultraconserved elements (UCE) and copy number variable regions (CNV) are provided. Sliding window analysis and dot plot analysis are also implemented.
Data type Comparative genomics

H-ANGEL

Summary H-ANGEL is a resource which provides information on human gene expression. H-ANGEL displays expression patterns of transcriptional products generated by the H-Invitational project in practical tissue categories based on tissue-specific expression data from several experimental platforms. H-ANGEL also displays information for the expression of different genes at their corresponding physical positions in the human genome. This information is linked to the corresponding transcript or locus annotation data stored in H-InvDB.
Data type Gene expression

H-DBAS

Summary H-DBAS is a database of human alternative splicing (AS) based on H-InvDB. H-DBAS offers human AS variants identified from H-Inv full-length cDNA and published human mRNA dataset. The data of analyses such as AS pattern, AS affecting protein function and AS comparison with mouse were included. H-DBAS allows users to find AS variants by using various search keys in Advanced Search page and the results are displayed visually in AS Viewer operated by Java.
Data type Annotation

H-Exp

Summary This is a DB of Human Tissue-specific expression profile data and it was integrated and coordinated with H-InvDB. This system enables (1) fast search, sort, and view for gene cluster or isoform, (2) comparison of expression pattern for gene cluster or isoform, (3) display of detailed information for gene expression and related data.
Data type Gene expression

H-InvDB

Summary H-Invitational Database (H-InvDB) is an integrated database of human genes and transcripts. By extensive analyses of all human transcripts, we provide curated annotations of human genes and transcripts that include gene structures, alternative splicing isoforms, protein functions, etc.
Data type RNA human full-length cDNA, mRNA

Hyperlink Management System

Summary Hyperlink Management System is an original tool for setting hyperlinks to major databases related to human genes and proteins worldwide. For details, please refer to the document here (http://staff.aist.go.jp/t.imanishi/about_hms.html).
Data type Data identifiers (IDs)

ID Converter System

Summary ID Converter System is an original tool for converting IDs used in major databases related to human genes and proteins worldwide. For details, please refer to the document here (http://staff.aist.go.jp/t.imanishi/about_hms.html).
Data type Data identifiers (IDs)

LEGENDA

Summary Legenda is the system to find articles in which any pair of gene names, diseases, and substrates are co-occurred in the abstract in MEDLINE. Co-occurrence of the same types (e.g. genes) can be searched. Legenda has its own gene name dictionary.
Data type Journal , Gene, Disease, Substrate

MEDALS

Summary The outcome is this portal site. This provides various information on databases and analysis tools for life science which were developed by the projects directly sponsored by METI (Ministry of Economy, Trade and Industry), or developed by the Institutes sponsored by METI. The name of the site, "MEDALS", stands for METI Database portal for Life Science. It opened on October 29, 2008.
Data type metadata

PMID-Extractor

Summary PMID-Extractor allows a user to obtain PubMed IDs (PMIDs) from PDF files or text format files of journal paper in your hand. From Digital Object Identifiers (DOIs, http://en.wikipedia.org/wiki/Digital_object_identifier) or text information (e.g. titles) in the first page of each files, To start using PubMedScan (http://medals.jp/pubmedscan/), a paper recommender, PMIDs are required to specify the users’ interest. That is the main usage of PMID-Extractor.
Data type Journal

PPI view

Summary The PPI view displays H-InvDB human protein-protein interaction (PPI)information. PPI data were collected from five major public PPI databases (BIND, DIP, MINT, HPRD, IntAct) and integrated them as a non-redundant PPI dataset. As the result, we got 32,198 human PPIs comprised of 9,268 proteins. (at H-InvDB version 5.0) The PPI view displays proteins which interact with the usersCHR(39) query proteins (or gene products), and provides links to H-InvDB locus view and cDNA view, which guide you to the gene locations and the detailed gene functional annotations of these interacting proteins, respectively.PPI view: http://www.h-invitational.jp/hinv/ppi/ PPI view sample:http://h-invitational.jp/hinv/ppi/ppi_view.cgi?hip=HIP000084307
Data type Protein-Proteome

PubMedScan

Summary PubMedScan is a automatic paper recommendation system for newly published PubMed papers that are related to your interest. This system has the following features: 1) queries are given by a list of PMIDs to search for related articles rather than by keywords or phrases; 2) related papers are reported daily by E-mail and stored in the system; 3) the similarity score between two articles are, currently, scored by PubMed "Related articles" service; and 4) the system can be used through either Web (ver 2.0) or software for local installation (ver 1.1). The local version can be installed on a UNIX machine such as Linux and Macintosh. PHP, MySQL, Apache and Perl modules are required. This system is key-word free, therefore effective to compensate for the keyword search in surveying the papers. To collect the PMIDs of your interest easily from your PDF or text files, we also provide software "PMID-Exporactor".
Data type Articles registered in PubMed

VarySysDB

Summary This is a system to search, display, and download our research results on human polymorphism based on publicly available data and annotations of transcripts presented by H-InvDB. It provides information about single nucleotide polymorphisms (SNPs), deletion-insertion polymorphisms (DIPs), short tandem repeats (STRs), single amino acid repeats (SARs), structural variation (or copy number variations: CNVs), and their relations to the genome, transcripts, and functional domains.
Data type

Relating project