Gene Catalogs
|
KEGG GENES is a collection of gene catalogs for all complete genomes and some partial genomes generated from publicly available resources, mostly NCBI RefSeq.
They are subject to SSDB computation and manual KO assignment (gene annotation).
DGENES for draft genomes (eukaryotes only) and EGENES for EST consensus contigs (mostly plants) are meant to supplement the repertoire of KEGG organisms, and they are given automatic KO assignment with GENES used as a reference data set.
|
Organism Codes
|
The three-letter KEGG organism codes (with prefix "d" for draft genomes and "e" for EST contigs) can be used to enter the organism-specific view of KEGG.
The NCBI taxonomy id and the five-letter Swiss-Prot code are also defined as aliases in KEGG.
Note, however, that the relationship is not one-to-one; one taxonomy id or one Swiss-Prot code may correspond to multiple KEGG organisms.
|
KEGG Orthology (KO) System
|
The KEGG Orthology (KO) system is a classification system of orthologous genes, including orthologous relationships of paralogous gene groups.
It is also the basis for drawing KEGG PATHWAY maps and creating the genes/proteins category of KEGG BRITE.
Thus, genes in the genome can be automatically mapped to KEGG pathways and BRITE hierarchies once they are assigned the KO identifiers, or the K numbers.
|
Automatic Annotation
|
The annotation of KEGG GENES involves assignment of KO identifiers (K numbers).
Internally, this is done using the KAAS automatic annotation program and the GFIT manual annotation tool both based on the SSDB database.
The BLAST version of the KAAS program is made publicly available.
|
Automatic KO assignment
KAAS - automatic annotation (KO assignment) and pathway reconstruction
Gene Name Conversion
|
KEGG GENES can be retrieved by giving identifiers of outside databases, such as NCBI-GeneID (Entrez Gene ID), NCBI-gi, and UniProt accession numbers.
Cross-reference lists are available at the FTP site.
|
Last updated: November 8, 2007
|
|