genehopper.de

Genehopper

Search
Qtlizer
More
- REST API
- Downloads
- Help
- FAQ
- Cite Us
- Contact

Search: 'TP53'

Similar Genes
Gene

Similarities

Homology

The similarity of paralogous genes is measured with the identity of the amino acid sequences. The identities were fetched from Ensembl and computed with CLUSTAL W, a multiple sequence alignment tool. Similarities of non-paralogous genes are set to zero.

Caution: the sequence identity is a asymmetric measure, i.e. the identity of aligned sequences a and b can be different.

Homology in rodents

Measures the degree of sequence identity to various species (mouse, rat, macaque, fruitfly, dog, guinea_pig, pig, rabbit, worm, cow) between human genes

The similarity between two genes is calculated from the correlation of two one dimensional vectors consisting of the numbers of orthologs and the max. protein identities to every species.

GO cellular component

This similarity is based Gene Ontology (GO) terms. GO terms as well as their associations to genes were fetched from Ensembl.

In the first step the similarity between two GO terms was computed with the R-Implementation (Bioconductor package GOSemSim) of the information content based Resnik's measure.

In the second step the combination method best-match average was applied to the output of the first step in order to assign a similarity to each pair of term sets.

GO biological process

This similarity is based Gene Ontology (GO) terms. GO terms as well as their associations to genes were fetched from Ensembl.

In the first step the similarity between two GO terms was computed with the R-Implementation (Bioconductor package GOSemSim) of the information content based Resnik's measure.

In the second step the combination method best-match average was applied to the output of the first step in order to assign a similarity to each pair of term sets.

GO molecular function

This similarity is based Gene Ontology (GO) terms. GO terms as well as their associations to genes were fetched from Ensembl.

In the first step the similarity between two GO terms was computed with the R-Implementation (Bioconductor package GOSemSim) of the information content based Resnik's measure.

In the second step the combination method best-match average was applied to the output of the first step in order to assign a similarity to each pair of term sets.

Interpro protein domains

Interpro identifiers were fetched from Ensembl. Sets of identifiers (one set for each gene) were compared with the Cosine measure.

Swiss-Prot protein features

A binary vector was assigned to each protein-coding gene from the following features:

Transmem
Signal
Lipid
Transcription factor
NHR
NOR
Ion Channel
GPCR
Enzyme
Kinase
Protease
Phosphatase
PDE
Disease
Monogenetic Disease
OMIM
Cytoplasm
Golgi
Membrane
Mitochondrion
Nucleus
Secreted
Ubiquitome
Epigenome

These features were fetched from the Uniprot/Swiss-Prot database. The similarity between two vectors was computed with the Cosine measure.

Variant-related publications

Citations were fetched from the Ensembl Variation database. Ensembl itself derived the citations from dbSNP submissions and text mining performed by EPMC and UCSC

The Cosine measure was used to compute similarities between sets of citations.

Caution: Only publications of genetic variants were considered.

Pubtator gene-related publications

Citations were fetched from Pubtator. The Cosine measure was used to compute similarities between sets of citations.

Tissue-specific gene expression

Tissue-specific gene expression dataset from The Human Protein Atlas. Based on RNAseq data from 32 tissues.

As similarity measure the square (r²) of Spearman's rank correlation coefficient was used.

Celline-specific gene expression

Celline-specific gene expression dataset from The Human Protein Atlas. Based on RNAseq data from 44 cell lines.

As similarity measure the square (r²) of Spearman's rank correlation coefficient was used.

Brainspan gene expression

Brainspan dataset consists of gene expressions across different brain tissues/regions, ages and individuals.

Brainspan gene expression in adults

Brainspan dataset consists of gene expressions across different brain tissues/regions, ages and individuals. Only samples from individuals > 13 years if age included.

StringDB experimentally validated interactions

This similarity reflects the the evidence dimension "Experimental/Biochemical Data" in StringDB.

Genomic distance

The genomic distance similarity is based on GRCh37/hg19. It equals to 0, if the distance is > 1MB or if the two genes are located on different genomes. Otherwise the similarity is calculated using the formula -1*((distance/1MB)-1).

HGNC gene symbol

Gene symbols were fetched from the HGNC database and compared by the prefix distance measure.

Instructions on how to compute the prefix distance between two strings can be found here.