In silico Analysis of Single  Nucleotide Polymorphisms  (SNPs) in HumanVCAM-1 gene

Tyseer Alabid; Anwaar A.Y.Kordofani; Bakhieta Atalla; Hisham N. Altayb; Afra AbdEhamid Fadla; Marwa Mohamed  Osman; Mohamed Ahmad Salih; Bahaeldin K. Elamin

doi:https://doi.org/10.47739/2576-1102/1004

In silico Analysis of Single Nucleotide Polymorphisms (SNPs) in HumanVCAM-1 gene

Research Article | Open Access | Volume 1 | Issue 1

Article DOI : https://doi.org/10.47739/2576-1102/1004

Tyseer Alabid^1* Anwaar A.Y.Kordofani² Bakhieta Atalla³ Hisham N. Altayb⁴ Afra AbdEhamid Fadla⁵ Marwa Mohamed Osman⁵ Mohamed Ahmad Salih⁶ Bahaeldin K. Elamin^7,8

^1. Department of Haematology , University of Khartoum , Sudan
^2. Department of Pathology , University of Khartoum , Sudan
^3. Department of Pediatrics, University of Bahr El Gazal, Sudan
^4. Department of Microbiology, Sudan University of Science and Technology, Sudan
^5. Department of biotechnology, Africa city of technology, Sudan
^6. Department of biotechnology, Biotechnology Park, Sudan
^7. Department of Microbiology and Parasitology , University of Bisha , Saudi Arabia
^8. Department of Microbiology, University of Khartoum, Sudan

+ Show More - Show Less

Corresponding Authors

Tyseer Alabid , Madani street, Faculty of medical laboratoy sciences, University of Khartoum, Soba, Khartoum, P.O Box :7699, Sudan, Fax :00249183434199; Tel: 00249912906551

Abstract

Homozygosity for the hemoglobin S (Hb S) mutation, as well as a number of heterozygous states involving one gene encoding Hb S, lead to shortened red blood cell survival and hemolysis in sickle cell disease. Variants of the VCAM1 gene could be informative genetic modifiers of phenotypic differences in SS disease. In this work, we have analyzed the genetic variation that can alter the expression and the function in VCAM-1 gene using computational methods. Out of the total 1450 SNPs in Homo sapiens 254 were missense, 255 were non-synonymous SNPs, 207 were synonymous SNPs, 53 were in the 3’un-translated region and 23 were in the 5’un-translated region of the nsSNPs67 SNPs (26.7%) were found to be damaging by both SIFT and Poly Phen server among the 255 ns SNPs investigated. In I-Mutant 3.0, 100% of ns SNPs changed protein stability with related free energy. There were 51 functional SNPs in 3UTR region, 39% (20/51) of them disrupted a conserved miRNA site and 23 derived allele created a new miRNA site. A structural and functional analysis of ns SNPs was also studied by Project HOPE software. UTR resource tool predicted that one SNP (rs529512508) found in a binding site that may affect the regulation of protein translation. Based on this work, we proposed that these most deleterious ns SNPs with an SNP ids (rs529512508, rs41287272, rs200949878, rs368917883, rs138479988, rs143330690) may be important candidates for the cause of different types of human diseases by VCAM1 gene.

Citation

Alabid T, Kordofani AAY, Atalla B, Altayb HN, Fadla AA, et al. (2016) In silico Analysis of Single Nucleotide Polymorphisms (SNPs) in HumanVCAM-1 gene. J Bioinform, Genomics, Proteomics 1(1): 1004.

Keywords

•   SCA
•   VCAM1 gene
•   SNP
•   In silico analysis

ABBREVIATIONS

3UTR: 3Un Translated Region; ACS: Acute Chest Syndrome; BP: Base Pair; CVA: Cerebro Vascular Accidents; dbSNP: Databases Single Nucleotide Polymorphism; DNA: Deoxyribo Nucleic Acid; Hb: Hemoglobin; MBE: Musashi Binding Element; NCBI: National Center for Biotechnology Information; nsSNP: Non-Synonymous Single Nucleotide Polymorphism; PISC: Position Specific Independent Count; RI: Reliability Index; RS: Recognition Site; SCA: Sickle Cell Anemia; SCD: Sickle Cell Disease; SIFT: Sorting Intolerant from Tolerant; SNP: Single Nucleotide Polymorphism; SVM: Support Vector Machine; VCAM: Vascular Cell Adhesion Molecule; VLA: Very Late Antigen

INTRODUCTION

Single Nucleotide Polymorphisms (SNPs) are the single base change in coding or non-coding DNA sequence and are present in every 200–300 bp in human genome [1]. So far, 5,000,000 SNPs have been identified in the coding region of human population responsible for genetic variation [2]. Among all SNPs, non-synonymous SNPs (ns SNPs) are present in exonic part of genome, which often leads to change in amino acid residues of gene product.

Homozygosity for the hemoglobin S (Hb S) mutation, as well as a number of heterozygous states involving one gene encoding Hb S, lead to shortened red blood cell survival and hemolysis in Sickle Cell Disease (SCD) [3].

The vaso-occlusive crisis, is a common painful complication of SCD in adolescents and adults which lead to Cerebro Vascular Accidents (CVA or stroke) genetic risk factors can be used to predict the risk of stroke is highly favorable in research.

Currently, controversy exists among geneticists as to the best means for identifying the genetic determinants of multi genic or complex traits like stroke in SS disease [4].

There for recent studies encourage applying SNPs to genetic association studies. Hopes of detecting DNA variants that contribute to an increased susceptibility to human diseases are the commonest interest nowadays. The multi factorial sickle cell disease is the commonest intrigue these days.

Human VCAM-1 is a single copy gene at chromosome 1p32-p31, and contains nine exons spanning w22.8 kb [5]. As a result of alternative splicing of exon 5, two different VCAM-1 transcripts have been described [6]. Elevated expression of VCAM-1 on endothelial cells depends on the presence of cytokines and mediates leukocyte accumulation in inflamed tissues [7].

Variants of the VCAM1 gene could be informative genetic modifiers of phenotypic differences in SS disease [8]. The predominant receptor for VCAM-1 is the integrin a4b1 [9].

Genetic polymorphisms of VCAM-1 have been implicated in susceptibility to a number of degenerative and inflammatory diseases [10,11]. One of the interests in association studies is the association between SNPs and disease development. There are millions of SNPs in the entire human genome, which creates major difficulty for planning costly population-based genotyping to target SNPs that are most likely to affect phenotypic functions and ultimately contribute to disease development. Single nucleotide polymorphism (SNP) markers are preferred for disease association studies because of their high abundance along with the human genome. Therefore, to explore possible associations between genetic mutation and phenotypic variation, different algorithms like SIFT and PolyPhen-2 software were used for prioritization of high-risk non synonymous single nucleotide polymorphisms (nsSNPs) in coding regions that are likely to have an effect on the function and structure of the protein.

The aim of this study was to analyze the genetic variation that can alter the expression and the function in VCAM-1 gene using computational methods.

MATERIALS AND METHODS

We obtained information regarding VCAM1 SNPs, from National Center for Biological Information (NCBI) SNPs database in January 2016. We analyzed these SNPs using computational soft ware’s as follows: For predicting damaging amino acid substitutions, we used SIFT software and for Prediction of functional modification we used Polyphen-2 software. Moreover, damaging ns SNPs by these two servers were submitted to I-mutant and PHD-SNP tools. The FASTA format of the protein was obtained from Uniprot at Expassy database and for Protein Modeling we used both Chimera 1.8 and Project HOPE software. To predict the effect on miRNA binding on these regions we analyzed SNPs at untranslated region at 3’ends (3’UTR) using Poly miRTS.

Gene mania

Gene Mania is online software used for prediction of the relation between a set of input gene and other genes using a very large set of functional association data. Association data include protein and genetic interactions, pathways, co-expression, co-localization and protein domain similarity [12]. Co-expression: Gene expression data. Two genes are linked if their expression levels are similar across conditions in a gene expression study. Genetic interaction: Genetic interaction data. Two genes are functionally associated if the effects of perturbing one gene were found to be modified by perturbations to a second gene. Physical Interaction: Protein-protein interaction data. Two gene products are linked if they were found to interact in a protein-protein interaction study [12] .

Functional SNPs in UTR found by the UTR scan

Scanning of UTR SNPs in UTR site 5′ and 3′ UTR regions are reported to have various biological processes such as post transcriptional regulatory pathways, stability and translational efficiency [13]. We used UTR scan server (http://utrdb.ba.itb. cnr.it/) [14] , that allows us to search for any sequences patterns collected in UTR site. UTR site is a collection of functional sequence patterns located in 5′ or 3′ UTR sequences. If different sequences of each UTR SNP are found to have different functional pattern(s), this UTR SNP is predicted to have functional significance. The internet resources for UTR analysis are UTR db and UTR site. UTR db contain experimentally proved biological activity of functional sequence patterns of UTR sequence from eukaryotic mRNAs.

Polym iRTS

It is another database server designed specifically for the analysis of the 3UTR region. We used this server to determine SNPs that may alter miRNA target site. All SNPs located within the 3UTR region were selected separately and submitted to the program [15] . (Available at: http://compbio.uthsc.edu/ miRSNP/)

Sorting Intolerant from Tolerant [SIFT]

Is an online bioinformatics soft-ware used to detect deleterious nsSNPs. SIFTis a multistep procedure that (1) searches for similar sequences, (2) chooses closely related sequences that may share similar function to the query sequence, (3) obtains the alignment of these chosen sequences, and (4) calculates normalized probabilities for all possible substitutions from the alignment. Positions with normalized probabilities less than 0.05 are predicted to be deleterious, those greater than or equal to 0.05 are predicted to be tolerated [14]. (Available at http://sift.jcvi.org/)

PolyPhen -2

This is an online bioinformatics program to automatically predict the consequence of an amino acid change on the structure and function of a protein. This prediction is based on a number of features comprising the sequence, phylogenetic and structural information characterizing the substitution. Basically, this program searches for 3D protein structures, multiple alignments of homologous sequences and amino acid contact information in several protein structure databases, then calculates Position- Specific Independent Count Scores (PSIC) for each of two variants, and then computes the PSIC scores difference between two variants. The higher a PSIC score difference, the higher the functional impact a particular amino acid substitution is likely to have PolyPhen scores were assigned probably damaging (2.00 or more), possibly damaging (1.40–1.90), potentially damaging (1.0–1.50), benign (0.00–0.90). Basically PolyPhen accepts input in form of SNPs or protein sequences [16] (Available at: http:// genetics.bwh.harvard.edu/pph).

I-Mutant version 3.0

I-Mutant is an online support vector machine tool based on the pro Thermdatabase to evaluate nsSNP induced changes in protein stability of a single site mutation starting from the protein structure or from the protein sequences. I-Mutant estimates the free energy change value (DDG) by calculating the unfolding Gibbs free energy value (DG) for the wild type protein and subtracting it from that of the mutant protein (DDG or DDG = DG mutant – DG wild type). It also predicts the sign (increase or decrease) of the free energy change value (DDG), along with a reliability index for the results (RI: 0–10, where 0 is the lowest reliability and 10 is the highest reliability). A DDG, 0 corresponds to a decrease in protein stability, whereas a DDG .0 corresponds to an increase in protein stability. However, according to the ternary classification system (SVM3), a large decrease in protein stability corresponds to a DDG, 20.5 and a large increase in protein stability corresponds to a DDG .0.5. In contrast, DDG values that fall between 20.5 and 0.5 correspond to relatively neutral protein stability. The pH was set to 7 and the temperature was set to 25 ? C for all submissions [17]. (Available at http://gpcr2.biocomp.unibo.it/cgi/predictors/IMutant3.0/I-Mutant3.0.cgi).

PHD-SNP

An online Support Vector Machine (SVM) based classifier, is optimized to predict if a given single point protein mutation can be classified as disease-related or as a neutral polymorphism [17]. (Available at: http://http://snps.biofold.org/phd-snp/phd-snp.html)

Protein Modelling

Project HOPE: Project HOPE is an easy-to-use web server that analyses the structural effects of intended mutation. HOPE provides the 3D structural visualization of mutated proteins, and gives the results by using UniProt and DAS prediction servers. Input method of Project HOPE carries the protein sequence and selection of Mutant variants. HOPE server predicts the output in the form of structural variation between mutant and wild type residues [15]. (Available at: http://www.cmbi.ru.nl/hope/home)

Chimera: Chimera is software produced by University of California, San Francisco used in this step to generate the mutated models of ATP7B protein 3D model. The outcome is then a graphic model depicting mutation [18] .(Available at: http:// www.cgl.ucsf.edu/chimera/).

RESULT AND DISCUSSION

Investigating the desired gene using dbSNP/NCBI

VCAM1 gene was investigated in dbSNP/NCBI (http://www.ncbi.nlm.nih.gov/snp). We found that this gene contains a total of 5709 SNPs; 1450 of them found on homosapiens; of which 254 were missense, 255 were non-synonymous SNPs (ns SNPs), 207 were synonymous SNPs, 53 were in the 3’un-translated region and 23 were in the 5’un-translated region.

Gene Mania: Functions and interaction of VCAM1 with functional similar gene. VCAM1 gene has many vital functions. It plays an important role in cell adhesion molecule binding, filopodium, microvillus, leukocyte cell-cell adhesion and membrane docking (Figure 1). The genes that are coexpressed or have physical properties with VCAM1 are shown in (Figure 2,3). The description of genes coexpressed with VCAM1 network is listed in (Table 1). The VCAM1network genes functions and its appearance in network and genome are listed in (Table 2).

Functional SNPs in UTR by UTR scan server

We used the UTR scan server for identifying the functionally significant SNPs in untranslated region. We observed that, one SNPs in 3′ UTR namely, rs529512508 is predicted to be of functional significance by this server. Musashi binding element (MBE) that is located in the Mos 3’ untranslated region (UTR) is an important binding element, that lies within a 24 nt region originally termed a polyadenylation response element (PRE), and also within this sequence is found to be critical for temporal regulation [17]. In mammalian somatic cells Musashi appears to facilitate repression of MBE-containing mRNAs, suggesting a context-dependent regulation of Musashi translational control. This MBE exist in one 3′ UTR SNP (rs529512508) which was considered to be of functional significance and hence can be thought to be damaging in the VCAM1 gene. This result could be thought of as an important outcome from this work as there are no reports in the literature which relates the deleterious nature of this SNP with 3′ ′ un translated region of VCAM1 gene so far.

Prediction of SNPS at the 3UTR region

SNPs in 3′UTR of VCAM1 gene were presented to Poly miRTS server. There were 51 functional SNPs predicted in 3UTR region, 20 alleles disrupted a conserved miRNA site (function class D) and 23 allele created a new site of miRNA (function class C), see (Table 3). We found that the (rs41287272) SNP had the highest frequency of function class (D). It had two alleles G and C. G allele with function class (D), had 5 miRSite as derived allele that disrupted a conserved miRNA site and the C allele with function class (C), has 2 miRSite as the derived allele creates a new miRNA site, (Table 3).

Predicting damaging amino acid substitutions using SIFT (v5.1) and Prediction of Functional modification using Polyphen-2 (Polymorphism Phenotyping v2).

Prediction of SNPs lies in coding region was performed by SIFT and Polyphen softwares. First, we submitted a batch of 207 nsSNPs (rs SNPs) to SIFT server; 102 SNPs were predicted to be deleterious by SIFT then the resultant deleterious nsSNPs were submitted to Polyphen server as query sequences in FASTA Format. Of the deleterious SNPs, 67 SNPs were predicted to be also damaging in Polyphen, see Appendix (II). The predicted outcomes revealed 47 SNPs were probably damaging and 20 SNPs were possibly damaging. We listed the most deleterious and damaging nonsynonymous SNPs predicted by both SIFT and Polyphen (Tolerance index = 0) and PSIC scores range = (1-0999) respectively, (Table 4).

Prediction of change in stability due to mutation used I-Mutant 3.0 server.

We submitted the 67 nonsynonymous SNPs that were predicted to be deleterious and damaging by both SIFT and Polyphen softwares to the I-Mutant 3.0 server. The outcomes predicted that all the mutations in VCAM1 gene change the protein stability with related free energy. We listed 7 mutations: Y113C, Y611C, Y504C, G295V, G354R, G524E and T61I that were predicted to decrease effective stability of the protein. (Table 4).

Association of nsSNPs to disease using PHD-SNP software

We submitted the 67 nonsynonymous SNPs that were predicted to be deleterious and damaging by both SIFT and Polyphen softwares to the PHD-SNP software. Of the 7 most deleterious and damaging nonsynonymous SNPs predicted by both SIFT and Polyphen we found 2 SNP mutations (rs200949878) G354R and (rs368917883) G524E were predicted to be disease related while 5 SNP mutations Y113C, G295V, Y611C, Y504C and T61I were predicted to be neutral polymorphism (Table 4).

According to Project HOPE software, we found the proprieties of amino acids as follows

We submitted the 7 most deleterious and damaging nonsynonymous SNPs predicted by both SIFT and Polyphen to Project HOPE software. In Three SNPs: rs62638734, rs199733528 and rs143330690 SNPs, we found the mutant residues (isoleucine, tyrosine and valine, respectively) were bigger than wild-type residues (threonine, histidine and glycine, respectively) at position 61, 91 and 295, respectively butin rs138479988 SNP it was smaller than the wild-type residue at position 113 due to (Tyrosine to Cysteine at position 113) mutation. Furthermore, the mutant residues were more hydrophobic than the wild-type residues in these four SNPs. Also, the wild-types in these four SNPs were much conserved, but a few other residue types had been observed at these positions too. Mutant residues in rs62638734 and rs138479988 SNPs were located near highly conserved positions and caused loss of hydrogen bonds in the core of the protein and as a result disturbed correct folding. According to conservation scores the mutations were possibly not damaging to the protein in rs62638734, rs199733528 SNPs and probably damaging to the protein in rs138479988, rs143330690 SNPs (Table 5). The amino acid proprieties of rs200949878, rs368917883 and rs375517524 SNPs could not predicted by Project HOPE software, but instead Chimera was used.

Chimera

Chimera program had been used to visualize the PDB file of rs200949878, rs368917883 and rs375517524 SNPs, determine the position of the mutant and replace it with the new amino acid, which showed based structural change among ns SNPs protein and clash points of new residues with other atoms using minimized energy (Table 5).

Different types of human diseases were previously linked to variants of VCAM1 gene. Particularly, variants rs1041163, rs3978598 and rs3783613 have been associated to small vessel stroke [19], leukocytosis and protection against 11.9% of stroke risk [20], respectively. Rs1409419 allele C was associated to stroke events, while VCAM-1 rs1409419 allele T was found to be protective [21]. Also rs3176867 was linked to benzene hematotoxicity [22] and rs3176879 was associated with decreased CFU-GEMM progenitor cell colony formation (p = 0.041) in 29 benzene-exposed workers [23].

Another study showed that rs3783605 is located in ETS2 binding site and therefore may be involved in the pathogenesis of VCAM-1-associated diseases including asthma, atherosclerotic lesions, multiple sclerosis, thromboembolic diseases, multiple myeloma, insulin-dependent diabetes mellitus and breast cancer [24-28]. This association might be due to the effect of this variation on the promoter activity and VCAM-1 expression [29].

None of these reported SNPs were predicted by our work to be the most damaging or disease related SNPs. While others were predicted to be harmful SNPs in the VCAM1 gene and not reported before. We propose that the 7 most deleterious SNPs may be involved in the pathogenesis of the VCAM-1-associated diseases.

Table 1: Table1 shows the description of genes co-expressed with VACM1 gene network.

Symbol	Description	Co-expression
ITGAD	integrin, alpha D	YES
CCL17	chemokine (C-C motif) ligand 17	YES
MSN	moesin	YES
SEC61B	Sec61 beta subunit	NO
SPARC	secreted protein, acidic, cysteine-rich (osteonectin)	YES
CYBB	cytochrome b-245, beta polypeptide	YES
NOX3	NADPH oxidase 3	NO
NOX1	NADPH oxidase 1	NO
CCL22	chemokine (C-C motif) ligand 22	NO
NCF1	neutrophil cytosolic factor 1	NO
CYBA	cytochrome b-245, alpha polypeptide	NO
NCF4	neutrophil cytosolic factor 4, 40kDa	NO
PTPRA	protein tyrosine phosphatase, receptor type, A	NO
AOAH	acyloxyacyl hydrolase (neutrophil)	YES
RELB	v-rel avian reticuloendotheliosis viral oncogene homolog B	YES
TLN1	talin 1	NO
LGALS3	lectin, galactoside-binding, soluble, 3	NO
ITGB7	integrin, beta 7	NO
EZR	ezrin	NO
CD5L	CD5 molecule-like	YES

Table 2: Table2 shows the VCAM1 network genes functions and its appearance in network and genome.

Feature	FDR	Genes in network	Genes in genome
Oxidoreductase activity, acting on NAD(P)H, oxygen as acceptor	9.81E-06	4	12
superoxide anion generation	2.35E-05	4	17
respiratory burst	4.37E-05	4	22
Oxidoreductase complex	4.37E-05	5	68
antigen processing and presentation of exogenous peptide antigen via MHC class I, TAP-dependent	5.77E-05	5	75
antigen processing and presentation of exogenous peptide antigen via MHC class I	5.87E-05	5	78
superoxide metabolic process	1.00E-04	4	32
antigen processing and presentation of peptide antigen via MHC class I	1.13E-04	5	94
phagosome maturation	2.66E-04	4	43
phagocytic vesicle	8.14E-04	4	58
antigen processing and presentation of exogenous peptide antigen	1.41E-03	5	166
antigen processing and presentation of exogenous antigen	1.50E-03	5	171
endocytic vesicle	1.55E-03	5	175
antigen processing and presentation of peptide antigen	1.56E-03	5	178
membrane docking	1.61E-03	3	20
oxidoreductase activity, acting on NAD(P)H	2.61E-03	4	87
interaction with host	2.82E-03	4	90
antigen processing and presentation	3.00E-03	5	216
reactive oxygen species metabolic process	3.00E-03	4	94
leukocyte cell-cell adhesion	6.26E-03	3	34
microvillus	6.52E-03	3	35
filopodium	8.66E-03	3	39
inflammatory response	9.02E-03	5	282
cell adhesion molecule binding	1.79E-02	3	51
electron carrier activity	3.74E-02	3	66
focal adhesion	6.89E-02	3	82
cell-substrate adherens junction	7.13E-02	3	84
cell-substrate junction	7.90E-02	3	88
FDR: False discovery rate is greater than or equal to the probability that this is a false positive

Table 3: Table3 shows the SNPs predicted by Polymirt to induce disruption of conserved miRNA site or formation of new mirRNA binding site:

Location	dbSNP ID	Variant TYPE	Wobble base pair	Ancestral Allele	Allele	miR ID	Conservation	miRSite	Function Class
101203871	rs201567943	SNP	Y	A	A	hsa-miR-3121-3p	12	gacaCTATTTAtc	D
101203871	rs201567943	SNP	Y	A	G	hsa-miR-6733-3p	7	GACACTGtttatc	C
101203881	rs3783617	SNP	Y	A	G	hsa-miR-507	2	atctGTGCAAAtc	D
					G	hsa-miR-557	2	atctGTGCAAAtc	D
					C	hsa-miR-4529-3p	2	atctGTCCAAAtc	C
					C	hsa-miR-7850-5p	2	atctGTCCAAAtc	C
101203884	rs201794763	SNP	Y	A	A	hsa-miR-507	2	tGTGCAAAtcctt	D
						hsa-miR-541-5p	5	tgtgcaAATCCTT	D
						hsa-miR-557	2	tGTGCAAAtcctt	D
					G	hsa-miR-3130-3p	2	tGTGCAGAtcctt	C
					G	hsa-miR-4793-3p	2	tGTGCAGAtcctt	C
101203949	rs41287272	SNP	N	G	G	hsa-miR-181a-5p	8	cTGAATGTAttga	D
						hsa-miR-181b-5p	8	cTGAATGTAttga	D
						hsa-miR-181c-5p	8	cTGAATGTAttga	D
						hsa-miR-181d-5p	8	cTGAATGTAttga	D
						hsa-miR-4262	8	cTGAATGTAttga	D
					C	hsa-miR-376a-5p	9	ctGAATCTAttga	C
					C	hsa-miR-6826-5p	21	ctgaatCTATTGA	C
101204118	rs185143902	SNP	N	T	T	hsa-miR-183-5p	9	aaattaTGCCATA	D
					C	hsa-miR-652-3p	9	aaattaCGCCATA	C
					C	hsa-miR-8074	9	aaattaCGCCATA	C
101204126	rs3783618	SNP	N	C	C	hsa-miR-31-3p	6	cCATAGCAagatt	D
					A	hsa-miR-4482-3p	6	ccATAGAAAgatt	C
					A	hsa-miR-7159-3p	6	cCATAGAAAgatt	C
101204167	rs3783619	SNP	Y	A	A	hsa-miR-302c-5p	2	atTGTTAAAataa	D
					A	hsa-miR-552-5p	3	attGTTAAAAtaa	D
					G	hsa-miR-1305	3	attGTTGAAAtaa	C
						hsa-miR-205-3p	21	attgtTGAAATAa	C
						hsa-miR-3065-5p	2	aTTGTTGAaataa	C
						hsa-miR-421	2	atTGTTGAAataa	C
						hsa-miR-7159-5p	2	aTTGTTGAAataa	C
101204189	rs17123276	SNP	N	C	C	hsa-miR-633	2	cttggACTATTAt	D
101204246	rs189902207	SNP	N	C	G	hsa-miR-6787-3p	7	AGCTGAGctttgt	C
101204246	rs189902207	SNP	N	C	G	hsa-miR-6854-5p	5	agCTGAGCTttgt	C
101204415	rs182563271	SNP	Y	A	G	hsa-miR-140-3p	3	caacCTGTGGTAt	C
101204463	rs3783620	SNP	Y	G	G	hsa-miR-3605-3p	10	tggtACGGAGAtg	D
101204463	rs3783620	SNP	Y	G	G	hsa-miR-3678-5p	11	tgGTACGGAgatg	D
101204495	rs3783621	SNP	N	C	C	hsa-miR-4294	2	cAGACTCCtgtgc	D
					C	hsa-miR-4433-3p	2	cagACTCCTGtgc	D
					A	hsa-miR-199a-3p	2	cagACTACTGtgc	C
						hsa-miR-199b-3p	2	cagACTACTGtgc	C
						hsa-miR-3129-5p	2	cagACTACTGtgc	C
101204545	rs3194002	SNP	N	C	T	hsa-miR-3613-3p	12	tatttTTTTTTGt	C
FuncClassD: The derived allele disrupts a conserved miRNA site (ancestral allele with support >= 2); C: The derived allele creates a new miRNA site; For more Column Label Descriptions see Appendix I

Table 4: Table 4 shows of nonsynonymous SNPs predicted with SIFT, polyphen, I-mutant and PHD-SNP programs, chosen SNPs with PSIC scores range (1-0999) and tolerance index equal 0.

SNP	Protein ID	Amino Acid Change	Sift Prediction	SIFT Tolerance Index	Polyphen predicted	Polyphen PSIC score	I-Mutant			PHD -snp
SNP	Protein ID	Amino Acid Change	Sift Prediction	SIFT Tolerance Index	Polyphen predicted	Polyphen PSIC score	SVM2 Prediction Effect	RI	DDG Value Prediction	Prediction	RI
rs138479988	ENSP00000294728	Y113C	DELETERIOUS	0	probably damaging	1	Decrease	1	-0.82	Neutral	3
rs143330690	ENSP00000359137	G295V	DELETERIOUS	0	probably damaging	1	Decrease	7	-0.68	Neutral	6
rs200949878	ENSP00000359133	G354R	DELETERIOUS	0	probably damaging	1	Decrease	1	-0.33	Disease	0
rs368917883	ENSP00000359137	G524E	DELETERIOUS	0	probably damaging	1	Decrease	4	-0.77	Disease	8
rs375517524	ENSP00000304611	Y611C	DELETERIOUS	0	probably damaging	1	Decrease	4	-1.1	Neutral	2
rs375517524	ENSP00000359133	Y504C	DELETERIOUS	0	probably damaging	1	Decrease	4	-1.1	Neutral	4
rs62638734	ENSP00000294728	T61I	DELETERIOUS	0	probably damaging	1	Decrease	0	-0.39	Neutral	4
SIFT Tolerance Index: Ranges from 0 to 1. The amino acid substitution is predicted deleterious if the score is <= 0.05, and tolerated if the score is > 0.05. PolyPhen-2 result: POROBABLY DAMAGING (more confident prediction) / POSSIBLY DAMAGING (less confident prediction), PSIC SD: Position-Specific Independent Counts software if the score is => 0.5, I mutant: Neutral mutation classification is: -0.5=<DDG=<0.5, while changes < -0.5 are classified as Large Decrease and changes > 0.5 are classified as Large Increase. I mutant and PHD – SNPRI (Reliability Index): 0–10, where 0 is the lowest reliability and 10 is the highest reliability

CONCLUSION

Functional and structural impact of SNPs in the VCAM1 gene was studied using computational prediction tools. Out of the total 1450 SNPs in Homo sapiens 254 were missense, 255 were non-synonymous SNPs (nsSNPs), 207 were synonymous SNPs, 53 were in the 3’un-translated region and 23 were in the 5’untranslated region.

In order to make effective use of genetic diagnosis, the predicted harmful SNPs in the VCAM1 gene are better be well known and available to the diagnostic services and molecular biology laboratories to ensure accurate diagnosis for the associated diseases which can also lead to successful intervention which dependent on finding the cause or causes of a problem. Based on this work, we predicted that (rs529512508, rs41287272, rs200949878, rs368917883, rs138479988, rs143330690) SNPs are important candidates for the cause of different types of human diseases caused by VCAM1 gene.

ACKNOWLEDGEMENTS

This study was supported by the University of Khartoum, Faculty of Medical Laboratory Sciences, Department of Haematology in part of PhD grant of Student Tyseer Alabid.