Molecular modelling analyses of the C-type lectin domain in human aggrecan
Models, Molecular
0301 basic medicine
Extracellular Matrix Proteins
Protein Structure, Secondary
03 medical and health sciences
Mannose-Binding Lectins
Chondroitin Sulfate Proteoglycans
Lectins
Humans
Computer Simulation
Lectins, C-Type
Proteoglycans
Aggrecans
Carrier Proteins
E-Selectin
Software
DOI:
10.1042/bst024099s
Publication Date:
2015-08-11T13:24:04Z
AUTHORS (2)
ABSTRACT
Aggrecan is the major proteoglycan of the extracellular matrix in cartilage. In the aggregated form, it plays a key role along with collagen in the maintenance of the tensile and elastic properties of cartilage [l]. The C-terminal region G3 of a group of proteoglycans that includes aggrecan, versican, neurocan, and brevican contains a carbohydrate recognition domain (CRD). The proteoglycan CRD forms Group I of the Ca2+dependent (C-type) animal lectin superfamily [2]. To date, crystal structures of the Group Ill and Group IV CRDs (mannose-binding protein (MBP) and E-selectin) have been reported [3,4]. Both structures are highly similar. However the CRDs of Group 1 exhibit distinct sequence differences from those of the Group 111 and IV CRDs. It is not possible to conclude that the Group I CRDs are typical C-type ledins on the basis of sequence similarity alone, and there is no evidence to show whether the protein structures of the Group I CRDs can be correlated with those in Groups Ill and IV. Analysis of the carbohydrate binding specificity of the G3 CRD in aggrecan shows that this preferentially binds galactose [5,6], unlike the Group 111 and Group IV CRDs which bind to mannose and sialyl Lewis x tetrasaccharide respectively. We have performed structural analyses in order to compare Group I with Groups 111 and IV. These will indicate the extent to which the Group I sequences are compatible with the known crystal structures, and may provide information on the carbohydrate specificity of the Group I CRDs. A total of 129 CRD sequences were extracted from Release 15.0 (February 15,1995) of ENTREZ, the CD-ROM document retrieval system. The sequences were aligned using the multiple alignment program MULTAL with a range of fixed and variable gap penalties (see [7,8] and references therein for details of the structural analyses). Final refinement of the alignment was carried out by hand to maximise the occurrence of conserved or chemically similar residues and to minimise gaps. Even though the 129 sequences yielded a satisfactory alignment, residue conservation in the CRD superfamily is not high. The consensus length of the CRD alignment is 136 residues. Only 32% (44) of these residues were conserved or conservatively replaced in at least half of the 129 sequences, only 19% were conserved in at least 70% of sequences, and only 7% were conserved more than 90%. The sequences can be classified into two groups, the 'short' CRDs which includes Group Ill and IV with 4 conserved Cys residues, and the 'long' CRDs which includes Group I and contain at least 6 conserved Cys residues. The secondary structures observed in five Group 111 and IV CRDs were analysed using DSSP and visualised using INSIGHT II (Biosym Technologies plc). A total of 2 ahelices and 7 pstrands were consistently identified by DSSP, and totalled 15% a h l i x and 21-26% pstrand. The observed structures were compared with averaged secondary structure predictions using the GOR I, GOR 111, Chou-Fasman, PHD and SAPIENS methods. The averaged predictions from the sequence alignment for all 129 sequences, and from those for the long and short CRD sequences showed good agreements with the observed secondary structures of MBP and E-selectin. Both a-helices were identified, together with six of the seven pstrands. Interestingly, the P-strand that was not predicted corresponds to the region of the Ca" binding site in the two uystal structures. This evidence suggests that the CRDs from Group I show a high degree of structural similarity with those of the Group Ill and IV. In fold recognition analyses, sequence threading was performed for all 129 CRD sequences against a library of 254 known protein folds by use of the THREADER program. The averaged pairwise energy scores from the use of 17 Group 111 and 13 Group IV sequences had significantly high mean Zscores of 3 .2 and -3.0 respectively when compared with the MBP fold. By comparison, the use of 9 Group I sequences have a weakened mean Z-score of -1.9 with the MBP fold, which is at the threshold of structural similarity with this fold. This was consistently observed with all the long CRD sequences in THREADER. A high Z-score for the CD69 CRD sequence which had been reported in the course of molecular graphics modelling of the Group V CRD [9] was not observed. Although the secondary structure predictions favour a close structural similarity between Groups I , Ill and IV, the THREADER scores show that this is not guaranteed. In order to complete the secondary structure and fold recognition analyses, molecular graphics homology modelling of the G3 CRD structure was performed using the INSIGHT II and HOMOLOGY programs. The G3 CRD sequence was readily overlaid with those of MBP and E-selectin on the basis of the multiple sequence alignment. Insertions or deletions all occurred at the protein surface and were 1-3 residues in size. A G3 model was readily generated, and is consistent with the possible similarity of the Group I structure with those of Groups Ill and IV. Regions of interest in G3 include the Ca" coordination site and the galactose binding site. Residues EPN of the Ca2+ coordination site in the Group IlWlV crystal structures could be readily altered to QPD in G3. The residues NRQKD which follow immediately after EPN in E-selectin become NFFAAG in G3, and these are notably more hydrophobic with an extra residue when compared with E-selectin and MBP. In summary, evidence from protein structure analyses suggest that the Group I, Ill and IV CRDs have similar protein folds. The modelling may indicate possible reasons for the differences in carbohydrate specificity between the three CRDs.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (2)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....