Kainov YA, Aushev VN, Naumenko SA, Tchevkina EM, Bazykin GA (2016) Complex Selection on Human Polyadenylation Signals Revealed by Polymorphism and Divergence Data. Genome Biol Evol 8: 1971-9

Polyadenylation is a step of mRNA processing which is crucial for its expression and stability. The major polyadenylation signal (PAS) represents a nucleotide hexamer that adheres to the AATAAA consensus sequence. Over a half of human genes have multiple cleavage and polyadenylation sites, resulting in a great diversity of transcripts differing in function, stability, and translational activity. Here, we use available whole-genome human polymorphism data together with data on interspecies divergence to study the patterns of selection acting on PAS hexamers. Common variants of PAS hexamers are depleted of single nucleotide polymorphisms (SNPs), and SNPs within PAS hexamers have a reduced derived allele frequency (DAF) and increased conservation, indicating prevalent negative selection; at the same time, the SNPs that \"improve\" the PAS (i.e., those leading to higher cleavage efficiency) have increased DAF, compared to those that \"impair\" it. SNPs are rarer at PAS of \"unique\" polyadenylation sites (one site per gene); among alternative polyadenylation sites, at the distal PAS and at exonic PAS. Similar trends were observed in DAFs and divergence between species of placental mammals. Thus, selection permits PAS mutations mainly at redundant and/or weakly functional PAS. Nevertheless, a fraction of the SNPs at PAS hexamers likely affect gene functions; in particular, some of the observed SNPs are associated with disease.

Pubmed: 27324920