Depletion of Shine-Dalgarno sequences within bacterial coding regions Is expression dependent
Efficient and accurate protein synthesis is crucial for organismal survival in competitive environments. Translation efficiency—the number of proteins translated from a single mRNA in a given time period—is the combined result of differential translation initiation, elongation, and termination rates. Previous research identified the Shine-Dalgarno (SD) sequence as a modulator of translation initiation in bacterial genes, while codon usage biases are frequently implicated as a primary determinant of elongation rate variation. Recent studies have suggested that SD sequences within coding sequences may negatively affect translation elongation speed, but this claim remains controversial. Here, we present a metric to quantify the prevalence of SD sequences in coding regions. We analyze hundreds of bacterial genomes and find that the coding sequences of highly expressed genes systematically contain fewer SD sequences than expected, yielding a robust correlation between the normalized occurrence of SD sites and protein abundances across a range of bacterial taxa. We further show that depletion of SD sequences within ribosomal protein genes is correlated with organismal growth rates, supporting the hypothesis of strong selection against the presence of these sequences in coding regions and suggesting their association with translation efficiency in bacteria.