Introduction

The RNA World Hypothesis posits that RNA was the first major biological molecule of life (Gesteland et al. 2006; Gilbert 1986; Joyce 1996; Orgel 1986). This hypothesis hinges on the ability of ribonucleotides to form polymers long enough to perform catalytic functions including self-replication and to preserve genetic information. These functions are highly dependent not only upon the length of the polymers, but also on their sequences, which determine their suitability for storing and replicating information and their ability to form stable, secondary structures that govern their molecular and chemical interactions with each other and with their environment (Kraut et al. 2011; Scott et al. 2013). To date, studies of abiotic RNA polymerization generally have focused on routes to polymerization and the length of the polymerization products (Huang and Ferris 2006; Costanzo et al. 2009; Costanzo et al. 2012; Ferris and Ertem 1993; Ferris 2005, 2006; Horowitz et al. 2010; Hud et al. 2007; Monnard et al. 2003; Monnard 2009; Morasch et al. 2014). Far fewer reports have examined the selectivity of the reaction toward incorporation of some nucleotides over others in the growing polymer in nucleotide mixtures (one example is the polymerization of imidazole-activated nucleotides in ice eutectic phases (Monnard et al. 2003)). Such information is an essential step toward understanding the chemical evolution of RNA on early Earth.

In the present work, we investigate the nucleotide selectivity of abiotic polymerization using the well-established reaction of imidazole-activated ribossnucleotides in the presence of homoionic sodium montmorillonite clay. Montmorillonite clay is a phyllosilicate that plausibly was present on early Earth and has been found on Mars (Cuadros and Michalski 2013; Joshi et al. 2015). We first performed reactions of individual, activated nucleotides with one or more unactivated nucleotides. Incorporation of an unactivated nucleotide terminates the polymerization, allowing us to study selectivity of the growing homopolymer toward incorporation of particular heteronucleotides over others. Next, we investigated polymerization in mixtures of multiple, pre-activated nucleotides. Finally, polymerization reactions were performed using in situ imidazole activation in the polymerization reaction itself (Burcar et al. 2015), thereby avoiding the need for preactivation of the nucleotides via organic (non-aqueous) synthesis. The in situ approach is a more realistic scenario than the use of nucleotides preactivated via organic synthesis in non-aqueous solvents. Additionally, the in situ approach avoids complications associated with relative stability of the different preactivated nucleotides during storage prior to their use in the polymerization reactions, thereby leveling the playing field and providing a more accurate view of their relative reactivities during the concurrent activation and polymerization reactions. Nucleotides used in this work included adenosine-5″-monophosphate (AMP), cytidine-5″-monophosphate (CMP), guanosine-5″-monophosphate (GMP), uridine-5″-monophosphate (UMP), which are found in modern RNA, and inosine-5″-monophosphate (IMP), which also has attracted interest in studies of prebiotic chemistry (Koslov and Orgel 1999)). The extent of polymerization and the nucleotide composition of the polymerization products were determined using Matrix Assisted Laser Desorption Ionization Mass Spectrometry (MALDI MS).

Materials and Methods

Materials

The stainless steel AnchorChip var. /387 and the low molecular weight (LMW) mass spectrometry calibration standard (comprising 4-mer, 5-mer, 7-mer, 9-mer, and 11-mer oligonucleotides) were obtained from Bruker Daltonics (Billerica, MA, USA). The 2,4,6-trihydroxyacetophenone (THAP), ammonium citrate dibasic (AHC), disodium adenosine-5″-monophosphate (AMP), disodium guanosine-5″-monophosphate (GMP), disodium uridine-5″-monophosphate (UMP), disodium cytidine-5″-monophosphate (CMP), disodium inosine-5″-monophosphate (IMP), Dowex 50WX8 hydrogen form cation exchange resin, RNase free sodium chloride, acetonitrile, and imidazole were obtained from Sigma-Aldrich (St. Louis, MO, USA). C18 ZipTips (Ref ZTC18S960) were purchased form Millipore (Billerica, MA, USA). Montmorillonite clay (MMC) in the form of volclay SPV-200 was a gift from The American Colloid Company (Arlington Heights, IL, USA). Ultrapure, 18 MΩ cm-1 deionized water was used for all solution preparations and experiments.

Nucleotide Activation

Imidazole was attached to the 5′ phosphate of the nucleotides in the free acid form according to published procedures (Joshi et al. 2013). Only AMP was commercially available as the free acid. The disodium forms of GMP, UMP, and IMP were converted to the free acid form by passing the salt through a Dowex 50WX8 cation exchange resin. GMP sometimes formed a gel when added to the column, due to the tendency of GMP to self-assemble through Hoogsteen-hydrogen bonding into G-tetrad structures (Chantot et al. 1971; Gellert et al. 1962; Pieraccini et al. 2003). This effect produced a thick, white gel layer forming along the walls of the column and on top of the resin, and the solution was no longer able to move through the resin despite the added pressure. In such cases, the column was tilted back and forth, disturbing both resin and gel phase, to allow solution to flow through the column again. The activated products are referred to as ImpX, where X = A, C, G, U or I for AMP, CMP, GMP, UMP, and IMP, respectively. All of the activated nucleotides were stored in a desiccator at 2–6 °C.

Over the course of numerous experiments we found that the ImpX products varied in their stability, in the order ImpA > ImpG, ImpI > ImpC, ImpU. ImpA was still active after 6 months of storage. ImpG and ImpI lasted 1–2 weeks before needing to increase the amount of ImpX necessary to drive the reaction and were completely degraded within 1–2 months. ImpC and ImpU rarely lasted more than a week, and their overall activity tended to be lower than the other nucleotides. In order to minimize the effects of these instabilities, the purines generally were used within a week of their preparation (or longer for ImpA), while the pyrimidines were used within a day of preparation.

Polymerization Reactions

Solutions for the polymerization reactions were prepared by dissolving the appropriate amounts of ImpX and/or XMP in aqueous 0.1 M NaCl to a total nucleotide concentration of 15 mM. Each reaction was performed with and without catalytic clay. For the reactions with clay, 100 μL of the nucleotide solution were added to 5 mg of homoionic Na-montmorillonite clay that was prepared according to the Banin process (Banin et al. 1985). In all cases, the solution was vortexed to form a homogenous mixture, followed by 2–3 days of continuous agitation at room temperature for the polymerization reaction. The solution was then centrifuged for 15 min to precipitate the clay. The supernatant was de-salted using Millipore 0.6 μL C18 ZipTips (Millipore) following strict protocols according to the manufacturer”s guidelines. Each polymerization reaction was performed at least 3 times.

MALDI-TOF MS Analysis of the Polymerization Products

MALDI-TOF MS analysis for reactions of pre-activated nucleotides was performed on either a Bruker Autoflex II or a Bruker Autoflex III (Bruker Daltonics), while the reactions using in situ activation were performed on a Bruker Autoflex Speed. The MALDI matrix was prepared on the day of the analysis by dissolving 2 mg THAP in 200 μL of a solution containing 12 mg/mL AHC aqueous 50% acetonitrile. AHC served as a matrix dopant to reduce salt adduct formation. Equal volumes of matrix and analyte were mixed, and 2 μL of the mixture was plated on the AnchorChip and allowed to dry.

All analyses were performed using negative ionization. The Autoflex II settings were reflectron mode, ion source 1, 19.00 kV; ion source 2, 16.85 kV; lens, 8.5 kV; reflector, 20.0 kV; pulsed ion ext., 80–400 ns; relative laser desorption between 40 and 50%. The Autoflex III settings were reflectron mode, ion source 1, 19.00 kV; ion source 2, 16.85 kV; lens, 8.76 kV; reflector, 20.00 kV; pulsed ion ext., 80 ns; relative laser desorption between 30 and 40%. The Autoflex Speed settings in linear mode were ion source 1, 19.50 kV; ion source 2, 18.40 kV; lens, 7.80 kV, pulsed ion ext., 130 ns; relative laser desorption between 60 and 80%. The settings in reflectron mode were: ion source 1, 19.00 kV; ion source 2, 16.70 kV; lens, 8.00 kV; reflector, 21.00 kV; reflector 2, 9.60 kV; pulsed ion ext., 130 ns; relative laser desorption between 70 and 80%. For all analyses, 4 sets of 500 laser shots each were collected and summed, for a total of 2000 laser shots per sample. Spectra were smoothed using the Savitsky-Golay polynomial regression algorithm with a width of 0.2 m/z, and a baseline subtraction was performed on all spectra.

The MALDI MS spectra were examined from ~m/z 900 forward for detection and identification of polymerization products. Lower m/z are complicated mixtures of monomer and dimer peaks, MALDI matrix peaks and other small molecular species that make assignments difficult. Peak assignment was performed as previously described (Burcar et al. 2013). The LMW oligonucleotide standard was used as an external calibrant and was run prior to all analyses. Once external calibration was performed, the homopolymeric products for each polymer were used as internal calibrants as long as their m/z values were within ±0.5 m/z of their theoretical values. The mass differences ± m/z 0.5 between peaks indicate the addition of a monomer of the corresponding nucleotide to the growing polymer. It should be noted that only singly charged nucleotide species are thought to be detected by MALDI-TOF MS (Zagorevskii et al. 2006; Costanzo et al. 2012; Burcar et al. 2013).

Results

Reactions of a Single ImpX or XMP

In the absence of Montmorillonite clay (MMC), polymerization was absent or negligible (up to trimers were occasionally observed for the purines AMP, GMP and IMP). No polymerization products were detected for unactivated XMP in the absence or presence of MMC. In the presence of MMC, all of the individual ImpXs yielded polymers. The extent of polymerization as detected by MALDI MS depended upon the instrument used as well as batch-to-batch variations in the ImpX and MMC preparations. In general, over the course of these experiments we observed a tendency to detect longer polymers, on average, for the purines relative to the pyrimidines. Peaks at –m/z 18 from linear polymer peaks are attributed to circular polymers that lose a water molecule upon condensation of the linear strand to form the circular Peaks at +m/z 18 or multiples of 18 are attributed to aggregates of linear polymers or aggregates of linear and circular polymer (Burcar et al. 2013).

Reactions of Mixtures of Individual ImpX with One or More XMP

Polymerization reactions were performed for each of the ImpX with one or more unactivated nucleotide (XMP). Note that ImpI, but not unactivated IMP, was included in these studies. No polymerization products were detected for ImpX/XMP mixtures in the absence of MMC. For ImpX/XMP reactions containing MMC, the addition of XMP instead of ImpX to a growing strand terminated the polymerization, because XMP lacks the imidazole group needed for further nucleotide addition (Supplementary Materials Fig. S1). As a result, the termination reactions in the presence of XMP generally decreased the extent of polymerization of the ImpX.

MALDI mass spectra for the ImpA/XMP reactions are shown in Figs. 1, 2, 3 and 4, including the full spectra from trimer forward (Fig. 1) and specific spectral regions on an expanded m/z scale to better illustrate the peak assignments (Figs. 2, 3 and 4). Analogous figures for the other ImpX are shown in Supplementary Materials Figs S2-S13. For ImpA/XMP reactions, the presence of XMP decreases the extent of polymerization (Fig. 1). The homopolymerization of ImpA was highly favored and XMP incorporation was observed only for GMP (Figs. 2, 3 and 4). When only GMP was present, the (n-1)A,1G heteropolymer peak was slightly more intense than the nA homopolymer peak (Figs. 2 and 3), but decreased relative to the nA homopolymer peak as the number of different XMP in the mixture increased (Fig. 4).

Fig. 1
figure 1

MALDI MS spectra of reaction products for mixtures of ImpA with each of the individual XMP, showing full spectrum from trimers forward. Boxed area shows pentamer region that is shown on an expanded scale in Fig. 2

Fig. 2
figure 2

MALDI MS spectra of reaction products in the pentamer region for mixtures of ImpA with each of the individual XMP

Fig. 3
figure 3

MALDI MS spectra of reaction products in the trimer region for mixtures of ImpA with each of the individual XMP

Fig. 4
figure 4

MALDI MS spectra of reaction products in the tetramer region for mixtures of ImpA with multiple XMP. (b) ImpA, CMP, GMP. (b) ImpA, CMP, GMP, UMP. (c) ImpA, AMP, CMP, GMP, UMP

For ImpG/XMP reactions, the extent of polymerization significantly increased in the presence of AMP and decreased very slightly in the presence of CMP or UMP (Supplementary Materials Fig. S2). Terminal incorporation of all XMP was detected, but to a much greater extent with AMP than CMP or UMP (Supplementary Material Fig. S3). In contrast to ImpA/GMP, the nG,1A heteropolymer peak increased relative to the polyG homopolymer peak as the number of different XMP in the mixture increased, and heteropolymerization with CMP or UMP was suppressed in the presence of AMP and with increasing numbers of XMP in the mixture (Supplementary Materials Fig. S4).

For ImpI/XMP reactions, the extent of polymerization was not significantly affected by the presence of AMP and decreased slightly in the presence of the other XMP (Supplementary Materials Fig. S5). Incorporation of all of the XMP except UMP is observed both the tetramer region (Supplementary Materials Fig. S6) and pentamer region (not shown). The extent of incorporation is much greater for AMP than for the other XMP. As shown in Supplementary Materials Fig. S7, the presence of multiple XMPs completely suppresses incorporation of pyrimidines into polyI and greatly reduces incorporation of AMP, while GMP is unaffected by the presence of other XMP.

For both ImpC and ImpU reactions, the presence of XMPs greatly reduced the extent of polymerization (Supplementary Materials Figs S8 (ImpC) and S11 (ImpU)). Incorporation of AMP or GMP was so strongly favored that little or no homopolymerization was detected (Supplementary Materials Figs S9 and S10 (ImpC) and S12 and S13 (ImpU).

These results indicate that, weighing the competition between ImpX homopolymerization and incorporation of XMP, homopolymerization is favored over XMP incorporation in the order ImpA > ImpI > ImpG > > ImpC, ImpU. Among the different XMP, terminal incorporation into the growing ImpX homopolymers occurs to a much greater extent for the purines (AMP, GMP) compared to the pyrimidines (CMP and UMP). The presence of multiple XMPs tends to decrease less favored reactions relative to the most favored reaction. Since the reactions were not performed using unactivated IMP, these experiments did not inform us about selectivity toward IMP addition.

Reactions of Mixtures of ImpX

Having established through the ImpX/XMP polymerization experiments that there is selectivity among the different XMP for terminal incorporation into the growing homopolymers, we proceeded to perform polymerization reactions of mixtures of ImpX. All possible combinations of 2, 3, 4 and 5 of the five ImpX, although not all provided reproducible or significant results. Since separation of molecules in MALDI MS is based on m/z, only base composition but not sequence of the polymerization products could be obtained. As discussed above, assignments of A vs. I or C vs. U in the polymerization products of mixtures containing both could not always be made since these nucleotides differ by only 1 m/z.

As was observed for the ImpX/XMP termination reactions, the purines dominated over the pyrimidines in the reaction products and, among the purines, ImpA dominated over ImpG and ImpI. For mixtures of two ImpX, trimer products generally favored homopolymerization of the dominant nucleotide over heteropolymerization (Fig. 5, Supplementary Fig. 14), although to only a small extent for ImpA with ImpG or ImpU (Supplementary Fig. 14). For longer products of two ImpX reactions (shown for tetramers in Figs. 6 and 7 (top)), the heteropolymer peak for (n-1)Y,1Z, where Y is the dominant nucleotide and Z is the second-most dominant nucleotide, became at least as intense as the nY homopolymer peak for all ImpX, with the exception of ImpA with the purines (see for example Supplementary Materials Fig. S14 (middle) vs. Fig. 6 (top)). This is consistent with the results of the ImpX/XMP reactions, in which the tendency of ImpA to favor homopolymerization over pyrimidine incorporation increased with polymer length, while ImpG tended to favor increasing incorporation of pyrimidines with increasing polymer length. The reaction of the two pyrimidines was uninformative, yielding only a single peak in the MALDI mass spectrum (not shown); however, the pyrimidines behaved similarly to each other in mixtures of one or the other with a purine.

Fig. 5
figure 5

MALDI MS spectra of reaction products in the trimer region for mixtures of two ImpX. (a) ImpG, ImpU. (b) ImpG, ImpC

Fig. 6
figure 6

MALDI MS spectra of reaction products in the tetramer region for mixtures of two ImpX

Fig. 7
figure 7

MALDI MS spectra of reaction products in the tetramer region for mixtures of multiple ImpX

Spectra in the tetrameric region of mixtures containing three ImpX are shown in Fig. 7. The absence of ImpA in a mixture allowed for a much greater diversity of reaction products (see ImpA/ImpG/ImpC vs. ImpI/ImpG/ImpC in Fig. 7). Heteropolymeric tetramers containing three different XMP were detected for the ImpG/ImpI/ImpC and ImpG/ImpA/ImpC mixtures - all combinations of G,I,C tetramers (i.e., 1G,1I,2C, 1G,1I,2C, 1G,1I,2C) were observed in the former, while only 2A,1G,1C was observed in the latter, again illustrating the greater diversity of products in the absence of ImpA.

Polymerization Reactions Using In Situ Activation of XMP

To more closely approximate early Earth conditions in which nucleotides would be activated in situ in an aqueous environment rather than pre-activated in dry, organic solvents, an in situ method of imidazole activation previously developed in our group (Burcar et al. 2015) was employed. The reaction scheme is shown in Fig. 8. The instrument used to analyze the polymerization products of the in situ experiments provided better resolution than the instruments used in the other experiments and improved our ability to distinguish AMP from IMP, and CMP from UMP. The MALDI MS spectra of the in situ activation reaction products generally were more complex than those for the reactions of preactivated ImpX due to greater contributions from reaction side products, as previously reported (Burcar et al. 2015). These side products are likely due to the presence of reagents and metals in the in situ reactions that would have been removed from the pre-activated ImpX prior to use. While the side products have not yet been identified, they could be explained by inclusion of metals, such as aluminum or magnesium that are present in the MMC, in polymer aggregates.

Fig. 8
figure 8

Reaction scheme for in situ activation of XMP

Consistent with the previous report of in situ XMP activation (Burcar et al. 2015), polymerization was detected for all XMP in solutions of the individual nucleotides (Supplementary Materials Figs S15). The extent of polymerization among the different nucleotides for in situ activation decreased in the order A, I > G, C, U, which is similar to the previous report that showed A > G, U > C.

MALDI MS spectra for the full m/z range for in situ activation in the polymerization reactions for all ten combinations of two XMP are shown for the full spectral range in Supplementary Materials Figs S16 and S17, and for the expanded m/z axis in the trimer region in Figs. 9, 10 and 11. All homopolymers and heteropolymer combinations were observed (with the caveat that there remained some ambiguity in AMP vs. IMP and UMP vs. CMP assignments), with the exception of the GMP/AMP mixture for which the polyG homopolymer was not detected. The polymerization products generally favored nucleotides in decreasing order of A > I > > G > C,U. In the trimer region, with the exception of AMP, for which the homopolymer was always the most intense peak in the spectrum, the (n-1)Y,1Z heteropolymer peak was at least as intense as the nY homopolymer peak of the dominant nucleotide (Y). This is in contrast to the preactivated ImpX mixtures, for which nY tended to dominate in the trimers and dominance of (n-1)Y,1Z over nY was observed only for tetramers and beyond.

Fig. 9
figure 9

MALDI MS spectra of reaction products in trimer region for mixtures of two in situ-activated XMP

Fig. 10
figure 10

MALDI MS spectra of reaction products in trimer region for mixtures of two in situ-activated XMP

Fig. 11
figure 11

MALDI MS spectra of reaction products in trimer region for mixtures of two in situ-activated XMP

MALDI MS spectra in the trimer region for all combinations of three XMP are shown in Supplementary Materials Figs S18–20. With the exception of the CMP/UMP/IMP mixture, which gave no significant peaks, all homopolymers and most of the heteropolymer combinations of two XMP generally were detected, again with the caveat of occasional ambiguity of AMP vs. IMP and UMP vs. CMP assignments. The trends were the same as for the mixtures of two XMP. In addition, heteropolymers containing all three nucleotides in the mixture were observed for AMP/CMP/GMP, AMP/CMP/IMP and AMP/GMP/UMP mixtures

MALDI MS spectra for all combinations of four XMP (Fig. 12) and for the mixture of all five XMP (Fig. 13) continued the same trends and exhibited a general decrease in the extent of polymerization and peak intensity with increasing numbers of different XMP in the mixtures. The IMP/GMP/CMP/UMP mixture gave no significant peaks, which highlights the decrease in extent of polymerization in the absence of AMP. No polymers containing more than three different XMP were observed, and the only polymer containing three different XMP was 1A,1C,1G, which was detected in the AMP/CMP/GMP/UMP and AMP/CMP/GMP/IMP/UMP mixtures. This supports the decrease in the diversity of polymerization products as the number of nucleotides in the mixture increases.

Fig. 12
figure 12

MALDI MS spectra of reaction products in trimer region for mixtures of four in situ-activated XMP

Fig. 13
figure 13

MALDI MS spectra of reaction products in trimer region for a mixture of five in situ-activated XMP

Discussion

While variables not considered here, such as nucleotide activation mechanisms, catalytic processes, physicochemical microenvironments, mineralogy and nucleotide availability, would undoubtedly contribute to the composition of RNA polymerization products on early Earth, the results of the present studies suggest that the nucleotides themselves exhibit selectivity that helps to direct the polymerization reactions toward certain base compositions. The termination reactions established that nucleotide addition to a growing polymer strand is a selective process generally favoring addition of purines over pyrimidines, and especially favoring AMP over the other nucleotides.

The reactions of mixtures of preactivated ImpX or in situ activated XMP showed that the extent of polymerization is greatest in the presence of ImpA or AMP, respectively. The pyrimidines generally were less reactive than the purines, and heteropolymers containing both pyrimidines and purines tended to favor those with higher purine content. When ImpA/AMP was present it tended to dominate the polymerization products. In general, the heteropolymer peaks increased relative to the dominant homopolymer peak with increasing polymer length, except for mixtures of ImpA with either of the purines.

The pre-activated purines, which had significantly longer shelf-lives than the pre-activated pyrimidines, were the more reactive species. ImpA, which had the longest shelf-life, was also the dominant nucleotide in polymerization reactions, biasing reactions toward high AMP content and suppressing non-AMP containing products. ImpG and ImpI were less dominant, yielding to ImpA when present, and producing more diverse reaction products with each other and with pyrimidines in the absence of ImpA. ImpC and ImpU, the least stable of the pre-activated nucleotides, were strongly biased toward heteropolymerization with purines over homopolymerization or heteropolymerization with each other. These results suggest that selectivity toward purines, especially AMP, over pyrimidines in the polymerization reactions is related to the relative stabilities of their activated forms.

The mechanism that confers greater stability to the activated purines, especially ImpA, is unclear. It may be that the lowest energy structure of the activated nucleotides is one in which the nucleobase overlaps with the imidazole. In these cases, the double-ringed purine nucleobases have larger pi systems available for pi-pi stacking with the imidazole group compared to single-ringed pyrimidines. It is also possible that the high degree of pi-pi stacking among purines could impart greater stability.

Another consideration is that MMC might play a role in the selectivity of the polymerization reactions. The use of other minerals that were likely to have been available on early Earth could indicate if such a mineral-dependent bias does exist. The methods of activation and/or catalysis should also be considered: other polymerization methods that do not use imidazole or MMC have been investigated for the polymerization of nucleotides (Costanzo et al. 2009, 2012; Horowitz et al. 2010; Hud et al. 2007; Monnard et al. 2003; Monnard 2009; Morasch et al. 2014) and these methods might result in different selectivity toward inclusion of nucleotides. For example, the polymerization of ImpX mixtures performed in ice eutectic phases (Monnard et al. 2003) showed less preference toward purines, with decreasing activity of A > G, U > C. It is interesting that this is the same order previously reported for polymerization of individual, in situ activated nucleotides (Burcar et al. 2015) and similar to the results for mixtures of in situ activated nucleotides in this work. It is possible that performing the reactions either with in situ nucleotide activation in the polymerization reaction or at the low temperature of the ice eutectic phase affects the kinetics of the polymerization reaction and competing reactions such as nucleotide deactivation in a way that levels the effective activity of the nucleotides.

The abiotic polymerization reactions in this work do not account for the eventual selection of AMP, CMP, GMP and UMP for modern RNA, which is evidenced by the inclusion of non-canonical IMP and the lower reactivity of the pyrimidines. The inclusion of other chemicals and processes may elucidate a pathway by which the canonical nucleotides were selected. For example, the RNA World scenario does not preclude the participation of peptides or other biopolymers that could also exert selection pressures in RNA polymerization.