Information

Do antisense transcripts have different names than their sense strand transcripts?

Do antisense transcripts have different names than their sense strand transcripts?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I want to find which genes in the human genome can potentially be complementary to a transcript that could act as antisense transcript inhibtion? Are cis-NATs (naturally occuring anti-sense transcripts) considered different transcripts and named differently or do they have the same name as the sense transcript?


If the anti-sense transcript is correctly annotated and in the databases (a very very big if), then it will have a different name. For example, the mouse Msx1 anti-sense transcript has the RefSeq accession NR_027920 while the sense transcript of the same gene is NM_010835.

In general, each transcript that is transcribed from a given locus has a different accession. The same is true for alternatively spliced transcripts. If a given gene has 4 AS transcripts, each of these transcripts will have a different, unique accession. The human insulin receptor gene, for example, has two protein coding transcripts, each with its own accession: NM_000208 and NM_001079817.

You may be interested in a recent analysis of cis-NATs in 10 species[1] and also in NATsDB: Natural Antisense Transcripts DataBase[2].

  1. Osato N, Suzuki Y, Ikeo K, Gojobori T. 2007. Transcriptional Interferences in cis Natural Antisense Transcripts of Humans and Mice. Genetics, 176(2): 1299-1306, doi:10.1534/genetics.106.069484.
  2. Li JT, Zhang Y, Kong L, Liu QR, Wei L. 2008. Trans-natural antisense transcripts including noncoding RNAs in 10 species: implications for expression regulation. Nucleic Acids Research, 36(15):4833-44, doi:10.1093/nar/gkn470.

Conserved expression of natural antisense transcripts in mammals

Recent studies had found thousands of natural antisense transcripts originating from the same genomic loci of protein coding genes but from the opposite strand. It is unclear whether the majority of antisense transcripts are functional or merely transcriptional noise.

Results

Using the Affymetrix Exon array with a modified cDNA synthesis protocol that enables genome-wide detection of antisense transcription, we conducted large-scale expression analysis of antisense transcripts in nine corresponding tissues from human, mouse and rat. We detected thousands of antisense transcripts, some of which show tissue-specific expression that could be subjected to further study for their potential function in the corresponding tissues/organs. The expression patterns of many antisense transcripts are conserved across species, suggesting selective pressure on these transcripts. When compared to protein-coding genes, antisense transcripts show a lesser degree of expression conservation. We also found a positive correlation between the sense and antisense expression across tissues.

Conclusion

Our results suggest that natural antisense transcripts are subjected to selective pressure but to a lesser degree compared to sense transcripts in mammals.


Regulatory roles of natural antisense transcripts

Although classification and nomenclature are still in its early stages, the number of functionally annotated lncRNAs continues to rapidly expand. Indeed several subclasses have already been broadly characterized which include NATs [8], large intergenic ncRNA (lincRNAs) [9], and totally and partially intronic non-coding transcripts (TINs and PINs) [10]. Although lincRNAs have also come to prominence in recent years, particularly due to their association with various types of cancer ( table 1 ), a comprehensive description of the expanding fields of these and other types of lncRNAs is beyond the scope of this short review. Instead we will focus on the therapeutic potential of NATs, arguably the best characterized of the lncRNAs. In addition to the information presented here, we encourage the reader to refer to the following excellent articles, which further describe different themes within the diverse world of lncRNAs [11-14].

Tables 1

Recent examples of regulatory lncRNAs with potential roles in disease

Gene Disease Reference
NATsMALAT1Cancer[61]
ATXN80SSpinocerebellar ataxia[62]
ASFMR1Fragile × syndrome[63]
NAT-RAD18Alzheimer's disease[42]
PINK1-ASParkinson's disease, diabetes[34]
ANRILCancer, cardiovascular disease, diabetes[39]
BACE1-ASAlzheimer's disease[18]
NPPA-ASCardiovascular disease[35]
HTTASHuntington's disease[33]
p15ASCancer[41]
LincRNAHOTAIRCancer[64]
LincRNA-p21Cancer[65]
LincRNA-EPSAnemia[66]
LSINCT5Cancer[67]
CUDRCancer[68]

NATs are lncRNAs that are transcribed from the opposite DNA strand to sense (protein-coding) transcripts and overlap in part with sense RNA, sense promoter or other regulatory regions[2]. They can originate from coding or non-coding DNA, including genic, intergenic or intronic sequences, and display a number of similarities to mRNA such as 5′ capping, 3′ polyadenylation and alternative splicing [8]. The most common form of regulation seen for NATs is the pairing of one of these non-coding antisense RNAs with a partner sense transcript [2]. Results from the FANTOM3 project revealed at least 1000 sense-antisense transcript pairs that were well conserved between mouse and human, as well as many thousands considered to be non-conserved [15]. In 2010, Faghihi and colleagues examined the targeted knockdown of 797 evolutionary conserved NATs and found evidence of regulatory roles for a number of sense-antisense pairs [16]. Indeed, NATs have now been identified for a broad range of mammalian genes involved in various diverse processes such as cell growth and differentiation, development, metabolism, cardiovascular function and synaptic plasticity ( tables 1 and ​ and2). 2 ). Despite being processed in much the same way as mRNAs, NATs do not all display the same characteristics. For example, they can be polyadenylated on non-polyadenylated, or be localized within the nucleus or the cytoplasm. While they have generally been shown to be less abundant than sense transcripts[17], the expression level of a given antisense transcript can also vary based on cell type [18, 19]. Using a custom ncRNA array, Clark and colleagues very recently demonstrated that the half-lives of cis-acting antisense transcripts can range from three to ten hours, although the majority of those examined clustered at about five to six hours [20]. The stability of NATs also appears to be influenced by their cellular localization, with nuclear transcripts shown to be more unstable[20].

Table 2

Functional validation of natural antisense regulation in vitro.

Sense Designated Antisense Species Reference
Concordant RegulationPINK1PINK1-ASMouse, human[34]
BACE1BACE1-ASMouse, human[18]
iNOSiNOS-ASRat[28]
p53Wrap53Human[57]
HAS2HAS2-ASHuman[69]
WDR83DHPSHuman[70]
Discordant RegulationMsx-1Msx-1 AsMouse[71]
HIF1-aaHIF1Human[31]
CrxCrxOSMouse[72]
D5DReverse D5DHuman[73]
Rad18Nat-Rad18Rat[42]
p15p15ASHuman[41]
BOKBOK-ASHuman[74]
NPPANPPA-ASHuman[35]
SLC39A5ANKRD52Human[16]
CCPG1CCPG1-ASHuman[16]
RAPGEF3RAPGEF3-ASHuman[16]
Tie-1hAS-2Human[75]
BDNFBDNF-ASMouse, human[19]
GDNFGDNF-ASHuman[19]
EPHB2EPHB2-ASHuman[19]
Tbca13Tbca16Human[76]
TspoTspo-NatMouse[77]

So far, the regulatory roles of various forms of lncRNAs have generally been demonstrated through large-scale molecular screens, in which their activities have been disrupted using RNA interference technology. These studies have demonstrated that endogenous lncRNAs can act by repressing or promoting the expression of their target genes. This regulation has been seen to occur either is cis-, whereby the lncRNA originates within the same genetic locus as it target sense transcript, or trans-, whereby the lncRNA is derived at a distant chromosomal location from the gene on which it acts [16, 21]. Although they demonstrate a high degree of target specificity, the manner in which NATs regulate their corresponding paired sense transcripts is thought to be quite diverse, involving various suggested transcriptional and post-transcriptional mechanisms (for review see [8]). For example, a number of them have now been shown to interact with the promoter regions of their corresponding sense strand in cis-, and influence DNA methylation, or act as scaffolds for the targeted recruitment of histone modifying complexes that dictate its chromatin state [22]. Indeed the presence of RNA-binding motifs in many chromatin-modifying enzymes suggests that some NATs that are in low abundance within the nucleus can act locally to orchestrate histone modifications and thus mediate gene silencing [23]. So far, the best-characterized example of this is the recruitment of the gene-silencing polycomb recessive complex 2 (PRC2), which is known to induce trimethylation of the lysine 27 residue on histone H3 (H3K27me3), a mark of transcriptionally silent chromatin. RNA immunoprecipitation (RIP) combined with directional RNA sequencing revealed that the PRC2 complex associates with over 9000 RNAs in mouse embryonic stem cells, and that almost 3000 of these RNAs are NATs. In fact, PRC2 recruitment by one such NAT, the X (inactive)-specific transcript (Xist), can lead to the inactivation of the entire X-chromosome in humans through heterochromatinization [24], which can also have important implications for various X-linked diseases [25].

Alternatively, sequence complementarity of cis-acting NATs means that they can also regulate the expression of their paired sense RNAs through the formation of RNA duplexes, both in the nucleus and the cytoplasm. The formation of RNA-RNA duplexes between NATs and their partner sense transcript in the nucleus can lead to differential RNA splicing or reduced cellular availability of the sense RNA through nuclear retention [26, 27]. Furthermore RNA duplex formation in the cytoplasm has been shown to influence sense transcript stability through alterations in secondary structure or by masking other regulatory binding sites [18, 28]. RNA duplexes within the cytoplasm have also been shown to inhibit sense RNA translation between initiation and elongation stages [29].

Given this complex level of regulation, it is likely that altered cellular NAT expression could also contribute to a number of pathological states. This is supported by observations that some antisense transcripts, such as BACE1-AS, HSR and HIF1-AS, can be influenced by various cellular stressors [18, 30-32]. Furthermore NATs have now been identified for genes involved in various neurological conditions, such as Alzheimer's [18], Huntington's [33], and Parkinson's disease [34], as well as cardiovascular [35], and metabolic disorders [34]. Although it currently remains to be seen whether some of these transcripts are actually differentially expressed in patient populations, as we will discuss later, functional validation of these and other similar NATs may lead to the identification of novel drug targets to combat disease progression ( table 1 ).

While a number of mutations have been reported in various ncRNA loci [3, 36, 37], there is so far very limited evidence from human genetic studies to show that mutations or single nucleotide polymorphisms (SNPs) within genes encoding NATs can affect their expression levels or give rise to disease phenotypes. However, genome-wide association studies have demonstrated that the intergenic region encoding ANRIL, a large antisense transcript that mediates the transcriptional regulation of the tumor suppressor genes INK4b/ARF/INK4a at the INK4a locus [38], is associated with increased susceptibility to cancer, type 2 diabetes and coronary disease [39]. Indeed, SNPs within the ANRIL-encoding region that are linked with increased susceptibility to atherosclerosis, are associated with reduced ANRIL expression levels in purified peripheral blood T�lls. On the other hand, ANRIL is seen to be over-expressed in prostate cancer cells, inducing the silencing of tumor suppressor genes INK4b/ARF/INK4a, through the recruitment of PRC1 [38]. A further report has shown that ANRIL is also required for the PRC2-mediated silencing of the P15/INK4B tumor suppressor gene, and that disrupting its interaction with the PRC2 complex can inhibit cell proliferation [40]. Increased expression of a NAT for another tumor suppressor gene p15 (p15AS) was also observed in leukemia patients, which correlated with a marked decrease in p15 sense levels [41]. Here, the investigators showed that p15AS expression induced dimethylation of histone 3 lysine 4 (H3K4) and reduced dimethylation of histone 3 lysine 9 (H3K9) at the p15 promoter, resulting in persistent transcriptional silencing. While the cause of this p15AS upregulation is unknown, this is a clear example of how a differentially expressed NAT could function as a potential molecular marker for disease, as well as a novel therapeutic target.


Results

Prediction of Arabidopsis trans-NAT pairs

To identify trans-NATs in Arabidopsis, we first collected sequences of all Arabidopsis annotated genes and full-length cDNA transcripts, and grouped them into clusters according to their genomic locations. Here, a transcript cluster represented a group of all transcripts derived from the same gene or genomic locus. A genome-wide trans-NAT screen was carried out by searching for transcript cluster pairs sharing sequence complementarity with each other using the NCBI BLAST program. Two transcripts were considered as a trans-NAT pair if: they have partial or perfect sequence complementary regions that could form RNA-RNA duplexes the total length of all putative duplex regions of the two transcripts is longer than 50% of the length of the shorter transcript of the pair (high-coverage category) or the length of the longest putative duplex region of the two transcripts is greater than 100 nucleotides (nt 100 nt category). After removing previously reported cis-NATs and pairs formed by transcripts derived from annotated transposons and pseudogenes, a total of 1,320 trans-NAT pairs were identified within the Arabidopsis genome (Additional data file 1). Among them, 368 trans-NAT pairs belonged to the 'high-coverage' category, whilst the remaining 952 pairs were from the '100 nt' class (Table 1). The average length of the double-stranded pairing region of the 'high-coverage' class trans-NAT pairs is 571 nt, with a range between 75 and 2,628 nt. For the '100-nt' class trans-NAT pairs, the average pairing length is 258 nt, with a range between 100 and 1,621 nt.

RNA molecules are known to assume various three-dimensional structures to execute their biological functions or to interact with other molecules. To investigate whether two transcripts of a putative trans-NAT pair could indeed form a double-stranded RNA duplex, we used a hybrid program [23, 24] to inspect the melting structure of each trans-NAT pair in silico. The results show that the two transcripts of all predicted trans-NAT pairs in the high-coverage category and about 90% of the pairs in the 100 nt category could hybridize to each other and have extended duplex regions in their lowest energy melting forms, at least based on the in silico RNA hybridization model (see Materials and methods). Some trans-NAT pairs even had a double-stranded pairing region extending beyond the predicted area based on BLAST results (Figure 1).

Annealed structure of a trans-NAT pair (At4g19270::At1g56530). The annealed structure of two transcripts was predicted by the hybrid program. Transcript At4g19270 is shown as the upper strand from 5' to 3', whilst transcript At1g56530 is shown as the lower strand from 3' to 5'. The paired region obtained by the blast search result is shown in red.

Expression analysis of trans-NATs

Among the 1,320 trans-NAT pairs, 658 pairs were formed by two transcript clusters both of which had matching full-length cDNAs, 444 pairs had full-length cDNA support for one transcript, and the remaining 218 pairs were identified solely by comparing annotated gene sequences (Table 1).

For an RNA molecule to function as trans-NAT, it has to co-exist with its sense transcript in the same cell in order to form double-stranded RNA duplex. To check the possibility of co-expression of the putative trans-NAT pairs, we used the Arabidopsis public MPSS database to examine the expression profiles of transcripts in different tissues or under different growth conditions. The Arabidopsis public MPSS database contains 17 nt and 20 nt long expressed sequence tags of Arabidopsis transcripts from 17 different tissues or plants grown under different conditions. In this study, we first mapped all 17 nt and 20 nt MPSS tags to the Arabidopsis genome, and selected for further analysis only those tags that could be uniquely mapped to transcripts forming trans-NAT pairs. About 16% of trans-NAT pairs in the 'high-coverage' category and 28% of trans-NAT pairs in the '100 nt' category had corresponding MPSS tags for both transcripts, and another 32% and 45% trans-NAT pairs in the 'high-coverage' and the '100 nt' categories, respectively, had MPSS tags for one transcript (Table 2). For those trans-NAT pairs in which both transcripts had matching MPSS data, more than 85% were co-expressed in at least one tissue (Table 2), suggesting that the two transcripts of these trans-NAT pairs had the opportunity to form double-stranded RNA duplexes in vivo. The expression patterns of two trans-NAT pairs derived from the MPSS data are shown in Table 3 as examples. We note that, in most cases, the sense and antisense transcripts of a trans-NAT pair had comparable expression levels when expressed in the same tissue. No significant tissue bias was observed in the expression of trans-NAT pairs when comparing MPSS data from the 17 different libraries.

To further investigate the potential of putative trans-NAT pairs to form double-stranded RNA duplexes at the single cell level, we inspected the expression pattern of each trans-NAT pair in Arabidopsis root cells using publicly available in situ hybridization data (AREX database) [25]. Since the AREX database contains information only for annotated Arabidopsis genes, only 658 putative trans-NAT pairs for which both transcripts derived from annotated genes could be compared by this analysis. Among the 355 trans-NAT pairs with in situ hybridization data for both transcripts, mRNAs of both transcripts of 237 pairs (67%) were found in the same cell with comparable expression levels (Table 4), suggesting that the sense and antisense transcripts of these pairs have the opportunity to interact with each other in Arabidopsis root cells. Whether sense and antisense transcripts in the same cell might be present in different cellular compartments awaits future experimental investigations. A complete list of the 355 trans-NAT pairs with available in situ hybridization data is provided in Additional data file 2.

Functions of trans-NAT pairs

We used the Arabidopsis function assignment from the Gene Ontology (GO) consortium to analyze the biological functions of trans-NATs and observed a modest functional category bias. Transcripts from function classes with catalytic activity, signal transducer activity and transporter activity were slightly over-represented (Figure 2). Chi-square test results showed that the difference between transcripts of trans-NAT pairs verses those from the whole genome had a p value < 0.01 in all the above categories, indicating that the difference was statistically significant. A detailed gene function analysis using FuncAssociate [26] revealed that transcripts from several gene families or functional groups were over-represented in trans-NAT pairs, including transcripts of UDP-glycosyltransferase genes, and gene transcripts involved in cell wall biosynthesis, protein ubiquitination and responses to auxin stimulus (Table 5). By contrast, no enrichment in any specific gene family was found among transcripts of cis-NAT pairs (data not shown).

Functional analysis of trans-NATs using GO. The percent of Arabidopsis annotated genes and genes involved in trans-NAT pairs in each functional category are shown.

Evolutionary conservation of trans-NAT pairs

To study the possible phylogenetic conservation of trans-NATs in higher plants, we performed an in silico search for trans-NAT pairs in poplar and rice and compared them with those from Arabidopsis. For about 60% of Arabidopsis trans-NAT pairs, homologs of at least one transcript involved in the pair also have trans-NAT partners in either poplar or rice (Table 6). For the majority of these Arabidopsis trans-NAT pairs, only one transcript retained a trans-NAT relationship in poplar or rice, but with new partners. Even for the small proportion of Arabidopsis trans-NAT pairs in which both transcripts retained trans-NAT relationships in poplar or rice, the sense and antisense transcripts of the same trans-NAT pair tended to have new pairing partners only one trans-NAT pair remained the same in poplar and rice as in Arabidopsis.

Networks formed by cis- and trans-NAT pairs

Unlike cis-NAT pairs, of which one sense transcript usually has only one antisense partner, one-to-many relationships are commonly seen in trans-NATs. There were also cases in which one transcript formed different double-stranded RNA duplexes with different transcripts derived from the same gene as a result of alternative splicing. Among all transcript clusters involved in trans-NAT pairs, 425 from both the high-coverage category and the 100 nt category can form multiple trans-NAT pairs with other transcripts (Figure 3).

Pairing relationship of transcript clusters in trans-NAT pairs.

Comparison with previously reported Arabidopsis cis-NAT data revealed that 430 transcripts on the trans-NAT list also had cis-NATs [7], indicating that antisense transcripts might form complex regulatory networks in Arabidopsis. UDP-glucosyl transferase family proteins are important enzymes catalyzing the transportation of sugars [27]. The Arabidopsis genome contains about 115 genes encoding UDP-glucosyl transferase family proteins. Transcripts of 44 UDP-glucosyl transferase genes have one or more pairing trans-NATs, among which 5 also have putative cis-NATs. Another 13 UDP glucosyl transferase gene member transcripts have pairing cis-NATs only. We analyzed NAT pairs formed by transcripts of UDP-glucosyl transferase gene family members in detail using the yEd software [28] to uncover possible regulatory networks formed by antisense transcripts (Figure 4). Our results showed that antisense transcripts could potentially regulate the UDP-glucosyl transferase family transcripts in various ways. Some transcripts could form antisense pairs with transcripts of UDP-glucosyl transferase family members in both a cis- and trans-manner. Phylogenetic analysis of UDP-glucosyl transferase gene member transcripts indicated that closely related transcripts (from the same clade of the phylogenetic tree) tended to be regulated by the same trans-antisense transcript (Figure 4, Additional data file 3). Such a complex pairing network was also observed amongst transcripts of several other gene families (data not shown).

Networks of cis- and trans-NAT pairs formed by transcripts encoding UDP-glucosyl transferase family proteins in A. thaliana. Green ellipses represent UDP-glucosyl transferase transcripts involved in trans-NAT pairs only red ellipses represent UDP-glucosyl transferase transcripts involved in cis-NAT pairs only yellow ellipses represent UDP-glucosyl transferase trasncripts involved in both cis- and trans-NAT pairs. Transcripts from other protein families are shown as blue ellipses. Directed lines present the pairing relationship of two transcripts, with arrows pointing to UDP-glucosyl transferase transcripts.

Potential roles of trans-NATs in inducing gene silencing

It has been shown that double-stranded RNA duplexes could be digested by Dicer to produce small interfering RNAs (reviewed in [29]). Since trans-NAT pairs also have long extended double-stranded regions, we asked whether some, if not all, of them could regulate each other's expression via the RNA interference pathway. To test this hypothesis, we first mapped all available Arabidopsis small RNAs from the public Arabidopsis MPSS database to the Arabidopsis genome [30], and searched for those siRNAs that could presumably be generated by trans-NAT pairs. We were able to identify a total of 148 siRNAs that were putatively derived from the RNA-RNA duplex region of 171 trans-NAT pairs (Table 7). Among them, 110 siRNAs could be generated by more than one trans-NAT pair. Comparison of siRNA density (matched siRNA number versus sequence length) between the pairing and non-pairing regions of the 171 trans-NAT pairs revealed that the siRNA density in duplex regions is 1.75 times higher than that in single-strand regions (14 siRNA per 1,000 nt versus 8 siRNA per 1,000 nt). SiRNAs generated from the duplex region of a trans-NAT pair could anneal to the antisense transcript and prime the synthesis of double-strand RNAs through RNA-dependent RNA polymerase (RDRP), thereby generating more siRNAs from sequences 5' to the original duplex region. For this reason, only sequences from the 3' end of the duplex region to the 3' end of the transcript that could not produce RDRP-generated siRNAs were considered in the siRNA density analysis.

Expression profile comparison of the trans-NAT specific siRNAs between the Arabidopsis wild-type and RNA-dependent RNA polymerase 2 (rdr2) loss-of-function mutant [31] showed that, out of the 148 siRNAs, only 1 was found in the rdr2 mutant. This result suggests that at least some siRNAs generated by trans-NATs are RDR2-dependent.

Because a large proportion of the 171 siRNA-related trans-NAT pairs were formed by putative transcripts from genes annotated as encoding hypothetical proteins, we asked whether some of these genes are uncharacterized transposable elements. To address this question, we extracted the corresponding genomic regions of genes involved in the 171 trans-NAT pairs, and used RepeatMasker to examine the homology of these sequences with known transposable elements. The results showed that 101 trans-NAT pairs had at least one transcript whose corresponding genomic region displayed high homology to transposable elements listed in the Repbase, indicating that these genes might be derived from transposons.

Trans-NATs and alternative splicing

Another reported function of trans-NATs is to alter the splicing pattern of their corresponding sense transcripts by base pairing, thereby masking certain splicing sites [10, 11]. To explore the potential roles of Arabidopsis trans-NATs in regulating alternative splicing, we compared the proportion of genes with alternative splicing in our predicted trans-NAT pairs with that of all genes in the Arabidopsis genome. A previous study using full-length cDNAs showed that about 11.59% of Arabidopsis transcription units had alternative splicing events [32]. For the 658 predicted trans-NAT pairs that had corresponding annotated genes for both transcripts, 127 pairs had one transcript with known alternatively spliced gene products, and another 3 pairs had alternatively spliced forms for both transcripts (Table 8). These data show that Arabidopsis trans-NAT pairs have a much higher proportion of alternative splicing events (19.76%) compared to all transcription units in the genome (11.59%), suggesting that some trans-NATs might function in regulating alternative splicing in Arabidopsis. Furthermore, among these 130 trans-NAT pairs, about 60% had antisense pairing regions overlapped with alternatively spliced exons, suggesting that the binding of antisense transcripts to the pre-mRNA of their sense partners could cause the exclusion of the pairing region from the resulting mature sense mRNAs.


(ARE). A region in an RNA transcript with frequent A and U nucleotides, such as AUUUA, that targets the RNA for degradation.

An RNase III family endonuclease that processes dsRNA and pre-miRNAs into siRNAs and miRNAs, respectively.

(ENCyclopedia Of DNA Elements). A publicly funded project that aims to find functional elements in the human genome.

(miRNA). Small (20–25 nucleotides) ssRNA that is thought to regulate the expression of other genes, either through inhibiting protein translation or through degrading a target mRNA transcript, by a process that is similar to RNAi.

A small (25–35 nucleotides) RNA species that is processed from precursor ssRNA, independently of Dicer, and forms a complex with the Piwi protein. piRNAs are probably involved in transposon silencing and stem cell function.

(RNA-induced silencing complex). A multi-protein complex that incorporates one strand of siRNA and uses it to recognize complementary target mRNA for degradation.

(RNA-induced transcriptional silencing complex). A multi-protein complex — for example, in fission yeast — that incorporates short RNA molecules, such as siRNAs, and triggers downregulation of transcription of a particular gene or genomic region. This is usually accomplished by the modification of histone tails, which target the genomic region for heterochromatin formation.

(siRNA). Short (21–23 nucleotides) RNA molecule that is processed from a long dsRNA. siRNAs are functional components of the RISC, and they typically target mRNAs by binding perfectly complementary sequences in the mRNA and causing their degradation.

A group of expressed sequence tags or mRNAs, usually with alternative splice patterns, that share exonic overlap of at least 1 nucleotide and are in the same chromosomal orientation.


RESULTS

Promoters subject to antisense transcription show increased levels of chromatin remodelling enzymes and histone chaperones.

To address how antisense transcription impacts on the canonical features around the sense promoter, we defined all genes based on the level of nascent antisense transcription in a 300 bp window immediately downstream of the sense promoter, excluding any overlapping, convergent genes (Figure 1A). This allowed us to examine how sense promoters subject to high levels of antisense transcription differ from those with lower levels.

We divided 5183 S. cerevisiae genes into five classes based on the number of NET-seq reads in the 300 bp window. This gave us a class of 1024 genes (20%) with high levels of antisense transcription (≥28 reads, median 70 reads), a class of 1240 genes (24%) with little or no antisense transcription (0 or 1 read), and three intermediate classes (Figure 1A). We then assessed whether genes in the class with the highest antisense transcription were enriched for specific factors when compared to those in the lowest, utilizing an extensive analysis in which the levels of 202 promoter factors were determined genome-wide ( 10). Strikingly, we found that the promoters of genes subject to high levels of antisense transcription were significantly enriched for factors involved in modulating the chromatin environment (Table 1 Supplementary Table S1). To assess whether these enriched factors were a unique feature of genes subject to antisense transcription, or just transcribed genes in general, we compared the group with highest antisense transcription (median 70 antisense reads, 578 sense reads, n = 1024) with another group which had no or low levels of antisense transcription (0 or 1 read) but higher levels of sense transcription (median 948 reads, n = 1240) within the same 300 bp window (Supplementary Table S2). Components of the ISW1, ISW2 and INO80 remodelling complexes and the FACT histone chaperone ( 28) were specifically enriched at the sense promoters of genes subject to high antisense transcription compared to gene promoters with high levels of sense transcription (Table 1). Thus, antisense transcription is likely to be associated with specific chromatin remodelling enzymes and changes in promoter chromatin structure.

Antisense transcribed genes are enriched for distinct promoter-bound transcription related proteins

Enriched factor . Complex . Function . P-value a .
Isw1 Isw1a, Isw1b Chromatin remodelling enzyme 4.2 × 10 −9
Ino80 INO80 Chromatin remodelling enzyme 4.9 × 10 −9
Pob3 FACT Histone chaperone 2.5 × 10 −7
Rsc9 RSC1, RSC2, RSCa Chromatin remodelling enzyme 3.2 × 10 −7
Swi3 SWI-SNF Chromatin remodelling enzyme 4.9 × 10 −7
Rpd3 RPD3 Histone chaperone and lysine deacetylase 2.9 × 10 −6
Rpb7 Pol II Recruitment of 3′-end processing factors 3.8 × 10 −6
Itc1 Isw2 Component of chromatin remodelling complex 1.2 × 10 −5
Spt3 SAGA,SLIK Negative acting subunit of the SAGA and SAGA-like transcriptional regulatory complexes 1.9 × 10 −5
Ctk1 CTK CTD phosphorylation, regulates mRNA 3′ end processing 2.1 × 10 −5
Spn1 (Iws1) SPT6 interactor Interacts with RNAP II, TBP and chromatin remodelling factors 2.2 × 10 −5
Spt6 SPT6 Histone chaperone 2.9 × 10 −5
Rpo21 Pol II Largest Pol II subunit 3.8 ×10 −5
Htb2 nucleosome Core histone 6.7 × 10 −5
Rpb2 Pol II Second largest Pol II subunit 8.8 × 10 −5
Enriched factor . Complex . Function . P-value a .
Isw1 Isw1a, Isw1b Chromatin remodelling enzyme 4.2 × 10 −9
Ino80 INO80 Chromatin remodelling enzyme 4.9 × 10 −9
Pob3 FACT Histone chaperone 2.5 × 10 −7
Rsc9 RSC1, RSC2, RSCa Chromatin remodelling enzyme 3.2 × 10 −7
Swi3 SWI-SNF Chromatin remodelling enzyme 4.9 × 10 −7
Rpd3 RPD3 Histone chaperone and lysine deacetylase 2.9 × 10 −6
Rpb7 Pol II Recruitment of 3′-end processing factors 3.8 × 10 −6
Itc1 Isw2 Component of chromatin remodelling complex 1.2 × 10 −5
Spt3 SAGA,SLIK Negative acting subunit of the SAGA and SAGA-like transcriptional regulatory complexes 1.9 × 10 −5
Ctk1 CTK CTD phosphorylation, regulates mRNA 3′ end processing 2.1 × 10 −5
Spn1 (Iws1) SPT6 interactor Interacts with RNAP II, TBP and chromatin remodelling factors 2.2 × 10 −5
Spt6 SPT6 Histone chaperone 2.9 × 10 −5
Rpo21 Pol II Largest Pol II subunit 3.8 ×10 −5
Htb2 nucleosome Core histone 6.7 × 10 −5
Rpb2 Pol II Second largest Pol II subunit 8.8 × 10 −5

a Factors are ranked in order of P-values determined using the Wilcoxon rank sum test, and were considered enriched if their levels were higher in those genes with high antisense compared to low antisense and if they had a P-value less than 0.0001. Factors in blue were significantly enriched in genes with antisense transcription compared to those with sense transcription but no antisense transcription (see Supplementary Table S2).

Enriched factor . Complex . Function . P-value a .
Isw1 Isw1a, Isw1b Chromatin remodelling enzyme 4.2 × 10 −9
Ino80 INO80 Chromatin remodelling enzyme 4.9 × 10 −9
Pob3 FACT Histone chaperone 2.5 × 10 −7
Rsc9 RSC1, RSC2, RSCa Chromatin remodelling enzyme 3.2 × 10 −7
Swi3 SWI-SNF Chromatin remodelling enzyme 4.9 × 10 −7
Rpd3 RPD3 Histone chaperone and lysine deacetylase 2.9 × 10 −6
Rpb7 Pol II Recruitment of 3′-end processing factors 3.8 × 10 −6
Itc1 Isw2 Component of chromatin remodelling complex 1.2 × 10 −5
Spt3 SAGA,SLIK Negative acting subunit of the SAGA and SAGA-like transcriptional regulatory complexes 1.9 × 10 −5
Ctk1 CTK CTD phosphorylation, regulates mRNA 3′ end processing 2.1 × 10 −5
Spn1 (Iws1) SPT6 interactor Interacts with RNAP II, TBP and chromatin remodelling factors 2.2 × 10 −5
Spt6 SPT6 Histone chaperone 2.9 × 10 −5
Rpo21 Pol II Largest Pol II subunit 3.8 ×10 −5
Htb2 nucleosome Core histone 6.7 × 10 −5
Rpb2 Pol II Second largest Pol II subunit 8.8 × 10 −5
Enriched factor . Complex . Function . P-value a .
Isw1 Isw1a, Isw1b Chromatin remodelling enzyme 4.2 × 10 −9
Ino80 INO80 Chromatin remodelling enzyme 4.9 × 10 −9
Pob3 FACT Histone chaperone 2.5 × 10 −7
Rsc9 RSC1, RSC2, RSCa Chromatin remodelling enzyme 3.2 × 10 −7
Swi3 SWI-SNF Chromatin remodelling enzyme 4.9 × 10 −7
Rpd3 RPD3 Histone chaperone and lysine deacetylase 2.9 × 10 −6
Rpb7 Pol II Recruitment of 3′-end processing factors 3.8 × 10 −6
Itc1 Isw2 Component of chromatin remodelling complex 1.2 × 10 −5
Spt3 SAGA,SLIK Negative acting subunit of the SAGA and SAGA-like transcriptional regulatory complexes 1.9 × 10 −5
Ctk1 CTK CTD phosphorylation, regulates mRNA 3′ end processing 2.1 × 10 −5
Spn1 (Iws1) SPT6 interactor Interacts with RNAP II, TBP and chromatin remodelling factors 2.2 × 10 −5
Spt6 SPT6 Histone chaperone 2.9 × 10 −5
Rpo21 Pol II Largest Pol II subunit 3.8 ×10 −5
Htb2 nucleosome Core histone 6.7 × 10 −5
Rpb2 Pol II Second largest Pol II subunit 8.8 × 10 −5

a Factors are ranked in order of P-values determined using the Wilcoxon rank sum test, and were considered enriched if their levels were higher in those genes with high antisense compared to low antisense and if they had a P-value less than 0.0001. Factors in blue were significantly enriched in genes with antisense transcription compared to those with sense transcription but no antisense transcription (see Supplementary Table S2).

Antisense transcription is associated with a change in chromatin architecture at the sense promoter

We asked how genes might differ in the positioning, dynamics, and occupancy of their promoter-bound nucleosomes. Using a genome-wide map of nucleosome occupancy ( 29), we found that genes subject to high levels of antisense transcription showed elevated nucleosome occupancy across their promoters (p = 2.3 × 10 −27 Figure 1B). The NDR between the −1 and +1 nucleosomes was also shorter in genes with high antisense transcription (Figure 1B a median length of 75 bp versus 114 bp for the highest and lowest classes respectively, p = 1 × 10 −16 ). The increase in nucleosome occupancy occurred in a stepwise fashion between the five gene classes, suggesting that antisense transcription can exert changes in the chromatin even at low levels and in ≈75% of genes. Critically, we found that there is only a very weak correlation between sense and antisense transcription within the 300 bp window (Figure 1C, Spearman's correlation coefficient = −0.07). Furthermore, changes in sense transcription were not found to be inversely correlated with changes in antisense transcription genome-wide using NET-seq experiments conducted in cells shifted from glucose- to galactose-containing medium, despite there being a >3-fold change in sense transcription for 1078 genes (20% of the genome) ( 8) (Figure 1D, rs = +0.06). Thus the increase in nucleosome occupancy associated with increasing antisense transcription is likely to be independent of sense transcription.

Sense and antisense transcription are associated with distinct patterns of nucleosome occupancy at the sense promoter

Next, we assessed how nascent sense transcription in the same 300 bp window influences nucleosome occupancy and NDR size at the sense promoter in the presence of varying levels of antisense transcription (Figure 2A–D). First, we asked how the levels of sense or antisense transcription in the 300 bp window correlated with nucleosome occupancy at varying positions relative to the TSS (Figure 2A). We found that sense transcription was negatively correlated with nucleosome occupancy immediately upstream of the TSS, and positively correlated immediately downstream, in the region corresponding to the +1 nucleosome, supporting a model in which sense transcription positions the +1 nucleosome ( 30). In contrast, antisense transcription correlated positively with nucleosome occupancy over the promoter, directly upstream of the TSS, and in the first 1 kb of the transcribed region. Genes with sense transcription but no antisense transcription tended towards an open promoter chromatin structure, with low nucleosome occupancy and a large NDR (126 bp), while genes with high antisense transcription and low sense transcription had a closed promoter chromatin structure with high nucleosome occupancy across the promoter and a small NDR (39 bp) (Figure 2B and C). We confirmed that the increased nucleosome occupancy associated with antisense transcription is independent of sense transcription by comparing genes with similar levels of sense transcription but varying levels of antisense transcription, and found that nucleosome occupancy still differed markedly (Figure 2B and C). This suggests three distinct promoter chromatin architectures, defined by genes enriched primarily for sense transcription (Figure 2D left panel), primarily for antisense transcription (middle panel), or genes with varying levels of both sense and antisense transcription (right panel). These architectures are highly reminiscent of the distinct classes of condition-specific promoter chromatin configurations associated with different degrees of gene expression noise ( 26). Thus, sense and antisense transcription are associated with distinct properties of the chromatin organization surrounding the sense TSS. Although sense and antisense transcription are inversely associated with nucleosome occupancy, the very weak correlation between sense and antisense transcription genome-wide suggests that they are largely independent processes, where one does not regulate the levels of the other.

(A) Correlation between nucleosome occupancy and the two types of transcription. Shown is the Spearman correlation coefficient calculated for sense transcription (red) and, separately, antisense transcription (blue) in the 300 bp window with nucleosome occupancy at varying positions around the TSS. Grey rectangles represent the approximate positions of the −1 (left) and +1 (right) nucleosomes. (B) Median number of sense and antisense reads in the gene groups shown in panel (C), together with their median NDR size. (C) Average nucleosome occupancy in genes with varying median numbers of sense and antisense reads in the same window, with colours referring to the gene groups described in panel (B). (D) Promoters can be considered to belong to one of three different classes under a given set of conditions, each with distinct chromatin features. Triangles represent transcription factors, ovals chromatin remodelling enzymes. See Supplementary Figures S1 and S2.

(A) Correlation between nucleosome occupancy and the two types of transcription. Shown is the Spearman correlation coefficient calculated for sense transcription (red) and, separately, antisense transcription (blue) in the 300 bp window with nucleosome occupancy at varying positions around the TSS. Grey rectangles represent the approximate positions of the −1 (left) and +1 (right) nucleosomes. (B) Median number of sense and antisense reads in the gene groups shown in panel (C), together with their median NDR size. (C) Average nucleosome occupancy in genes with varying median numbers of sense and antisense reads in the same window, with colours referring to the gene groups described in panel (B). (D) Promoters can be considered to belong to one of three different classes under a given set of conditions, each with distinct chromatin features. Triangles represent transcription factors, ovals chromatin remodelling enzymes. See Supplementary Figures S1 and S2.

Nascent sense and antisense transcription are associated with distinct patterns of histone modification

We examined the patterns of histone modifications in genes with varying levels of antisense transcription in the 300 bp window, and identified numerous modifications that showed substantial differences in genes subject to high antisense transcription compared to those with low levels (Figure 3A). These modifications could be broadly classed into three groups: those that were higher across the gene body with high antisense transcription, those that were lower, and those that were more evenly spread. We also assessed how the correlations between chromatin modifications and antisense transcription compared to those correlations between modifications and sense transcription (Figure 3B and C). Genome-wide associations remained the same when the gene class with highest antisense transcription was divided into three subgroups (Supplementary Figure S1) or when regulated TATA-box containing genes ( 25) were excluded from the analysis (Supplementary Figure S2), indicating a continuum of effects on all gene types, associated with the level of antisense transcription. Although this analysis is based on sense and antisense transcription in the 300 bp window, we show the lack of correlation between sense and antisense transcription extends well into the gene body (Figure 3D). The small negative correlation towards the 3′ region suggests that the initiation of antisense transcription might lead to premature termination of sense transcription, thus affecting steady-state transcript levels.

Sense and antisense transcription have distinct associations with a variety of histone modifications. (A) The average levels of seven different histone modifications around the TSS (0) in the five gene classes described in Figure 1A, subject to varying antisense transcription. (B) The correlation coefficient between the levels of seven different histone modifications in 10 bp windows around the TSS (0) and, separately, the number of sense and antisense NET-seq reads in the 300 bp window described in Figure 1A. (C) The average levels of seven different histone modifications around the TSS (0) in the five gene classes described in Figure 1A, subject to high antisense transcription (blue), and high sense transcription (red) in the 300 bp window. (D) Spearman correlation coefficient between sense and antisense transcription across genes. (E) The average levels of TBP, TFIIB, H2A.Z and sense-transcribing RNAPII. TBP and TFIIB do not spread into the body of the gene with increasing antisense transcription as H2A.Z does. (F) Correlation coefficient between level of RNAPII CTD Ser2ph and sense and antisense transcription. Where shown, P-values were determined using the Wilcoxon rank sum test, comparing the genes in the highest class to the lowest class near the TSS. See Supplementary Figure S3.

Sense and antisense transcription have distinct associations with a variety of histone modifications. (A) The average levels of seven different histone modifications around the TSS (0) in the five gene classes described in Figure 1A, subject to varying antisense transcription. (B) The correlation coefficient between the levels of seven different histone modifications in 10 bp windows around the TSS (0) and, separately, the number of sense and antisense NET-seq reads in the 300 bp window described in Figure 1A. (C) The average levels of seven different histone modifications around the TSS (0) in the five gene classes described in Figure 1A, subject to high antisense transcription (blue), and high sense transcription (red) in the 300 bp window. (D) Spearman correlation coefficient between sense and antisense transcription across genes. (E) The average levels of TBP, TFIIB, H2A.Z and sense-transcribing RNAPII. TBP and TFIIB do not spread into the body of the gene with increasing antisense transcription as H2A.Z does. (F) Correlation coefficient between level of RNAPII CTD Ser2ph and sense and antisense transcription. Where shown, P-values were determined using the Wilcoxon rank sum test, comparing the genes in the highest class to the lowest class near the TSS. See Supplementary Figure S3.

Acetylation at histone H3 was elevated in genes subject to high antisense transcription (Figure 3A–C Supplementary Figure S3A–C). Though H3 acetylation was also higher in genes with high sense transcription, these differences were primarily at the promoter, whereas antisense transcription-associated changes occurred both at the promoter and across the gene body.

Levels of H3K4me3 were significantly decreased at the promoter and early transcribed region of genes subject to antisense transcription (Figure 3A–C), but were increased further downstream, suggesting that the mark is redistributed by antisense transcription ( 31) (Supplementary Figure S3D). Levels of the variant histone Htz1 (H2A.Z) showed a similar redistribution from the promoter into the body of genes subject to high antisense transcription (Figure 3E), i.e. they were more evenly spread. However, TFIIB and TBP remain associated at the promoter, demonstrating that the redistribution we observe is not a result of cryptic promoters in the transcribed region (Figure 3E).

H2BK123ub, H3K36me3 and H3K79me3 are lower in genes subject to high antisense transcription genome-wide, mainly across the gene body (Figure 3A–C Supplementary Figure S3E and F). This is contrary to what one might expect for genes possessing two overlapping, convergent transcription units, given that H3K36me3 is found primarily at the 3′ end of genes ( 16, 19), implying that antisense transcription may be inherently different from sense transcription. Indeed, we observed little correlation between antisense transcription and RNAPII CTD serine 2 phosphorylation ( 32) (Figure 3F), required for deposition of H3K36me3 by Set2, suggesting that sense and antisense-transcribing RNAPII complexes may themselves differ. Intriguingly, H2BK123ub, H3K36me3 and H3K79me3 are associated with the stabilization of nucleosome structures ( 16, 20–21, 33–34). Moreover, despite H3K36me3 not showing a causal relationship with histone turnover ( 13), the histone modifications most strongly associated with antisense transcription were also those most strongly correlated with histone H3 turnover genome-wide namely H3K79me3 and H3K36me3 (negatively correlated with both) and H3K4ac, H3K56ac and H3K79me2 (positively correlated with both) (Figure 4A), leading us to ask if antisense transcription correlates with histone turnover.

Antisense transcription is associated with increased histone turnover. (A) The Spearman's correlation coefficients obtained by correlating histone turnover or antisense transcription genome-wide with the levels of thirteen different histone modifications within the same probed regions. (B and C) Boxplots showing the distributions of histone turnover rates at the −1 to +4 nucleosomes of genes with (B) varying levels of antisense transcription, coloured as in Figure 1A, with the three intermediate classes combined into a single group, or (C) sense transcription in the 300 bp window. Median levels of sense and antisense transcription in each group are given in red and blue respectively. The top and bottom of the grey box indicates the median value of all probes overlapping a −1 nucleosome or histone turnover genome-wide, respectively. (D) Histone turnover data for selected regions in G1 arrested cells from ( 12) and NET-seq data ( 8) showing strand-specific transcription for TUB2 (B) and HMS2 (C). Green represents regions with low histone turnover and red indicates regions of high histone turnover. TUB2 and HMS2 are subject to similar levels of sense transcription but show very different levels of antisense transcription. See Supplementary Figure S4.

Antisense transcription is associated with increased histone turnover. (A) The Spearman's correlation coefficients obtained by correlating histone turnover or antisense transcription genome-wide with the levels of thirteen different histone modifications within the same probed regions. (B and C) Boxplots showing the distributions of histone turnover rates at the −1 to +4 nucleosomes of genes with (B) varying levels of antisense transcription, coloured as in Figure 1A, with the three intermediate classes combined into a single group, or (C) sense transcription in the 300 bp window. Median levels of sense and antisense transcription in each group are given in red and blue respectively. The top and bottom of the grey box indicates the median value of all probes overlapping a −1 nucleosome or histone turnover genome-wide, respectively. (D) Histone turnover data for selected regions in G1 arrested cells from ( 12) and NET-seq data ( 8) showing strand-specific transcription for TUB2 (B) and HMS2 (C). Green represents regions with low histone turnover and red indicates regions of high histone turnover. TUB2 and HMS2 are subject to similar levels of sense transcription but show very different levels of antisense transcription. See Supplementary Figure S4.

Promoters and gene bodies subject to antisense transcription exhibit increased histone turnover rates

To assess histone turnover rates, we utilized a genome-wide map in which the rate of displacement of Myc-tagged H3 by Flag-tagged H3 was measured following induction of Flag-H3 expression ( 12). Histone turnover was significantly higher between nucleosomes −1 and +4 of genes with high antisense transcription in the 300 bp window compared to those with no/low antisense transcription in the 300 bp window (Figure 4B p = 9.6 × 10 −7 , 1.7 × 10 −22 , 1.9 × 10 −24 , 5.4 × 10 −21 and 4.9 × 10 −25 respectively). These trends remained when the H3K36 methyltransferase SET2 was deleted ( 16) (Supplementary Figure S4), consistent with no major causal role for H3K36 methylation in the association between antisense transcription and histone turnover.

To confirm that increased histone turnover is a feature of antisense transcription, and not transcription more generally, we assessed how histone turnover differed for genes with varying levels of sense transcription (Figure 4C). Despite large changes in sense transcription between groups, histone turnover did not vary substantially at the promoter, although turnover rates were high. This lack of association between levels of sense transcription and histone turnover at the promoter agrees with previous findings ( 12). Sense transcription, however, did show the expected relationship at nucleosomes +3 and +4, with higher turnover associated with higher levels of transcription.

The dramatic influence of antisense transcription on histone turnover can be seen at HMS2 and TUB2, genes with similar levels of sense transcription but very different antisense transcription (Figure 4D). Taken together, our results suggest that although histone turnover is a feature of the canonical eukaryotic promoter, it is not correlated with the level of sense transcription, but rather with the level of antisense transcription. This increased turnover could be a consequence of the increased levels of histone chaperones and/or chromatin remodelling enzymes, which have also been shown to direct turnover ( 35, 36).

Decreasing antisense transcription over GAL1 increases levels of H3 and H3 acetylation but reduces levels of H3K36me3

We hypothesised that antisense transcription might itself be responsible for changing histone levels and modifications at the sense promoter and across the gene body, and sought to validate this experimentally at the GAL1 gene. In both glucose and galactose, GAL1 has sense and antisense (CUT445) transcripts, although in glucose the sense transcript (SUT013) originates from a promoter in GAL10 ( 22) (Figure 5 and Supplementary Figure S5). Previously, we showed that inserting the terminator sequence of ADH1 (ADH1T) into the ORF of GAL1 resulted in a redefinition of the transcription unit the GAL1 sense transcripts (GAL1 and SUT013) terminated at ADH1T while the antisense transcript initiated from it ( 9) (Figure 5A and B). The shortening of the antisense transcripts during galactose induction results from the production of distinct transcript isoforms (Figure 5B) ( 8, 37). Mapping of these transcripts in glucose revealed that a major antisense transcript end site was 128 bp upstream of the sense TSS, well into the GAL1–10 promoter, suggesting that it might be able to modulate sense promoter structure (Figure 5C). Mapping nascent transcripts using NET-seq confirmed the presence of antisense transcription extending across the complete GAL1–10 promoter at the start of induction (60 min in galactose) and when sense transcription is highly induced (180 min in galactose) (Figure 5D).

Experimentally varying antisense transcription at GAL1 alters chromatin modifications (A) Schematics showing transcripts at the native GAL locus in glucose medium (top) or after insertion of the ADH1 terminator (T) +757 bp into GAL1 in glucose (middle) or galactose (bottom). (B) Northern blot probing for the GAL1 sense and antisense transcripts during an induction time-course (times in min). rRNA in the ethidium bromide-stained gel is shown as a loading control. The control is GAL1:ADH1T and the TATA mutant has 4 bp of a TATA-like sequence scrambled (see panel C). Antisense transcription into the GAL1 promoter from the insert can be disrupted by mutation of a TATA-like sequence within ADH1T. (C) GAL1:ADH1 sense/antisense RNA mapping by RL-PCR (with decapping and dephosphosphorylation) at the 3‘ region of the construct with ADH1T inserted into GAL1. Also shown is the position of the TATA-like sequence, as well as its sequence following mutation (changes marked in red). 3‘ end sites were confirmed using 3‘ RACE. Sequence of primers available on request. (D) NET-seq data showing nascent transcripts on the Watson (W) and Crick (C) strand of DNA in a GAL1:ADH1T strain after 60 or 180 min in galactose medium. (E and F) ChIP experiments showing the levels of histone H3 or various histone modifications at GAL1 in the GAL1-ADH1T strain, both with and without mutation of the TATA-like sequence. Chromatin was prepared from strains grown in glucose as described previously ( 27). (E) In the strain with the intact TATA-like sequence, a peak of H3K4me3 is observed corresponding approximately to the site from which the antisense transcript initiates. Following mutation of the TATA-like sequence this peak is lost, consistent with the observed abrogation of antisense transcription, n = 2, error bars are SEM. (F) H3 levels, histone lysine acetylation for H3K9, K14, K18 and K23 and histone H3 lysine trimethylation for K36 normalised to H3 levels.

Experimentally varying antisense transcription at GAL1 alters chromatin modifications (A) Schematics showing transcripts at the native GAL locus in glucose medium (top) or after insertion of the ADH1 terminator (T) +757 bp into GAL1 in glucose (middle) or galactose (bottom). (B) Northern blot probing for the GAL1 sense and antisense transcripts during an induction time-course (times in min). rRNA in the ethidium bromide-stained gel is shown as a loading control. The control is GAL1:ADH1T and the TATA mutant has 4 bp of a TATA-like sequence scrambled (see panel C). Antisense transcription into the GAL1 promoter from the insert can be disrupted by mutation of a TATA-like sequence within ADH1T. (C) GAL1:ADH1 sense/antisense RNA mapping by RL-PCR (with decapping and dephosphosphorylation) at the 3‘ region of the construct with ADH1T inserted into GAL1. Also shown is the position of the TATA-like sequence, as well as its sequence following mutation (changes marked in red). 3‘ end sites were confirmed using 3‘ RACE. Sequence of primers available on request. (D) NET-seq data showing nascent transcripts on the Watson (W) and Crick (C) strand of DNA in a GAL1:ADH1T strain after 60 or 180 min in galactose medium. (E and F) ChIP experiments showing the levels of histone H3 or various histone modifications at GAL1 in the GAL1-ADH1T strain, both with and without mutation of the TATA-like sequence. Chromatin was prepared from strains grown in glucose as described previously ( 27). (E) In the strain with the intact TATA-like sequence, a peak of H3K4me3 is observed corresponding approximately to the site from which the antisense transcript initiates. Following mutation of the TATA-like sequence this peak is lost, consistent with the observed abrogation of antisense transcription, n = 2, error bars are SEM. (F) H3 levels, histone lysine acetylation for H3K9, K14, K18 and K23 and histone H3 lysine trimethylation for K36 normalised to H3 levels.

We identified a sequence within ADH1T resembling a TATA-box (TATAAAAA) (Figure 5C), and hypothesised that it might be necessary for initiation of the antisense transcript. Strikingly, although downstream of the antisense TSS, mutation of the TATA-box (AAATAAAT) reduced levels of the antisense transcript while sense transcript levels were unaffected (Figure 5B), in keeping with the lack of correlation between sense and antisense transcription (Figure 1C). We used chromatin immunoprecipitation (ChIP) to show reduced levels of H3K4me3 around the TSS for the antisense transcript in the TATA-box mutant strain (Figure 5E), confirming reduced use of the transcription unit. The ability to reduce antisense transcription at GAL1 provided an experimental system in which the effects of antisense transcription on chromatin could be observed.

The levels of histone and histone modifications were measured using ChIP in glucose-containing medium, specifically H3, a series of H3 acetylation marks (K9,14,18,23ac) and H3K36me3 across both GAL1 constructs (Figure 5F). ChIP experiments were performed in the presence of SUT013, which in glucose is transcribed at similar levels to the GAL1 antisense transcript (Figure 5D). Based on the genome-wide analysis, it was predicted that H3 and H3 acetylation levels would fall following ablation of antisense transcription, and that H3K36me3 would rise. In agreement with our predictions, levels of H3 and H3 acetylation (normalised to levels of H3) did fall, while levels of H3-normalised H3K36me3 increased, following mutation of the TATA-like sequence and loss of the antisense transcript (Figure 5F). These changes are consistent with a reduced histone turnover/deposition in the absence of antisense transcription, and suggest that antisense transcription is associated with broad changes in chromatin.

Changes in antisense transcription following deletion of SET2, RCO1, EAF3 or SET1 are associated with a corresponding change in nucleosome occupancy and histone modifications

Thus far, we have shown associations between antisense transcription and chromatin features across different genes in a wild-type background. We hypothesised that similar associations would be observed when comparing the same genes between different yeast strains. i.e. by investigating how antisense transcription changes upon mutation, and whether these changes are mirrored by the changes in the chromatin features discussed above.

To this end, we utilised available NET-seq data obtained in yeast mutants known to have altered levels of antisense transcription, namely SET2, RCO1, EAF3 and SET1 deletion strains ( 7), and grouped genes on the basis of whether antisense transcription over the 300 bp window increased, decreased or did not change. From the 5183 genes that made up the five classes presented in Figure 1A, we selected three new gene groups based on how they changed classification in the mutant strain compared to wild-type. An ‘increased’ gene group was selected in which genes went up by at least two classes (for example, from class two to class four, in Figure 1A). A second group was selected in which genes dropped by at least two classes (the ‘decreased’ group). The third group contained genes that did not change classification (the ‘unchanged’ group). The numbers of genes in each group for each mutant strain are shown by the left hand panels in Figure 6. Note especially that the three groups are different in each of the four mutants (see corresponding n values by the left hand panels), as antisense transcription was differentially altered between them—i.e. a gene classed as ‘increased’ in one mutant could well be classed as ‘decreased’ or ‘unchanged’ in another. Genome-wide levels of sense expression were generally similar between all four mutants and wild-type, save for eaf3Δ, in which the average level of sense transcription at a gene was three-quarters that of wild-type. Crucially, and in support of our earlier analyses, the correlation between the change in sense and antisense transcription at the 300 bp window in the deletion mutants relative to wild-type was similarly small for all four strains (Figure 6)—i.e. changes in antisense transcription were not associated with increased or decreased sense transcription.

Changes in antisense transcription levels following mutation are concomitant with expected changes in nucleosome occupancy and chromatin modification. (AD) The average change in levels of nucleosome occupancy, H3K14ac, H3K36me3 and H4ac in (A) set2Δ (B) rco1Δ (C) eaf3Δ and (D) set1Δ, when compared to wild-type. Three gene groups were analysed, which differed in terms of how the level of antisense transcription in the 300 bp window changed following mutation. A positive value indicates the particular level has gone up in the mutant strain relative to wild-type. Black vertical lines indicate P-values between gene groups at the points indicated, determined using the Wilcoxon rank sum test (* indicates P < 0.05, **P < 0.01, ***P < 0.001). The grey vertical box represents the region typically occupied by the +1 nucleosome. Levels of H3K14ac and H3K36me3 were normalised to levels of nucleosome occupancy as discussed in Materials and Methods. The panels on the far right show scatterplots comparing the changes in sense transcription in the 300 bp window with the changes in antisense transcription for each of the four mutants relative to wild-type. Included are all 5183 genes as discussed in the text. Shown are the Spearman rank correlation coefficients (r). NET-seq data were obtained from ( 7), nucleosome occupancy, H3K14ac and H3K36me3 data were from ( 38), and H4ac data were from ( 39).

Changes in antisense transcription levels following mutation are concomitant with expected changes in nucleosome occupancy and chromatin modification. (AD) The average change in levels of nucleosome occupancy, H3K14ac, H3K36me3 and H4ac in (A) set2Δ (B) rco1Δ (C) eaf3Δ and (D) set1Δ, when compared to wild-type. Three gene groups were analysed, which differed in terms of how the level of antisense transcription in the 300 bp window changed following mutation. A positive value indicates the particular level has gone up in the mutant strain relative to wild-type. Black vertical lines indicate P-values between gene groups at the points indicated, determined using the Wilcoxon rank sum test (* indicates P < 0.05, **P < 0.01, ***P < 0.001). The grey vertical box represents the region typically occupied by the +1 nucleosome. Levels of H3K14ac and H3K36me3 were normalised to levels of nucleosome occupancy as discussed in Materials and Methods. The panels on the far right show scatterplots comparing the changes in sense transcription in the 300 bp window with the changes in antisense transcription for each of the four mutants relative to wild-type. Included are all 5183 genes as discussed in the text. Shown are the Spearman rank correlation coefficients (r). NET-seq data were obtained from ( 7), nucleosome occupancy, H3K14ac and H3K36me3 data were from ( 38), and H4ac data were from ( 39).

We considered how levels of nucleosome occupancy and histone modifications compared between these three distinct groups. We obtained levels of nucleosome occupancy, H3K14ac and H3K36me3 from ChIP-seq data (H3K36me3 not shown for set2Δ which lacks this modification, or rco1Δ) ( 38). H4ac levels were from ChIP-chip data (available for set2Δ and rco1Δ strains only) ( 39). We predicted that genes whose antisense transcription increased following mutation would tend towards a greater increase in nucleosome occupancy and H3K14ac compared to those genes in which antisense transcription decreased, while H3K36me3 would tend towards a greater decrease. The changes in the mutant compared to the wild-type strains are displayed as difference plots (Figure 6A–D). Superimposed on the difference plots for the unchanged gene group are the difference plots for the gene groups with increased or decreased antisense transcription. Strikingly, the changes observed agreed with our earlier analysis (Figures 1–4). For example in the set2Δ strain, an increase or decrease in antisense transcription changed nucleosome occupancy at the promoter and in the gene body as predicted. Changes in the levels of antisense transcription were generally found to have reciprocal effects on acetylation and H3K36me3 in the mutant strains (Figure 6D). We note, however, that the changes in chromatin are found in different regions of genes in the different mutant strains. For the set1 deletion strain, for example, we observed changes in nucleosome occupancy and H3K36me3 at the promoter, but not over the gene body. By contrast, changes in nucleosome occupancy were observed at both the sense promoter and the gene body in set2Δ, rco1Δ and eaf3Δ, while H3K36me3 changed at both regions in eaf3Δ. This suggests that Set1 might be required for mediating some of the antisense-transcription associated changes seen at the gene body in the other mutant strains. This analysis supports our hypothesis that antisense transcription is associated with a unique chromatin environment compared to sense transcription. Taken together with the earlier bioinformatics and experimental validation, our data support a model in which increasing antisense transcription levels lead to an increase in nucleosome occupancy and histone acetylation, while decreasing levels of H3K36me3.


References

An JJ et al (2008) Distinct role of long 3′ UTR BDNF mRNA in spine morphology and synaptic plasticity in hippocampal neurons. Cell 134(1):175–187

Beltran M et al (2008) A natural antisense transcript regulates Zeb2/Sip1 gene expression during Snail1-induced epithelial-mesenchymal transition. Genes Dev 22(6):756–769

Berkovits BD, Mayr C (2015) Alternative 3′ UTRs act as scaffolds to regulate membrane protein localization. Nature 522(7556):363–367

Carrieri C et al (2012) Long non-coding antisense RNA controls Uchl1 translation through an embedded SINEB2 repeat. Nature 491(7424):454–457

Chen JJ et al (2005) Genome-wide analysis of coordinate expression and evolution of human cis-encoded sense-antisense transcripts. Trends Genet 21(6):326–329

Colgan DF, Manley JL (1997) Mechanism and regulation of mRNA polyadenylation. Genes Dev 11(21):2755–2766

David L et al (2006) A high-resolution map of transcription in the yeast genome. Proc Natl Acad Sci USA 103(14):5320–5325

de la Mata M, Kornblihtt AR (2006) RNA polymerase IIC-terminal domain mediates regulation of alternative splicing by SRp20. Nat Struct Mol Biol 13(11):973–980

Derti A et al (2012) A quantitative atlas of polyadenylation in five mammals. Genome Res 22(6):1173–1183

Di Giammartino DC, Nishida K, Manley JL (2011a) Mechanisms and consequences of alternative polyadenylation. Mol Cell 43(6):853–866

Di Giammartino DC, Nishida K, Manley JL (2011b) Mechanisms and consequences of alternative polyadenylation. Mol Cell 43(6):853–866

Elkon R, Ugalde AP, Agami R (2013) Alternative cleavage and polyadenylation: extent, regulation and function. Nat Rev Genet 14(7):496–506

Faghihi MA et al (2010) Evidence for natural antisense transcript-mediated inhibition of microRNA function. Genome Biol 11(5):R56

Fu YG et al (2011) Differential genome-wide profiling of tandem 3′ UTRs among human breast cancer and normal cells by high-throughput sequencing. Genome Res 21(5):741–747

Grinchuk OV, Motakis E, Kuznetsov VA (2010) Complex sense-antisense architecture of TNFAIP1/POLDIP2 on 17q11.2 represents a novel transcriptional structural-functional gene module involved in breast cancer progression. BMC Genomics 11:S9

Gunderson SI, Polycarpou-Schwarz M, Mattaj IW (1998) U1 snRNP inhibits pre-mRNA polyadenylation through a direct interaction between U1 70 K and poly(A) polymerase. Mol Cell 1(2):255–264

Hsin JP, Manley JL (2012a) The RNA polymerase II CTD coordinates transcription and RNA processing. Genes Dev 26(19):2119–2137

Hsin JP, Manley JL (2012b) The RNA polymerase II CTD coordinates transcription and RNA processing. Genes Dev 26(19):2119–2137

Hu X et al (2009) Genetic alterations and oncogenic pathways associated with breast cancer subtypes. Mol Cancer Res 7(4):511–522

Hu S, Wang X, Shan G (2016) Insertion of an Alu element in a lncRNA leads to primate-specific modulation of alternative splicing. Nat Struct Mol Biol 23(11):1011–1019

Ikura T et al (2000) Involvement of the TIP60 histone acetylase complex in DNA repair and apoptosis. Cell 102(4):463–473

Jao CY, Salic A (2008) Exploring RNA transcription and turnover in vivo by using click chemistry. Proc Natl Acad Sci USA 105(41):15779–15784

Ji Z, Tian B (2009) Reprogramming of 3′ untranslated regions of mRNAs by alternative polyadenylation in generation of pluripotent stem cells from different cell types. PLoS ONE 4(12):e8419

Ji Z et al (2009) Progressive lengthening of 3 ‘ untranslated regions of mRNAs by alternative polyadenylation during mouse embryonic development (vol 106, pg 7028, 2009). Proc Natl Acad Sci USA 106(23):9535

Ji Z et al (2011) Transcriptional activity regulates alternative cleavage and polyadenylation. Mol Syst Biol 7:534

Kaida D et al (2010) U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature 468(7324):664–668

Katayama S et al (2005) Antisense transcription in the mammalian transcriptome. Science 309(5740):1564–1566

Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25(14):1754–1760

Licatalosi DD, Darnell RB (2010) Applications of next-generation sequencing RNA processing and its regulation: global insights into biological networks. Nat Rev Genet 11(1):75–87

Loeb LA (2011) Human cancers express mutator phenotypes: origin, consequences and targeting. Nat Rev Cancer 11(6):450–457

Lopez-Otin C et al (2013) The hallmarks of aging. Cell 153(6):1194–1217

Mayr C, Bartel DP (2009) Widespread shortening of 3′ UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138(4):673–684

Millevoi S, Vagner S (2010) Molecular mechanisms of eukaryotic pre-mRNA 3′ end processing regulation. Nucleic Acids Res 38(9):2757–2774

Moskalev AA et al (2013) The role of DNA damage and repair in aging through the prism of Koch-like criteria. Ageing Res Rev 12(2):661–684

Ni T et al (2013) Distinct polyadenylation landscapes of diverse human tissues revealed by a modified PA-seq strategy. BMC Genomics 14:615

Onodera CS et al (2012) Gene isoform specificity through enhancer-associated antisense transcription. PLoS ONE 7(8):e43511

Ozsolak F et al (2010) Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation. Cell 143(6):1018–1029

Park JY et al (2011) Comparative analysis of mRNA isoform expression in cardiac hypertrophy and development reveals multiple post-transcriptional regulatory modules. PLoS ONE 6(7):e22391

Paulsen MT et al (2014) Use of Bru-Seq and BruChase-Seq for genome-wide assessment of the synthesis and stability of RNA. Methods 67(1):45–54

Pelechano V, Steinmetz LM (2013) NON-CODING RNA Gene regulation by antisense transcription. Nat Rev Genet 14(12):880–893

Pinto PAB et al (2011) RNA polymerase II kinetics in polo polyadenylation signal selection. EMBO J 30(12):2431–2444

Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841–842

Reijns MA et al (2012) Enzymatic removal of ribonucleotides from DNA is essential for mammalian genome integrity and development. Cell 149(5):1008–1022

Sandberg R et al (2008) Proliferating cells express mRNAs with shortened 3 ‘ untranslated regions and fewer microRNA target sites. Science 320(5883):1643–1647

Spies N et al (2009) Biased chromatin signatures around polyadenylation sites and exons. Mol Cell 36(2):245–254

Squatrito M, Gorrini C, Amati B (2006) Tip60 in DNA damage response and growth control: many tricks in one HAT. Trends Cell Biol 16(9):433–442

Thorvaldsdottir H, Robinson JT, Mesirov JP (2013) Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14(2):178–192

Vanden Bempt I, Drijkoningen M, De Wolf-Peeters C (2007) The complexity of genotypic alterations underlying HER2-positive breast cancer: an explanation for its clinical heterogeneity. Curr Opin Oncol 19(6):552–557

Wallace SS, Murphy DL, Sweasy JB (2012) Base excision repair and cancer. Cancer Lett 327(1–2):73–89

Xie Y et al (2017) An episomal vector-based CRISPR/Cas9 system for highly efficient gene knockout in human pluripotent stem cells. Sci Rep 7(1):2320

Yelin R et al (2003) Widespread occurrence of antisense transcription in the human genome. Nat Biotechnol 21(4):379–386

Yu M et al (2015) RNA polymerase II-associated factor 1 regulates the release and phosphorylation of paused RNA polymerase II. Science 350(6266):1383–1386


Experimental procedures

Antisense transcript characterization

Ustilago maydis antisense transcripts were identified by Ho et al. ( 2007 ) and Morrison et al. ( 2012 ) through the analysis of 5’ ESTs represented in U. maydis cDNA libraries created from germinating (T11) and dormant (TDO) teliospores (Sacadura and Saville, 2003 Ho et al., 2007 ), haploid cells grown in complete medium (HCM), carbon starvation medium (MMC) or nitrogen starvation medium (MMN Ho et al., 2007 ), forced diploids grown filamentously (D12 Nugent et al., 2004 ) or filamentous dikaryotic mycelia (DIK Morrison et al., 2012 ).

Clones representing antisense transcripts were selected for 3’ sequencing. Plasmid DNA isolation followed the procedure outlined in Nugent et al. ( 2004 ). Sequencing from the 3’ end of the antisense transcript was performed using primers M13_R or dT19V (Table S5). Nucleotide sequences were determined using Big Dye Terminator v3.1 chemistry (Applied Biosystems) and the reaction products were analysed using an ABI PRISM 3730 DNA Analyser (Applied Biosystems).

New ESTs were edited to remove contaminating vector sequences as indicated in Morrison et al. ( 2012 ), except that all sequences were manually inspected for the presence of a poly(A) tail. The 5’ and 3’ ESTs representing antisense transcripts were aligned using blastn to the U. maydis chromosome assembly file ‘Umaydis_contigs.fas’ (last modified 8 September 2011) or the U. maydis open reading frame (ORF) file ‘Umaydis_valid_orf.fas’ (last modified 24 May 2011). These FASTA (.fas) files were downloaded from the Munich Information Centre for Protein Sequences (MIPS) U. maydis database (MUMDB Mewes et al., 2008 ). All 5’ and 3’ ESTs from clones representing antisense transcripts were confirmed to contain the antisense strand of a MUMDB gene. Additionally, in aligning the pair of ESTs representing the 5’ and 3’ ends of an antisense transcript to the U. maydis genome, or U. maydis ORFs, characteristics of antisense transcripts were determined, including: the length of the antisense transcript, the type and length of overlap (with respect to the ORF), and those antisense transcripts which contain an intron. Antisense transcripts containing an intron were confirmed manually and the splice site nucleotides were recorded. Putative ORFs within full-length NATs were detected and used to predict their corresponding amino acid sequence using Geneious Pro v5.6.5 (Biomatters). The resulting putative proteins were used in blastp searches (conducted in October 2012) against the NCBI non-redundant protein database with an expect threshold of E < 1e-5. For each protein, only the best blastp hit was reported. Furthermore, putative proteins were inspected for N-terminal secretion signals using SignalP v4.0 (Petersen et al., 2011 ), TargetP v1.1 (Emanuelsson et al., 2000 ), and ProtComp v9.0 (Softberry). Only those proteins predicted to have extracellular location by all three programs were recorded.

Plasmid and strain constructions

All procedures were carried out as suggested by the manufacturers, unless otherwise stated. Plasmids utilized in this study are listed in Table S6. The U. maydis expression vector pCM768 is a non-integrating vector that contains a U. maydis autonomously replicating sequence (ARS), a hygromycin B resistance (Hyg R ) cassette, and the U. maydis glyceraldehyde-3-phosphate dehydrogenase promoter (Pumgapd) which facilitates the expression of genes inserted in the multiple cloning site (Kojic and Holloman, 2000 ). To express antisense transcripts from an autonomously replicating vector, a region of the genome corresponding to an antisense transcript (identified through full-length cDNA analysis) was amplified by PCR using primers which introduce restriction endonuclease recognition sequences at their 5’ ends (Table S5). The PCR fragment and pCM768 were digested with restriction endonucleases and purified. Linearized pCM768 was treated with Antarctic Phosphatase (New England BioLabs) and ligated to the purified digested PCR fragment. The resulting U. maydis antisense transcript expression vector was transformed into Subcloning Efficiency DH5α Competent Cells (Invitrogen). Six U. maydis antisense transcript expression vectors were constructed using this approach (Table S6). The correct nucleotide sequence and orientation for the region of the genome encoding the antisense transcript was confirmed by sequencing. All sequencing reactions were performed using primer pgapd_79_F (Table S5). FB1 or FB2 protoplasts were transformed with the antisense expression vector, using a modification of the Yee ( 1998 ) protocol, previously described in Morrison et al. ( 2012 ). Genomic DNA was isolated from putative transformants using the protocol outlined by Hoffman and Winston ( 1987 ). Successful U. maydis transformants were confirmed using PCR. PCRs were performed to amplify a region of the Hyg R cassette using primers pCM768_Hyg_F and pCM768_Hyg_R, or the Pumgapd-driven antisense transcript, using the primer pgapd_79_F and the reverse primer used to clone the genomic region corresponding to the antisense transcript (Table S5). Furthermore, antisense transcript expression was confirmed via strand-specific semi-quantitative RT-PCR (see transcript expression analysis, below). The U. maydis FB1- and FB2-derived strains created for antisense transcript expression analyses are listed in Table 4. In each case, several strains were identified.

Strain Relevant genotype Source
Ustilago maydis
518 a2 b2 Holliday ( 1961 )
521a a Reference strain used for genome sequencing.
a1 b1 Holliday ( 1961 )
d132 a1/a2 b1/b2 Kronstad and Leong ( 1989 )
FB1 a1 b1 Banuett and Herskowitz ( 1989 )
FB2 a2 b2 Banuett and Herskowitz ( 1989 )
FBD12 a1/a2 b1/b2 Banuett and Herskowitz ( 1989 )
SG200 a1 mfa2 bE1bW2 Kämper et al. ( 2006 )
FB1 [pCM768] a1 b1 [pCM768] This work
FB2 [pCM768] a2 b2 [pCM768] This work
FB2 [pCMas-um02114] a2 b2 [pCMas-um02114] This work
FB1 [pCMas-um02125] a1 b1 [pCMas-um02125] This work
FB1 [pCMas-um02150] a1 b1 [pCMas-um02150] This work
FB1 [pCMas-um02151] a1 b1 [pCMas-um02151] This work
FB2 [pCMas-um10002] a2 b2 [pCMas-um10002] This work
FB2 [pCMas-um12232] a2 b2 [pCMas-um12232] This work
SG200Δum02151 a1 mfa2 bE1bW2 Δum02151 This work
SG200ΔPas-um02151 a1 mfa2 bE1bW2 ΔPas-um02151 This work
SG200ΔncRNA1 a1 mfa2 bE1bW2 ΔncRNA1 Morrison et al. ( 2012 )
Ustilago hordei
Uh 4857-4 (alias Uh364)a a Reference strain used for genome sequencing.
MAT-1 Linning et al. ( 2004 )
Uh 4857-5 (alias Uh365) MAT-2 Linning et al. ( 2004 )

Ustilago maydis deletion strains were created by replacing the locus of interest with a Hyg R cassette following a PCR-based approach (Kämper, 2004 ). To delete um02151, 1.1 kb fragments flanking um02151 were amplified by PCR using primers um02151_Left_F and um02151_Left_R_SfiI to amplify the left flank, and primers um02151_Right_F_SfiI and um02151_Right_R to amplify the right flank (Table S5). To delete putative regulatory elements for as-um02151, a 0.5 kb region upstream from as-um02151 (Pas-um02151) was targeted for deletion. Primers pA_Left_F and pA_Left_R_SfiI, and pA_Right_F_SfiI and pA_Right_R were used to amplify 1.1 kb flanking regions to Pas-um02151 (Table S5). PCR fragments were digested with SfiI and ligated to the Hyg R cassette as described in Morrison et al. ( 2012 ). The 4.8 kb ligation products were amplified using primers um02151_Nested_F and um02151_Nested_R, or pA_Nested_F and pA_Nested_R (Table S5), and cloned into pCR-XL-TOPO (Invitrogen) yielding pCR-XL-TOPO-Δum02151, and pCR-XL-TOPO-ΔPas-um02151 (Table S6). These vectors, carrying the deletion constructs, were transformed into One Shot Chemically Competent TOP10 Escherichia coli cells (Invitrogen). Nucleotide sequences were verified by sequencing using primers pMF1_HygOUT_F and pMF1_HygOUT_R. The um02151 or Pas-um02151 deletion constructs were amplified by PCR using primers um02151_Nested_F and um02151_Nested_R, or pA_Nested_F and pA_Nested_R, and then purified using the QIAquick Gel Purification kit (Qiagen) prior to U. maydis transformation. U. maydis SG200 protoplasts were transformed. To identify putative transformants, PCRs were performed to amplify a region containing part of the Hyg R cassette and an area of the flanking um02151 or Pas-um02151 locus, outside the area of integration. For these screens, PCR was conducted using primer pairs um02151_Left_F and pMF1_HygOUT_F, and pMF1_HygOUT_R and um02151_Right_R, or primer pairs pA_Left_F and pMF1_HygOUT_F, and pMF1_HygOUT_R and pA_Right_R (Table S5). Several strains carrying deletions to um02151 or Pas-um02151 were identified by PCR.

Fungal strains, growth conditions and production of U. maydis cell types

Ustilago maydis and U. hordei strains used in this study are listed in Table 4. For RT-PCRs conducted to support EST data, U. maydis haploid strains FB1 and FB2 were grown in complete (CM) or minus nitrogen (MN) medium as described in Ho et al. ( 2010 ), filamentous growth of FBD12 diploids or FB1×FB2 dikaryons was induced as described in Morrison et al., ( 2012 ), and teliospores were harvested from mature tumours of corn (Z. mays L. ‘Golden Bantam’) infected with 518×521. Cob infection, teliospore harvesting and teliospore storage were performed as described in Morrison et al., ( 2012 ). For all other experiments: U. maydis haploid cells were grown overnight in YEPS medium (1% w/v yeast extract, 2% w/v peptone, 2% w/v sucrose 250 r.p.m., 28°C). U. maydis teliospores were induced to germinate by modifying the protocol described in Zahiri et al. ( 2005 ). Differences include: germination was induced by shaking in 250 ml Erlenmeyer flasks containing 10 ml of YEPS medium, supplemented with 160 μg ml −1 streptomycin sulphate (90 r.p.m., 28°C). The germinating teliospores were pelleted 16, 18, 24 or 40 h post induction of germination (PIG) by centrifugation (400 g, 5 min, RT).

Ustilago hordei haploid strains Uh364 and Uh365, and barley (Hordeum vulgare L. ‘Odessa’) heads infected with Uh364×Uh365 were obtained from Guus Bakkeren (Agriculture & Agri-Food Canada, Pacific Agri-Food Research Centre, BC, Canada). Haploid cells were grown for ∼ 40 h in YEPS medium (250 r.p.m., 22°C). Mixtures of U. hordei teliospores, collected from field samples across Manitoba and eastern Saskatchewan in 2007 or 2009, were provided by James Menzies (Agriculture & Agri-Food Canada, Cereal Research Centre, MB, Canada). Plant growth conditions and pathogenesis assays were performed following the protocols described in Morrison et al. ( 2012 ).

RNA extraction, S1 nuclease trimming and reverse transcription

RNA extractions for RT-PCRs conducted to support EST data were carried out on pelleted haploid cells grown in CM or MN, or filamentous diploids or dikaryons harvested from plates. The tissue from haploid or mycelial cell types was frozen in liquid nitrogen and homogenized using a mortar and pestle. Vacuum-dried teliospores were disrupted following the protocol described in Zahiri et al. ( 2005 ). For all other experiments, U. maydis haploid cells, U. maydis germinating teliospores, or U. hordei haploid cells, were pelleted by centrifugation. These cells, or U. hordei teliospores (collected from field samples), were resuspended in TRIzol reagent (Invitrogen), and transferred into 2 ml screw-cap tubes containing Lysing Matrix C (MP Biomedicals). Cells were disrupted as described in Zahiri et al. ( 2005 ). Alternatively, U. hordei-infected barley heads were ground in liquid nitrogen immediately prior to RNA extraction and resuspended in TRIzol reagent (Invitrogen). Following cell disruption, RNA was isolated using TRIzol reagent (Invitrogen) according to the manufacturer's protocol. RNA samples were precipitated, treated with DNase I, screened for genomic DNA contamination and assessed for quality as described in Morrison et al. ( 2012 ).

For first-strand synthesis reactions, 200 ng of DNase I-treated total RNA was used as template. These reactions were primed with oligo-d(T)16, a sense-specific primer, an antisense-specific primer, a tagged strand-specific primer or water (to account for false-priming). Primers were designed for sense- or antisense-specific first-strand synthesis reactions using Primer3 (Rozen and Skaletsky, 2000 ) following the protocol outlined in Ho et al. ( 2010 ). The strand-specific primer sequences are listed in Table S5. All first-strand synthesis reactions were carried out using the TaqMan Gold RT-PCR kit (Applied Biosystems) with the conditions outlined in Ho et al. ( 2010 ). Following first-strand synthesis, cDNA was diluted fourfold (1:3) with DEPC-treated water.

To detect dsRNA, 2.5 μg of DNase I-treated total RNA was incubated in S1 nuclease digestion reactions at 37°C for 30 min. In separate reactions, the final concentration of S1 Nuclease (Invitrogen) included: 0, 0.01, 0.1 or 1 U μl −1 . Next, dsRNA was phenol/chloroform extracted, precipitated with NH4Ac/Ethanol/GlycoBlue Coprecipitant (Invitrogen) at −20°C for 60 min, and resuspended in 15 μl DEPC-treated H2O. Two microlitres of S1 trimmed RNA was used as template in first-strand synthesis reactions.

Transcript expression analysis

All primers used in RT-PCRs to analyse transcript expression were designed using Primer3 (Rozen and Skaletsky, 2000 ) and are listed in Table S5. Two microlitres of diluted cDNA was used as template for each RT-PCR. All RT-PCRs were performed using Amplitaq Gold DNA polymerase (Applied Biosystems) and 30 or 35 cycles. RT-PCRs with 30 cycles were considered to be semi-quantitative because equal amounts of total RNA were used as template for each first-strand synthesis reaction, and equal volumes of cDNA were used as template for each RT-PCR. One quarter of the final product mixture was electrophoretically separated on an agarose gel (1× TAE), and visualized by ethidium bromide staining.

TaqMan minor groove binder (MGB) probes were designed using Primer Express v2.0 (Applied Biosystems) and are listed in Table S5. Two microlitres of diluted cDNA was used as template for each quantitative PCR (RT-qPCR). All RT-qPCRs were performed using the TaqMan Universal PCR Master Mix (Applied Biosystems). Data were collected and analysed on an ABI PRISM 7900HT using Sequence Detection System Version 2.1 (Applied Biosystems). Relative transcript levels for um02151 were measured for eight independent FB1 [pCMas-um02151] strains, compared with an average of four independent FB1 [pCM768] strains using the ΔΔCT method. Three technical replicates for each sample were performed and umgapd was used as an internal standard.

RNA ligase-mediated rapid amplification of cDNA ends (RLM-RACE)

RLM-RACE was employed to identify the 5’ and 3’ ends of as-um02150, as-um02151 and as-uhor_03676. Ten micrograms of DNase I-treated total RNA from U. maydis (518×521) or U. hordei (Uh364×Uh365) teliospores was processed and reverse transcribed using the FirstChoice RLM-RACE kit (Ambion). The resulting cDNAs, containing 5’ or 3’ adaptors, were used as template for RACE. Successive PCRs utilizing gene-specific outer primers, followed by gene-specific inner primers (Table S5), were performed to yield 5’ or 3’ RACE products. All PCRs were conducted using HotStar HiFidelity DNA Polymerase (Qiagen). Amplified fragments were cloned into the pDrive Cloning Vector (Qiagen) and transformed into E. coli (Qiagen EZ Competent Cells). Transformants were plated on LB agar plates containing 100 μg ml −1 ampicillin, 50 μM isopropyl β- d -thiogalactopyranoside (IPTG) and 80 μg ml −1 5-bromo-4-chloro-3-indolyl β- d -galactopyranoside (X-gal), which enabled blue/white screening of recombinant colonies. Following overnight growth at 37°C, 24 individual E. coli colonies from each transformation were inoculated into 96-well microtitre plates containing LB medium supplemented with 100 μg ml −1 ampicillin, and grown overnight at 250 r.p.m., 37°C. Portions of each bacterial culture were aliquoted into a separate 96-well microtitre plate and stored frozen at −80°C in 15% glycerol. The remaining bacterial cells were harvested by centrifugation and plasmid DNA was isolated using the protocol outlined in Nugent et al., ( 2004 ). Nucleotide sequences were determined using the primer M13_ F (Table S5).

Protein extraction and analysis

Proteins were extracted from haploid cells grown in 15 ml YEPS medium, or YEPS medium supplemented with 250 μg ml −1 hygromycin B (Bioshop Canada). Cells were pelleted by centrifugation (5250 g, 5 min, 4°C), and washed once in 1 ml ice-cold protein extraction buffer [10 mM KCl, 5 mM MgCl2·6H2O, 400 mM sucrose, 100 mM Tris-HCl pH 8.1, 10% v/v glycerine, 0.007% v/v β-mercaptoethanol, 1% v/v Protease Inhibitor Cocktail (Sigma-Aldrich)]. Cells were resuspended in 300 μl ice-cold protein extraction buffer, transferred to 2 ml screw-cap tubes containing acid-washed glass beads (425–600 μm, Sigma-Aldrich), and disrupted as described in Zahiri et al. ( 2005 ). Cell debris was pelleted by centrifugation (20 800 g, 5 min, 4°C), the supernatant was transferred to a 1.5 ml microcentrifuge tube, and the total protein extract was stored at −20°C.

Protein concentrations were estimated using the Quick Start Bradford Protein Assay (Bio-Rad). The standard microplate protocol was followed. A standard curve was created using a bovine γ-globulin dilution series. For each sample, the absorbance at 595 nm was measured in triplicate, using a NanoDrop 8000 Spectrophotometer (Thermo Scientific). Total protein extracts were diluted to ∼ 1000 μg ml −1 and these concentrations were confirmed using the Bradford assay.

To visualize the total protein extract, SDS-PAGE was performed, followed by Coomassie staining. Briefly, equal parts of the diluted total protein extract was combined with Laemmli Sample Buffer (supplemented with 5% v/v β-mercaptoethanol), and this mixture was boiled for 5 min. Approximately 10 μg total protein extract was loaded into the wells of a 12% Mini-PROTEAN TGX precast polyacrylamide gel (Bio-Rad). Gels were stained with Coomassie Staining Solution (Sambrook and Russell, 2001 ), confirming equivalent amounts of total protein were present in the ∼ 1000 μg ml −1 dilutions.

For detection of Um02151, a Western blot was performed. Following SDS-PAGE, proteins were transferred electrophoretically onto a 0.2 μm PVDF membrane (Bio-Rad). Subsequently, the membrane was stained with Ponceau S (Sigma-Aldrich) to verify equal protein loading in each well. Prior to blocking, the Ponceau S stain was washed off the membrane using distilled water. The primary antibody (1:1000 dilution) used to probe the membrane was a rabbit affinity-purified polyclonal antibody against the Um02151 peptide APGKTKEDTLESLRC (GenScript). Binding of the primary antibody to the membrane was detected using a secondary antibody (1:3000 dilution) consisting of goat anti-rabbit immunoglobulin G, conjugated to alkaline phosphatase (Bio-Rad), in conjunction with the Immun-Blot Assay kit (Bio-Rad). An image of the Western blot was taken using the Geliance 600 Imaging System (Perkin Elmer). To estimate Um02151 levels, the pixel volume for each sample was determined using GeneTools software (Perkin Elmer). An average pixel volume was calculated for three biological replicates of the empty vector transformant, FB1[pCM768]. Using this average as a reference, the relative pixel volume for each sample was determined. The Western blot was repeated and the relative pixel volumes were separately calculated. The ‘fold change, relative to controls’ reported for each sample in Fig. 7B is an average of the values obtained from the two technical replicates.


CONCLUSION

Our results paint the most detailed picture of the global regulation of cis-NATS in plants so far. While we could show that cis-NAT pairs tend to have more anticorrelated expression patterns than nonoverlapping neighboring transcripts, we found that pronounced anticorrelation across many samples can only be found in a small subset of cis-NATs. Along these lines, we found that discrete cis-NAT pairs show anticorrelated expression in different experiments, suggesting that independent transcriptional regulation of both members of a pair has a strong influence on cis-NAT expression. The negative correlation of cis-NATs was also observed in a cell type-specific data set, indicating that cis-NATs affect each others' expression in individual cells. The observation that small RNA loci, representing mainly siRNAs, were underrepresented in cis-NATs along with the fact that mutations in the RNA silencing machinery did not have a significant effect on cis-NAT expression confirm this notion and complement previous suggestions that small RNAs and RNA interference are important for only a subset of cis-NATs ( Lu et al., 2005).

However, there is at least one known example in which small RNAs derived from cis-NATs have been shown to be important in mutually antagonistic expression, namely, the SRO5 and P5CDH pair of cis-NATs involved in Arabidopsis salt tolerance ( Borsani et al., 2005). When exposed to salt stress, SRO5 message is induced, leading to formation of small RNAs and activation of an RNA-silencing pathway that ultimately leads to down-regulation of the P5CDH transcript. As pointed out before, no small RNA MPSS tag from wild-type tissue maps to the overlapping region of the two transcripts, consistent with the inducible nature of this particular siRNA. Borsani et al. (2005) have also suggested that microarrays are imperfect for assessing mutually antagonistic effects, if 3′ products are largely stable. Indeed, SRO5 and P5CDH are only weakly anticorrelated in our data sets and are not significantly different from nonoverlapping transcripts. Nevertheless, the significant shift in correlation coefficients of cis-NATs toward negative values when compared with nonoverlapping transcripts indicates that coordinated expression of cis-NAT can be detected by microarrays, even if the mechanism by which this is achieved is still unclear. These strongly anticorrelated cis-NATs will be attractive targets for further mechanistic studies.


There are three distinct sets of sources for the annotation of genes and transcripts in Vega. These are shown as different tracks and have different colour schemes.

  • Havana Core Genes have been annotated in depth to identify alternative transcripts, and are present for all species. They have been annotated by the Havana group at the WTSI.
  • LoF genes show the consequences of variations in sequence on the functional properties of human transcripts.

For each of these sets, genes and transcripts are classified as shown below.


Watch the video: Ξεκινούν οι μετεγγραφές φοιτητών (May 2022).