Topic Introduction

Genomics Methods for Xenopus Embryos and Tissues

  1. Gert Jan C. Veenstra3,4
  1. 1The Francis Crick Institute, London NW1 1AT, United Kingdom;
  2. 2Department of Developmental and Cell Biology, University of California, Irvine, California 92697;
  3. 3Radboud University, Department of Molecular Developmental Biology, 6525GA Nijmegen, The Netherlands
  1. 4Correspondence: drmikegilchrist{at}gmail.com; kwcho{at}uci.edu; g.veenstra{at}science.ru.nl

Abstract

High-throughput sequencing methods have created exciting opportunities to explore the regulatory landscape of the entire genome. Here we introduce methods to characterize the genomic locations of bound proteins, open chromatin, and sites of DNA–DNA contact in Xenopus embryos. These methods include chromatin immunoprecipitation followed by sequencing (ChIP-seq), a combination of DNase I digestion and sequencing (DNase-seq), the assay for transposase-accessible chromatin and sequencing (ATAC-seq), and the use of proximity-based DNA ligation followed by sequencing (Hi-C).

OVERVIEW

The epigenetic state of chromatin regulates gene expression—and hence cellular differentiation—by controlling the access of transcription factors to DNA. Here we introduce methods to explore the locations of bound proteins, open chromatin, and sites of DNA–DNA contact at the whole-genome level in Xenopus embryos and tissues. These methods of data generation are often used in conjunction with gene expression analyses, which we discuss in Introduction: Transcriptomics and Proteomics Methods for Xenopus Embryos and Tissues (Gilchrist et al. 2019).

Chromatin immunoprecipitation (ChIP) is one of the most direct ways to identify the sites of interaction between the genome and DNA-binding proteins (typically transcription factors and histones). In its most straightforward application, ChIP relies on an initial cross-linking stage in the intact animals or dissected tissues, followed by DNA fragmentation by sonication. Complexes containing DNA fragments bound to the protein of interest are immunoprecipitated with an antibody that recognizes the protein. After dissolution of the cross-linking, the DNA fragments are recovered and subjected to direct high-throughput sequencing (hence the abbreviation ChIP-seq).

Mapping regions of relatively open chromatin is valuable for identifying regulatory elements, including promoters and enhancers. Even when bound by proteins such as transcription factors, the DNA of the regulatory element remains accessible, rendering these regions relatively sensitive to cleavage by DNaseI or Tn5 transposase. Two methods, DNase-seq (see Protocol: DNase-seq: A High-Resolution Technique for Mapping Active Gene Regulatory Elements across the Genome from Mammalian Cells [Song and Crawford 2010]) and the assay for transposase-accessible chromatin (ATAC) and sequencing (ATAC-seq) (Buenrostro et al. 2013), exploit this feature to map regions of accessible DNA. In these methods, digested double-cut fragments of DNA are either directly amplified (as in ATAC-seq) or amplified after purification (as in DNase-seq) and then sequenced. These approaches provide the ability to discover important new enhancers without having a priori knowledge of the location and identity of bound transcription factors.

In a rather different approach to genomic data generation, we can determine the distribution of DNA–DNA contacts within and between chromosomes to understand how chromosomal DNA is folded in the nucleus. Cross-linking is used to form bridges at the contact points, and after DNA fragmentation, the biotinylated loose ends of DNA fragments in the same complex are ligated. After removal of the cross-links and further digestion, the joined fragments are isolated and sequenced from both ends to identify the different genomic regions that were in contact. This global and relatively unbiased variant of the chromosome conformation capture methods is referred to as Hi-C. The sequence data contain information on the DNA looping structures understood to regulate transcription, and are thus rather different from, but highly complementary to, the data generated by ChIP-seq, DNase-seq, and ATAC-seq experiments.

In general, these experimental methods are not very efficient, requiring relatively large numbers of cells to provide a robust signal. The Xenopus system is therefore ideal for these types of applications because of the ability to collect large numbers of synchronously developing embryos after in vitro fertilization. Below we introduce ChIP-seq, DNase-seq, ATAC-seq, and Hi-C protocols that have been developed specifically for use in Xenopus embryos and tissues and provide examples of their applications.

The Xenopus community is fortunate in having two well-assembled genomes, that of the diploid X. tropicalis (Hellsten et al. 2010) and that of the allo-tetraploid X. laevis (Session et al. 2016). This is important, as the sequence fragments generated by these methods generally map to the intergenic and intragenic noncoding (introns) regions, and incomplete assemblies will cause loss of potentially valuable data. The diploid genome of X. tropicalis makes the high-throughput genomic data simpler to interpret when compared to data from the larger, partly duplicated genome of X. laevis. However, the larger X. laevis embryos may be preferred for ease of experimental manipulation.

PROTOCOLS

Two ChIP-seq protocols, Protocol: Mapping Chromatin Features of Xenopus Embryos (Gentsch and Smith 2019) and Protocol: ChIP-Sequencing in Xenopus Embryos (Hontelez et al. 2019), offer slightly different approaches to prepare enriched nuclei and yolk-depleted embryo lysate. In particular, ChIP with cleared lysates—as described in the latter protocol—requires less starting material. ChIP-seq has been widely used in the Xenopus community to understand the targets of developmentally important transcription factors—for example, Smad2/3 (Yoon et al. 2011); Foxh1 (Chiu et al. 2014); T-Box family proteins (Gentsch et al. 2013); β-catenin (Nakamura et al. 2016); Otx2, Lim1/Lhx1, and Gsc (Yasuoka et al. 2014); Vegt and Otx1 in early embryos (Paraiso et al. 2019); and Prdm12 in developing inhibitory neurons important for vertebrate locomotion (Thélie et al. 2015). ChIP-seq has also been used to identify promoters and compare them across species (van Heeringen et al. 2011) or to look at the dynamic changes in genome-wide distribution of histone modifications and RNA polymerase II occupancy (Akkers et al. 2009; Hontelez et al. 2015).

Open chromatin is generally considered a prerequisite for binding of transcription factors, and two protocols can be used to study the distribution of open chromatin regions over the Xenopus genome: One uses DNase-seq (Protocol: DNase-seq to Study Chromatin Accessibility in Early Xenopus tropicalis Embryos [Cho et al. 2019]), and the other uses ATAC-seq (Protocol: Assay for Transposase-Accessible Chromatin-Sequencing Using Xenopus Embryos [Bright and Veenstra 2019]). Reads from these protocols are mapped to the genome and generally produce peaks or otherwise-delineated small regions of DNA. These regions can then be analyzed for enrichment in DNA-binding motifs to predict which transcription factors may be active in particular developmental stages or tissues. If enough sequence reads are mapped, specific transcription factor footprints can be resolved at single-base resolution to determine how those factors contact the DNA (Boyle et al. 2011; Neph et al. 2012; Buenrostro et al. 2013).

Most chromosome conformation capture methods use defined viewpoints (genomic regions of interest for which interactions are determined) to build dense maps of DNA–DNA contact information (for review, see Nicoletti et al. 2018). However, it is also possible to use this approach to perform an unbiased analysis in an organism's cells to study the three-dimensional organization of chromosomes under physiological conditions. This approach is described for Xenopus in Protocol: Generating a Three-Dimensional Genome from Xenopus with Hi-C (Quigley and Heinz 2019). This approach has previously been used in Xenopus to investigate the relationship between Foxj1 binding and chromatin loops in multiciliated cells (Quigley and Kintner 2017) using tethered conformation capture (Kalhor et al. 2011).

FUTURE CONSIDERATIONS

The major change sweeping through the world of biological data generation is the switch in emphasis from the bulk analysis of whole organisms or tissues to the analysis of hundreds to many thousands of individual cells from an embryo or tissue sample. Advances have been most rapid in the field of single cell transcriptomics—with, for example, the recent launch of the Human Cell Atlas (Rozenblatt-Rosen et al. 2017), which aims to create comprehensive reference maps of cell types in all major tissues. Recently, a major single-cell transcriptomics survey was conducted in Xenopus specimens spanning the blastula stage to the tailbud stage; this enabled the characterization of cell types from the earliest pluripotent cells to the well-differentiated cells of early organogenesis (Briggs et al. 2018). Although not all mRNAs in each cell are currently detected, single-cell RNA sequencing (scRNA-seq) can help us define the set of cell types in a population of cells, determine the evolution of cell lineages during development, and better understand the regulatory relationships between genes. Inevitably, the very large data sets that are generated do introduce some challenges in the development and application of computational methods, but modern approaches such as deep learning may be ideal for this.

Genomics techniques will not be far behind. Already, a number of approaches to single-cell genomics analysis are being developed along the lines of the multicellular methods, although the relatively small amounts of DNA in each sample will require highly efficient experimental methods and some rethinking of the downstream analysis. New approaches include the use of the Tn5 transposase for efficient library preparation—for example, in the CUT&Tag method (Kaya-Okur et al. 2019) for efficient epigenomic profiling on low cell numbers and single cells. An ATAC-seq method has been developed for single-cell work (Chen et al. 2018) and used to resolve dynamic changes in the chromatin landscape and to uncover the cis-regulatory programs of Drosophila germ layer formation (Cusanovich et al. 2018). Last, in addition to Hi-C, Capture Hi-C (Jäger et al. 2015) has been developed to specifically enrich (for example) for promoter-containing fragments from Hi-C libraries, and it will be able to generate evidence for dynamic interactions between promoters and enhancers.

The availability of genome assemblies for two closely related Xenopus species (see above) provides opportunities for new insights in comparative biology. In addition, there are many potential sources of genetic variation data captured in expressed sequences from both species and from different strains of these. Analysis of such data is useful for many different purposes, such as the design of morpholinos and CRISPR guide RNAs when using outbred populations. Genetic variation is also key to analyses of quantitative trait loci (QTL) and genome evolution. Although analysis of variation in whole-genome sequencing data is outside the scope of this introduction, genomic variation in X. laevis and X. tropicalis has already been analyzed in several reports (Elurbe et al. 2017; Savova et al. 2017; Mitros et al. 2019). Further studies will likely address genetic variation and their associated developmental, gene-regulatory, and phenotypic traits, both within populations and between closely related species.

Future genome-wide studies in Xenopus using the techniques described above will lead to the discovery of new mechanisms and to a better understanding of the processes controlling transcription and gene activation. We look forward to seeing the Xenopus community employing this next generation of methods in this most tractable of model systems.

Footnotes

  • From the Xenopus collection, edited by Hazel L. Sive.

REFERENCES

| Table of Contents