Encyclopedia Britannica

  • Games & Quizzes
  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center

point mutation

What are mutation hotspots?

Ice Sledge Hockey, Hockey Canada Cup, USA (left) vs Canada,  2009. UBC Thunderbird Arena, Vancouver, BC, competition site for Olympic ice hockey and Paralympic ice sledge hockey. Vancouver 2010 Olympic and Paralympic Winter Games, Vancouver Olympics

Our editors will review what you’ve submitted and determine whether to revise the article.

  • Verywell Health - What Is a Mutation?
  • North Dakota State University - Genes and Mutations
  • Cleveland Clinic - Genetic Mutations in Humans
  • Live Science - What are Mutations?
  • University of California, Berkeley - Understanding Evolution - The causes of mutations
  • National Center for Biotechnology Information - PubMed Central - Muons, mutations, and planetary shielding
  • Learn Genetics - What is Mutation?
  • Biology LibreTexts - Mutations
  • Nature - Genetic Mutation
  • Academia - Mutation
  • Khan Academy - An introduction to genetic mutations
  • Iowa State University Digital Press - DNA Mutations
  • mutation - Children's Encyclopedia (Ages 8-11)

point mutation

How are mutations passed to offspring?

An individual offspring inherits mutations only when mutations are present in parental egg or sperm cells (germinal mutations). All of the offspring’s cells will carry the mutated DNA , which often confers some serious malfunction, as in the case of a human genetic disease such as  cystic fibrosis .

Why does mutation occur?

Mutations in DNA occur for different reasons. For example, environmental factors, such as exposure to ultraviolet radiation or certain chemicals, can induce changes in the DNA sequence. Mutations can also occur because of hereditary factors.

Mutation hotspots (or mutational hotspots) are segments of DNA that are especially prone to genetic alteration. The increased susceptibility of these areas of DNA to mutation is attributed to interactions between mutation-inducing factors, the structure and function of the DNA sequence, and enzymes involved in DNA repair, replication, and modification.

Recent News

mutation , an alteration in the genetic material (the genome ) of a cell of a living organism or of a virus that is more or less permanent and that can be transmitted to the cell’s or the virus’s descendants. (The genomes of organisms are all composed of DNA , whereas viral genomes can be of DNA or RNA ; see heredity: The physical basis of heredity .) Mutation in the DNA of a body cell of a multicellular organism ( somatic mutation ) may be transmitted to descendant cells by DNA replication and hence result in a sector or patch of cells having abnormal function, an example being cancer . Mutations in egg or sperm cells ( germinal mutations ) may result in an individual offspring all of whose cells carry the mutation, which often confers some serious malfunction, as in the case of a human genetic disease such as cystic fibrosis . Mutations result either from accidents during the normal chemical transactions of DNA, often during replication, or from exposure to high-energy electromagnetic radiation (e.g., ultraviolet light or X-rays) or particle radiation or to highly reactive chemicals in the environment . Because mutations are random changes, they are expected to be mostly deleterious , but some may be beneficial in certain environments . In general, mutation is the main source of genetic variation , which is the raw material for evolution by natural selection .

Know how a single change in the DNA nucleotide results in mutation and why some mutations are harmful

The genome is composed of one to several long molecules of DNA, and mutation can occur potentially anywhere on these molecules at any time. The most serious changes take place in the functional units of DNA, the genes . A mutated form of a gene is called a mutant allele . A gene is typically composed of a regulatory region, which is responsible for turning the gene’s transcription on and off at the appropriate times during development, and a coding region, which carries the genetic code for the structure of a functional molecule, generally a protein . A protein is a chain of usually several hundred amino acids . Cells make 20 common amino acids, and it is the unique number and sequence of these that give a protein its specific function. Each amino acid is encoded by a unique sequence, or codon , of three of the four possible base pairs in the DNA (A–T, T–A, G–C, and C–G, the individual letters referring to the four nitrogenous bases adenine , thymine , guanine , and cytosine ). Hence, a mutation that changes DNA sequence can change amino acid sequence and in this way potentially reduce or inactivate a protein’s function. A change in the DNA sequence of a gene’s regulatory region can adversely affect the timing and availability of the gene’s protein and also lead to serious cellular malfunction. On the other hand, many mutations are silent, showing no obvious effect at the functional level. Some silent mutations are in the DNA between genes, or they are of a type that results in no significant amino acid changes.

Carolus Linnaeus.

Mutations are of several types. Changes within genes are called point mutations . The simplest kinds are changes to single base pairs, called base-pair substitutions. Many of these substitute an incorrect amino acid in the corresponding position in the encoded protein, and of these a large proportion result in altered protein function. Some base-pair substitutions produce a stop codon. Normally, when a stop codon occurs at the end of a gene, it stops protein synthesis , but, when it occurs in an abnormal position, it can result in a truncated and nonfunctional protein. Another type of simple change, the deletion or insertion of single base pairs, generally has a profound effect on the protein because the protein’s synthesis, which is carried out by the reading of triplet codons in a linear fashion from one end of the gene to the other, is thrown off. This change leads to a frameshift in reading the gene such that all amino acids are incorrect from the mutation onward. More-complex combinations of base substitutions , insertions, and deletions can also be observed in some mutant genes.

Mutations that span more than one gene are called chromosomal mutations because they affect the structure, function, and inheritance of whole DNA molecules (microscopically visible in a coiled state as chromosomes ). Often these chromosome mutations result from one or more coincident breaks in the DNA molecules of the genome (possibly from exposure to energetic radiation), followed in some cases by faulty rejoining. Some outcomes are large-scale deletions, duplications, inversions, and translocations. In a diploid species (a species, such as human beings, that has a double set of chromosomes in the nucleus of each cell), deletions and duplications alter gene balance and often result in abnormality. Inversions and translocations involve no loss or gain and are functionally normal unless a break occurs within a gene. However, at meiosis (the specialized nuclear divisions that take place during the production of gametes —i.e., eggs and sperm), faulty pairing of an inverted or translocated chromosome set with a normal set can result in gametes and hence progeny with duplications and deletions.

write and essay on isolation and mutation

Loss or gain of whole chromosomes results in a condition called aneuploidy . One familiar result of aneuploidy is Down syndrome , a chromosomal disorder in which humans are born with an extra chromosome 21 (and hence bear three copies of that chromosome instead of the usual two). Another type of chromosome mutation is the gain or loss of whole chromosome sets. Gain of sets results in polyploidy —that is, the presence of three, four, or more chromosome sets instead of the usual two. Polyploidy has been a significant force in the evolution of new species of plants and animals. ( See also evolution: Polyploidy .)

Most genomes contain mobile DNA elements that move from one location to another. The movement of these elements can cause mutation, either because the element arrives in some crucial location, such as within a gene, or because it promotes large-scale chromosome mutations via recombination between pairs of mobile elements in different locations.

At the level of whole populations of organisms, mutation can be viewed as a constantly dripping faucet introducing mutant alleles into the population, a concept described as mutational pressure. The rate of mutation differs for different genes and organisms. In RNA viruses, such as the human immunodeficiency virus (HIV; see AIDS ), replication of the genome takes place within the host cell using a mechanism that is prone to error. Hence, mutation rates in such viruses are high. In general, however, the fate of individual mutant alleles is never certain. Most are eliminated by chance. In some cases a mutant allele can increase in frequency by chance, and then individuals expressing the allele can be subject to selection, either positive or negative. Hence, for any one gene the frequency of a mutant allele in a population is determined by a combination of mutational pressure, selection, and chance.

  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Image & Use Policy
  • Translations

UC MUSEUM OF PALEONTOLOGY

UC Berkeley logo

Understanding Evolution

Your one-stop source for information on evolution

  • ES en Español

Mutations are changes in the information contained in genetic material. For most of life, this means a change in the sequence of DNA, the hereditary material of life. An organism’s DNA affects how it looks, how it behaves, its physiology — all aspects of its life. So a change in an organism’s DNA can cause changes in all aspects of its life.

Mutations are random Mutations can be beneficial, neutral, or harmful for the organism, but mutations do not “try” to supply what the organism “needs.” In this respect, mutations are  random  — whether a particular mutation happens or not is unrelated to how useful that mutation would be.

Not all mutations matter to evolution Since all cells in our body contain DNA, there are lots of places for mutations to occur; however, not all mutations matter for evolution.  Somatic mutations  occur in non-reproductive cells and so won’t be passed on to offspring.

For example, the yellow color on half of a petal on this red tulip was caused by a somatic mutation. The seeds of the tulip do not carry the mutation. Cancer is also caused by somatic mutations that cause a particular cell lineage (e.g., in the breast or brain) to multiply out of control. Such mutations affect the individual carrying them but are not passed directly on to offspring.

A tulip with one petal that is half solid red and half solid yellow.

The only mutations that matter for the evolution of life’s diversity are those that can be passed on to offspring. These occur in reproductive cells like eggs and sperm and are called  germline mutations .

  • More Details
  • Evo Examples
  • Teaching Resources

Read more about  how mutations are random  and the famous  Lederberg experiment  that demonstrated this. Or read more about  how mutations factored into the history of evolutionary thought . Or dig into DNA and mutations in this primer.

Learn more about mutation in context:

  • Evolution at the scene of the crime , a news brief with discussion questions.
  • A chink in HIV's evolutionary armor , a news brief with discussion questions.

Find  lessons, activities, videos, and articles  that focus on mutation.

Reviewed and updated June, 2020.

Genetic variation

The effects of mutations

Subscribe to our newsletter

  • Teaching resource database
  • Correcting misconceptions
  • Conceptual framework and NGSS alignment
  • Image and use policy
  • Evo in the News
  • The Tree Room
  • Browse learning resources

ENCYCLOPEDIC ENTRY

Adaptation and survival.

An adaptation is a mutation, or genetic change, that helps an organism, such as a plant or animal, survive in its environment.

Loading ...

Newsela

An  adaptation  is a  mutation , or  genetic  change, that helps an organism, such as a plant or animal, survive in its  environment . Due to the helpful nature of the mutation , it is passed down from one  generation  to the next. As more and more organisms  inherit  the mutation , the mutation becomes a typical part of the  species . The mutation has become an adaptation . Structural and Behavioral Adaptations An adaptation can be structural, meaning it is a physical part of the organism. An adaptation can also be behavioral, affecting the way an organism acts. An example of a  structural adaptation  is the way some plants have adapted to life in the  desert . Deserts are dry, hot places. Plants called  succulents have adapted to this  climate  by storing water in their thick stems and leaves. Animal  migration  is an example of a  behavioral adaptation . Gray whales  migrate  thousands of miles every year as they swim from the cold Arctic Ocean to the warm waters off the  coast  of Mexico. Gray whale calves are born in the warm water, and then travel in groups called pods to the  nutrient -rich waters of the  Arctic . Some adaptations are called  exaptations . An exaptation is an adaptation developed for one purpose, but used for another. Feathers were probably adaptations for keeping the animal warm that were later used for flight, making feathers an exaptation for flying. Some adaptations , on the other hand, become useless. These adaptations are  vestigial : remaining but functionless. Whales and dolphins have vestigial leg bones, the remains of an adaptation (legs) that their ancestors used to walk. Habitat

Adaptations usually develop in response to a change in the organisms’  habitat . A famous example of an animal adapting to a change in its environment is the English peppered moth . Prior to the 19th century, the most common type of this  moth  was cream-colored with darker spots. Few peppered moths displayed a mutation of being gray or black.

As the  Industrial Revolution  changed the environment , the appearance of the peppered moth changed. The darker-colored moths , which were rare, began to  thrive  in the  urban  atmosphere. Their  sooty color blended in with the trees stained by  industrial   pollution . Birds couldn’t see the dark moths , so they ate the cream-colored moths instead. The cream-colored moths began to make a comeback after the United Kingdom passed laws that limited  air pollution . Speciation Sometimes, an organism develops an adaptation or set of adaptations that create an entirely new species . This process is known as  speciation . The physical  isolation  or specialization of a species can lead to speciation . The wide variety of  marsupials in Oceania is an example of how organisms adapt to an isolated habitat . Marsupials ,  mammals that carry their  young  in pouches, arrived in Oceania before the land split with Asia.  Placental mammals , animals that carry their young in the mother ’s  womb , came to  dominate  every other continent , but not Oceania . There, marsupials faced no competition. Koalas, for instance, adapted to feed on  eucalyptus  trees, which are native to Australia. The  extinct  Tasmanian tiger was a  carnivorous marsupial and adapted to the  niche  filled by  big cats like tigers on other continents . Marsupials in Oceania are an example of  adaptive radiation , a type of speciation in which species develop to fill a variety of empty ecological niches . The  cichlid  fish found in Africa’s Lake Malawi exhibit another type of speciation ,  sympatric speciation . Sympatric speciation is the opposite of physical isolation . It happens when species share the same habitat . Adaptations have allowed hundreds of varieties of cichlids to live in Lake Malawi. Each species of cichlid has a  unique , specialized  diet : One type of cichlid may eat only insects, another may eat only  algae , another may feed only on other fish.

Coadaptation Organisms sometimes adapt to and with other organisms. This is called  co adaptation . Certain flowers have adapted their  pollen  to appeal to the hummingbirds ’ nutritional needs.  Hummingbirds have adapted long, thin beaks to  extract  the pollen from certain flowers. In this relationship, the hummingbird gets food, while the plants pollen is  distributed . The co adaptation is beneficial to both organisms. Mimicry is another type of co adaptation . With mimicry , one organism has adapted to  resemble  another. The harmless king snake (sometimes called a milk snake) has adapted a color pattern that resembles the deadly coral snake. This mimicry keeps predators away from the king snake. The mimic octopus has behavioral as well as structural adaptations . This species of octopus can mimic the look and movements of animals such as sea stars, crabs, jellyfish, and shrimp. Co adaptation can also limit an organism’s ability to adapt to new changes in their habitat . This can lead to  co-extinction . In Southern England, the large blue butterfly adapted to eat red ants. When human  development  reduced the red ants’ habitat , the local extinction of the red ant led to the local extinction of the large blue butterfly.

Vestigial Adaptations Vestigial organs are adaptations that have become useless. In humans, vestigial organs include the appendix, thought to be left over from when the human diet was primarily vegetation; the coccyx, a vestigial tail; and gill slits that are found in human embryos, though embryos never breathe through them.

Articles & Profiles

Media credits.

The audio, illustrations, photos, and videos are credited beneath the media asset, except for promotional images, which generally link to another page that contains the media credit. The Rights Holder for media is the person or group credited.

Illustrators

Educator reviewer, last updated.

October 19, 2023

User Permissions

For information on user permissions, please read our Terms of Service. If you have questions about how to cite anything on our website in your project or classroom presentation, please contact your teacher. They will best know the preferred format. When you reach out to them, you will need the page title, URL, and the date you accessed the resource.

If a media asset is downloadable, a download button appears in the corner of the media viewer. If no button appears, you cannot download or save the media.

Text on this page is printable and can be used according to our Terms of Service .

Interactives

Any interactives on this page can only be played while you are visiting our website. You cannot download interactives.

ENCYCLOPEDIC ENTRY

A mutation is a change in the sequence of genetic letters, called bases, within a molecule of DNA.

Biology, Genetics

York Grown Strawberries

Mutations occur throughout the natural world, and fuel the process of natural selection. In cultivated crops like strawberries, farmers may help this process along by selectively growing plants with mutations that make the fruits more resilient against di

Photograph by Jim Richardson

Mutations occur throughout the natural world, and fuel the process of natural selection. In cultivated crops like strawberries, farmers may help this process along by selectively growing plants with mutations that make the fruits more resilient against di

A mutation is a change in the structure of a gene , the unit of heredity. Genes are made of deoxyribonucleic acid ( DNA ), a long molecule composed of building blocks called nucleotides . Each nucleotide is built around one of four different subunits called bases. These bases are known as guanine, cytosine, adenine, and thymine. A gene carries information in the sequence of its nucleotides , just as a sentence carries information in the sequence of its letters.

One type of mutation is a change in a base. This is called a point mutation and it is like changing one letter in a word. Most genes carry instructions for making proteins . When a base is changed in a gene , different results are possible, depending on which base is changed and what it is changed into. The gene may produce an altered protein , it may produce no protein , or it may produce the usual protein . Most mutations are not harmful, but some can be. A harmful mutation can result in a genetic disorder or even cancer.

Another kind of mutation is a chromosomal mutation . Chromosomes , located in the cell nucleus, are tiny threadlike structures that carry genes . A chromosome consists of a molecule of DNA together with proteins . Sometimes, a long segment of DNA is inserted into a chromosome , deleted from a chromosome , flipped around within a chromosome , duplicated, or moved from one chromosome to another. Such changes are usually very harmful.

One example of a chromosomal mutation is a condition called Down syndrome. In each cell, humans normally have forty-six chromosomes, consisting of two copies of the twenty-three kinds of chromosomes. Down syndrome usually results from the presence of one extra copy of a particular chromosome, or an extra portion of that chromosome. The presence of that extra chromosome leads to problems with certain organs of the body, such as the heart. It can also lead to leukemia—a cancer of the blood-forming cells—and produce mental disabilities. Many people with Down syndrome also have distinct facial features.

Mutations can be inherited or acquired during a person's lifetime. Mutations that an individual inherits from their parents are called hereditary mutations . They are present in all body cells and can be passed down to new generations . Acquired mutations occur during an individual’s life. If an acquired mutation occurs in an egg or sperm cell, it can be passed down to the individual’s offspring. Once an acquired mutation is passed down, it is a hereditary mutation . Acquired mutations are not passed down if they occur in the somatic cells, meaning body cells other than sperm cells and egg cells. Some acquired mutations occur spontaneously and randomly in genes . Other mutations are caused by environmental factors, such as exposure to certain chemicals or radiation.

Mutations occur throughout the natural world. Some mutations are beneficial and increase the possibility that an organism will thrive and pass on its genes to the next generation. When mutations improve survival or reproduction, the process of natural selection will cause the mutation to become more common over time. When mutations are harmful, they become less common over time. Therefore, mutation is a force that helps drive evolution.

Media Credits

The audio, illustrations, photos, and videos are credited beneath the media asset, except for promotional images, which generally link to another page that contains the media credit. The Rights Holder for media is the person or group credited.

Production Managers

Program specialists, last updated.

October 19, 2023

User Permissions

For information on user permissions, please read our Terms of Service. If you have questions about how to cite anything on our website in your project or classroom presentation, please contact your teacher. They will best know the preferred format. When you reach out to them, you will need the page title, URL, and the date you accessed the resource.

If a media asset is downloadable, a download button appears in the corner of the media viewer. If no button appears, you cannot download or save the media.

Text on this page is printable and can be used according to our Terms of Service .

Interactives

Any interactives on this page can only be played while you are visiting our website. You cannot download interactives.

Related Resources

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Biology LibreTexts

19.2A: Genetic Variation

  • Last updated
  • Save as PDF
  • Page ID 13482

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Genetic variation is a measure of the variation that exists in the genetic makeup of individuals within population.

Learning Objectives

  • Assess the ways in which genetic variance affects the evolution of populations
  • Genetic variation is an important force in evolution as it allows natural selection to increase or decrease frequency of alleles already in the population.
  • Genetic variation can be caused by mutation (which can create entirely new alleles in a population), random mating, random fertilization, and recombination between homologous chromosomes during meiosis (which reshuffles alleles within an organism’s offspring).
  • Genetic variation is advantageous to a population because it enables some individuals to adapt to the environment while maintaining the survival of the population.
  • genetic diversity : the level of biodiversity, refers to the total number of genetic characteristics in the genetic makeup of a species
  • crossing over : the exchange of genetic material between homologous chromosomes that results in recombinant chromosomes
  • phenotypic variation : variation (due to underlying heritable genetic variation); a fundamental prerequisite for evolution by natural selection
  • genetic variation : variation in alleles of genes that occurs both within and among populations

Genetic Variation

Genetic variation is a measure of the genetic differences that exist within a population. The genetic variation of an entire species is often called genetic diversity. Genetic variations are the differences in DNA segments or genes between individuals and each variation of a gene is called an allele.For example, a population with many different alleles at a single chromosome locus has a high amount of genetic variation. Genetic variation is essential for natural selection because natural selection can only increase or decrease frequency of alleles that already exist in the population.

Genetic variation is caused by:

  • random mating between organisms
  • random fertilization
  • crossing over (or recombination) between chromatids of homologous chromosomes during meiosis

The last three of these factors reshuffle alleles within a population, giving offspring combinations which differ from their parents and from others.

image

Evolution and Adaptation to the Environment

Variation allows some individuals within a population to adapt to the changing environment. Because natural selection acts directly only on phenotypes, more genetic variation within a population usually enables more phenotypic variation. Some new alleles increase an organism’s ability to survive and reproduce, which then ensures the survival of the allele in the population. Other new alleles may be immediately detrimental (such as a malformed oxygen-carrying protein) and organisms carrying these new mutations will die out. Neutral alleles are neither selected for nor against and usually remain in the population. Genetic variation is advantageous because it enables some individuals and, therefore, a population, to survive despite a changing environment.

image

Geographic Variation

Some species display geographic variation as well as variation within a population. Geographic variation, or the distinctions in the genetic makeup of different populations, often occurs when populations are geographically separated by environmental barriers or when they are under selection pressures from a different environment. One example of geographic variation are clines: graded changes in a character down a geographic axis.

Sources of Genetic Variation

Gene duplication, mutation, or other processes can produce new genes and alleles and increase genetic variation. New genetic variation can be created within generations in a population, so a population with rapid reproduction rates will probably have high genetic variation. However, existing genes can be arranged in new ways from chromosomal crossing over and recombination in sexual reproduction. Overall, the main sources of genetic variation are the formation of new alleles, the altering of gene number or position, rapid reproduction, and sexual reproduction.

Genetic Mutations ( AQA A Level Biology )

Revision note.

Alistair

Biology & Environmental Systems and Societies

Gene Mutations

  • A gene mutation is a change in the sequence of base pairs in a DNA molecule that may result in an altered polypeptide
  • Errors in the DNA often occur during DNA replication
  • As the DNA base sequence determines the sequence of amino acids that make up a protein, mutations in a gene can sometimes lead to a change in the polypeptide that the gene codes for
  • Most mutations do not alter the polypeptide or only alter it slightly so that its structure or function is not changed (as the genetic code is degenerate )
  • Mutations in the DNA base sequence can occur due to the insertion , deletion or  substitution of a nucleotide or due to the inversion , duplication or translocation of a section of a gene

Insertion of nucleotides  

  • A mutation that occurs when a nucleotide (with a new base) is randomly inserted into the DNA sequence is known as an insertion mutation
  • This is because every group of three bases in a DNA sequence codes for an amino acid
  • An insertion mutation also has a knock-on effect by changing the triplets (groups of three bases) further on in the DNA sequence
  • This is sometimes known as a frameshift mutation
  • This may dramatically change the amino acid sequence produced from this gene and therefore the ability of the polypeptide to function

Insertion mutation, downloadable IGCSE & GCSE Biology revision notes

An example of an insertion mutation

Deletion of nucleotides  

  • A mutation that occurs when a nucleotide (and therefore its base) is randomly deleted from the DNA sequence is known as a deletion mutation
  • Like an insertion mutation, a deletion mutation changes the amino acid that would have been coded for
  • Like an insertion mutation, a deletion mutation also has a knock-on effect by changing the groups of three bases further on in the DNA sequence

Substitution of nucleotides  

  • A mutation that occurs when a base in the DNA sequence is randomly swapped for a different base is known as a substitution mutation
  • Unlike an insertion or deletion mutation, a substitution mutation will only change the amino acid for the triplet (group of three bases) in which the mutation occurs ; it will not have a knock-on effect
  • Silent mutations – the mutation does not alter the amino acid sequence of the polypeptide (this is because certain codons may code for the same amino acid as the genetic code is degenerate)
  • Missense mutations – the mutation alters a single amino acid in the polypeptide chain (sickle cell anaemia is an example of a disease caused by a single substitution mutation changing a single amino acid in the sequence)
  • Nonsense mutations – the mutation creates a premature stop codon (signal for the cell to stop translation of the mRNA molecule into an amino acid sequence), causing the polypeptide chain produced to be incomplete and therefore affecting the final protein structure and function (cystic fibrosis is an example of a disease caused by a nonsense mutation, although this is not always the only cause)

Substitution mutation, downloadable IGCSE & GCSE Biology revision notes

An example of a substitution mutation

Inversion within a gene section

  • Usually occurs during crossing-over in meiosis
  • The DNA of a single gene is cut in two places
  • The cut portion is inverted 180 ° then rejoined to the same place within the gene
  • The result is a large section of the gene is 'backwards' and therefore multiple amino acids are affected
  • In some cases, an entirely different protein is produced
  • If the other chromosome in the pair carries a working gene the effect of the mutation may be lessened

Gene Inversion, downloadable AS & A Level Biology revision notes

Inversion mutations occur when a section of a gene is cut then resealed after 180° inversion

Duplication of a gene

  • A whole gene or section of a gene is duplicated so that two copies of the gene/section appear on the same chromosome
  • The original version of the gene remains intact and therefore the mutation is not harmful
  • Overtime, the second copy can undergo mutations which enable it to develop new functions
  • Alpha, beta and gamma haemoglobin genes evolved due to duplication mutations

Gene Duplication, downloadable AS & A Level Biology revision notes

Duplication mutations occur when a gene is copied so that two versions of the same gene occur on the same chromosome

Translocation of a gene section

  • Similarly to inversion, a gene is cut in two places
  • The section of the gene that is cut off attaches to a separate gene
  • The result is the cut gene is now non-functional due to having a section missing and the gene that has gained the translocated section is likely to also be non-functional
  • If a section of a proto-oncogene is translocated onto a gene controlling cell division, it could boost expression and lead to tumours
  • Similarly, if a section of a tumour suppressor gene is translocated and the result is a faulty tumour suppressor gene, this could lead to the cell continuing replication when it contains faulty DNA

Gene Translocation, downloadable AS & A Level Biology revision notes

Translocation mutations occur when a section of a gene is cut then resealed onto another gene

A silent mutation is a change in the nucleotide sequence that results in the same amino acid sequence.This is possible because some amino acids can be coded for by up to four different triplet codon sequences.Silent mutations are often a change in the 2nd or 3rd base in the codon, rather than the first.For example, valine is coded for by four different triplet codon sequences (GUU, GUC, GUA and GUG) – therefore, as long as the first two nucleotides in the codon are guanine and uracil the amino acid valine will be inserted into the polypeptide.

You've read 0 of your 0 free revision notes

Get unlimited access.

to absolutely everything:

  • Downloadable PDFs
  • Unlimited Revision Notes
  • Topic Questions
  • Past Papers
  • Model Answers
  • Videos (Maths and Science)

Join the 100,000 + Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Author: Alistair

Alistair graduated from Oxford University with a degree in Biological Sciences. He has taught GCSE/IGCSE Biology, as well as Biology and Environmental Systems & Societies for the International Baccalaureate Diploma Programme. While teaching in Oxford, Alistair completed his MA Education as Head of Department for Environmental Systems & Societies. Alistair has continued to pursue his interests in ecology and environmental science, recently gaining an MSc in Wildlife Biology & Conservation with Edinburgh Napier University.

  • Biology Article
  • Modern Synthetic Theory Evolution

Modern Synthetic Theory of Evolution

Modern Synthetic Theory of Evolution

The Modern Synthetic Theory of Evolution (also called Modern Synthesis) merges the concept of Darwinian evolution with Mendelian genetics, resulting in a unified theory of evolution. This theory is also referred to as the Neo-Darwinian theory and was introduced by a number of evolutionary biologists such as T. Dobzhansky, J.B.S. Haldane, R.A. Fisher, Sewall Wright, G.L. Stebbins, Ernst Mayr.

It describes the evolution of life in terms of genetic changes occurring in the population that leads to the formation of new species. It also describes the genetic population or Mendelian population, gene pool and gene frequency. The major concepts coming under this theory include genetic variations, reproductive and geographical isolation and natural selection.

The Modern Synthetic Theory of Evolution showed a number of changes as to how the evolution and the process of evolution are conceived. The theory gave a new definition of  evolution as “the changes occurring in the allele frequencies within the populations, ” which emphasizes the genetic basis of evolution.

Factors of Modern Synthetic Theory of Evolution

The factors that contribute to the change in allele frequency of a population are as follows:

Genetic Recombination

Recombination is a process where new combinations of alleles are formed. The genetic recombination occurs during sexual reproduc­tion at the time of gamete formation. There occurs an exchange of genetic material between non-sister chromatids during meiosis that is called crossing over. It leads to recombination and is one of the causes of genetic variability present within a population.

Mutations are the sudden inheritable changes that occur in the gene and have a certain phenotypic effect. Chromosomal mutations may be due to change in the genes or chromosome structure or number, e.g. deletion, inversion, duplication, translocation, aneuploidy, polyploidy, etc. Mutation produces a variety of changes that may be harmful. Many of the mutant forms of genes are recessive and are expressed only in the homozygous condition. Advantageous mutations may be selected by natural selection and gradual small changes get accumulated over time. These mutations cause variation in a population.

Genetic Drift and Gene Flow

Any change in the gene/allele frequency of a population due to sudden, random changes, is referred to as genetic drift. It occurs due to chance events. Genetic drift is more prominent in a small population. Gene flow is due to the immigration or emigration of individuals from one population to another. If the migration occurs multiple times it leads to gene flow and changes the allele/gene frequency of the populations.

Natural selection

Organisms that are better adapted to the environment are selected by nature. Natural selection produces a change in the frequency of the genes from one generation to the other favouring the differential form of  reproduction .

It is one of the significant factors responsible for the synthetic theory of evolution. The isolation helps in preventing the interbreeding of related organisms which is a reproductive form of isolation.

In addition to these factors, other factors such as hybridization between two species increases the genetic variability of the popu­lation.

write and essay on isolation and mutation

Put your understanding of this concept to test by answering a few MCQs. Click ‘Start Quiz’ to begin!

Select the correct answer and click on the “Finish” button Check your score and answers at the end of the quiz

Visit BYJU’S for all Biology related queries and study materials

Your result is as below

Request OTP on Voice Call

BIOLOGY Related Links

Leave a Comment Cancel reply

Your Mobile number and Email id will not be published. Required fields are marked *

Post My Comment

write and essay on isolation and mutation

Your organization has been doing great job keep it up

write and essay on isolation and mutation

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

Mutation and premating isolation

Cite this chapter.

write and essay on isolation and mutation

  • R. C. Woodruff 1 &
  • J. N. Thompson Jr. 2  

Part of the book series: Contemporary Issues in Genetics and Evolution ((CIGE,volume 9))

433 Accesses

While premating isolation might be traceable to different genetic mechanisms in different species, evidence supports the idea that as few as one or two genes may often be sufficient to initiate isolation. Thus, new mutation can theoretically play a key role in the process. But it has long been thought that a new isolation mutation would fail, because there would be no other individuals for the isolation-mutation-carrier to mate with. We now realize that premeiotic mutations are very common and will yield a cluster of progeny carrying the same new mutant allele. In this paper, we discuss the evidence for genetically simple premating isolation barriers and the role that clusters of an isolation mutation may play in initiating allopatric, and even sympatric, species divisions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Unable to display preview.  Download preview PDF.

Barton, N.H. & S. Rouhani, 1987. The frequency of shifts between alternative equilibria. J. Theoret. Biol. 125: 397–418.

Article   CAS   Google Scholar  

Barton, N.H. & S. Rouhani, 1991. The probability of fixation of a new karyotype in a continuous population. Evolution 45: 499–517.

Article   Google Scholar  

Barton, N.H. & R.D. Keightley, 2002. Understanding quantitative genetic variation. Nat. Rev. Genet. 3: 11–21.

Article   PubMed   CAS   Google Scholar  

Bell, M.A., 2001. Lateral plate evolution in the threespine stickleback: getting nowhere fast. Genetica 112-113: 445–461.

Bengtsson, B.O. & W.F. Bodmer, 1976. On the increase of chromosome mutations under random mating. Theoret. Popul. Biol. 9: 260–281.

Bleiweiss, R., 2001. Mimicry on the QT(L): genetics of speciation in Mimulus . Evolution 55: 1706–1709.

PubMed   CAS   Google Scholar  

Bone, E. & A. Farres, 2001. Trends and rates of microevolution in plants. Genetica 112-113: 165–182.

Bradshaw, W.E. & C.M. Holzapfel, 2001. Genetic shift in photoperiodic response correlated with global warming. Proc. Natl. Acad. Sci. USA, Early Edition.

Google Scholar  

Bradshaw Jr. H.D., S.M. Wilbert, K.G. Otto & D.W. Schemske, 1995. Genetic mapping of floral traits associated with reproductive isolation in monkeyflowers ( Mimulus ). Nature 376: 762–765.

Bradshaw Jr. H.D., K.G. Otto, B.E. Frewen, J.K. McKay & D.W. Schemske, 1998. Quantitative trait loci affecting differences in floral morphology between two species of monkeyflower ( Mimulus ). Genetics 149: 367–382.

Bridges C.B., 1919. The developmental stages at which mutations occur in the germ tract. Proc. Soc. Exp. Biol. Med. 17: 1–2.

Buerkle, C.A., R.J. Morris, M.A. Asmussen & L.H. Rieseberg, 2000. The likelihood of homoploid hybrid speciation. Heredity 84:441–451.

Article   PubMed   Google Scholar  

Bush, G.L., 1994. Sympatric speciation in animals: new wine in old bottles. Trends Ecol. Evol. 9: 285–288.

Bush, G.L. & J.L. Smith, 1998. The genetics and ecology of sympatric speciation: a case study. Res. Popul. Ecol. 40: 175–187.

Butlin, R.K., 1996. Co-ordination of the sexual signalling system and the genetic basis of differentiation between populations in the brown planthopper, Nilaparvata lugens . Heredity 77: 369–377.

Carroll, R.L., 2000. Towards a new evolutionary synthesis. Trends Ecol. Evol. 15: 27–32.

Carson, H.L., 1982. Evolution of Drosophila on the newer Hawaiian volcanoes. Heredity 48: 3–25.

Carson, H.L. & A.R. Templeton, 1984. Genetic revolution in relation to speciation phenomena: the founding of new populations. Ann. Rev. Ecol. Syst. 15: 97–131.

Castle, W.E., 1905. The mutation theory of organic evolution: from the standpoint of animal breeding. Science 21: 521–525.

Castle, W.E., 1929. A mosaic (intense-dilute) coat pattern in the rabbit. J. Exp. Zool. 52: 471–480.

Charlesworth, B. & D. Charlesworth, 1973. Selection of new inversions in multi-locus genetic systems. Genet. Res. 21: 167–183.

Charlesworth, D. & B. Charlesworth, 1975a. Theoretical genetics of Batesian mimicry. I. Single-locus models. J. Theoret. Biol. 55: 283–303.

Charlesworth, D. & B. Charlesworth, 1975b. Theoretical genetics of Batesian mimicry. II. Evolution of supergenes. J. Theoret. Biol. 55: 305–324.

Charlesworth, B. & D.B. Smith, 1982. A computer model of speciation by founder effects. Genet. Res. 39: 227–236.

Chesser, R.K. & R.J. Baker, 1986. On factors affecting the fixation of chromosomal rearrangements and neutral genes: computer simulations. Evolution 40: 625–632.

Cook, L.M., R.L.H. Dennis & G.S. Mani, 1999. Melanic morph frequency in the peppered moth in the Manchester area. Proc. R. Soc. Lond. B 266: 293–297.

Coyne, J.A. & H.A. Orr, 1999. The evolutionary genetics of speciation, pp. 1–36 in Evolution of Biological Diversity, edited by A.E. Magurran & R.M. May. Oxford University Press, Oxford.

Coyne, J.A., N.H. Barton & M. Turelli, 1997. A critique of Se-wall Wright’s shifting balance theory of evolution. Evolution 51: 643–671.

Craig, T.P., J.D. Horner & J.K. Itami, 2001. Genetics, experience, and host-plant preference in Eurosta solidaginis :implications for host shifts and speciation. Evolution 55: 773–782.

Crawford, D.J., T.F. Stuessy, D.W. Haines, M.B. Cosner, M. Silva & P. Lopez, 1992. Allozyme diversity within and divergence among four species of Robinsonia (Asteraceae: Senecioneae), a genus endemic to the Juan Fernandez Islands, Chile. Am. J. Bot. 79: 962–966.

Crozier, R.H., B. Kaufman, M.E. Carew & Y.C. Crozier, 1999. Mutability of microsatellites developed for the ant Camponotus consobrinus . Mol. Evol. 8:271–276.

CAS   Google Scholar  

Danley, P.D. & T.D. Kocher, 2001. Speciation in rapidly diverging systems: lessons form Lake Malawi. Mol. Ecol. 10: 1075–1086.

de Belle, J.S., A.J. Hilliker & M.B. Sokolowski, 1989. Genetic localization of foraging (for) :A major gene for larval behavior in Drosophila melanogaster . Genetics 123: 157–163.

PubMed   Google Scholar  

Dickinson, H. & J. Antonovics, 1973. Theoretical considerations of sympatric divergence. Am. Nat. 107: 256–274.

Dobzhansky, Th. 1937. Genetics and the Origin of Species. Columbia University Press, New York.

Dobzhansky, Th. & S. Wright, 1941. Genetics of natural populations. V. Relations between mutation rate and accumulation of lethals in populations of Drosophila pseudoobscura . Genetics 26:23–51.

Doi, M., M. Matsuda, M. Tomaru, H. Matsubayashi & Y. Oguma, 2001. A locus for female discrimination behavior causing sexual isolation in Drosophila . Proc. Nat. Acad. Sci. USA 98: 6714–6719.

Drake, J.W., B. Charlesworth, D. Charlesworth & J.F. Crow, 1998. Rates of spontaneous mutation. Genetics 148: 1667–1686.

Drost, J.B. & W.R. Lee, 1998. The developmental basis for germline mosaicism in mouse and Drosophila melanogaster, pp. 421–443 in Mutation and Evolution, edited by R.C. Woodruff & J.N. Thompson, Jr. Kluwer Academic Press, Dordrecht.

Chapter   Google Scholar  

Ebstein, R.P., O. Novick, R. Umansky, B. Priel, Y. Osher, D. Blaine, E.R. Bennett, L. Nemanov, M. Katz & R.H. Belmaker, 1996. Dopamine D4 receptor (D4DR) exon III polymorphism associated with the human personality trait of novelty seeking. Nat. Genet. 12: 78–80.

Falconer, D.S. & T.F.C. Mackay, 1996. Introduction to Quantitative Genetics. Longman Group Limited, Essex.

Feder, J.L., 1998. The apple maggot fly, Rhagoletis pomonella, pp. 130–144 in Endless Forms: Species and Speciation, edited by D.J. Howard & S.H. Berlocher. Oxford University Press, Oxford.

Fisher, R.A., 1930a. The Genetical Theory of Natural Selection. Clarendon Press, Oxford.

Fisher, R.A., 1930b. Note on a tricolour (mosaic) mouse. J. Genet. 23:77–81.

Fitzsimmons, N.N., 1998. Single paternity of clutches and sperm storage in the promiscuous green turtle ( Chelonia mydas ). Mol. Ecol. 7: 575–584.

Flichak, K.E., J.B. Roethele & J.L. Feder, 2000. Natural selection and sympatric divergence in the apple maggot Rhagoletis pomonella . Nature 407: 739–742.

Gallez, G.P. & L.D. Gottlieb, 1982. Genetic evidence for the hybrid origin of the diploid plant Stephanomeria diegensis . Evolution 36: 1158–1167.

Gehring, W.J. & K. Ikeo, 1999. Pax 6 mastering eye morphogenesis and eye development. Trends Genet. 15: 371–377.

Gertsch, W.J. & S.B. Peck, 1992. The pholcid spiders of the Galapagos Islands, Ecuador (Araneae: Pholcidae). Can. J. Zool. 70: 1185–1199.

Gittenberger, E., 1988. Sympatric speciation in snails; a largely neglected model. Evolution 42: 826–828.

Gottlieb, L.D., 1984. Genetics and morphological evolution in plants. Am. Nat. 123: 681–709.

Grant, V., 1981. Plant Speciation. Columbia University Press, New York.

Grant, B.R. & P.R. Grant, 1989. Evolutionary Dynamics of a Natural Population: The Large Cactus Finch of the Galapagos. The University of Chicago Press, Chicago.

Grant, P.R. & B.R. Grant, 1997. Genetics and the origin of bird species. Proc. Natl. Acad. Sci. USA 94: 7768–7775.

Grant, B.S., D.F. Owen & C.A. Clarke, 1996. Parallel rise and fall of melanic peppered moths in America and Britain. J. Hered. 87: 351–357.

Haag, E.S. & J.R. True, 2001. From mutants to mechanisms? Assessing the candidate gene paradigm in evolutionary biology. Evolution 55: 1077–1084.

Haidane, J.B.S., 1932. The Causes of Evolution. Longmans & Green, London.

Hall, J.G., 1988. Somatic mosaicism: observations related to clinical genetics. Am. J. Hum. Genet. 43: 355–363.

Hedrick, P.W., 1981. The establishment of chromosomal variants. Evolution 35: 322–332.

Hellberg, M.E., D.P Balch & K. Roy, 2001. Climate-driven range expansion and morphological evolution in a marine gastropod. Science 292: 1707–1710.

Hendry, C.S., 1994. Singing and cryptic speciation in insects. Trends Ecol. Evol. 9: 388–392.

Hendry, A.P., 2001. Adaptive divergence and the evolution of reproductive isolation in the wild: an empirical demonstration using introduced sockeye salmon. Genetica 112-113: 515–534.

Hendry, A.P. & M.T. Kinnison, 1999. Perspective: the pace of modern life: measuring rates of contemporary microevolution. Evolution 53: 1637–1653.

Hendry, A.P, J.K. Wenburg, P. Bentzen, E.C Volk & T.P. Quinn, 2000. Rapid evolution of reproductive isolation in the wild: evidence from introduced salmon. Science 290: 516–518.

Higgie, M., S. Chenoweth & M.W. Blows, 2000. Natural selection and the reinforcement of mate recognition. Science 290: 519–521.

Hollander, W.F., 1944. Mosaic effects in domestic birds. Quart. Rev. Biol. 19: 285–307.

Hollander, W.F., 1975. Sectorial mosaics in the domestic pigeon: 25 more years. J. Hered. 66: 197–202.

Hori, M., 1993. Frequency-dependent natural selection in the handedness of scale-eating cichlid fish. Science 260: 216–219.

Huai, H. & R.C. Woodruff, 1998. Clusters of new identical mutants and the fate of underdominant mutations, pp. 489–505 in Mutation and Evolution, edited by R.C. Woodruff & J.N. Thompson Jr. Kluwer Academic Publishers, Dordrecht.

Huey, R.B., G.W. Gilchrist, M.L. Carson, D. Berrigan & L. Serra, 2000. Rapid evolution of a geographic cline in size in an introduced fly. Science 287: 308–309.

Johnson, P.A., F.C. Hoppensteadt, J.J. Smith & G.L. Bush, 1996. Conditions for sympatric speciation: a diploid model incorporating habitat fidelity and non-habitat assortative mating. Evol. Ecol. 10: 187–205.

Jones, C.D., 1998. The genetic basis of Drosophila sechellia’s resistance to host plant toxin. Genetics 149: 1899–1908.

Jones, C.D., 2001. The genetic basis of larval resistance to a host plant toxin in Drosophila sechellia . Genet. Res. 78: 225–233.

Jones, A.G. & J.C. Avise, 2001. Mating systems and sexual selection in male-pregnant pipefishes and seahorses: insights from microsatellite-based studies on maternity. J. Hered. 92: 150–158.

Jones, A.G., G. Rosenqvist, A. Berglund & J.C. Avise, 1999. Clustered microsatellite mutations in the pipefish Syngnathus typhle . Genetics 152: 1057–1063.

Juberthie, C., 1988. Paleoenvironment and speciation in the cave beetle complex Speonomus delarouzeei (Coleoptera, Bathysci-inae). Int. J. Speleol. 17: 31–50.

Juberthie-Jupeau, L., 1988. Mating behaviour and barriers to hybridization in the cave beetle of the Speonomus delarouzeei complex (Coleoptera, Catopidae, Bathysciinae). Int. J. Speleol. 17:51–63.

Kawecki, T.J., 1996. Sympatric speciation driven by beneficial mutations. Proc. R. Soc. Lond. B 263: 1515–1520.

King, M., 1993. Species Evolution: The Role of Chromosome Change. Cambridge University Press, Cambridge.

King, R.B. & R. Lawson, 1997. Microevolution in island water snakes. BioScience 47: 279–286.

Kinnison, M.T. & A.P. Hendry, 2001. The pace of modern life. II. From rates of contemporary microevolution to pattern and process. Genetica 112-113: 145–164.

Kirkpatrick, M. & V. Ravigne, 2002. Speciation by natural selection: models and experiments. Am. Nat. 159: S22–S35.

Kirkpatrick, M. & M.R. Servedio, 1999. The reinforcement of mating preferences on a island. Genetics 151:865–884.

Krieger, M.J.B. & K.G. Ross, 2001. Identification of a major gene regulating complex social behaviour. Science 295: 328–332.

Lande, R., 1979. Effective deme sizes during long-term evolution estimated from rates of chromosome rearrangement. Evolution 33:234–251.

Lande, R., 1985. The fixation of chromosomal rearrangements in a subdivided population with local extinction and colonization. Heredity 54: 323–332.

Lefebvre, L., S. Viville, S.C. Barton, F. Ishino, E.B. Keverne & M.A. Surani, 1998. Abnormal maternal behaviour and growth retardation associated with loss of the imprinted gene Mest . Nat. Genet. 20: 163–169.

Lessios, H.A., 1998. The first stage of speciation as seen in organisms separated by the Isthmus of Panama, pp. 186–201 in Endless Forms: Species and Speciation, edited by D.J. Howard & S.H. Berlocher. Oxford University Press, Oxford.

Lewis, E.B., 1992. Clusters of master control genes regulate the development of higher organisms. J. Am. Med. Assoc. 267: 1524–1531.

Li, L.-L., E.B. Keverne, S.A. Aparicio, F. Ishino, S.C. Barton & M.A. Surani, 1999. Regulation of maternal behavior and offspring growth by paternally expressed Peg3 . Science 284: 330–333.

Li, W.-H., S. Boissiinot, Y. Tan, S.-K. Shyue & D. Hewett-Emmett, 2000. Evolutionary genetics of primate color vision. Evol. Biol. 32: 151–178.

Lofstedt, C., 1990. Population variation and genetic control of pher-omone communication systems in moths. Entomol. Exp. Appl. 54: 199–218.

Losos, J.B., T.W. Schoener, K.I. Warheit & D. Creer, 2001. Experimental studies of adaptive differentiation in Bahamian Anolis lizards. Genetica 112-113: 399–415.

Lynch, M. & B. Walsh, 1998. Genetics and Analysis of Quantitative Traits. Sinauer Associates, Inc., Sunderland, Massachusetts.

Mackay, T.F.C., 2001. Quantitative trait loci in Drosophila . Nat. Rev. Genet. 2: 11–20.

Macnair, M.R., 1987. Heavy metal tolerance in plants: a model evolutionary system. Trends Ecol. Evol. 2: 354–359.

Macnair, M.R., 1989. The potential for rapid speciation in plants. Genome 31: 203–210.

Marshall, D.C. & J.R. Cooley, 2000. Reproductive character displacement and speciation in periodical cicadas, with description of anew species, 13 year Magicicada neotredecim . Evolution 54: 1313–1325.

Maynard Smith, J., 1983. The genetics of stasis and punctuation. Annu. Rev. Genet. 17: 11–25.

Mayr, E., 1963. Animal Species and Evolution. The Belknap Press of Harvard University Press, Cambridge, Mass.

McCarty, E.M., M.A. Asmussen & W.W. Anderson, 1995. A theoretical assessment of recombinational speciation. Heredity 74: 502–509.

McKenzie, J.A. & P. Batterham, 1995. The genetic, molecular and phenotypic consequences of selection for insecticide resistance. Trends Ecol. Evol. 9: 166–169.

Mitchell-Olds, T., 1995. The molecular basis of quantitative genetic variation in natural populations. Trends Ecol. Evol. 10: 324–325.

Miyatake, T. & T. Shimizu, 1999. Genetic correlations between life-history and behavioral traits can cause reproductive isolation. Evolution 53: 201–208.

Morrow, J., L. Scott, B. Congdon, D. Yeates, M. Frommer & J. Sved, 2000. Close genetic similarity between two sympatric species of tephritid fruit fly reproductively isolated by mating time. Evolution 54: 899–910.

Muller, H.J., 1949. Redintegration of the symposium on genetics, paleontology, and evolution, pp. 421–445 in Genetics, Paleontology and Evolution, edited by G.L. Lepsen, G.G. Simpson & E. Mayr. Princeton University Press, Princeton, NJ.

Muller, H.J., I.I. Oster, & S. Zimmering, 1963. Are chronic and acute gamma irradiation equally mutagenic in Drosophila? pp. 275–311 in Repair from Genetic Radiation Damage and Differential Radiosensitivity in Germ Cells, edited by F.H. Sobels. Pergamon Press, Oxford.

Nachman, M.W. & J.B. Searle, 1995. Why is the house mouse karyotype so variable? Trends Ecol. Evol. 10: 397–402.

Nei, M., 1976. Mathematical models of speciation and genetic distance, pp. 723–765 in Population Genetics and Ecology, edited by S. Karlin & E. Nevo. Academic Press, New York.

Neel, J.V., 1998. Genetic studies at the atomic bomb Casualty Commission-Radiation Effects Research Foundation: 1946-1997. Proc. Natl. Acad. Sci. USA 95: 5432–5436.

Nelson, R.J., G.E. Demas, P.L. Huang, M.C. Fishman, V.L. Dawson, T.M. Dawson & S.H. Synder, 1995. Behavioural abnormalities in male mice lacking neuronal nitric oxide synthase. Nature 378: 383–386.

Nijhout, H.F., 1994. Developmental perspectives on evolution of butterfly mimicry. BioScience 44: 148–157.

Orr, H.A., 1991. Is single-gene speciation possible? Evolution 45: 764–769.

Orr, H.A., 2001. The genetics of species differences. Trends Ecol. Evol. 16: 343–350.

Orr, H.A. & J.A. Coyne, 1992. The genetics of adaptations: a reassessment. Am. Nat. 140: 725–742.

Osborne, K.A., A. Robichon, E. Burgess, S. Butland, R.A. Shaw, A. Coulthard, H.S. Pereira, R.J. Greenspan & M.B. Sokolowski, 1997. Natural behavior polymorphism due to a cGMP-dependent protein kinase of Drosophila . Science 277: 834–836.

Otte, D., 1989. Speciation in Hawaiian crickets, pp. 482–526, in Speciation and its Consequences, edited by D. Otte & J.A. Endler. Sinauer Associates, Inc., Sunderland, Massachusetts.

Peters, G.B., 1982. The recurrence of chromosome fusion in inter-population hybrids of the grasshopper Atractomorpha similis . Chromosoma 85:323–347.

Pialek, J., H.C. Hauffe, K.M. Rodriguez-Clark & J.B. Searle, 2001. Raciation and speciation in house mice from the Alps: the role of chromosomes. Mol. Ecol. 10: 613–625.

Porter, C.A. & J.W. Sites, 1987. Evolution of Sceloporus gram-micus complex (Sauria: Iguanidae) in central Mexico. II. Studies on rates of nondisjunction and the occurrence of spontaneous chromosomal mutations. Genetica 75: 131–144.

Prowell, D.P., 1998. Sex linkage and speciation in Lepidoptera, pp. 309–319 in Endless Forms: Species and Speciation, edited by D.J. Howard & S.H. Berlocher. Oxford University Press, Oxford.

Quinn, T.P., M.T. Kinnison & M.J. Unwin, 2001. Evolution of chinook salmon ( Oncorhynchus tshawytscha )populations in New Zealand: pattern, rate, and process. Genetica 112-113: 493–513.

Raymond, M., C. Berticat, M. Weill, N. Pasteur & C. Chevillon, 2001. Insecticide resistance in the mosquito Culex pipiens: what have we learned about adaptation? Genetica 112-113: 287–296.

Rice, W.R. & E.E. Hostert, 1993. Laboratory experiments on speciation: what have we learned in 40 years. Evolution 47: 1637–1653.

Rieseberg, L.H., 2001. Chromosomal rearrangements and speciation. Trends Ecol. Evol. 16: 351–358.

Rieseberg, L.H., R. Carter & S. Zona, 1990. Molecular tests of the hypothesized hybrid origin of two diploid Helianthus species (Asteraceae). Evolution 44: 1498–1511.

Ritchie, M.G. & S.D.F. Phillips, 1998. The genetics of sexual isolation, pp. 291–308, in Endless Forms: Species and Speciation. edited by D.J. Howard & S.H. Berlocher Oxford University Press, Oxford.

R’Kha, S., P. Capy & J.R. David, 1991. Host-plant specialization in the Drosophila melanogaster species complex: a physiological, behavioral, and genetical analysis. Proc. Natl. Acad. Sci. USA 88: 1835–1839.

Roelofs, W., T. Glover, X.-H. Tang, I. Sreng, P. Robbins, C. Eckenrode, C. Lofstedt, B.S. Hansson & B.O. Bengtsson, 1987. Sex pheromone production and perception in European corn borer moths is determined by both autosomal and sex-linked genes. Proc. Natl. Acad. Sci. USA 84: 7585–7589.

Rogina, B., R.A. Reenan, S.P. Nilsen & S.L. Helfand, 2000. Extended life-span conferred by cotransporter gene mutations in Drosophila . Science 290: 2137–2140.

Ross, K.G. & L. Keller, 1998. Genetic control of social organization in an ant. Proc. Natl. Acad. Sci. USA 95: 14232–14237.

Roundtree, D.B. & H.F. Nijhout, 1995. Genetic control of a seasonal morph in Precis coenia (Lepidoptera: Nymphalidae). J. Insect Physiol. 41: 1141–1145.

Rundle, H.D., L. Nagel, J.W. Boughman & D. Schluter, 2000. Natural selection and parallel speciation in sympatric sticklebacks. Science 287: 306–307.

Russell, L.B., 1964. Genetic and functional mosaicism in the mouse, pp. 153–181 in The Role of Chromosomes in Development, edited by M. Locke. Academic Press, New York.

Sato, A., H. Tichy, C. O’hUigin, P.R. Grant, B.R. Grant & J. Klein, 2001. On the origin of Darwin’s finches. Mol. Biol. Evol. 18: 299–311.

Schalet, A., 1986. The distribution of and complementation relationships between spontaneous X-linked recessive lethal mutations recovered from crossing long-term laboratory stocks of Drosophila melanogaster . Mutat. Res. 163: 115–144.

Schemske, D.W. & H.D. Bradshaw, 1999. Pollinator preference and the evolution of floral traits in monkeyflowers ( Mimulus ). Proc. Natl. Acad. Sci. USA 96: 11910–11915.

Schilthuizen, M., 2001. Frogs, Flies and Dandelions: The Making of Species. Oxford University Press, Oxford.

Schliewen, U., K. Rassmann, M. Markmann, J. Markert, T. Kochers & D. Tautz, 2001. Genetic and ecological divergence of a mono-phyletic cichlid species pair under fully sympatric conditions in Lake Ejagham, Cameroon. Mol. Ecol. 10: 1471–1488.

Schluter, D., 1998. Ecological causes of speciation, pp. 114–129, in Endless Forms: Species and Speciation, edited by in D.J. Howard & S.H. Berlocher. Oxford University Press, Oxford.

Schubart, C.D., R. Diesel & S.B. Hedges, 1998. Rapid evolution to terrestrial life in Jamaican crabs. Nature 393: 363–365.

Searle, J.B., 1998. Speciation, chromosomes, and genomes. Gen. Res. 8:1–3.

Selby, P.B., 1998. Major impacts of gonadal mosaicism on hereditary risk estimation, origin of hereditary diseases, and evolution, pp. 454–462 in Mutation and Evolution, edited by R.C. Woodruff & J.N. Thompson Jr. Kluwer Academic Publishers, Dordrecht.

Sezer, M. & R.K. Butlin, 1998. The genetic basis of oviposition preference difference between sympatric host races of the brown planthopper ( Nilaparvata lugens ). Proc. R. Soc. Lond. B 265: 2399–2405.

Shaw, D.D., P. Wilkinson & D.J. Coates, 1983. Increased chromosomal mutation rate after hybridization between two subspecies of grasshopper. Science 220: 1165–1167.

Sheppard, P.M., 1954. Evolution in bisexually reproducing organisms, pp. 201–218 in Evolution as a Process, edited by J. Huxley, A.C. Hardy & E.B. Ford. George Allen & Unwin Ltd., London.

Simpson, G.G., 1944. Tempo and Mode in Evolution. Columbia University Press, New York.

Slatkin, M., 1982. Pleiotropy and parapatric speciation. Evolution 36: 263–270.

Smith, T.B., 1993. Disruptive selection and the genetic basis of bill size polymorphism in the African finch Pyrenestes . Nature 363: 618–620.

Soltis, D.E. & P.S. Soltis, 1999. Polyploidy: recurrent formation and genome evolution. Trends Ecol. Evol. 14: 348–352.

Spirito, F., 1998. The role of chromosomal change in speciation, pp. 320–329 in Endless Forms, Species and Speciation, edited by D.J. Howard & S.H. Berlocher. Oxford University Press, Oxford.

Stanley, S.M., 1998. Macroevolution: Pattern and Process. The Johns Hopkins University Press, Baltimore, MD.

Swanson, W.J. & V.D. Vacquier, 1998. Concerted evolution in an egg receptor for a rapidly evolving abalone sperm protein. Science 281: 710–712.

Tabashinik, B.E., Y.-B. Liu, N. Finson, L. Masson & D.G. Heckel, 1997. One gene in diamondback moth confers resistance to four Bacillus thuringiensis toxins. Proc. Natl. Acad. Sci. USA 94: 1640–1644.

Tauber, C.A. & M.J. Tauber, 1987. Inheritance of seasonal cycles in Chrysoperla (Insecta: Neuroptera). Genet. Res. 49: 215–223.

Taylor, M. & R. Feyereisen, 1996. Molecular biology and evolution of resistance to toxicants. Mol. Biol. Evol. 13: 719–734.

Templeton, A.R., 1977. Analysis of head shape differences between two interfertile species of Hawaiian Drosophila . Evolution 31: 630–641.

Templeton, A.R., 1980. The theory of speciation via the founder principle. Genetics 94: 1011–1038.

Templeton, A.R., 1996. Experimental evidence for the genetic transilience model of speciation. Evolution 50: 909–915.

Thoday, J.M. & J.N. Thompson Jr., 1976. The number of segregating genes implied by continuous variation. Genetica 46: 335–344.

Thompson, J.N. Jr., 1975. Quantitative variation and gene number.Nature 259: 665–668.

Thompson, J.N., 1998. Rapid evolution as an ecological process. Trends Ecol. Evol. 13: 329–332.

Thompson, J.N. Jr. & J.M. Thoday, 1979. Quantitative Genetic Variation. Academic Press, New York.

Thompson, J.N. Jr., R.C. Woodruff & Haiying Huai, 1998. Mutation rate: a simple concept has become complex. Environ. Mol. Mutagen. 32: 292–300.

Turelli, M., N.H. Barton & J.A. Coyne, 2001. Theory and speciation. Trends Ecol. Evol. 16: 330–343.

Turner, G.J., 1999. Explosive speciation of African cichlid fishes, pp. 113–129 in Evolution of Biological Diversity, edited by A.E. Magurran & R.M. May. Oxford University Press, Oxford.

Ullerich, F.-H., 1996. Inheritance patterns of new genetic markers and occurrence of spontaneous mosaicism in the monogenic blowfly Chrysomya rufifacies (Diptera: Calliphoridae). Mol. Gen. Genet. 253:232–241.

Val, F.C., 1977. Genetic analysis of the morphological differences between two interfertile species of Hawaiian Drosophila . Evolution 31: 611–629.

van Batenburg, F.H.D. & E. Gittenberger, 1996. Ease of fixation of a change in coiling: computer experiments on chirality in snails. Heredity 76: 278–286.

van Dam, N.M., J.D. Hare & E. Elle, 1999. Inheritance and distribution of trichome phenotypes in Datura wrightii . Heredity 90: 220–227.

Via, S., 2001. Sympatric speciation in animals: the ugly duckling grows up. Trends Ecol. Evol. 16: 381–390.

Via, S. & D.J. Hawthorne, 1998. The genetics of speciation, pp. 352–364 in Endless Forms: Species and Speciation, edited by D.J. Howard & S.H. Berlocher. Oxford University Press, Oxford.

Vorontsov, N.N. & E.A. Lyapunova, 1989. Two ways of speciation, pp. 221–245 in Evolutionary Biology of Transient Unstable Populations, edited by A. Fontdevila. Springer-Verlag, Berlin.

Voss, S.R. & H.B. Shaffer, 1997. Adaptive evolution via a major gene effect: paedomorphosis in the Mexican axolotl. Proc. Natl. Acad. Sci. USA 94: 14185–14189.

Walsh, J.B., 1982. Rate of accumulation of reproductive isolation by chromosomal rearrangements. Am. Nat. 120: 510–532.

Wayne, M.L., J.B. Hackett, C.L. Dilda, S.V. Nuzhdin, E.G. Pasyukova & T.F.C. Mackay, 2001. Quantitative trait locus mapping of fitness-related traits in Drosophila melanogaster . Genet. Res. 77: 107–116.

Waxman, D. & J.R. Peck, 1998. Pleiotropy and the preservation of perfection. Science 279: 1210–1213.

Wheeler, D.A., C.P. Kyriacou, M.L. Greennacre, Q. Yu, J.E. Rutila, M. Rosebash & J.C Hall, 1991. Molecular transfer of a species-specific behavior from Drosophila simulons to Drosophila melanogaster . Science 251: 1082–1085.

White, M.J.D., 1978. Modes of Speciation. Freeman, San Francisco, CA.

White, S. & J. Doebley, 1998. Of genes and genomes and the origin of maize. Trends Genet. 14: 327–332.

Williams, S.T., N. Knowlton, L.A. Weigt & J.A. Jara, 2001. Evidence for three major clades within the snapping shrimp genus Alpheus inferred from nuclear and mitochrondrial gene sequence data. Mol. Phylogenet. Evol. 20: 375–389.

Wilson, A.B., K. Noack-Kunnmann & A. Meyer, 2000. Incipient speciation in sympatric Nicaraguan crater lake cichlid fishes: sexual selection versus ecological diversification. Proc. R. Soc. Lond. B 267: 2133–2141.

Woodruff, R.C. & J.N. Thompson Jr., 1992. Have premeiotic clusters of mutation been overlooked in evolutionary theory? J. Evol. Biol. 5: 457–464.

Woodruff, R.C., H. Huai & J.N. Thompson Jr., 1996. Clusters of identical new mutation in the evolutionary landscape. Genetica 98: 149–160.

Wright, S., 1941. On the probability of fixation of reciprocal translocations. Am. Nat. 75: 513–522.

Wright, S., 1982. Character change, speciation, and the higher taxa. Evolution 36: 427–443.

Wright, S. & O.N. Eaton, 1926. Mutational mosaic coat patterns of the guinea pig. Genetics 11: 333–351.

Yang, H.-P., A.Y. Tanikawa & A.S. Kondrashov, 2001. Molecular nature of 11 spontaneous de novo mutations in Drosophila melanogaster . Genetics 157: 1285–1292.

Yokoyama, S., R.B. Radlwimmer & N.S. Blow, 2000. Ultraviolet pigments in birds evolved from violet pigments by a single amino acid change. Proc. Natl. Acad. Sci. USA 97: 7366–7371.

Zimmerman, E.C., 1960. Possible evidence of rapid evolution in Hawaiian moths. Evolution 14: 137–138.

Download references

Author information

Authors and affiliations.

Department of Biological Sciences, Bowling Green State University, Bowling Green, OH, 43403, USA

R. C. Woodruff

Department of Zoology, University of Oklahoma, Norman, OK, 73019, USA

J. N. Thompson Jr.

You can also search for this author in PubMed   Google Scholar

Editor information

Rights and permissions.

Reprints and permissions

Copyright information

© 2002 Springer Science+Business Media Dordrecht

About this chapter

Woodruff, R.C., Thompson, J.N. (2002). Mutation and premating isolation. In: Etges, W.J., Noor, M.A.F. (eds) Genetics of Mate Choice: From Sexual Selection to Sexual Isolation. Contemporary Issues in Genetics and Evolution, vol 9. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0265-3_18

Download citation

DOI : https://doi.org/10.1007/978-94-010-0265-3_18

Publisher Name : Springer, Dordrecht

Print ISBN : 978-94-010-3958-1

Online ISBN : 978-94-010-0265-3

eBook Packages : Springer Book Archive

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

The Impact of Social Isolation Essay (Critical Writing)

  • To find inspiration for your paper and overcome writer’s block
  • As a source of information (ensure proper referencing)
  • As a template for you assignment

Social isolation has become a normal state of living for the world’s population in recent years. For many people, confinement has become a highly stressful situation, triggering their mental health issues, while for others became an opportunity to learn and grow. The recent article reviewed explores the impact of social isolation on a person, providing reliable data on the exact effect of such a notion on people and coping mechanisms to reduce its influence. This essay will summarize the story, adding relevant scientific evidence to ensure the credibility of the article.

Peterson, the author of the article, focused on the challenges the world faced with the pandemic’s restriction on social isolation, inviting Emilie Kossick, the manager of the Canadian Institute of Public Safety Research and Treatment, to talk about the notion from a doctoral perspective. The expert proceeded to state that social desolation has been a significant problem long before the lockdowns, providing examples such as the conditions of astronauts and older adults’ loneliness in care facilities, explaining that it did not receive much attention from researchers until COVID-19 (Peterson, 2021). For that reason, there are currently many available pieces of research that can help facilitate people endure the effects of communication limitations.

Among the physiological consequences of long-term isolation, Kossick indicates sleep pattern impairments and personality changes, in particular, the development of anxiety and depression. Indeed, multiple studies confirm the risks of mental health decrease, especially among children and adolescents who are constantly in need of socialization (Loades et al., 2020). The expert explains such impact to be caused by a decrease in brain activity in areas responsible for social skills and emotions. Therefore, social isolation negatively affects brain functions, causing multiple mental shifts.

Not only social functions are impacted by long-term confinement, but there is also a high possibility of developing chronic diseases as a result of low physical activity and the lack of socialization. Kossick offers the theory that the emergence of a higher likelihood of stroke, dementia, and heart diseases from isolation is attributed to human evolution as social creatures (Peterson, 2021). People continually interact with others, whether they want it or not, and instant depravity from such socialization radically affects mental and physical health.

Since the brain is not accustomed to functioning without interaction, people experience unpleasant consequences. Studies with certain participants confirmed that isolation evokes frustration, distresses the routine, and causes boredom (Brooks et al., 2020). Therefore, the author’s advice is to create a coping strategy to reduce the negative impact of isolation. She proposes to constantly plan one’s day, including hobbies and physical activities in daily life, as well as to keep in touch with friends and family (Peterson, 2021). Even though self-isolation is challenging for each person during the pandemic, as Kossick stated in the article, it is necessary to learn from such experience.

The article by Peterson is of utmost importance for all people who have experienced a decrease in their mental or physical health during the pandemic. It explains the most essential factors that influence a person’s state and why such a thing happens. Moreover, the author offers strategies that can facilitate coping with isolation and reduce its negative impact. Although the articles on the pandemic’s evolution can be frequently encountered, they rarely focus on the condition of healthy individuals. Mental and physical health must become a general priority during the lockdown, and to this end, Peterson’s article helps explain the symptoms people might experience and suggests the mechanisms to remain sane.

Brooks, S. K., Webster, R. K., Smith, L. E., Woodland, L., Wessely, S., Greenberg, N., & Rubin, G. J. (2020). The psychological impact of quarantine and how to reduce it: Rapid review of the evidence. The Lancet , 395 (10227), 912–920. Web.

Loades, M. E., Chatburn, E., Higson-Sweeney, N., Reynolds, S., Shafran, R., Brigden, A., Linney, C., McManus, M. N., Borwick, C., & Crawley, E. (2020). Rapid systematic review: The impact of social isolation and loneliness on the mental health of children and adolescents in the context of COVID-19. Journal of the American Academy of Child & Adolescent Psychiatry , 59 (11). Web.

Peterson, J. (2021). Researchers with Sask. roots explore the impacts of long-term isolation . MSN. Web.

  • Chicago Peterson Ave Stores: Company Information
  • International Trade of Peterson Advocates in Australia
  • Discrimination: Peterson v. Wilmur Communications
  • Cultural Heritage and Its Impact on Health Care Delivery
  • Community Health Assessment: The City of Los Angeles
  • An Evidence-Based Program for Female Offenders
  • Evaluation of the Welsh National Exercise Referral Scheme
  • Fundamentals of Evidence-Based Practice in Health Care
  • Chicago (A-D)
  • Chicago (N-B)

IvyPanda. (2022, June 29). The Impact of Social Isolation. https://ivypanda.com/essays/the-impact-of-social-isolation/

"The Impact of Social Isolation." IvyPanda , 29 June 2022, ivypanda.com/essays/the-impact-of-social-isolation/.

IvyPanda . (2022) 'The Impact of Social Isolation'. 29 June.

IvyPanda . 2022. "The Impact of Social Isolation." June 29, 2022. https://ivypanda.com/essays/the-impact-of-social-isolation/.

1. IvyPanda . "The Impact of Social Isolation." June 29, 2022. https://ivypanda.com/essays/the-impact-of-social-isolation/.

Bibliography

IvyPanda . "The Impact of Social Isolation." June 29, 2022. https://ivypanda.com/essays/the-impact-of-social-isolation/.

  • BiologyDiscussion.com
  • Follow Us On:
  • Google Plus
  • Publish Now

Biology Discussion

Essay on Genetic Variation | Evolution | Species | Biology

write and essay on isolation and mutation

ADVERTISEMENTS:

Here is an essay on ‘Genetic Variation’ for class 9, 10, 11 and 12. Find paragraphs, long and short essays on ‘Genetic Variation’ especially written for school and college students.

Essay on Genetic Variation

Essay # 1. meaning of genetic variation:.

Evolution requires genetic variation. If there were no dark moths, the population could not have evolved from mostly light to mostly dark. In order for continuing evolution there must be mechanisms to increase or create genetic variation and mechanisms to decrease it. Mutation is a change in a gene. These changes are the source of new genetic variation. Natural selection operates on this variation.

Genetic variation has two components- allelic diversity and non- random associations of alleles. Alleles are different versions of the same gene. For example, humans can have A, B or O alleles that determine one aspect of their blood type. Most animals, including humans, are diploid—they contain two alleles for every gene at every locus, one inherited from their mother and one inherited from their father.

Locus is the location of a gene on a chromosome. Humans can be AA, AB, AO, BB, BO or OO at the blood group locus. If the two alleles at a locus are the same type (for instance two A alleles) the individual would be called homozygous. An individual with two different alleles at a locus (for example, an AB individual) is called heterozygous. At any locus there can be many different alleles in a population, more alleles than any single organism can possess. For example, no single human can have an A, B and an O allele.

Considerable variation is present in natural populations. At 45 percent of loci in plants there is more than one allele in the gene pool. Any given plant is likely to be heterozygous at about 15 percent of its loci. Levels of genetic variation in animals range from roughly 15% of loci having more than one allele (polymorphic) in birds, to over 50% of loci being polymorphic in insects.

Mammals and reptiles are polymorphic at about 20% of their loci – amphibians and fish are polymorphic at around 30% of their loci. In most populations, there are enough loci and enough different alleles that every individual, identical twins excepted, has a unique combination of alleles.

Linkage disequilibrium is a measure of association between alleles of two different genes. If two alleles were found together in organisms more often than would be expected, the alleles are in linkage disequilibrium. If there are two loci in an organism (A and B) and two alleles at each of these loci (A1, A2, B1 and B2), linkage disequilibrium (D) is calculated as D = f (A1B1) * f (A2B2) – f (A1B2) * f (A2B1) (where f(X) is the frequency of X in the population).

D varies between -1/4 and 1/4; the greater the deviation from zero, the greater the linkage. The sign is simply a consequence of how the alleles are numbered. Linkage disequilibrium can be the result of physical proximity of the genes. Or, it can be maintained by natural selection if some combinations of alleles work better as a team.

Natural selection maintains the linkage disequilibrium between color and pattern alleles in Papilio memnon. In this moth species, there is a gene that determines wing morphology. One allele at this locus leads to a moth that has a tail; the other allele codes for an untailed moth. There is another gene that determines if the wing is brightly or darkly colored.

There are thus four possible types of moths- brightly colored moths with and without tails, and dark moths with and without tails. All four can be produced when moths are brought into the lab and bred. However, only two of these types of moths are found in the wild- brightly colored moths with tails and darkly colored moths without tails.

The non-random association is maintained by natural selection. Bright, tailed moths mimic the pattern of an unpalatable species. The dark morph is cryptic. The other two combinations are neither mimetic nor cryptic and are quickly eaten by birds.

Assortative mating causes a non-random distribution of alleles at a single locus. If there are two alleles (A and a) at a locus with frequencies p and q, the frequency of the three possible genotypes (AA, Aa and aa) will be p 2 , 2pq and q 2 , respectively. For example, if the frequency of A is 0.9 and the frequency of a is 0.1, the frequencies of AA, Aa and aa individuals will be- 0.81, 0.18 and 0.01. This distribution is called the Hardy-Weinberg equilibrium.

Non-random mating results in a deviation from the Hardy- Weinberg distribution. Humans mate assortatively according to race; we are more likely to mate with someone of own race than another. In populations that mate this way, fewer heterozygotes are found than would be predicted under random mating.

A decrease in heterozygotes can be the result of mate choice, or simply the result of population subdivision. Most organisms have a limited dispersal capability, so their mate will be chosen from the local population.

In order for continuing evolution there must be mechanisms to increase or create genetic variation and mechanisms to decrease it. The mechanisms of evolution are mutation, natural selection, genetic drift, recombination and gene flow.

Essay # 2. Mechanism of Evolution for Genetic Variation:

Genetic Drift:

Allele frequencies can change due to chances alone. This is called genetic drift. Drift is a binomial sampling error of the gene pool. What this means is, the alleles that form the next generation’s gene pool are a sample of the alleles from the current generation. When sampled from a population, the frequency of alleles differs slightly due to chance alone.

Alleles can increase or decrease in frequency due to drift. The average expected change in allele frequency is zero, since increasing or decreasing in frequency is equally probable. A small percentage of alleles may continually change frequency in a single direction for several generations just as flipping a fair coin may, on occasion, result in a string of heads or tails. A very few new mutant alleles can drift to fixation in this manner.

In small populations, the variance in the rate of change of allele frequencies is greater than in large populations. However, the overall rate of genetic drift (measured in substitutions per generation) is independent of population size. If the mutation rate is constant, large and small populations lose alleles to drift at the same rate.

This is because large populations will have more alleles in the gene pool, but they will lose them more slowly. Smaller populations will have fewer alleles, but these will quickly cycle through. This assumes that mutation is constantly adding new alleles to the gene pool and selection is not operating on any of these alleles.

Sharp drops in population size can change allele frequencies substantially. When a population crashes, the alleles in the surviving sample may not be representative of the pre-crash gene pool. This change in the gene pool is called the founder effect, because small populations of organisms that invade a new territory (founders) are subject to this.

Many biologists feel the genetic changes brought about by founder effects may contribute to isolated populations developing reproductive isolation from their parent populations. In sufficiently small populations, genetic drift can counteract selection. Mildly deleterious alleles may drift to fixation.

Wright and Fisher disagreed on the importance of drift. Fisher thought populations were sufficiently large that drift could be neglected. Wright argued that populations were often divided into smaller subpopulations. Drift could cause allele frequency differences between subpopulations if gene flow was small enough.

If a subpopulation was small enough, the population could even drift through fitness valleys in the adaptive landscape. Then, the subpopulation could climb a larger fitness hill. Gene flow out of this subpopulation could contribute to the population as a whole adapting. This is Wright’s Shifting Balance theory of evolution.

Both natural selection and genetic drift decrease genetic variation. If they were the only mechanisms of evolution, populations would eventually become homogeneous and further evolution would be impossible. There are, however, mechanisms that replace variation depleted by selection and drift.

Recombination:

Each chromosome in our sperm or egg cells is a mixture of genes from our mother and our father. Recombination can be thought of as gene shuffling. Most organisms have linear chromosomes and their genes lie at specific location (loci) along them. Bacteria have circular chromosomes.

In most sexually reproducing organisms, there is two of each chromosome type in every cell. For instance in humans, every chromosome is paired, one inherited from the mother, the other inherited from the father. When an organism produces gametes, the gametes end up with only one of each chromosome per cell. Haploid gametes are produced from diploid cells by a process called meiosis.

In meiosis, homologous chromosomes line up. The DNA of the chromosome is broken on both chromosomes in several places and rejoined with the other strand. Later, the two homologous chromosomes are split into two separate cells that divide and become gametes. But, because of recombination, both of the chromosomes are a mix of alleles from the mother and father.

Recombination creates new combinations of alleles. Alleles that arose at different times and different places can be brought together. Recombination can occur not only between genes, but within genes as well. Recombination within a gene can form a new allele. Recombination is a mechanism of evolution because it adds new alleles and combinations of alleles to the gene pool.

New organisms may enter a population by migration from another population. If they mate within the population, they can bring new alleles to the local gene pool. This is called gene flow. In some closely related species, fertile hybrids can result from interspecific matings. These hybrids can vector genes from species to species.

Gene flow between more distantly related species occurs infrequently. This is called horizontal transfer. One interesting case of this involves genetic elements called P elements. Margaret Kidwell found that P elements were transferred from some species in the Drosophila willistoni group to Drosophila melanogaster.

These two species of fruit flies are distantly related and hybrids do not form. Their ranges do, however, overlap. The P elements were vectored into D. melanogaster via a parasitic mite that targets both these species. This mite punctures the exoskeleton of the flies and feeds on the “juices”.

Material, including DNA, from one fly can be transferred to another when the mite feeds. Since P elements actively move in the genome (they are themselves parasites of DNA), one incorporated itself into the genome of a melanogaster fly and subsequently spread through the species. Laboratory stocks of melanogaster caught prior to the 1940’s lack of P elements. All natural populations today harbor them.

Essay # 3. Evolution within a Lineage for Genetic Variation:

Evolution is a change in the gene pool of a population over time; it can occur due to several factors. Three mechanisms add new alleles to the gene pool- mutation, recombination and gene flow. Two mechanisms remove alleles, genetic drift and natural selection. Drift removes alleles randomly from the gene pool. Selection removes deleterious alleles from the gene pool. The amount of genetic variation found in a population is the balance between the actions of these mechanisms.

Natural selection can also increase the frequency of an allele. Selection that weeds out harmful alleles is called negative selection. Selection that increases the frequency of helpful alleles is called positive, or sometimes positive Darwinian, selection. A new allele can also drift to high frequency. But, since the change in frequency of an allele each generation is random, nobody speaks of positive or negative drift.

Except in rare cases of high gene flow, new alleles enter the gene pool as a single copy. Most new alleles added to the gene pool are lost almost immediately due to drift or selection; only a small percent ever reach a high frequency in the population. Even most moderately beneficial alleles are lost due to drift when they appear. But, a mutation can reappear numerous times.

The fate of any new allele depends a great deal on the organism it appears in. This allele will be linked to the other alleles near it for many generations. A mutant allele can increase in frequency simply because it is linked to a beneficial allele at a nearby locus. This can occur even if the mutant allele is deleterious, although it must not be so deleterious as to offset the benefit of the other allele.

Likewise a potentially beneficial new allele can be eliminated from the gene pool because it was linked to deleterious alleles when it first arose. An allele “riding on the coat tails” of a beneficial allele is called a hitchhiker. Eventually, recombination will bring the two loci to linkage equilibrium. But, the more closely linked two alleles are, the longer the hitchhiking will last.

The effects of selection and drift are coupled. Drift is intensified as selection pressures increase. This is because increased selection (i.e. a greater difference in reproductive success among organisms in a population) reduces the effective population size, the number of individuals contributing alleles to the next generation.

Adaptation is brought about by cumulative natural selection, the repeated sifting of mutations by natural selection. Small changes, favored by selection, can be the stepping-stone to further changes. The summation of large numbers of these changes is macroevolution.

Related Articles:

  • Genetic Divergence in Species | Zoology
  • Models of Gene-Pool Structure | Population Genetics

Essay , Biology , Species , Evolution , Genetic Variation , Essay on Genetic Variation

  • Anybody can ask a question
  • Anybody can answer
  • The best answers are voted up and rise to the top

Forum Categories

  • Animal Kingdom
  • Biodiversity
  • Biological Classification
  • Biology An Introduction 11
  • Biology An Introduction
  • Biology in Human Welfare 175
  • Biomolecules
  • Biotechnology 43
  • Body Fluids and Circulation
  • Breathing and Exchange of Gases
  • Cell- Structure and Function
  • Chemical Coordination
  • Digestion and Absorption
  • Diversity in the Living World 125
  • Environmental Issues
  • Excretory System
  • Flowering Plants
  • Food Production
  • Genetics and Evolution 110
  • Human Health and Diseases
  • Human Physiology 242
  • Human Reproduction
  • Immune System
  • Living World
  • Locomotion and Movement
  • Microbes in Human Welfare
  • Mineral Nutrition
  • Molecualr Basis of Inheritance
  • Neural Coordination
  • Organisms and Population
  • Photosynthesis
  • Plant Growth and Development
  • Plant Kingdom
  • Plant Physiology 261
  • Principles and Processes
  • Principles of Inheritance and Variation
  • Reproduction 245
  • Reproduction in Animals
  • Reproduction in Flowering Plants
  • Reproduction in Organisms
  • Reproductive Health
  • Respiration
  • Structural Organisation in Animals
  • Transport in Plants
  • Trending 14

Privacy Overview

CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

web counter

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 12 June 2024

Strand-resolved mutagenicity of DNA damage and repair

  • Craig J. Anderson 1 ,
  • Lana Talmane 1 ,
  • Juliet Luft   ORCID: orcid.org/0000-0003-2458-9377 1 ,
  • John Connelly 1 , 2 , 3 , 4 ,
  • Michael D. Nicholson   ORCID: orcid.org/0000-0003-2813-8979 5 ,
  • Jan C. Verburg   ORCID: orcid.org/0000-0002-3012-3015 1 ,
  • Oriol Pich 6 ,
  • Susan Campbell 1 ,
  • Marco Giaisi 7 ,
  • Pei-Chi Wei 7 ,
  • Vasavi Sundaram 8 ,
  • Frances Connor   ORCID: orcid.org/0000-0003-2858-9411 9 ,
  • Paul A. Ginno 10 ,
  • Takayo Sasaki 11 ,
  • David M. Gilbert   ORCID: orcid.org/0000-0001-8087-9737 11 ,
  • Liver Cancer Evolution Consortium ,
  • Núria López-Bigas   ORCID: orcid.org/0000-0003-4925-8988 6 , 12 , 13 , 14 ,
  • Colin A. Semple 1 ,
  • Duncan T. Odom   ORCID: orcid.org/0000-0001-6201-5599 9 , 10 ,
  • Sarah J. Aitken   ORCID: orcid.org/0000-0002-1897-4140 2 , 9 , 15 , 16 &
  • Martin S. Taylor   ORCID: orcid.org/0000-0001-7656-330X 1  

Nature ( 2024 ) Cite this article

3607 Accesses

99 Altmetric

Metrics details

  • Cancer genomics
  • DNA adducts
  • Genome informatics
  • Nucleotide excision repair
  • Translesion synthesis

DNA base damage is a major source of oncogenic mutations 1 . Such damage can produce strand-phased mutation patterns and multiallelic variation through the process of lesion segregation 2 . Here we exploited these properties to reveal how strand-asymmetric processes, such as replication and transcription, shape DNA damage and repair. Despite distinct mechanisms of leading and lagging strand replication 3 , 4 , we observe identical fidelity and damage tolerance for both strands. For small alkylation adducts of DNA, our results support a model in which the same translesion polymerase is recruited on-the-fly to both replication strands, starkly contrasting the strand asymmetric tolerance of bulky UV-induced adducts 5 . The accumulation of multiple distinct mutations at the site of persistent lesions provides the means to quantify the relative efficiency of repair processes genome wide and at single-base resolution. At multiple scales, we show DNA damage-induced mutations are largely shaped by the influence of DNA accessibility on repair efficiency, rather than gradients of DNA damage. Finally, we reveal specific genomic conditions that can actively drive oncogenic mutagenesis by corrupting the fidelity of nucleotide excision repair. These results provide insight into how strand-asymmetric mechanisms underlie the formation, tolerance and repair of DNA damage, thereby shaping cancer genome evolution.

Similar content being viewed by others

write and essay on isolation and mutation

Single-mitosis dissection of acute and chronic DNA mutagenesis and repair

write and essay on isolation and mutation

Unravelling roles of error-prone DNA polymerases in shaping cancer genomes

write and essay on isolation and mutation

Pervasive lesion segregation shapes cancer genome evolution

There is an elegant symmetry to the structure and replication of DNA, in which the two strands separate and each acts as a template for the synthesis of new daughter strands. Despite this holistic symmetry, many activities of DNA are strand asymmetric: (1) during replication, different enzymes mainly synthesize the leading and lagging strands 3 , 4 , 6 , 7 , (2) RNA transcription uses only one strand of the DNA as a template 8 , (3) one side of the DNA double helix is more associated with transcription factors 9 , and (4) alternating strands of DNA face towards or away from the nucleosome core 10 , 11 . These processes can each impart strand asymmetric mutational patterns that reflect the cumulative DNA transactions of the cells in which the mutations accrued 1 , 9 , 10 , 12 , 13 .

Cancer genomes are the result of diverse mutational processes 1 , 14 , often accumulated over decades, making it challenging to identify and subsequently interpret their relative roles in generating spatial and temporal mutational asymmetries. The relative contribution of DNA damage, surveillance and repair processes to observed patterns of mutational asymmetry remains poorly understood, although mapping of DNA damage 15 , 16 , 17 , 18 and repair intermediates 19 , 20 have provided key insights.

To understand the mechanistic asymmetries of DNA damage and repair on a genome-wide basis, we have exploited an established mouse model of liver carcinogenesis 21 , 22 , in which mutations are induced through a single DNA-damaging exposure to diethylnitrosamine (DEN; an alkylating agent that is bioactivated by the hepatocyte-expressed enzyme Cyp2e1). The exposure results in mutagenic DNA base damage, referred to as DNA lesions, that are inherited and resolved as mutations in subsequent cell cycles 2 . This phenomenon of lesion segregation, in which damaged lesion-containing strands segregate into separate daughter cells, results in pronounced, chromosome-scale mutational asymmetry. In a clonally expanded cell population, such as a tumour, this asymmetry can identify which damaged DNA strand was inherited by the ancestor of each tumour (Fig. 1a ). Using this approach, we can determine the lesion-containing strand for approximately 50% of the autosomal genome and the entire X chromosome for each tumour 2 (Extended Data Fig. 1 ). We analysed data from 237 clonally distinct tumours from 98 mice and could resolve the lesion strand for over 7 million base substitution mutations (Fig. 1b ). Most (more than 75%) of the mutations are from T nucleotides on the lesion strand (Fig. 1c ), consistent with previous analyses of DEN-induced tumours 2 , 22 , and biochemical evidence of frequent mutagenic alkylation adducts on thymine 23 .

figure 1

a , Schematic of DNA lesion segregation 2 . Mutagen exposure induces lesions (red triangles) on both DNA strands (forward in blue; reverse in gold). Lesions that persist until replication serve as a reduced fidelity template. The two sister chromatids segregate into distinct daughter cells, so new mutations are not shared between daughter cells of the first division. Lesions that persist for multiple cell generations can generate multiallelic variation through repeated replication over the lesion (in italic). b , Summary of tumour generation and mutations called from whole-genome sequencing (WGS; Methods). c , Lesion strand resolved mutation spectra of all tumours ( n  = 237), representing the relative frequency of strand-specific single-base substitutions and their sequence context (192 categories). d , During the first DNA replication after DNA damage, template lesions (red triangles) are encountered by both the extending leading and the lagging strands. e , The relative enrichment (RE) of liver-expressed genes in the plus versus minus orientation (RE = (plus − minus)/(plus + minus)) across 21 quantile bins of replication fork directionality (RFD) bias ( x axis). f , Mutation rates ( y axis) for the whole genome (gold) stratified into 21 quantile bins of replication strand bias (RSB; x axis) show a higher mutation rate for the lagging strand than the leading strand replication on a lesion-containing template. This effect is enhanced in expressed genes (tan) and negligible in non-genic regions (orange). Whiskers show 95% bootstrap confidence intervals.

The range of mutagenic alkylation adducts generated by activated DEN overlaps those from tobacco smoke exposure, unavoidable endogenous mutagens and alkylating chemotherapeutics such as temozolomide 23 , 24 , 25 . More generally, the mechanism of lesion segregation, which the strand-resolved analysis relies on, appears to be a ubiquitous property of base-damaging mutagens 2 . Here we newly exploit these strand-resolved lesions as a powerful tool to quantify how mitotic replication, transcription and DNA–protein binding mechanistically shape DNA damage, genome repair and mutagenesis.

The mutational symmetry of replication

These well-powered and experimentally controlled in vivo data provide a unique opportunity to evaluate whether DNA damage on the template for leading strand replication results in the same rate and spectrum of mutations as on the lagging strand template. There are several reasons why they might differ. First, leading and lagging strand replication use distinct replicative enzymes 3 , 4 , 6 , 7 , which may differ in how they handle unrepaired damage on the DNA template strand. Second, it is unknown whether the leading and lagging strand polymerases recruit different translesion polymerases, which could generate distinct error profiles. Third, substantially longer replication gaps are expected on the leading strand, if there is polymerase stalling 26 . Consequently, leading and lagging strands are thought to differ in their lesion bypass 5 and post-replicative gap filling 27 , 28 .

On the basis of hepatocyte-derived measures of replication fork directionality (using Repli-seq and OK-seq, see Methods; Extended Data Fig. 2 ) and patterns of mutation asymmetry, we inferred whether the lesion-containing strand preferentially templated the leading or lagging replication strand (Fig. 1d ). This was separately resolved for each genomic locus on a per tumour basis. Our initial analysis demonstrated a significantly higher mutation rate for lagging strand synthesis over a lesion-containing template (Pearson’s correlation coefficient cor = −0.86, P  = 3.2 × 10 −9 ; Fig. 1e ). However, gene orientation — and thus the directionality of transcription — also correlates with replication direction 29 , 30 and DEN lesions are subject to transcription-coupled repair (TCR) 2 . We therefore measured transcriptome-wide gene expression in the mouse liver on postnatal day 15 (P15), corresponding to the timing of DEN mutagenesis. This confirmed that the direction of transcription is strongly biased to match replication fork movement, and the effect is disproportionally evident in regions of extreme replication bias (Fig. 1e ).

To disentangle the effects of transcription from replication, we measured mutation rates, jointly stratifying the genome by transcription state, replication strand bias, replication timing and genic annotation (Fig. 1f and Extended Data Fig. 3 ). Although transcribed regions exhibit a strong correlation of mutation rate with replication strand bias (Pearson’s cor = −0.86, P  = 3.1 × 10 −7 ), genome-wide multivariate regression shows that the strongest independent effect on the DEN-induced mutation rate is transcription over the lesion-containing strand ( P  < 1 × 10 −300 ), followed by replication time ( P  = 6 × 10 −162 ). As mismatch repair is biased towards earlier replicating genomic regions 31 , it may be partially responsible for correcting some mismatch–lesion heteroduplexes. We considered genic and non-genic regions of the genome across 21 quantiles of replication timing and found that, although there is a correlation between mutation rate and replication time supportive of mismatch repair, its role is minor relative to TCR (Extended Data Fig. 4 ). Replication strand bias has the smallest effect on mutation rate of tested measures (Extended Data Fig. 3j ). Outside of genic regions, the correlation of replication strand bias with mutation rate is negligible (Fig. 1f and Extended Data Fig. 3j ). This unexpected consistency in the rate of mutations generated by replication over alkyl lesions points to a shared mechanism of lesion bypass for the leading and lagging strands, possibly involving recruitment of the same translesion polymerases.

Strand-resolved collateral mutagenesis

It has been proposed that when translesion polymerases replicate across damaged bases, they can generate proximal tracts of low-fidelity synthesis 32 , 33 , 34 . In bacteria and yeast, this mechanism produces clusters of mutations 35 , 36 and such collateral mutagenesis has recently been reported in vertebrates 37 . Consistent with these models, we found that mutations within 10 nt of each other are significantly elevated over permuted expectation (two-sided Fisher’s test, odds ratio 11.9, P  < 2.2 × 10 −16 ). This enrichment is most pronounced at 1–2 nt spacing, decreases after one DNA helical turn (approximately 10 nt) and decays to background within 20 nt (Fig. 2a and Extended Data Fig. 5 ). These short clusters are overwhelmingly isolated pairs of mutations (98% pairs, 2% trios) phased on the same chromosome (Extended Data Fig. 5e ).

figure 2

a , Closely spaced mutations (brown) occur more frequently than expected based on permutation of mutations between tumours (pink; bootstrap 95% CI is shaded, too small to visualize). b , Residual mutation signature (after subtracting expected mutations) for cluster upstream mutations. Cluster orientation by the lesion-containing strand (red dashed line; Methods). c , Residual signature of downstream cluster mutations, plotted as per b . d , Schematic illustrating mutagenic translesion synthesis (TLS) (yellow circle) and collateral mutagenesis (brown circle). e , Substitutions are highly clustered downstream of 1 bp deletions. The inset shows the density plot for 10,000 random permutations of lesion strand assignment (grey) compared with the observed level of upstream/downstream bias. Only clusters where the substitution could be definitively assigned to an upstream or downstream location were considered. Two-sided P values were empirically derived from the permutations. nt, nucleotide. f , Single-base insertions are also clustered with substitutions, but biased to upstream of the insertion; plotted as per e . g , One-base pair deletions with a downstream substitution within 10 bp (left panel) show significant bias towards deletion of T (rather than A) from the lesion-containing strand compared with the rate genome wide (centre panel, two-sided Fisher’s exact test odds = 16.5, P  = 1.04 × 10 −16 ). Downstream substitutions are also highly distorted from the genome-wide profile (two-sided Chi-squared test P  = 8.5 × 10 −46 ). By contrast, insertion mutations and their proximal substitutions resemble the genome-wide profiles, with the notable additional contribution from the G→T substitutions (*) that also associate with both substitution and 1 bp deletion clusters. h , The rate of mutation clusters is not correlated with replication strand bias; consistently, approximately 0.8% of substitution mutations are found in clusters spanning 10 nt or fewer, indicating a similar rate of TLS for both the leading and the lagging strands.

We oriented the clusters by their lesion-containing strand, and designated the first mutation site to be replicated over on the lesion-containing template as the upstream (5′) mutation and subsequent mutations were designated downstream (3′). Upstream mutations showed a mutation spectrum closely resembling the tumours as a whole (Fig. 2b and Extended Data Fig. 5a,b,i ), indicating that it represents a typical lesion-templated substitution.

By contrast, downstream mutations have distinct mutation spectra (Extended Data Fig. 5c ). Those located more than two nucleotides downstream show a strong preference for G→T substitutions (Fig. 2c and Extended Data Fig. 5h,l–n ). As mutations are called relative to the lesion-containing template strand, this indicates the preferential misincorporation of A nucleotides opposite a template G nucleotide, thus newly revealing the intrinsic error profile of an extending translesion polymerase. Mutation pairs with closer spacing (2 nt or fewer) exhibit somewhat divergent mutation signatures (Extended Data Fig. 5h,j,k ), probably reflecting both sequence-composition constraints and processes such as the transition between alternate translesion polymerases (Fig. 2d ).

Extending these observations of collateral translesion mutagenesis, we found significant clustering of insertion and deletion mutations with base substitutions (insertion/deletion mutation within 100 bp of a substitution, two-sided Fisher’s test odds ratio 103, P  < 2.2 × 10 −16 compared with permuted expectation; Fig. 2e,f and Extended Data Fig. 6a–i ). Single-base deletions preferentially remove T nucleotides from the lesion strand both genome wide and in mutation clusters (Fig. 2g ; two-sided Fisher’s test odds ratio 16.5, P  = 1.04 × 10 −16 ), which indicates a base-skipping mode of lesion bypass. These single-base deletions are associated with downstream substitutions within 10 nt that include the G→T substitutions already identified as a signature of collateral translesion mutagenesis, but more prominently a distinct substitution signature of A→C on the lesion strand (Fig. 2g ). In contrast to deletions, nucleotide insertions are clustered downstream of typical DEN adduct-induced base substitutions, pointing to collateral insertion mutagenesis by translesion polymerases (Fig. 2g and Extended Data Fig. 6h,i ).

Three lines of evidence support a model in which the same translesion polymerases are recruited with equal efficiency and processivity to both the leading and the lagging strands. First, the leading and lagging strands have essentially identical relative rates of mutation clusters (Fig. 2h ). Second, the mutation spectra of the downstream mutations are the same (Extended Data Fig. 5o ). Third, the length distribution of clusters matches between leading strand-biased and lagging strand-biased regions (no significant difference in size distribution, Kolmogorov–Smirnov test ( P  = 0.15) despite more than 98% power to detect a difference in the distribution of cluster lengths of 4% or more; Extended Data Fig. 5p,q ).

Having established the replicative symmetry of damage-induced mutagenesis and determined the relative contributions of replication and transcription on mutation rate, we next looked in detail at the pronounced strand-specific effects of transcription on DNA repair and mutagenesis.

Multiallelism reveals repair kinetics

Using liver RNA sequencing data (P15 mice), we found that nascent transcription estimates provide a better correlation with mutation rate than steady-state transcript levels (Extended Data Fig. 7a–d ), as expected 8 . Increased transcription decreases the mutation rate for template strand lesions up to an expression level of ten nascent transcripts per million (Fig. 3a,b ). Beyond this, the mutation rate plateaus and is not further reduced by additional transcription, suggesting that the remaining mutagenic lesions are largely invisible to TCR (Extended Data Fig. 7c,d ).

figure 3

a , DNA lesions (red triangles) on the transcription template strand can cause RNA polymerase to stall and trigger transcription-coupled NER. Cells that inherit the template strand of active genes have a depletion of mutations through the gene body. b , Mutation rate ( y axis) for individual genes relative to their nascent transcription rate ( x axis) estimated from intronic reads. Mutation rates for each gene ( n  = 3,392) are calculated separately for template (orange) and non-template (black) strand lesions. The curves show best-fit splines. Genes are grouped into six expression strata (used in subsequent analyses), indicated by the density distribution (top). TPM, transcripts per million. c , Mutation rates for genes grouped into expression strata (1–6; top axis), calculated separately for template strand lesions (orange) and non-template strand lesions (black). Whiskers indicate 95% bootstrap confidence intervals (too small to resolve). Labels indicate data used in subsequent mutation spectra panels ( d , e ). d , Despite similar mutation rates, the mutation spectrum differs between non-template lesion stratum 6 (nl6) and template lesion stratum 2 (tl2). e , Permutation testing confirms that the mutation spectra differs between the transcription template and the non-template strand, even when overall mutation rates are similar. Comparison of tl2 and nl6 mutation spectra (red) and after gene-level permutation of categories. n  = 10 5 permutations (grey). f , Lesions (red triangles) that persist for multiple cell generations can generate multiallelic variation through repeated replication over the lesion. g , Lesions rapidly removed by NER persist for fewer cell cycles, generating less multiallelic variation. h , The multiallelic rate ( y axis) for template strand lesions (orange) is reduced with increasing transcription ( x axis). The same is apparent for non-template lesions (black), indicating that enhanced repair of non-template lesions is also associated with greater transcription. Whiskers show bootstrap 95% confidence intervals.

Unexpectedly, the non-template strands of genic regions also showed a modest reduction in mutation rate with increased transcription (Fig. 3c ), but the resulting mutation signature differs from that on the template strand. This discordance suggests that cryptic antisense transcription is not responsible (Fig. 3d,e and Extended Data Fig. 7e–j ) and that there is either (1) enhanced (non-TCR) surveillance of lesions on the non-template strand or (2) generally reduced alkylation damage to transcriptionally active regions.

We used another insight from lesion segregation to disentangle patterns of differential damage from differential repair. As DNA lesions from DEN treatment, as with all other tested mutagens 2 , can persist for multiple cell cycles, each round of replication could incorporate a different incorrectly paired nucleotide opposite a persistent lesion. This results in multiallelic variation: multiple alleles at the same genomic position within a tumour 2 (Figs. 1a and  3f ). Lesions in efficiently repaired regions will persist for fewer generations and therefore have fewer opportunities to generate multiallelic variation, so are expected to exhibit lower multiallelic rate (the fraction of mutations with multiallelic variation) than less efficiently repaired regions (Fig. 3g ). By contrast, differential rates of damage, although influencing overall mutation rate, do not systematically distort the persistence of an individual lesion, so would have no influence on rates of multiallelic variation.

Whether mutation suppression on the non-template strand is caused by enhanced repair or reduced damage can now be established through the comparison of multiallelic variation rates. For lesions on the template strand, multiallelic rate decreases with increased transcription (Fig. 3h ), reflecting the progressive removal of lesions across multiple cell cycles by TCR, as expected. The multiallelic rate for non-template strand lesions is also reduced with greater transcription (Fig. 3h ), revealing enhanced repair rather than decreased damage. Combined with the distinct repair signature of the two strands (Fig. 3d,e and Extended Data Fig. 7j ), this demonstrates that in expressed genes, there is transcription-associated repair activity of the non-template strand, in addition to the template strand-specific TCR. We speculate that this may reflect enhanced global nucleotide excision repair (NER) surveillance in the more open chromatin of transcriptionally active genes.

Steric influences on damage and repair

Transcription-associated repair of non-template lesions (Fig. 3h ) highlights the importance of DNA accessibility for repair of DNA damage. Although it is well established that mutation rate is correlated with nucleosome positioning and transcription factor binding 7 , 9 , 10 , 11 , 38 , our lesion strand resolved measures of mutation and multiallelic rate provide an opportunity to deconvolve the contributions of differential damage from repair in these genomic contexts.

We quantified the DNA accessibility landscape of the genome using ATAC-seq (in the P15 mouse liver; Methods), and annotated it using experimentally defined transcription factor binding (including chromatin immunoprecipitation followed by sequencing (ChIP–seq) mapping of CTCF binding in the P15 mouse liver; Methods) and pre-existing maps of nucleosome positioning 39 . In all contexts, we found that greater DNA accessibility corresponds to both reduced mutation rate and reduced rate of multiallelic variation, implicating the efficient repair of accessible DNA as a major determinant of damage-induced mutation rate (Fig. 4a,b ). Indeed, the 10 bp periodicity of mutations in nucleosome-wrapped DNA, as previously seen for other mutagens 11 , 40 , is recapitulated by the multiallelic rate variation that we identified (Extended Data Fig. 8a–c ).

figure 4

a , Nucleosome occupancy shapes the mutational landscape 56 , 57 , with higher mutation rates (21 bp sliding window) over the nucleosomes (for example, x  = 0), and lower rates in more-accessible linker regions (accessibility measured by ATAC-seq from P15 mouse liver, in purple with scale on the right axis and larger values corresponding to greater accessibility). Mutation and multiallelic rates are shown with shaded 95% bootstrap confidence intervals (also in subsequent panels). b , High rates of multiallelic variation are found at sites of low accessibility and high mutation rate, indicating that high rates of mutation represent slow repair. c , The rate of A→N mutations is the inverse of the overall mutation profile, with high rates of A→N corresponding to accessible regions and rapid repair. d , Mutation rates are dramatically elevated at CTCF-binding sites (21 bp sliding window, in black; single-base resolution, in red). e , High accessibility at CTCF sites again corresponds to low multiallelic variation and low mutation rates ( d ), with the exception of the mutation hotspot (red arrow), which does not show a corresponding increase in multiallelism, indicating that higher rates of damage cause these hotspots. f , Mutations of A→N closely track DNA accessibility.

Sequence-specific binding proteins, such as transcription factors and CTCF, interact with DNA more transiently than nucleosomes 41 . We found reduced mutation rates and multiallelic variation adjacent to and across their binding sites compared with genome-wide averages (Extended Data Fig. 8h–j ), suggesting that transient binding is not a strong impediment to repair processes. High information content nucleotides in sequence-specific binding motifs show exceptionally reduced mutation rates that are not accompanied by corresponding decreases in multiallelic variation (Extended Data Fig. 8i,j ). This discordance is consistent with reduced damage (rather than enhanced repair) in these sites. Given the close contacts made between the bases and proteins in these motifs, it raises the possibility that binding proteins offer some protection from lesion formation. Uniquely, the CTCF-binding footprint contains specific sites that exhibit pronounced, lesion strand-specific elevations of mutation rate that are not accompanied by increased multiallelic variation (Fig. 4d,e and Extended Data Fig. 8e–g ). This suggests that in this case, the elevated mutation is due to elevated DNA damage, rather than primarily a consequence of suppressed repair.

We identified an anomalous enrichment of apparent A→N mutations in genomic loci that showed highly efficient repair for other nucleotides (Fig. 4c,f ). These accessible loci include those adjacent to CTCF and transcription factor-binding sites and linker DNA between nucleosomes (Fig. 4 and Extended Data Figs. 8d and  9 ). This enrichment of A→N mutations extends into sequence-specific binding sites (Extended Data Figs. 8c and  9e,f ). A possible explanation for the enrichment of A→N mutations is that, in some circumstances, the activity of NER is itself mutagenic.

Nucleotide excision repair is mutagenic

We propose a mechanistic model for mutagenic NER, arising when two lesions occur in close proximity, but on opposite strands of the DNA duplex. Repair of one lesion, which entails excision of an approximately 26 nt single-stranded segment containing the lesion 42 , 43 , would leave a single-stranded gap containing the second lesion on the opposite strand; resynthesis using this as a template would necessitate replication over that remaining lesion (Fig. 5a ). As a result, nucleotide misincorporation opposite a T lesion in the single-stranded gap would be erroneously interpreted as a mutation from an A lesion (Fig. 5a ) when phasing lesion segregation. We subsequently refer to this mechanism as translesion resynthesis-induced mutagenesis (TRIM), or NER-TRIM specifically in the context of NER.

figure 5

a , Mechanism of NER translesion resynthesis-induced mutagenesis (NER-TRIM). Lesion-containing single-stranded DNA is excised and consequently a residual lesion in close proximity on the opposite strand would be used as a low-fidelity template for repair synthesis. This creates isolated mutations with opposite strand asymmetry to the genomic locality (for example, A→N within a T→N segment). Most lesion-induced mutations are not shared between daughter lineages, whereas those from NER-TRIM can be shared (black arrow). b , The rate of A→N mutations on the genic template strand increases with gene expression, mirroring the decrease in mutations from other bases due to TCR. The relative difference ( y axis) in mutation rate for each nucleotide is (obs − exp)/(obs + exp); exp is the mutation rate for that nucleotide in non-expressed genes, and obs is the rate observed in the body of genes with the indicated expression level ( x axis). Rates shown for lesions on the transcription template strand, with 95% confidence interval (shaded areas) from 100 bootstrap samples of genes. c , Schematic illustrating the generation of a mutationally symmetric tumour through the survival of both post-mutagenesis daughter genomes. NER-TRIM mutations in symmetric tumours will be characterized by abnormally high VAF as they will be shared by both contributing genomes (Extended Data Fig. 10b ). d , Contingency table illustrating the enrichment of mutations with high VAF (0.995–1.0 quantile) in highly expressed genes of mutationally symmetric tumours ( n  = 8) compared with asymmetric tumours ( n  = 237). Statistical significance by two-tailed Fisher’s exact test. e , Symmetric tumours are highly enriched for high VAF mutations in highly expressed genes. Odds ratios ( y axis) are as in d , for VAF quantile bins of 0.005 ( x axis). The black arrow shows the odds ratio calculated in d .

As NER-TRIM requires lesions on both DNA strands, mutagenic NER can only occur when both lesion-containing strands are duplexed, for example, in the first cell generation following DEN mutagenesis; NER-TRIM would not occur in daughter cells with only one lesion-containing strand per duplex. It follows that regions with the highest — and thus fastest — repair rates are most likely to experience NER-TRIM. This prediction is consistent with our observation of local enrichment of apparent A-lesion mutations in accessible regions with otherwise low rates of mutations and low multiallelic variation (Fig. 4c,f ).

Local gradients in repair efficiency are also expected to lead to enrichment of NER-TRIM. The most efficient repair that we observed is transcription-coupled NER, in which there is a steep gradient of repair efficiency between the template and non-template strands. There is a pronounced increase in the rate of apparent A→N mutations on the template strand of expressed genes, whose sigmoidal profile closely mirrors the decrease in T→N mutations on the same strand (Fig. 5b ). The saturation of repair at higher expression levels is reflected in a corresponding saturation of NER-TRIM, demonstrating that the rate of template strand A→N mutations is not simply dependent on transcription, but on TCR.

Similar local gradients of repair can also explain the elevated rate of A→N mutations in CTCF and transcription factor-binding sites (Extended Data Fig. 9e,f ), where nucleotides adjacent to the binding site are more accessible than those within the binding site. High-efficiency repair of the accessible DNA would result in an excision gap that extends into the binding site, where a more protected lesion then serves as a template for repair resynthesis.

The TRIM origin of twin sister tumours

A subset of tumours in our dataset provided an opportunity to directly test further predictions of this NER-TRIM model and demonstrated a remarkable propensity for NER-TRIM mutagenesis to drive oncogenic transformation. Of the complete set of DEN-induced tumours 2 , 2% (8 of 371) exhibited the same mutation spectra as other tumours but completely lacked the mutational asymmetry of lesion segregation (Extended Data Fig. 10a ). This pattern is expected to result from the persistence of mutations derived from lesions on both strands (Fig. 5c and Extended Data Fig. 10b ). On the basis of extensive genomic and histological evidence (Extended Data Fig. 10c–h ), we conclude that these eight mutationally symmetrical tumours are each made up of two diploid sister clones derived from both daughters of a mutagenized cell.

Lesion segregation predicts that mutations will be independent and not shared between sister clones (Fig. 1a ). However, mutations arising from NER-TRIM are expected to be shared between sister clones (Fig. 5a ). The variant allele frequency (VAF) of a somatic mutation is proportional to the fraction of cells in the tumour that contain the mutation. Consequently, we expect the VAF of shared mutations derived from NER-TRIM to be approximately twice that of mutations found in only one of the two daughter cell lineages. Owing to the absence of mutational asymmetry in these eight tumours, it is not possible to define which individual mutations arose from NER-TRIM. However, as we have shown that NER-TRIM is enriched in highly expressed genes, we tested whether high VAF mutations were biased to those regions in the symmetrical tumours ( n  = 8) compared with the asymmetric tumours ( n  = 237). Our results demonstrated a pronounced and significant enrichment, as we predicted, both in aggregate (odds ratio 2.84, two-tailed Fisher’s test P  = 8.7 × 10 −113 ; Fig. 5d ) and individually for each tumour (Fig. 5e ), confirming expectations of the NER-TRIM model.

Finally, we note that in the symmetrical sister-clone tumours, the oncogenic driver mutations in the MAPK pathway that typify these DEN-induced tumours 2 , 22 are all significantly biased to the highest VAF mutations, in contrast to the driver mutations in the asymmetric tumours ( P  = 3.61 × 10 −5 two-tailed Wilcoxon rank-sum test, Bonferroni corrected; Extended Data Fig. 10i–y ). This suggests that driver mutations in the symmetrical tumours arose through NER-TRIM and may explain the co-evolution of both sister clones in a single tumour.

In damaged DNA, most mutations arise from replication bypass of unrepaired lesions, which can result in chromosome-scale mutational asymmetry 2 . We leveraged this discovery to explore the mechanisms of mutagenesis and repair in vivo at high resolution, with single-base, single-strand specificity. The persistence of DNA lesions for multiple cell generations leads to the generation of multiallelic variation, its quantification providing insight into repair kinetics that allowed us to discriminate the relative contributions of initial damage from subsequent repair in shaping mutation rate patterns.

It has long been expected that the asymmetry of leading and lagging strand replication would lead to asymmetric replication fidelity on damaged DNA 27 , 28 , 44 , 45 , and analysis of UV-induced mutation patterns supports that expectation 5 , 12 . However, our system, with over 7.2 × 10 6 lesion strand-resolved mutations and cell-type-matched measures of replication strand bias, means we are uniquely powered to question the generality of this model. Contrary to expectation, we found a remarkable symmetry of mutation rate for leading and lagging strand replication. Matched patterns of collateral mutagenesis — proximal downstream mutations thought to arise from continued synthesis by translesion (TLS) polymerases 37 — point to the recruitment of identical TLS polymerases for the bypass of small alkylation adducts on both replication strands.

Our deeper exploration of mutation clusters demonstrates spatial shifts in mutation signature 3 bp downstream of nucleotides misincorporated opposite damaged bases, supporting a model for the hand-off between TLS polymerases 46 , 47 . We also provide evidence of competition between TLS polymerases. Single-base deletions, such as base substitutions, are strongly strand asymmetric. This implicates the skipping of damaged template bases (−1 frameshifting), which in vitro studies show is common for some of the TLS polymerases such as polymerase-κ 48 . These skipping versus low-fidelity incorporation mechanisms of lesion bypass are associated with highly distinct signatures of downstream collateral mutations, arguing that the alternate outcomes reflect the recruitment of distinct combinations of TLS polymerases. The contrast in mutation asymmetry that we found between replication over UV and DEN damage suggests at least two available strategies of mutagenic translesion bypass in mammalian cells. For example, re-priming followed by gap-filling 49 , leading to replication strand asymmetric mutagenesis, versus on-the-fly bypass 28 , which results in replication strand symmetric mutagenesis. The balance between these probably vary between different types of damage.

Although we found that replication strand biases do not influence the rate of mutations from alkylation damage, both transcription and DNA accessibility have large effects. To better understand how these other features of the genome influence mutation rates, we analysed multiallelic variation as a powerful means to infer the relative kinetics of repair, and disentangle differential damage from differential repair across the genome. This reveals the transcription-associated repair of genic non-template strands, in addition to the well-established TCR of the template strand 8 . Beyond the effects of transcription, the mutational landscape of damaged genomes closely tracks DNA accessibility. This pattern is mirrored by the rate of multiallelic variation, thus providing in vivo evidence that more efficient repair of accessible DNA, rather than differential DNA damage, is primarily responsible for shaping the distribution of damage-induced mutations.

There are, however, some exceptions to the dominance of repair. We found that within transcription factor-binding sites, close contact between high-information-binding site nucleotides and sequence-specific binding proteins shows evidence of providing protection from base damage. By contrast, a subset of nucleotides specifically within CTCF-binding sites exhibit dramatically elevated mutation rates, and lesion strand phasing confirmed that it was damage induced. The identity of these sites with elevated mutation can only partially be reconciled with the structure of the CTCF–DNA interface. We speculate that this structure may be modified, for example, by interacting with cohesin, leading to bending 50 , 51 and partial melting of the DNA duplex, resulting in greater exposure of the nucleotide bases to chemical attack.

Finally, we found that genomic regions that are most efficiently repaired are also, counterintuitively, specifically prone to repair-induced mutagenesis. Building on evidence that transcription-coupled NER can be mutagenic in bacteria 52 and quiescent yeast 53 , we present multiple orthogonal analyses supporting the conclusion that TRIM occurs in vivo in mammals, although confirming the involvement of NER requires further experimental validation. We also showed that NER-TRIM is not purely dependent on transcription, but more generally results from the repair of lesions in close proximity, on opposite strands. It is therefore expected to occur when damage loads are high or closely spaced, for example, UV damage in promoters and ETS factor-binding sites 54 , 55 . Although NER-TRIM mutations represent only a small fraction of damage-induced mutations, they are specifically biased to functionally important sites: they are responsible for most driver mutations seen in symmetric tumours and, perhaps most importantly, NER-TRIM preferentially results in the misincorporation of a normal DNA base on the template strand of highly expressed genes. That incorrect normal base is not a substrate for subsequent NER and could therefore lead to efficient miscoding of a protein before genome replication, and in the case of an oncogenic mutation, potentially driving otherwise quiescent cells towards oncogenic transformation.

Our ability to resolve both mutation rate and multiallelism at single-strand, single-base resolution allows us to infer lesion longevity and thus disentangle differential DNA damage from differential repair. This powerful approach provides in vivo insights into how strand-asymmetric mechanisms underlie the formation, tolerance and repair of DNA damage, thereby shaping cancer genome evolution.

Genomic annotation

The C3H/HeJ mouse strain reference genome assembly C3H_HeJ_v1 (ref. 58 ) was used for read mapping, annotation and analysis. WGS regions with abnormal read coverage (ARC regions; 12.7% of the genome) were masked from analysis, as previously described 2 . Gene annotation was obtained from Ensembl v.91 (ref. 59 ).

Mutation asymmetry

Mutation calling and quality filtering were performed using WGS of 371 DEN-induced liver tumours from n  = 104 male C3H mice (Supplementary Table 1 ), as previously reported 2 . All mutation data were derived from sequence data in the European Nucleotide Archive (ENA) under accession PRJEB37808 , and processed files directly used as input for this work are publicly available 2 .

Genomic segmentation on mutational asymmetry was performed as previously reported 2 . Mutational strand asymmetry was scored for each genomic segment using the relative difference metric S  = ( F  −  R )/( F  +  R ) where F is the rate of mutations from T on the forward (plus) strand of the reference genome and R is the rate of mutations from T on the minus strand (mutations from A on the plus strand). A mutational asymmetry score of S  > 0.33 was used to identify the inheritance of forward strand lesions and S  < −0.33 as the inheritance of reverse strand lesions. A rare subset of tumours (2.7%) exhibited uniform mutational symmetry (more than 99% of autosomal mutations in genomic segments with abs( S ) < 0.2; these were labelled ‘symmetric’ tumours.

Except where otherwise stated (within the final results section), analyses were confined to n  = 237, clonally distinct DEN-induced tumours that met the combined criteria of: (1) not labelled as symmetric, (2) tumour cellularity of more than 50%, and (3) more than 80% of substitution mutations attributed to the DEN1 signature 2 by sigFit (v.2.0) 60 .

Relative to the reference genome sequence, a plus ( P ) strand gene was transcribed using the reverse ( R ) strand as a template. So, a P strand gene in a genomic segment with R strand lesions (denoted RP orientation) is expected to be subject to TCR. A minus ( M ) strand gene with forward ( F ) strand lesions ( FM orientation) is also expected to be subject to TCR, as the retained lesions are again on the transcription template strand. Conversely, FP and RM orientation combinations will have lesions on the non-template strand for transcription. For DNA replication, we similarly refer to whether the preferential template for the leading strand contains the retained lesions or whether the preferential template for the lagging strand contains the retained lesions.

Mutation rates and spectra

Mutation rates were calculated as 192 category vectors representing every possible single-nucleotide substitution conditioned on the identity of both the upstream and the downstream nucleotides. Each rate being the observed count of a mutation category divided by the count of the trinucleotide context in the analysed sequence. To report a single aggregate mutation rate, the three rates for each trinucleotide context were summed to give a 64 category vector and the weighted mean of that vector reported as the mutation rate. The vector of weights being the fraction of each trinucleotide in a reference sequence, for example, the composition of the whole genome. Strand-specific mutation rates were calculated with respect to the lesion-containing strand, with both mutation calls and sequence composition reverse complemented for reverse strand lesions. Autosomal chromosomes were considered diploid and the X chromosome haploid (male mice) for the purposes of calculating mutation rates and sequence composition. For the counting of strand-specific mutations, a threshold VAF > 10% was applied to remove mutation calls from contaminating non-clonal cells.

Subtracted spectra plots (Fig. 2c,d ) were calculated by subtracting the counts of simulated tumour datasets from those of observed datasets and then scaling as for mutation spectra, so that the absolute area of the histogram summed to 100. Percent repair efficiency (Extended Data Fig. 7j ) was calculated as (observed/expected) × 100, where expected was the corresponding mutation rate for non-expressed genes (stratum 1, see below) averaged between the template and non-template strand. Cosine similarity was used as a relative measure of mutation signature similarity. Mutation signature deconvolution was performed using sigFit (v.2.0), with two component signatures ( K  = 2) chosen based on heuristic goodness-of-fit for integer values of K from 2 to 8, with 2,000 iterations each. Final K  = 2 deconvolution used 40,000 iterations.

The expected number of mutations at each position of the analysed transcription factor-binding site (Supplementary Table 2 ) and nucleosome regions was calculated as a sum of genome-wide rates (mutations per base pair) for that particular trinucleotide context from each tumour that had this region classified as either forward or reverse segment. The genome-wide rate for each tumour was calculated by dividing the number of mutations in a particular trinucleotide context (that fall within genomic space phased to have inherited either a forward or a reverse lesion-containing strand) by the total count of that trinucleotide in that genomic space; this was done separately for forward and reverse segments.

Excess mutations per Mb were calculated as (observed i,n  − expected i,n ) × 10 6 /(count i ), where i is the relative position within the region , count i represents a total number of regions with non-‘ N ’ nucleotide at position i , and n is the specific mutation context (for example, mutation from A). Mutation enrichment was calculated as (observed i,n  − expected i,n )/ (observed i,n  + expected i,n ). Rolling mean values were plotted using windows of 51 bp and 21 bp for nucleosome-centred and CTCF-centred plots, respectively. On the basis of bootstrap sampling of the analysed regions, 95% confidence intervals were calculated.

Multiallelic mutation rates

Aligned reads spanning genomic positions of somatic mutations were re-genotyped using SAMtools mpileup (v1.9) 61 . Genotypes supported by 2 or more reads with a nucleotide quality score of 20 or more were reported, considering sites with two alleles as biallelic, those with three or four alleles as multiallelic. For a defined set of mutations, the background composition is the count of mutations in each of the 64 possible trinucleotide contexts. The count of multiallelic mutations in each of those 64 categories was divided by the corresponding background mutation count and the weighted average of those ratios are reported as the multiallelic rate. As for mutation rates, the vector of weights being the fraction of each trinucleotide in a reference sequence, for example, the composition of the whole genome.

Replication time

We generated early–late Repli-seq as previously described 62 for two mouse hepatocellular carcinoma-derived cell lines (Hep−74.3a and Hepa1-6, obtained from biohippo and the American Type Culture Collection, respectively, and tested for mycoplasma at source), matching for the study cell type 63 . Furthermore, the tumour from which the Hep-74.3a cell line was derived was induced by a single intraperitoneal injection of DEN at P15 into a C3H/He mouse 64 , thus closely matching the DEN-induced tumours in our study. For each cell line, two ENCODE-style biological replicates were generated with individual BrdU labelling and fluorescence-activated cell sorting (FACS) into early and late S-phase fractions for Repli-seq Illumina sequencing library preparation 62 . Sequencing was performed on Illumina NextSeq550 using a Mid-Output v2.5 kit generating 75 bp paired-end reads, producing a total of 1.2 × 10 8 read pairs (Hep-74.3a), and Illumina NovaSeq with an S1 flowcell generating 50 bp paired-end reads, producing a total of 3.9 ×1 0 7 read pairs (Hepa1-6). Sequencing reads were mapped using Bowtie2 (v2.4.5) to the C3H_HeJ_v1 reference genome. SAMtools (v1.15.1) was used for alignment quality filtering (-bSq 20), matepair annotation (fixmate -m) and deduplication (markdup -r -s). After confirming concordance, replicates were aggregated and read coverage was calculated for 10  kb consecutive windows with local smoothing: 50 kb windows with a step-length of 10 kb using the central 10 kb window coordinates using bedtools (v2.30.0) multicov. Windowed read counts were normalized to aggregate library size (tags per million, separately for early ( E ) and late ( L )) and replication time was taken as the relative enrichment ( E  −  L )/( E  +  L ). For replication time analysis, genomic regions were categorized into 21 quantile bins of replication time relative enrichment, and the median value for each bin used in quantile-based visualization and regression analysis. As the Hep-74.3a cell line is better matched for both strain and treatment, these Repli-seq data were used throughout the paper. The results were replicated with matched analyses of the Hepa1-6 Repli-seq data (Extended Data Figs. 2a and 3h–j ).

Repli-seq data are available at the ENA at EMBL-EBI under accessions PRJEB72349 (Hep-74.3a) and PRJEB67994 (Hepa1-6).

Replication strand bias

Replication fork directionality (RFD) is a relative difference metric that scales from 1 to −1. RFD values > 0 indicate a consensus rightward progressing replication fork, whereas RFD < 0 indicates a consensus leftward progressing fork. RFD can be directly measured at 1 kb resolution from Okazaki fragment sequencing (OK-seq) 65 , but such data have only been obtained from cultured cells that can be prepared in large quantities with a high fraction in S phase. Alternatively, RFD has been inferred from Repli-seq data, where RFD is calculated as the derivative of the change in replication time along the genome 12 , 66 , but has lower spatial resolution and is dependent on ad hoc filtering. Here we intersected cell-type-matched Repli-seq RFD with higher resolution OK-seq to ensure high-resolution tissue-matched RFD, and removing the need for ad hoc filtering. Replication time was converted to Repli-seq RFD by taking the average of the difference in replication time of the adjacent upstream and downstream windows.

OK-seq data from mouse activated primary splenic B cells 65 were aligned to the C3H_HeJ_v1 reference genome using Bowtie2 (v2.4.5) 67 , quantified using bedtools multicov and RFD calculated as the relative enrichment of reverse ( R ) versus forward ( F ) read coverage (RFD = ( R  −  F )/( R  +  F )) 68 . This OK-seq RFD (OK-RFD) metric was calculated for 10 kb consecutive windows to match Repli-seq RFD analysis. Both OK-RFD and Repli-seq RFD measures were categorized into 21 quantile bins. Subsequent mutation rate analysis used OK-RFD quantile classification but was restricted to those that differed from the corresponding Repli-seq RFD by less than 19% of the category range (four bins). Other OK-seq and Repli-seq datasets (Supplementary Table 3 ) were processed as outlined above, aligning to the GRCh37 reference genome in the case of human-derived sequences. For comparisons between Repli-seq RFD and high-resolution OK-RFD (Extended Data Fig. 2 ), OK-RFD was calculated as above but in 1 kb consecutive windows and smoothed (R loess function), with the span parameter set to encompass 25 windows.

For each DEN-induced tumour, we identified all RFD segments that were completely contained within lesion segregation mutational asymmetry segments (as defined above) with | S | > 0.33. For these segments, we resolved the lesion-containing strand to the template of either the leading or lagging replication strand. A forward strand mutation asymmetry (lesions on the forward strand, S  > 0.33) and rightward progressing replication fork (RFD > 0) was consensus lagging strand replication over the lesions (Fig. 1e ). Similarly S  < −0.33 and RFD < 0 was also lagging strand replication over lesions. Consensus leading strand replication over lesions is indicated by S  > 0.33, RFD < 0; or S  < −0.33, RFD > 0. For the purposes of visualization and the aggregation of equivalent data for increased statistical power, a single replication strand bias (RSB) metric was defined by consistently orienting the strandedness of analyses such that the lesion-containing strand is the reverse strand (compare Extended Data Fig. 3d and 3f ). Consequently, new replication and transcription will proceed left to right as the forward strand over a damaged template strand in all RSB figures.

Gene expression

Paired-end, stranded total RNA-seq from unexposed P15 C3H male mouse livers ( n  = 4, matching the developmental time of mutagenesis) were aligned, annotated and quantified previously 2 . All transcriptome data used were derived from sequence data in Array Express under accession E-MTAB-8518 and are publicly available 2 .

The transcription strand of RNA-seq reads was resolved using read-end and mapping orientation using SAMtools (v.1.7.0) and read pairs exclusively mapping within annotated exons were identified using bedtools intersect (v2.29.2) 69 . Intronic read pairs were defined as those mapping within a genic span, derived from a sense strand transcript and not in the exonic set.

For genes with multiple annotated transcript isoforms, the sum of transcripts per million (TPM) over the isoforms was taken as the expression measure (mature transcript, steady state), although similar results — with the same conclusions — were obtained if the maximum for any one isoform was used. Nascent transcription was quantified by counting read pairs with a mapping quality of more than 10 overlapping intronic regions (defined as intronic in all annotated transcript isoforms of the gene) using bedtools multicov (v2.29.2). The read count was normalized to reads per kilobase of analysed intron for each gene in each sequence library, and then normalized to TPM for each library. The final nascent transcript expression estimate per gene was taken as the mean of nascent TPM over replicate libraries. Nascent transcription estimates could be generated for 85% ( n  = 17,304) of protein-coding genes.

Gene-based analyses of mutation rates used the genomic extent of the most highly expressed transcript isoform (the primary transcript) based on P15 C3H mouse liver gene expression. Overlapping genes, defined by primary transcript coordinates, were hierarchically excluded from analysis. Starting with the most expressed gene, any overlapping less-expressed genes were excluded. For the plotting of per-gene, per-strand mutation rates (Fig. 3b and Extended Data Fig. 7b–d ), only genes spanning more than 2 million nucleotides of strand-resolved tumour genome in aggregate were shown ( n  = 3,392 genes) to minimize stochastic noise from genes with little power individually to accurately estimate mutation rates. Analyses of aggregating rates by expression bin included all genes within the bin.

Genes with similar estimates of nascent expression were aggregated for analysis of TCR. The sigmoidal distribution relating nascent transcription rate to mutation rate (Fig. 3b ) was segmented using linear regression models in the R package Segmented (v1.3-3) 70 . This defined n  = 4,649 genes with zero or low-detected nascent expression (less than 0.287 TPM) in which reduced mutation rates associated with TCR are essentially undetectable; subsequently, stratum 1 genes (light blue in plots). Genes expressed at a greater rate than segmentation threshold (more than 3.73 TPM) do not show a further decrease in mutation rate with increased expression; these n  = 7,176 highly expressed genes were defined as stratum 6 (bright red in plots). The n  = 4,005 genes with intermediate expression (0.287–3.73 TPM) exhibited a log-linear relationship between expression and mutation rate. These were quantile split into strata 2–5, containing approximately 1,000 genes in each strata.

Genomic intersection and bootstrapping

The intersection and subsetting of genomic intervals were performed using bedtools intersect (v2.30.0). For the removal of genic subregions, overlapping genes were merged (bedtools merge), the regions extended 5 kb upstream and downstream (bedtools slop) and removed from pre-defined intervals using bedtools subtract. Genomic window coordinates were defined using bedtools makewindows. Bootstrap analysis, for example, in mutation rate calculations, resampled genomic intervals that met the selection criteria (for example, RFD category 1, non-genic, minus strand lesions) with replacement to the same total count, within the same tumour.

Multivariate regression analysis was performed using the lm function of R. The reference genome was partitioned into consecutive 10 kb windows, and composition-corrected mutation rates were calculated for each window in aggregate across tumours, separately for forward- strand and reverse strand lesions. Windows in a tumour with an unresolved lesion strand or containing lesion strand transitions were excluded. The fraction of nucleotides within a window overlapping genomic extents expressed at more than 1 TPM were separately calculated for template and non-template strand lesions. Replication time and RSB were both annotated for 10 kb windows by overlap with larger-scale replication time and RSB measures described above, taking the consensus measure (most nucleotide span) for the 10 kb window as the value for regression analysis. The fraction of window nucleotides annotated as genic but excluding regions identified as expressed genes was also included as a predictor variable (residual genic). The relative enrichment measures RSB and replication time were bounded (−1,1), whereas other parameters were fractions bounded (0,1). To ensure equal scaling for regression analysis, RSB and replication time were rescaled to the (0,1) range as f  = 1 − (1 −  r )/2, where r is the relative enrichment metric and f is the rescaled fractional range. Regression models were constructed with mutation rate as the outcome variable and other variables as independent predictor variables.

Substitution mutation clusters

For each nucleotide substitution mutation, the closest adjacent mutation was found. Null expectations of mutation spacing were generated by sampling mutation positions from other tumours without replacement, to generate an identical number of proxy mutations for each tumour. Initial analysis of mutation spacing indicated strong enrichment of mutations spaced less than 11 nt apart and evidence of enrichment to 100 nt spacing. Mutation clusters were defined as chains of mutations within the same tumour spaced less than X nucleotides from adjacent mutations, with X  = 11, X  = 101 or X  = 201 depending on analysis as indicated. Over 97% of X  = 101 mutation clusters (29,307 of 30,028) contained only two mutations, 721 clusters contained three mutations and no larger clusters were identified. Of X  = 101 clusters from proxy-tumour mutations, 100% contained only two mutations.

For each mutation cluster, if it was located within a lesion segregation mutation asymmetry segment, we annotated the mutations within the cluster with respect to the inferred lesion-containing strand. For a genomic segment containing reverse strand lesions, the leftmost mutation site would be the first used as a template for an extending DNA polymerase (as DNA synthesis extends 5′→3′), and the rightmost mutation site replicated over subsequently. These orientations are reversed for a genomic segment containing forward strand lesions. The first replicated-over mutation site for each cluster was annotated distinctly from subsequent sites in the cluster.

Pairs of mutations were phased to the same chromosome by co-occurrence in the same sequencing read. Sequencing reads were extracted from genomic alignments using SAMtools mpileup (v1.7) where they overlapped both genomic positions of a pair of mutations called from the same tumour and separated by 75 nt or fewer. Any sequencing read supporting the called mutant allele with a phred-scaled quality score ≥ 20 at both mutation positions was taken as support for those mutations occurring on the same chromosome.

Mutation clusters were resolved to preferential leading or lagging strand replication-based RSB measures as defined above. Only the more extreme RSB windows (quantiles 1, 2, 20 and 21; |RSB| > 0.51) were considered for comparisons of leading versus lagging strand asymmetry, so that any strand differences were not swamped by regions with low levels of replicative asymmetry. Clusters were defined with X  = 101 as above, resulting in n  = 2,791 leading strand and n  = 3,289 lagging strand clusters, the difference in count attributable to TCR correlating with leading strand replication (Fig. 1f ). Cluster length distributions were compared using a two-sample, two-sided Kolmogorov–Smirnov test (ks.test function in R). To estimate statistical power for detecting differences in cluster size distribution between leading and lagging strands, we simulated distorted length distributions. The lagging strand length distribution vector was partitioned into clusters of length of 10 or less (short) or more than 10 (long) and randomly sampled with replacement to produce a vector of length matching the leading strand vector. Bias sampling between the short and long cluster bins was controlled by parameter d . An undistorted sample of the original distribution would be d  = 0; whereas 10% of short clusters sampled from the long bin instead of the short bin would be d  = 0.1. Two-sample, two-sided Kolmogorov–Smirnov tests comparing the original to the distorted sample distribution were applied to 100 bootstraps for each tested value of d (0–0.1 in increments of 0.0005), recording nominal significant difference at P  < 0.05. The percent of bootstraps supporting nominal significance is the power to detect significance at the tested value of d .

Indel–substitution mutation clusters

Insertion and deletion (indel) mutations were filtered as previously described for base substitutions 2 . For clustering analysis, we only considered indel mutations in lesion strand-resolved autosomal regions where at least three reads support precisely the called mutation. We identified the closest upstream or downstream substitution to each insertion or deletion, called within the same tumour. Null expectation datasets were generated by sampling substitution mutations between tumours as described for substitution mutation clustering above; 100 of these permuted datasets were generated for each tumour. Enrichment of clustering was evaluated by two-sided Fisher’s exact test (fisher.test function in R) considering the observed count of indels with a substitution within 100 bp versus the count of indels without a substitution within 100 bp, as compared with the same values estimated from the average of permuted datasets.

For a pair of sequences that differ by a single substitution and a single indel, there can be multiple equally optimal alignments. We identified all cases where there was a substitution mutation within 100 nt of the indel. For each of these, the ancestral and derived sequences were constructed by editing the mutations into the reference genome sequence, and they were oriented to represent the forward strand being newly synthesized over a lesion-containing template (that is, reverse complemented if the reference genome forward strand was the lesion-containing strand). We considered all possible gap placements within those more than 200 bp (2 × 100 flanks + indel length) alignments between ancestral and derived sequence. All alignments that had a single indel-length gap and one substitution were kept, but multiple solutions fractionally weighted, for example, four equally scoring alignment solutions would each be scored 1/4 = 0.25, whereas an alignment with just one solution would score 1/1 = 1. For the distance between indel and substitution, and the identity of the substituted, inserted or deleted bases were recorded for each weighted solution. Observed indel–substitution clusters were further filtered to ensure at least two sequence reads supported the existence of both the indel and the substitution in the same read (SAMtools v1.7.0 mpileup), confirming that the mutations occur on the same copy of the same chromosome. This filtering was not possible for the permuted data and thus makes our estimate of mutation clustering in the observed data conservative.

To consider whether substitutions were preferentially located upstream or downstream of the indel with respect to synthesis over the lesion strand, we considered both the full set of indel–substitution mutation clusters and additionally the subset where all equally scoring alignments placed the substitution on a single side of the indel. To generate a null expectation, for each of these datasets, the annotation of the lesion strand was randomly permuted, the distribution of biases from 10,000 permuted datasets were used to derive an empirical P value for each considered set of indel–substitution clusters.

Transcription-coupled repair

Annotated genes (Ensembl v91) were partitioned into six expression strata based on P15 liver RNA-seq (see above). For each tumour, genes were identified that were wholly contained within a mutation asymmetry segment. Using the annotated transcriptional orientation of the gene and mutational asymmetry of the tumour, each of these genes was categorized as either template strand lesion or non-template strand lesion.

Mouse colony management

Animal experimentation was carried out in accordance with the Animals (Scientific Procedures) Act 1986 (UK) and with the approval of the Cancer Research UK Cambridge Institute Animal Welfare and Ethical Review Body (AWERB). Animals were maintained using standard husbandry: mice were group housed in Tecniplast GM500 IVC cages with a 12–12-h light–dark cycle and ad libitum access to water, food (LabDiet 5058) and environmental enrichments. Ethical approval, tumour size limits, sample size choice, randomization and blinding for the tumour samples have been previously reported 2 . At least three biological replicates were included for ATAC-seq and ChIP–seq experiments.

Liver samples from P15 mice (matching the developmental time of mutagenesis) were isolated and flash frozen. ATAC-seq was performed as previously described 71 , with minor modifications to the nuclear isolation steps (in step 1, 1 ml of 1× homogenizer buffer was used instead of 2 ml; in step 4, douncing was performed with 30 strokes instead of 20). Pooled libraries were sequenced on a NovaSeq6000 (Illumina) to produce paired-end 50 bp reads, according to the manufacturer’s instructions. Experiments were performed with three biological replicates.

ATAC-seq data processing and analysis

ATAC-seq data processing was performed using a Snakemake pipeline (v6.1.1) 72 . Adaptor sequences were removed using cutadapt (v2.6) 73 . Reads were aligned to the reference genome (Ensembl v91: C3H_HeJ_v1 (ref. 59 )) using BWA (v0.7.17) 74 . Data from multiple lanes were merged before deduplication; duplicates were marked using Picard (v2.23.8) 75 . Reads overlapping ARC regions were removed using SAMtools (v1.9). Reads aligning to mitochondrial DNA were excluded from further analysis. Read positions aligning to forward and reverse strands were offset by +5 bp and −4 bp, respectively, to represent the middle of the transposition event, as previously described 76 . ATAC-seq peaks were called using MACS2 (v2.1.2) 77 on pooled data containing all replicates. Single-nucleotide-resolution chromatin accessibility was measured and plotted as coverage of ATAC-seq ‘tags’ (Tn5 insertion sites, adjusted to represent the middle of the transposition event, as described above).

ATAC-seq data are available from Array Express at EMBL-EBI under accession E-MTAB-11780 .

Nucleosome positioning analysis

We used nucleosome positions determined through chemical profiling of mouse embryonic stem cells 39 using a nucleosome centre positioning score to signify the prevalence of nucleosome dyads for a given genomic position. We transferred genome coordinates from mm9 to mm10 using UCSC liftover 78 , before using halLiftover (v2.1) to derive expanded C3H-specific coordinates, considering only unique non-overlapping and syntenic positions. The top 4 million dyad positions were selected based on the nucleosome centre positioning score.

The positions and span of the major groove (either facing out or into the histones relative to the dyad) were calculated with the centre of the major groove facing inwards, repeating every ±10.3 bp away from the dyad position and spanning 5.15 bp (ref. 10 ).

CTCF ChIP–seq

Livers from P15 mice (matching the developmental time of mutagenesis) were perfused in situ with PBS and then dissected, minced, cross-linked using 1% formaldehyde solution for 20 min, quenched for 10 min with 250 mM glycine, washed twice with ice-cold PBS and then stored as tissue pellets at –80 °C. Tissues were homogenized using a dounce tissue grinder, washed twice with PBS and lysed according to published protocols 79 . Chromatin was sonicated to an average fragment length of 300 bp using a Misonix tip sonicator 3000. To negate batch effects and allow multiple ChIP experiments to be performed using the same tissue, we pooled ten livers for each experiment; 0.5 g of washed homogenized tissue was used for each ChIP, using 20 μg CTCF antibody (rabbit polyclonal; 07-729, lot 2517762, Merck Millipore). Library preparation was performed using immunoprecipitated DNA or input DNA (maximum 50 ng) as previously described 80 with the ThruPLEX DNA-Seq library preparation protocol (Rubicon Genomics). Libraries were quantified by qPCR (Kapa Biosystems), and fragment size was determined using a 2100 Bioanalyzer (Agilent). Pooled libraries were initially sequenced on a MiSeq (Illumina) to ensure balanced pooling, followed by deeper sequencing on a HiSeq4000 (Illumina) to produce paired-end 150 bp reads, according to the manufacturer’s instructions; only HiSeq libraries were used for downstream analyses. Experiments were performed with five biological replicates.

To identify ChIP–seq-positive regions, we trimmed the HiSeq sequencing reads to 50 bp and then aligned them using BWA (v0.7.17) using default parameters. Uniquely mapping reads were selected for further analysis. Peaks were identified for each ChIP library and input control using MACS2 (v2.1.2) callpeak with default parameters, and all peaks with a q  > 0.05 were included in downstream analyses. Input libraries were used to filter spurious peaks associated with a high-input signal using the GreyListChIP R package 81 . Biologically reproducible peaks were identified by merging ChIP–seq peaks defined as above from individual replicates and selecting those that overlapped with two or more individual replicate peaks.

ChIP–seq data are available from Array Express at EMBL-EBI under accession E-MTAB-11959 .

Transcription factor binding site identification and analysis

ChIP–seq data for transcription factors, apart from CTCF (see above), were obtained from Life Science Database Archive ( https://dbarchive.biosciencedbc.jp/datameta-list-e.html ) with genomic coordinates for the mm9 reference assembly. Liver-specific ChIP–seq was used whenever possible, otherwise files marked with ‘All cell types’ were used instead (Supplementary Table 2 ). Genomic coordinates were lifted to mm10 using liftOver, and then lifted to the C3H genome assembly using halLiftover (as above). Overlapping ChIP–seq regions were merged, using the outermost coordinates as the new start/end of regions. FASTA sequences of the regions were extracted using bedtools getfasta (v2.27.1) and used together with non-redundant vertebrate position weight matrices from JASPAR 82 to run FIMO (MEME suite) 83 with default parameters to detect motifs within ChIP–seq peaks. Those motifs were then filtered based on an overlap with ATAC-seq peaks (defined above) to ensure that the analysed set was within open chromatin regions of P15 C3H mouse livers. For CTCF-binding site analysis, in-house generated ChIP–seq data (described above) was used. For wider flank (1 kb) analysis, all motifs (JASPAR matrix profile MA0139.1) within the peaks were retained regardless of ATAC-seq intersection, allowing multiple motifs per ChIP–seq peak.

For high-resolution CTCF and transcription factor-binding site analysis (Extended Data Fig. 8 ), only one highest-scoring motif per ChIP–seq peak was retained. Similarly, for aggregate transcription factor analysis, only one highest-scoring motif per ChIP–seq peak was retained if it overlapped with an ATAC-seq peak. A total of 129 transcription factors were analysed based on ChIP–seq and position weight matrix availability, RNA-seq support for transcription factor expression (1 TPM or more) in the P15 mouse liver 2 . In all the analyses, ‘bit score’ refers to the information content of the whole position. Within the motif, only mutations with the reference nucleotide matching the consensus nucleotide from position weight matrix were retained. In the flanks, mutations from all reference nucleotides were used.

CTCF structural analysis

High-resolution crystal structures for CTCF zinc fingers complexed with binding site DNA were obtained from the Protein Data Bank (PDB; 5YEL , 5T0U and 5UND ) 84 , 85 . As no single structure contains all 11 CTCF zinc fingers, a composite structure was compiled through alignment using PyMOL (v2.5.2) 86 align function. The PDB 5UND A chain 406–556 was aligned to the PDB 5T0U A chain (root mean square deviation of 1.06 Å); then the PDB 5YEL A chain was aligned to the PDB 5UND chain A (root mean square deviation of 1.3 Å). The composite image (Extended Data Fig. 8d ) then shows the PDB 5T0U A chain 289–405, PDB 5UND A chain 406–488 and PDB 5YEL A chain 489–556, which collectively spans CTCF zinc fingers 2–11 inclusive. The bound DNA strands comprise the PDB 5YEL F chain 1–24, PDB 5T0U C chain 7–23, PDB 5T0U B chain 1–18 and PDB 5YEL E chain 5–26.

Protein–DNA contact distance measurements were performed using the Protein Contacts Atlas 87 . Non-covalent interatomic contacts of 3 Å or less between CTCF protein and DNA were considered close contacts. Close contacts of atoms within phosphate groups or deoxyribose were considered backbone, and other DNA contacts were annotated as base contacts. Close base contacts involving atoms expected to acquire DEN-induced mutagenic adducts 23 or structurally equivalent positions in other bases (purines: N6 and O6; pyrimidines: O4, N4 and O2) were annotated as lesion site contacts. Distance measurements were taken separately for each structure (rather than from the composite) and excluded PDB 5T0U nucleotide contacts upstream of binding motif position +1 where this structure substantially deviates from PDB 5YEL . PDB 5T0U is truncated at zinc finger 7, whereas PDB 5YEL extends to zinc finger 11 and makes additional base-specific contacts absent from PDB 5T0U . Close backbone, base and lesion site contacts were reported if the distance threshold criteria were met in any of the three considered structures, although concordance was high in the overlapping regions.

Histology and image analysis

Digitized histology images of DEN-induced tumours 2 were obtained from Biostudies (accession S-BSST383 ).

Whole-slide images of tumours that met inclusion criteria (cellularity of more than 50% and DEN1 signature of more than 80%) were annotated in QuPath (v0.2.2) 88 using the polygon tool to include neoplastic tissue and excluded adjacent parenchyma, cyst cavities, processing artefacts and white space. For tumours with multiple transections, only a single whole-slide image was used. Annotations were reviewed for quality by a histopathologist (S.J.A.). Using Groovy in QuPath, annotated regions were tessellated into fixed size, non-overlapping 256 × 256 µm tiles. For segmentation of epithelioid nuclei, a pre-trained StarDist 89 model (he_heavy_augment.zip) was downloaded from https://github.com/stardist/stardist-imagej/tree/master/src/main/resources/models/2D , and an inference instance was deployed using Groovy across the tiles in QuPath, built from source with Tensorflow 90 , with a minimum detection threshold of 0.5. Python (v3.9.7) was used for downstream analyses. Data were filtered to exclude extreme outliers: tiles with 43 nuclei per tile or fewer; nuclei with an area of 227.18386 µm or more, circularity of 0.4841 or less, or non-computable circularity were excluded. From the 245 whole-slide images ( n  = 237 mutationally asymmetric tumours and n  = 8 symmetric tumours), 70,414 tiles were generated, and 9,999,783 nuclei were segmented (post-filtering). To compute inter-nuclear distance, for each nucleus in a tile represented by its x – y centroid coordinates, nearest neighbours were identified using the k -dimensional tree function from the spatial module of SciPy (v1.7.1) 91 . The Euclidean distance for each nearest neighbour pair was computed using the paired distances function from the metrics module of SciKit-Learn (v1.0.2) 92 . The median nuclear area, median nuclei per tile and median inter-nuclear distances were compared between asymmetric and symmetric tumours using a two-tailed Wilcoxon rank-sum test.

Symmetric versus asymmetric tumour comparison

Mutationally symmetric tumours (defined above; more than 99% of autosomal mutations in genomic segments with abs( S ) < 0.2) were filtered to the subset that met the same inclusion criteria as the other n  = 237 tumours analysed in this study (more than 50% cellularity (after adjusting for the presence of two genomes) and more than 80% substitution mutations attributed to the DEN1 signature). Eight tumours met this criteria. We subsequently show that these tumours are not whole-genome duplicated, but that they contain both daughter lineages of an originally mutagenized cell (Extended Data Fig. 10b ). For each autosomal variant in a tumour, we calculated its VAF quantile position among point mutations in that tumour, using the R ecdf function 93 . The quantile positions (range 0–1) were grouped into consecutive bins of 0.005 unit span, that is, the 0.995–1.0 was the rightmost bin representing the top 0.5% of VAF values for mutations in a tumour. The mutations within a VAF quantile bin were classified as either overlapping or not overlapping with the genomic span of the most highly expressed genes (stratum 6) using the R data.table foverlaps function 94 . The counts of overlapping and non-overlapping mutations from the focal tumour were compared as a two-tailed Fisher’s exact test to the equivalent counts aggregated from all asymmetric tumours (excluding the focal tumour in the case of asymmetric focal tumours for the calculation of background expectation). The same analysis was performed in aggregate for all symmetric tumours ( n  = 8) compared with all asymmetric tumours ( n  = 237). The calculations were repeated for each of the 200 consecutive bins to demonstrate the VAF range over which high VAF mutations are preferentially enriched in highly expressed genes specifically in symmetric tumours, as predicted under NER-TRIM.

Computational analysis environment

Except where otherwise noted, analysis was performed in Conda environments and choreographed with Snakemake 72 running in an LSF 965 or Univa Grid Engine batch control system (Supplementary Table 3 ). Statistical tests were performed in R (v4.0.5) using fisher.test, ks.test, cor.test and wilcox.test functions for Fisher’s exact, Kolmogorov–Smirnov, Pearson’s and Spearman’s correlation and Wilcoxon tests, respectively. Graphics were generated using R.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

Raw data files for all new datasets are available from Array Express and the ENA at the EMBL-EBI. Early–late Repli-seq accession numbers from the ENA: PRJEB72349 and PRJEB67994 . ATAC-seq accession number from Array Express: E-MTAB-11780 . ChIP–seq accession number from Array Express: E-MTAB-11959 .

Code availability

The analysis pipeline including Conda and Snakemake configuration files can be obtained without restriction from the repository https://git.ecdf.ed.ac.uk/taylor-lab/lceStrandInteractions .

Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578 , 94–101 (2020).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Aitken, S. J. et al. Pervasive lesion segregation shapes cancer genome evolution. Nature 583 , 265–270 (2020).

Burgers, P. M. J., Gordenin, D. & Kunkel, T. A. Who is leading the replication fork, Pol ε or Pol δ? Mol. Cell 61 , 492–493 (2016).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Baris, Y., Taylor, M. R. G., Aria, V. & Yeeles, J. T. P. Fast and efficient DNA replication with purified human proteins. Nature 606 , 204–210 (2022).

Seplyarskiy, V. B. et al. Error-prone bypass of DNA lesions during lagging-strand replication is a common source of germline and cancer mutations. Nat. Genet. 51 , 36–41 (2019).

Article   CAS   PubMed   Google Scholar  

Clausen, A. R. et al. Tracking replication enzymology in vivo by genome-wide mapping of ribonucleotide incorporation. Nat. Struct. Mol. Biol. 22 , 185–191 (2015).

Reijns, M. A. M. et al. Lagging-strand replication shapes the mutational landscape of the genome. Nature 518 , 502–506 (2015).

Agapov, A., Olina, A. & Kulbachinskiy, A. RNA polymerase pausing, stalling and bypass during transcription of damaged DNA: from molecular basis to functional consequences. Nucleic Acids Res . https://doi.org/10.1093/nar/gkac174 (2022).

Afek, A. et al. DNA mismatches reveal conformational penalties in protein–DNA recognition. Nature https://doi.org/10.1038/s41586-020-2843-2 (2020).

Pich, O. et al. Somatic and germline mutation periodicity follow the orientation of the DNA minor groove around nucleosomes. Cell 175 , 1074–1087.e18 (2018).

Mao, P., Smerdon, M. J., Roberts, S. A. & Wyrick, J. J. Asymmetric repair of UV damage in nucleosomes imposes a DNA strand polarity on somatic mutations in skin cancer. Genome Res . https://doi.org/10.1101/gr.253146.119 (2019).

Haradhvala, N. J. et al. Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair. Cell https://doi.org/10.1016/j.cell.2015.12.050 (2016).

Tomkova, M., Tomek, J., Kriaucionis, S. & Schuster-Böckler, B. Mutational signature distribution varies with DNA replication timing and strand asymmetry. Genome Biol. 19 , 129 (2018).

Article   PubMed   PubMed Central   Google Scholar  

Degasperi, A. et al. Substitution mutational signatures in whole-genome-sequenced cancers in the UK population. Science 376 , abl9283 (2022).

Article   Google Scholar  

Hu, J., Lieb, J. D., Sancar, A. & Adar, S. Cisplatin DNA damage and repair maps of the human genome at single-nucleotide resolution. Proc. Natl Acad. Sci. USA https://doi.org/10.1073/pnas.1614430113 (2016).

Hu, J., Adebali, O., Adar, S. & Sancar, A. Dynamic maps of UV damage formation and repair for the human genome. Proc. Natl Acad. Sci. USA 114 , 6758–6763 (2017).

Mao, P. et al. Genome-wide maps of alkylation damage, repair, and mutagenesis in yeast reveal mechanisms of mutational heterogeneity. Genome Res . https://doi.org/10.1101/gr.225771.117 (2017).

Poetsch, A. R., Boulton, S. J. & Luscombe, N. M. Genomic landscape of oxidative DNA damage and repair reveals regioselective protection from mutagenesis. Genome Biol. 19 , 215 (2018).

Hu, J., Adar, S., Selby, C. P., Lieb, J. D. & Sancar, A. Genome-wide analysis of human global and transcription-coupled excision repair of UV damage at single-nucleotide resolution. Genes Dev. 29 , 948–960 (2015).

Yimit, A., Adebali, O., Sancar, A. & Jiang, Y. Differential damage and repair of DNA-adducts induced by anti-cancer drug cisplatin across mouse organs. Nat. Commun. 10 , 309 (2019).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Verna, L., Whysner, J. & Williams, G. M. N -nitrosodiethylamine mechanistic data and risk assessment: bioactivation, DNA-adduct formation, mutagenicity, and tumor initiation. Pharmacol. Ther. 71 , 57–81 (1996).

Connor, F. et al. Mutational landscape of a chemically-induced mouse model of liver cancer. J. Hepatol. 69 , 840–850 (2018).

Singer, B. In vivo formation and persistence of modified nucleosides resulting from alkylating agents. Environ. Health Perspect. 62 , 41–48 (1985).

Chen, H.-J. C., Wang, Y.-C. & Lin, W.-P. Analysis of ethylated thymidine adducts in human leukocyte DNA by stable isotope dilution nanoflow liquid chromatography–nanospray ionization tandem mass spectrometry. Anal. Chem. 84 , 2521–2527 (2012).

Fu, D., Calvo, J. A. & Samson, L. D. Balancing repair and tolerance of DNA damage caused by alkylating agents. Nat. Rev. Cancer 12 , 104–120 (2012).

Guilliam, T. A. & Yeeles, J. T. P. Reconstitution of translesion synthesis reveals a mechanism of eukaryotic DNA replication restart. Nat. Struct. Mol. Biol. 27 , 450–460 (2020).

Meneghini, R., Cordeiro-Stone, M. & Schumacher, R. I. Size and frequency of gaps in newly synthesized DNA of xeroderma pigmentosum human cells irradiated with ultraviolet light. Biophys. J. 33 , 81–92 (1981).

Hedglin, M. & Benkovic, S. J. Eukaryotic translesion DNA synthesis on the leading and lagging strands: unique detours around the same obstacle. Chem. Rev . https://doi.org/10.1021/acs.chemrev.7b00046 (2017).

Chen, Y.-H. et al. Transcription shapes DNA replication initiation and termination in human cells. Nat. Struct. Mol. Biol. 26 , 67–77 (2019).

Koyanagi, E. et al. Global landscape of replicative DNA polymerase usage in the human genome. Nat. Commun. 13 , 7221 (2022).

Supek, F. & Lehner, B. Differential DNA mismatch repair underlies mutation rate variation across the human genome. Nature 521 , 81–84 (2015).

Sale, J. E. Translesion DNA synthesis and mutagenesis in eukaryotes. Cold Spring Harb. Perspect. Biol. 5 , a012708 (2013).

Powers, K. T. & Washington, M. T. Eukaryotic translesion synthesis: choosing the right tool for the job. DNA Repair 71 , 127–134 (2018).

Lou, J. et al. Rad18 mediates specific mutational signatures and shapes the genomic landscape of carcinogen-induced tumors in vivo. NAR Cancer 3 , zcaa037 (2021).

Kochenova, O. V., Daee, D. L., Mertz, T. M. & Shcherbakova, P. V. DNA polymerase ζ-dependent lesion bypass in Saccharomyces cerevisiae is accompanied by error-prone copying of long stretches of adjacent DNA. PLoS Genet. 11 , e1005110 (2015).

Isogawa, A., Ong, J. L., Potapov, V., Fuchs, R. P. & Fujii, S. Pol V-mediated translesion synthesis elicits localized untargeted mutagenesis during post-replicative gap repair. Cell Rep. 24 , 1290–1300 (2018).

Póti, Á., Szikriszt, B., Gervai, J. Z., Chen, D. & Szüts, D. Characterisation of the spectrum and genetic dependence of collateral mutations induced by translesion DNA synthesis. PLoS Genet. 18 , e1010051 (2022).

Kaiser, V. B., Taylor, M. S. & Semple, C. A. Mutational biases drive elevated rates of substitution at regulatory sites across cancer types. PLoS Genet. 12 , e1006207 (2016).

Voong, L. N. et al. Insights into nucleosome organization in mouse embryonic stem cells through chemical mapping. Cell 167 , 1555–1570.e15 (2016).

Matsumoto, S. et al. DNA damage detection in nucleosomes involves DNA register shifting. Nature 571 , 79–84 (2019).

Siggers, T. & Gordân, R. Protein–DNA binding: complexities and multi-protein codes. Nucleic Acids Res. https://doi.org/10.1093/nar/gkt1112 (2014).

Huang, J. C., Svoboda, D. L., Reardon, J. T. & Sancar, A. Human nucleotide excision nuclease removes thymine dimers from DNA by incising the 22nd phosphodiester bond 5′ and the 6th phosphodiester bond 3′ to the photodimer. Proc. Natl Acad. Sci. USA 89 , 3664–3668 (1992).

Hu, J. et al. Genome-wide mapping of nucleotide excision repair with XR-seq. Nat. Protoc. 14 , 248–282 (2019).

Yeeles, J. T. P., Poli, J., Marians, K. J. & Pasero, P. Rescuing stalled or damaged replication forks. Cold Spring Harb. Perspect. Biol. 5 , a012815 (2013).

Gabbai, C. B., Yeeles, J. T. P. & Marians, K. J. Replisome-mediated translesion synthesis and leading strand template lesion skipping are competing bypass mechanisms. J. Biol. Chem. 289 , 32811–32823 (2014).

Cranford, M. T., Kaszubowski, J. D. & Trakselis, M. A. A hand-off of DNA between archaeal polymerases allows high-fidelity replication to resume at a discrete intermediate three bases past 8-oxoguanine. Nucleic Acids Res. 48 , 10986–10997 (2020).

Anand, J. et al. Roles of trans-lesion synthesis (TLS) DNA polymerases in tumorigenesis and cancer therapy. NAR Cancer 5 , zcad005 (2023).

Levine, R. L. et al. Translesion DNA synthesis catalyzed by human Pol η and Pol κ across 1, N 6-ethenodeoxyadenosine. J. Biol. Chem. 276 , 18717–18721 (2001).

Tirman, S. et al. Temporally distinct post-replicative repair mechanisms fill PRIMPOL-dependent ssDNA gaps in human cells. Mol. Cell 81 , 4026–4040.e8 (2021).

MacPherson, M. J. & Sadowski, P. D. The CTCF insulator protein forms an unusual DNA structure. BMC Mol. Biol. 11 , 101 (2010).

Pugacheva, E. M. et al. CTCF mediates chromatin looping via N-terminal domain-dependent cohesin retention. Proc. Natl Acad. Sci. USA 117 , 2020–2031 (2020).

Carvajal-Garcia, J., Samadpour, A. N., Hernandez Viera, A. J. & Merrikh, H. Oxidative stress drives mutagenesis through transcription-coupled repair in bacteria. Proc. Natl Acad. Sci. USA 120 , e2300761120 (2023).

Kozmin, S. G. & Jinks-Robertson, S. The mechanism of nucleotide excision repair-mediated UV-induced mutagenesis in nonproliferating cells. Genetics 193 , 803–817 (2013).

Perera, D. et al. Differential DNA repair underlies mutation hotspots at active promoters in cancer genomes. Nature 532 , 259–263 (2016).

Article   ADS   CAS   PubMed   Google Scholar  

Mao, P. et al. ETS transcription factors induce a unique UV damage signature that drives recurrent mutagenesis in melanoma. Nat. Commun. 9 , 2626 (2018).

Sasaki, S. et al. Chromatin-associated periodicity in genetic variation downstream of transcriptional start sites. Science 323 , 401–404 (2009).

Morganella, S. et al. The topography of mutational processes in breast cancer genomes. Nat. Commun. 7 , 11383 (2016).

Lilue, J. et al. Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci. Nat. Genet. 50 , 1574–1583 (2018).

Howe, K. L. et al. Ensembl 2021. Nucleic Acids Res. 49 , D884–D891 (2021).

Gori, K. & Baez-Ortega, A. sigfit: flexible Bayesian inference of mutational signatures. Preprint at bioRxiv https://doi.org/10.1101/372896 (2020).

Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25 , 2078–2079 (2009).

Marchal, C. et al. Genome-wide analysis of replication timing by next-generation sequencing with E/L Repli-seq. Nat. Protoc. 13 , 819–839 (2018).

Darlington, G. J., Bernhard, H. P., Miller, R. A. & Ruddle, F. H. Expression of liver phenotypes in cultured mouse hepatoma cells. J. Natl Cancer Inst. 64 , 809–819 (1980).

CAS   PubMed   Google Scholar  

Kress, S. et al. p53 Mutations are absent from carcinogen-induced mouse liver tumors but occur in cell lines established from these tumors. Mol. Carcinog. 6 , 148–158 (1992).

Tubbs, A. et al. Dual roles of poly(dA:dT) tracts in replication initiation and fork collapse. Cell 174 , 1127–1142.e19 (2018).

Otlu, B. et al. Topography of mutational signatures in human cancer. Cell Rep. 42 , 112930 (2023).

Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9 , 357–359 (2012).

Petryk, N. et al. Replication landscape of the human genome. Nat. Commun. 7 , 10208 (2016).

Quinlan, A. R. & Hall, I. M. bedtools: A flexible suite of utilities for comparing genomic features. Bioinformatics 26 , 841–842 (2010).

Muggeo, V. M. R. Estimating regression models with unknown break-points. Stat. Med. 22 , 3055–3071 (2003).

Article   PubMed   Google Scholar  

Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14 , 959–962 (2017).

Mölder, F. et al. Sustainable data analysis with Snakemake. F1000Res. 10 , 33 (2021).

Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. https://doi.org/10.14806/ej.17.1.200 (2011).

Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25 , 1754–1760 (2009).

Broad Institute. Picard Tools. Broad Institute GitHub Repository http://broadinstitute.github.io/picard (2019).

Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10 , 1213–1218 (2013).

Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9 , R137 (2008).

Hinrichs, A. S. et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 34 , D590–D598 (2006).

Schmidt, D. et al. ChIP–seq: using high-throughput sequencing to discover protein–DNA interactions. Methods https://doi.org/10.1016/j.ymeth.2009.03.001 (2009).

Aitken, S. J. et al. CTCF maintains regulatory homeostasis of cancer pathways. Genome Biol. 19 , 106 (2018).

Brown, G. GreyListChIP: grey lists — mask artefact regions based on ChIP inputs. Bioconductor www.bioconductor.org/packages/release/bioc/html/GreyListChIP.html (2021).

Castro-Mondragon, J. A. et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 50 , D165–D173 (2022).

Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27 , 1017–1018 (2011).

Yin, M. et al. Molecular mechanism of directional CTCF recognition of a diverse range of genomic sites. Cell Res. 27 , 1365–1377 (2017).

Hashimoto, H. et al. Structural basis for the versatile and methylation-dependent binding of CTCF to DNA. Mol. Cell 66 , 711–720.e3 (2017).

The PyMOL Molecular Graphics System, version 2.5.2 (Schrödinger, LLC, 2015).

Kayikci, M. et al. Visualization and analysis of non-covalent contacts using the Protein Contacts Atlas. Nat. Struct. Mol. Biol. 25 , 185–194 (2018).

Bankhead, P. et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7 , 16878 (2017).

Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. in Medical Image Computing and Computer Assisted Intervention – MICCAI 2018. Lecture Notes in Computer Science Vol. 11071 (eds Frangi, A. Schnabel, J., Davatzikos, C., Alberola-López, C. & Fichtinger, G.) 256–273 (Springer, Cham, 2018).

Abadi, M. et al. TensorFlow: Large-scale machine learning on heterogeneous systems. Zenodo https://doi.org/10.5281/zenodo.4724125 (2015).

Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17 , 261–272 (2020).

Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12 , 2825–2830 (2011).

MathSciNet   Google Scholar  

R Core Team. R: A Language and Environment for Statistical Computing https://www.R-project.org/ (R Foundation for Statistical Computing, 2020).

Dowle, M. & Srinivasan, A. data.table: Extension of ‘data.frame’. CRAN https://CRAN.R-project.org/package=data.table (2019).

Download references

Acknowledgements

We thank P. Flicek for LCE Consortium management, computational resources and ATAC-seq support; P. Bankhead for supervision of image processing; T. Deegan, V. Seplyarskiy and M. Andrianova for informative discussions; N. Hastie, C. Ponting and W. Bickmore for comments on the manuscript; M. Roller for assistance with data curation; the CRUK Cambridge Institute Core facilities for their valuable contribution: CRUK Biological Resources (A. Mowbray), Genomics (P. Coupland) and Bioinformatics (G. Brown and M. Eldridge); Edinburgh Genomics, The University of Edinburgh for provision of sequencing services; and the European Molecular Biology Laboratory for access to computational resources. This work was supported by the MRC Human Genetics Unit core funding programme grants (MC_UU_00007/11, MC_UU_00007/16 and MC_UU_00035/2), MRC Toxicology Unit core funding (RG94521), Cancer Research UK Cambridge Institute funding (20412 and 22398) and European Molecular Biology Laboratory core funding. Support was also provided from specific research grants: PID2021-126568OB-I00 (CHEMOHEALTH) project, funded by the Spanish Ministry of Science (MCIN, AEI/10.13039/501100011033/); the Wellcome Trust (WT202878/B/16/Z); the European Research Council (615584 and 788937); Helmholtz NCT (DKFZ abteiling B270); the US NIH (R01GM083337); and the MRC equipment award (MC_PC_MR/X013677/1). Edinburgh Genomics is partly supported through core grants from the NERC (R8/H10/56), the MRC (MR/K001744/1) and the BBSRC (BB/J004243/1). J.C. was supported by a Wellcome Trust PhD Training Fellowship for Clinicians (WT223088/Z/21/Z) as part of the Edinburgh Clinical Academic Track (ECAT) programme. M.D.N. is a cross-disciplinary post-doctoral fellow supported by funding from the CRUK Brain Tumour Centre of Excellence Award (C157/A27589). O.P. was funded by a BIST PhD fellowship supported by the Secretariat for Universities and Research of the Ministry of Business and Knowledge of the Government of Catalonia and the Barcelona Institute of Science and Technology. V.S. was supported by an EMBL Interdisciplinary Postdoc (EIPOD) fellowship under Marie Skłodowska Curie actions COFUND (664726). P.-C.W. is supported by the ERC Starting Grant (BrainBreaks 949990) and a Helmholtz Young Investigator grant. S.J.A. received a Wellcome Trust PhD Training Fellowship for Clinicians (WT106563/Z/14/Z), an National Institute for Health and Care Research (NIHR) Clinical Lectureship and a CRUK Clinician Scientist Fellowship (RCCCSF-May23/100001).

Author information

Authors and affiliations.

Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK

Craig J. Anderson, Lana Talmane, Juliet Luft, John Connelly, Jan C. Verburg, Susan Campbell, Stuart Aitken, Ailith Ewing, Vera B. Kaiser, Colin A. Semple & Martin S. Taylor

Medical Research Council Toxicology Unit, University of Cambridge, Cambridge, UK

John Connelly, Claudia Arnedo-Pac & Sarah J. Aitken

Edinburgh Pathology, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK

John Connelly

Laboratory Medicine, NHS Lothian, Edinburgh, UK

CRUK Scotland Centre, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK

Michael D. Nicholson

Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain

Oriol Pich, Claudia Arnedo-Pac, Erika López-Arribillaga, Inés Sentís & Núria López-Bigas

Brain Mosaicism and Tumorigenesis (B400), German Cancer Research Center (DKFZ), Heidelberg, Germany

Marco Giaisi & Pei-Chi Wei

European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK

Vasavi Sundaram, Maëlle Daunesse, Paul Flicek & Elissavet Kentepozidou

Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK

Frances Connor, Ruben M. Drews, Christine Feig, Margus Lukk, Tim F. Rayner, Duncan T. Odom & Sarah J. Aitken

Division of Regulatory Genomics and Cancer Evolution (B270), German Cancer Research Center (DKFZ), Heidelberg, Germany

Paul A. Ginno & Duncan T. Odom

San Diego Biomedical Research Institute, San Diego, CA, USA

Takayo Sasaki & David M. Gilbert

Universitat Pompeu Fabra (UPF), Barcelona, Spain

Núria López-Bigas

Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain

Centro de Investigación Biomédica en Red en Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain

Department of Pathology, University of Cambridge, Cambridge, UK

  • Sarah J. Aitken

Department of Histopathology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK

You can also search for this author in PubMed   Google Scholar

Liver Cancer Evolution Consortium

  • , Stuart Aitken
  • , Craig J. Anderson
  • , Claudia Arnedo-Pac
  • , John Connelly
  • , Frances Connor
  • , Maëlle Daunesse
  • , Ruben M. Drews
  • , Ailith Ewing
  • , Christine Feig
  • , Paul Flicek
  • , Paul A. Ginno
  • , Vera B. Kaiser
  • , Elissavet Kentepozidou
  • , Erika López-Arribillaga
  • , Núria López-Bigas
  • , Juliet Luft
  • , Margus Lukk
  • , Duncan T. Odom
  • , Oriol Pich
  • , Tim F. Rayner
  • , Colin A. Semple
  • , Inés Sentís
  • , Vasavi Sundaram
  • , Lana Talmane
  • , Martin S. Taylor
  •  & Jan C. Verburg

Contributions

M.S.T. conceived the project and designed the analyses. F.C. and S.J.A. performed the mouse experiments and collected tissue samples. S.J.A. performed the ChIP–seq experiments. S.C. performed the ATAC-seq experiments. M.G., P.-C.W., T.S. and D.M.G. performed and supervised the Repli-seq experiments. C.J.A., L.T., J.L., M.D.N., J.C.V. and M.S.T. designed and performed the computational analysis of genomic data. O.P., V.S. and P.A.G. provided supporting genomic analyses. J.C. and S.J.A. annotated and analysed the histology images. J.C. performed the computational image analysis. N.L.-B., C.A.S., D.T.O., S.J.A. and M.S.T. led the Liver Cancer Evolution Consortium and supervised the work. Research support to D.T.O., S.J.A. and M.S.T. funded the work. D.T.O., S.J.A. and M.S.T. wrote the manuscript, with contributions from C.J.A., L.T., J.L., J.C., M.D.N. and J.C.V. All authors had the opportunity to edit the manuscript. All authors approved the final manuscript.

Corresponding authors

Correspondence to Duncan T. Odom , Sarah J. Aitken or Martin S. Taylor .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Boris Pfander and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended data fig. 1 exemplar tumour genome demonstrating mutation asymmetry from lesion segregation..

a , Mutational summary of one DEN induced tumour; the tumour genome represented by the shared x-axis and chromosome boundaries marked with dashed vertical lines. Mutations are called relative to the forward strand of the reference genome and shown as coloured points stratified type (C → N, T → N, A → N, G → N). Y-axis positions show the genomic distance to the next mutation of the same type and plotted on a log 10 scale. Mutations of type T → N and A → N are complements of each other and plotted on opposite sides of the asymmetry segmentation track with inverted y-axis orientations (y-axis arrows). The same for C → N versus G → N mutations. Genomic segmentation by T → N/A → N mutation asymmetry is plotted showing genomic segments where mutations have arisen from forward strand lesions (blue), reverse strand lesions (gold), or where one chromosome has forward and the other reverse strand lesions meaning that they cancel each other out (grey). Hemizygous X chromosomes are always mutationally asymmetric. The asymmetry score is calculated as S = (forward-reverse)/(forward+reverse) where forward and reverse are the sequence composition adjusted rates of T → N and A → N mutations. Both average total mutation rate and read coverage are typically uniform across the autosomal portion of the tumour genomes. b , The mutational asymmetry calculated from T → N/A → N mutations (x-axis) and C → N/G → N mutations (y-axis) in 5 Mb windows over the genome is closely correlated, consistent with the interpretation that most mutagenic adducts in these tumours are on T and C nucleotides 2 and supported by reduced mutation rates when T and C are on the transcriptional template strand (Extended Data Fig. 7 ).

Extended Data Fig. 2 Quantifying replication fork directionality.

a , Replication time profile of an example 15 Mb of C3H genome chromosome 8 (x-axis, shared with panel c ). Curves show early/late (EL) replication relative enrichment (E and L read counts normalised to their respective library read depth, then relative enrichment, RE = (E − L)/(E + L)) where more positive values indicate earlier replication and more negative values indicate later replication. Replication profiles shown for a mouse embryonic stem cell line (E14TG2a, tan) and mouse hepatocyte derived cell lines (Hep-74.3a, red; Hepa1-6, brown). Blue dash line indicates the centre of a strong replication origin region (schematic) and is projected into panel c for comparison. b , Schematic illustrating two alternate strategies to generate replication fork directionality measures (RFD). Left side, E/L-Repli-seq (top) can be used to derive Repli-seq based replication fork RFD (repli-RFD; bottom). On the right side, Okazaki fragment sequencing based RFD (OK-RFD). c , Smoothed derivatives of Hep-74.3a E/L-Repli-seq data (red, panel a ) provides an RFD estimate. Comparison to OK-seq data from another differentiated cell type (pink, activated B-cells) shows overall good concordance but captures some replication profile differences between cells (grey triangle). d , Kernel density plot summarising the genome-wide correlation of B-cell derived OK-RFD (x-axis) and Hep-74.3a derived repli-RFD (y-axis), both at 10 kb resolution. Only high-concordance genomic intervals between blue stepped lines (21 quantile boundaries) were used for RFD based measures of liver tumour mutation rate. e , Validation of the E/L-Repli-seq to RFD measure in human RPE-1 cells where both OK-seq (grey) and E/L-Repli-seq (black) has been generated and used to calculate RFD. The curves are shown over a 15 Mb interval of human chromosome 8 and illustrate a high concordance of RFD profile. Although both traces are plotted at 10 kb resolution, the smoothing and processing required to calculate RFD from E/L-Repli-seq averages out some of the fine grained structure evident in the OK-seq derived profile. f , Kernel density plot summarising the OK-seq (x-axis) and E/L-Repli-seq (y-axis) RFD estimates for RPE-1 cells, as for panel d .

Extended Data Fig. 3 Transcription and replication time influence DNA damage induced mutation rate but replication strand bias has negligible impact.

a , Relative enrichment (RE) of early versus late replication time for 21 quantile bins of replication fork direction bias (RFD, x-axis shared with b-d ). Relative enrichment calculated as RE = (early−late)/(early+late) using the number of nucleotides annotated as early or late replicating in each of the RFD bins. b , Percent of genic nucleotides in each quantile bin, stratified as transcribed (red, >1 transcript per million (TPM) in P15 mouse liver) or non-transcribed (grey). c , Relative enrichment of strand-biassed transcription across RFD bins (RE = (forward-reverse)/(forward+reverse)) calculated using the number of nucleotides contained within the transcription strand resolved genomic span of expressed genes (panel b ). d , Mutation rate (nucleotide composition normalised) for RFD bins calculated separately for forward strand and reverse strand lesions, 95% C.I. (whiskers) from bootstrap sampling. e , Percentage of nucleotides that are transcribed (>1 TPM, P15 mouse liver) in each of the 21 quantile bins of replication strand bias (RSB, x-axis shared with f ). RSB is the RFD metric but all data oriented so that lesions would be on the reverse strand. f , Mutation rates for the 21 RSB bins. g , Mutation rates (y-axis) points and RSB bins identical to panel f , but x-axis shows the percent of nucleotides with transcription over a lesion strand template, illustrating that transcription using a lesion containing strand is the main determinant of mutation rate. Linear modelling (shaded area 95% C.I.) and extrapolation of this correlation accurately predicts the observed mutation rate in non-genic regions (orange point). h , Mutation rates (y-axis) for the whole genome (gold) stratified into 21 quantile bins of RSB (x-axis). Equivalent analysis is shown for fractions of the genome contained within expressed genes (tan) and non-genic regions (orange). This is a repeat of the analysis shown in Fig. 1f confirming the results using Repli-seq data from a second independent hepatocyte cell line (Hepa1-6 ( h ), rather than Hep-74.3a (Fig. 1f ) that is used except where otherwise stated). i , Multivariate regression modelling based on 10 kb consecutive genomic windows finds all five tested parameters make nominally significant (right of the dashed line), independent contributions to variation in mutation rate (calculated separately for forward strand and reverse strand lesions, blue and gold, respectively). The predominant contributions are transcription over a lesion containing template strand and to a lesser extent replication time. Residual genomic annotation (annotated genes not meeting the >1 TPM threshold for expression) is notably significant, indicating sub-threshold expression contributes to reducing the mutation rate. The results are highly reproducible, independently using either Hep-74.3a and Hepa1-6 Repli-seq measures (circles and crosses, respectively). j , Multi-regression analysis considering only 10 kb segments that are >5 kb from annotated genes, demonstrates significant replication time influences on mutation rate but that replication strand bias does not significantly influence the mutation rate. Forward strand lesions (blue) and reverse strand lesions (gold) calculated separately.

Extended Data Fig. 4 Replication time correlates with mutation rate partly independent of transcription.

a-c , The genome was partitioned into 21 quantile bins of replication time, relative enrichment (shared x-axis, RE = (early−late)/(early+late)) a , Percent of genic nucleotides in each quantile bin, stratified as transcribed (red, >1 transcript per million (TPM) in P15 mouse liver) or non-transcribed (grey). b , Relative enrichment of strand-biassed transcription across replication time bins (RE = (forward−reverse)/(forward+reverse)) calculated using the number of nucleotides contained within the transcription strand resolved genomic span of expressed genes (panel a ). c , Mutation rates (y-axis) for the whole genome (black, 95% C.I. whiskers). A linear regression 95% C.I. shown as a corresponding shaded area. Equivalent analysis is also shown, restricted to only expressed genes (mid-grey) and non-genic regions (light-grey).

Extended Data Fig. 5 Tracts of low-fidelity replication downstream of lesion induced mutations.

a , Genome-wide mutation signature of DEN induced tumours. b , Signature of mutation cluster upstream (5′) position mutations, oriented so the lesion containing strand is the replication template. c , Signature of downstream mutations in the cluster (2.2% of clusters have two downstream mutations). d , Frequency distribution of the spacing between adjacent observed (dark-red) and simulated (pink) mutations for all tumours (n = 237). The simulated data were generated by sampling mutations across all other tumours to create proxy tumour datasets with identical mutation counts (see Methods). Main histogram shows only closest spaced mutations, inset graph shows full distribution of both observed and simulated, blue arrow indicates x-axis area expanded in main histogram. Excess clustering of observed mutations (blue arrow) accounts for only 0.8% of the total mutation burden. e , Clustered mutation pairs co-occur in the same sequencing read, confirming they are on the same DNA duplex. Expected (pink) is analogous to two heads or two tails from consecutive flips of a fair coin. f , Multiallelism is a hallmark of lesion templated mutations 2 . The multiallelic rate (y-axis, fraction of mutation sites with multiallelic variation) for simulated data (pink spots). Curve shows best-fit spline (25 degrees of freedom) for the downstream mutations. g , As for ( f ) but showing observed data (red), demonstrating a pronounced and specific depletion of multiallelic variation immediately downstream of the cluster 5′ mutation (yellow circle and arrow). h , Heatmap summarising cosine similarity between mutation clusters with different inter-mutation spacing (schematic in lower panel). Upstream (5′) cluster mutations closely match the genome wide mutation spectrum. Mutations 3 to 10 nt downstream of the 5′ mutation share a common signature. i-n , Mutation signature profiles for clustered mutations; distance from the upstream mutation (number in brown circle) relate to schematic in h . Mutation counts in each category indicated below the plot. o , The mutation spectrum of downstream mutations closely matches between leading and lagging strand replication (strongly RSB regions, absolute RSB > 0.2). The observed cosine similarity between mutation spectra is robustly within the range expected by random permutation of mutations between leading and lagging strands (n = 10 5 permutations, two tailed empirical p = 0.18). p , The distribution of mutation cluster length also matches between leading (black) and lagging (red) strands (no significant difference; two sided Kolmogorov-Smirnov test p = 0.15). q , Simulations show >98% power to detect a ≥ 4% difference in the distribution of cluster lengths for strongly RSB regions of the genome.

Extended Data Fig. 6 DNA damage induces deletion mutations at damaged bases and collateral insertion mutagenesis.

a , A deletion or insertion mutation with a proximal substitution can often be explained by multiple equally scoring alignments. Two example sequences can be aligned with a single gap (dash) and substitution (blue line), in this case with two possible solutions. To avoid systematic biases in gap placement by alignment and mutation calling software, all equally optimal alignments are calculated, the distance between gap and substitution measured for each and count value distributed equally between possible solutions (weight). b , As ( a ) but gap and substitution position are not immediately adjacent. c , As ( a ) but demonstrating an example with seven equally scoring solutions where the substitution could be assigned to either upstream or downstream of the insertion/deletion. d , Frequency distribution of the distance between insertion or deletion (indel) mutations and their closest proximal substitution mutation (black curve), demonstrating a high degree of spatial clustering within 10 bp. The permuted expectation (pink) was calculated by measuring the distance to the nearest substitution in a permuted set of substitutions sampled from other tumours (Methods). Confidence intervals (95%, light pink) on the permuted set were calculated from 100 permuted sets of substitutions. Inset graph shows the same data plotted with the y-axis on a log 10 scale. Counts for both observed and permuted are the sums of the weighted counts for each distance as illustrated in ( a-c ). e , Schematic to show how indel and substitution mutation clusters are oriented by the lesion containing strand in subsequent plots, and that the position of the insertion or deletion is set as x = 0. The subsequent plots ( f-i ) also show cases where all optimal alignments agree on the upstream/downstream placement of the substitution relative to the indel (dark blue, e.g. panel b ) as distinct from where that assignment is ambiguous (light blue, e.g. panel c ). f , Substitutions are strongly clustered around 1 bp deletions and biassed towards a downstream location. Inset shows the density plot for 10,000 permutations of the observed data where the assignment of the lesion strand was randomly permuted (grey) compared with the observed level of upstream/downstream bias (calculated as bias = (down−up)/(down+up)). Two-sided p-values were empirically derived from the permutations. g , Deletions >1 bp are rarely clustered with substitutions and do not show a significant upstream/downstream bias. h , Single base insertions are clustered with substitutions and are significantly biassed to upstream of the insertion. i , Longer insertions show similar clustering trends to 1 bp insertions but do not reach statistical significance.

Extended Data Fig. 7 Transcription and lesion repair have strand-specific, expression-dependent mutation signatures.

a , Mature transcript expression and nascent transcription (intron mapping RNA-seq reads) are highly correlated; one point per gene. b , As for panel a but restricted to the genes spanning in aggregate across tumours >2 million nucleotides of strand resolved tumour genome (n = 3,392). c , Mature transcript gene expression (x-axis) negatively correlates with composition normalised mutation rate (y-axis) where lesions are on the transcription template strand (one red point per gene). Red curve shows the best-fit spline (8 degrees of freedom) through the red points. Black points show gene expression measures for centile bins of gene expression. d , As for c , but x-axis shows nascent RNA estimates of transcription. P-values for panels a-d are too small to precisely calculate (p < 2.2 × 10 −16 ). e , Nucleotide order used for 192 category mutation spectra in panels f-i . Expanded segment shows the flanking nucleotide context for C → A mutations; the same ordering of flanking nucleotides is used for all mutation types. f-i , Mutation rate spectra for non-expressed (stratum 1) genes are closely matched for template ( f ) and non-template ( g ) lesion strands. For highly expressed genes (stratum 6), the mutation rate is reduced for both strands and the spectrum differs between template strand ( h ) and non-template strand ( i ) lesions. j , The profile of lesion repair efficiency differs between template strand lesions and non-template strand lesions of expressed genes. Repair efficiency is calculated as the percent change in mutation rate for a trinucleotide sequence context (n = 64 categories) relative to the average for both strands in non-expressed genes (stratum 1). The y-axis is inverted to indicate reduction in mutation rate from increased repair. Transcription coupled repair shows similar efficiency for C and T lesions on the template strand. Transcription associated repair on the non-template strand shows preferential repair of C lesions compared to T lesions. Mutations from apparent A lesions (and to a lesser extent G lesions) are rare and, as shown in subsequent sections, should not be evaluated as lesions on the indicated nucleotide, but are included here for completeness (y-axis values < -10 truncated).

Extended Data Fig. 8 Mutation enrichment and depletion at transcription factor binding sites (TFBS).

a , The compositionally corrected mutation rate shows helical (10 bp) periodicity over nucleosomes. Separating the mutation rates by the lesion containing strand (blue, forward; gold, reverse) reveals two partially offset periodic profiles (top panel). Orientating both strands 5′ → 3′ demonstrates that the profiles are mirror images (bottom panel). Mutation rate peaks (black) correspond to regions where the DNA major groove faces into the histones, and valleys (red) where the major groove faces outward. Mutation enrichment is shown with shaded 95% bootstrap confidence intervals (blue, gold). b , For the lesion containing strand, mutation rates are significantly higher for the peaks on the 3′ side of the nucleosome dyad than on the 5′ side (significant p-values shown, two tailed Wilcoxon tests). c , Comparing the compositionally corrected multiallelic rates shows significantly increased multiallelic variation for the 3′ peaks (significant p-values shown, two tailed Wilcoxon test), indicating the increased mutation rate results from slower repair on the 3′ side of the dyad. d , The molecular structure of the CTCF:DNA interface (top) reflects the strand specific mutation profiles of CTCF binding sites (histograms, composition corrected). A composite crystal structure of CTCF zinc fingers 2-11 (grey surface) is shown binding DNA (blue & gold strands) and close protein:DNA contacts (≤3 Å) illustrated below the structure. At nucleotide positions with close contact between CTCF and atoms thought to acquire mutagenic lesions (red circles), the corresponding strand specific mutation rates are generally lower than genome-wide expectation (y ≤ 0; excepting apparent A → N mutations considered later). Mutation rates are high (y > 0) for nucleotide positions with backbone-only contacts or no close contacts but still occluded by CTCF. CTCF motif position 6 exhibits an exceptionally high T → N mutation rate that cannot be readily reconciled with the structure, but the strand specificity demonstrates it is a consequence of DEN exposure. e , The profile of DNA accessibility around CTCF binding sites, defines categories of sequence (shaded areas) considered subsequently. f , Mutation rates are higher than genome-wide expectation (y = 0) for CTCF binding motif nucleotides and their close flanks. g , This is not reflected in increased rates of multiallelic variation. CTCF occluded positions (positions -5 to 3 of the CTCF motif) show the greatest elevation of mutation rate but evidence of decreased multiallelic variation. Both high information content (motif-high, bit score>0.2) and low information content (motif-low, bit-score ≤0.2) motif positions have high mutation rates. h , DNA accessibility around non-CTCF transcription factor binding sites (TFBS) as in e . i,j , In contrast to the situation for CTCF, all TFBS categories of sites have suppressed mutation rate compared to genome-wide expectation, y = 0 ( i ), and suppression of multiallelic variation ( j ) indicates enhanced repair. However, high information content motif sites (motif-high) have exceptionally reduced mutation rate not similarly reflected by multiallelic variation, suggesting there may be reduced damage in addition to efficient repair at these sites.

Extended Data Fig. 9 Lesion induced mutation patterns at DNA:protein interaction sites.

a , Excess mutations resulting from A lesions in accessible DNA (relative to the genome-wide trinucleotide mutation rate) centred on the nucleosome dyad. DNA accessibility as measured by ATAC-seq (purple; higher values mean more accessible chromatin). Excess mutations are shown with shaded 95% bootstrap confidence intervals. b-d , Relative mutation rates as a , for apparent T lesions ( b ), C lesions ( c) , and G lesions ( d ); in each case, except A → N mutations, the mutation rate is lower in accessible DNA and higher in less-accessible DNA. e , Mutation rates and multiallelic rates for sequence categories (Methods) within, and adjacent to, CTCF binding sites, stratified by the identity of the inferred lesion containing nucleotide. Point estimate (circles) and bootstrap 95% confidence intervals (whiskers) are shown for the rate difference relative to genome-wide expectation (y = 0, mutations Mb −1 for mutation rates, relative difference metric for multiallelic variation). All rates are adjusted for trinucleotide composition. Instances where the motif_lo category has too few observed or expected mutations to calculate estimates (x-axis label grey) have no data point. Where the observed level of multiallelic variation is zero (asterisk) bootstrap confidence intervals cannot be calculated. f , Mutation rates and multiallelic variation for P15 liver expressed transcription factors; plots as in ( e ).

Extended Data Fig. 10 Mutagenic nucleotide excision repair.

a , Most DEN induced tumours show pronounced mutation asymmetry across approximately 50% of their genome. Asymmetric tumours meeting inclusion criteria (mutation signature and cellularity thresholds; black) are included in the preceding analyses of this study. In addition, here we include a subset of tumours that were excluded due to the absence of mutation asymmetry (n = 8, blue). b , The mutational symmetry of these tumours could be explained if both daughters of the originally mutagenised cell persist (schematic). Mutagenic NER in the first generation of the mutagenised cell could produce mutations at the same base pair in both daughter lineages; such mutations would have approximately double the variant allele frequency (VAF) of mutations confined to one daughter lineage. Whole genome duplication in the first generation of the mutagenised cell could also produce symmetric tumours. c , Tumours with symmetric mutation patterns have a significantly higher mutation load than those with asymmetric mutations, consistent with mutations from both mutagenised strands contributing to the tumour. Statistical analysis (p = 1.1 × 10 −4 ) by two tailed Wilcoxon rank sum test. In panels c,d,f,g,h points are individual tumours, bar is median, statistical tests are based on n = 8 symmetric and n = 237 asymmetric tumours, all reported p-values are Bonferroni corrected (n = 5 tests). d , The median VAF for mutations in symmetric tumours is approximately half that of asymmetric tumours. Statistical analysis (p = 7.67 × 10 −6 ) by two tailed Wilcoxon rank sum test. e , Automated nuclear detection (red circles) and quantification in an exemplar hematoxylin and eosin stained tumour section (93131_N2). Original digitised magnification x200; scale bar indicated. f , Nuclear area is not significantly different between symmetric and asymmetric tumours (p = 0.215, two tailed Wilcoxon rank sum test), indicating similar DNA content and arguing against mononuclear whole-genome duplication. g , The density of nuclei is not significantly different between symmetric and asymmetric tumours (p = 1, two tailed Wilcoxon rank sum test), arguing against both mononuclear and possibly multi-nuclear whole genome duplication. h , Internuclear distance is not significantly different between symmetric and asymmetric tumours (p = 1, two tailed Wilcoxon rank sum test), arguing against multi-nuclear whole genome duplication. i-p , VAF frequency distributions for symmetric tumours, indicating the VAF of MAPK pathway driver mutations (red points, also in q-x ). For symmetric tumours, the driver VAFs are strongly right-biassed (i.e. high VAF). This is consistent with mutagenic NER copying the same driver mutation site into both daughter genomes of the mutagenised cell, and in turn both daughter lineages (containing either the same driver mutation, or multiallelic driver mutations at the same site) contributing to the resultant tumour. q-x , VAF frequency distributions for example asymmetric tumours. y , MAPK pathway driver mutations are biassed to the highest VAF values in symmetric tumours but not in asymmetric tumours (p = 3.61 × 10 −5 two tailed Wilcoxon rank sum test, Bonferroni corrected). VAF quantile position (y-axis) indicates the fraction of mutations in a tumour that have lower VAF than the driver mutation (quantile of 1.0 indicates all other mutations in that tumour have a lower VAF). Horizontal bars indicate median VAF quantile position of the focal driver mutations. As a null expectation for comparison, one mutation was randomly selected from each of the asymmetric tumours (grey points).

Supplementary information

Reporting summary, peer review file, supplementary table 1.

Table of tumours sequenced containing key metadata

Supplementary Table 2

Table of ChIP-seq transcription factors and tissues of origin from ChIP-Atlas database

Supplementary Table 3

Table of key resources and software

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Anderson, C.J., Talmane, L., Luft, J. et al. Strand-resolved mutagenicity of DNA damage and repair. Nature (2024). https://doi.org/10.1038/s41586-024-07490-1

Download citation

Received : 10 June 2022

Accepted : 30 April 2024

Published : 12 June 2024

DOI : https://doi.org/10.1038/s41586-024-07490-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

write and essay on isolation and mutation

Get the Reddit app

r/GCSE is the place for tips, advice, resources and memes for your GCSE exams.

Can someone mark this essay on the effects on loneliness and isolation in ACC

Starting with this extract, explore how Dickens presents the effects of loneliness and isolation in A Christmas Carol. Write about: • how Dickens presents the effects of loneliness and isolation in this extract • how Dickens presents the effects of loneliness and isolation in the novel as a whole. [30 marks] Throughout a Christmas Carol, the theme of loneliness and isolation can be seen to haunt Scrooge from very early on till his redemption in the end. The recurring presence of the theme can be used by Dickens to highlight the dire consequences of chasing nothing but money. Dickens uses it to warn the greedy Upper class victorians of what will be forced onto them, if they do not change their trajectory down the path of greed and immorality.

We first see the isolation that shadows scrooge when we see how he is described as in Stave 1. Initially, he is described as a capitalist man, whose focus is on nothing but to work and earn money, with little time for anything else leading him to be misanthropic and ultimately, lonely. This comes to light when he is described as “solitary as an oyster”. The symbolism used presents him as a sort of hermit, only focused on one belief and idea;money. One can argue that the shell can be symbolic of capitalism. Capitalism has sold Scrooge a dream; a dream of money and wealth. This results in his entire life revolving around him chasing money, which leads to him not sharing his thoughts or feelings with anyone, due to his misanthropic and harsh behaviour. This shows how often those that chase money, forget about other factors of life such as maintaining relationships, and socialising. Furthermore, the motif of isolation is reinforced once more through the use of the harsh adjectives solitary. It creates a semantic field of prison and confinement, which suggests how Scrooge is not only lonely, but is also forced and trapped into this cell of capitalism. Dickens here can be warning his audience about the dangers of Frued’s id, and the dangers of following one’s desires blindly. Scrooge can be seen as a prisoner of Capitalism, physically affluent, yet emotionally starved and lonely. This drastic image painted can warn Dickens readers of money, no matter how important it is, can ultimately imprison you, and can lead to nothing but isolation and loneliness in the long term. He reinforces how the sacrifices one makes for the master's passion can be said to be futile and pointless in the long term, and how the short term pleasures are trivial.

Furthermore, in Stave 2, Scrooge can be seen to be neglected and isolated as a child, when we see how he was left behind during the Christmas Holidays. The ghost of Christmas Past brings him to his school days, and shows him the days in which he was neglected by his friends, which makes a semantic field of isolation, and loneliness. This is best highlighted through the quote “Solitary child, neglected by his friend”, with Scrooge once again being described with the adjective “solitary”. This time, however, it can be noted that this is not because of his greed for money, but rather the neglect he received as a child from his parents. This creates a semantic field of sorrow and forces the reader to empathise and feel remorse for  young Scrooge. The adjective “neglected” and “solitary”, creates an uncomfortable image of young Scrooge being left for himself, which arguably could have stemmed into his adult life, and be a cause for his misanthropic nature and behaviour and how his experiences at the boarding school resulted in him feeling unwanted and unloved. Dickens’ here can be criticising the Victorian education system, as it did not pay attention to the struggles of students and instead only focused on the educational side. Dicken’s, a writer and social critic, could be calling on more work being put into managing these schools, to benefit the mental health of the students as well.

As the novella progresses, Scrooge and the Ghost of Christmas Present visit the house of Scrooge's old wife Belle. The environment described juxtaposes Scrooge’s miserable, isolated dwelling, and loneliness is not seemed to be found there. Belle, while not physically wealthy, can be seen to be full of happiness and joy and this is emphasised by the quote “the mother and daughter laughed heartily, and enjoyed it very much; and the latter, soon beginning to mingle in the sports, got pillaged by the young brigands most ruthlessly”. Joy and happiness can be seen to be found in every nook and cranny of the house, which contradicts Scrooges’ melancholy tavern. The positive adjectives and phrases of “heartily”,”laugh”,”enjoy” all create  a positive semantic field of love and belonging- which Scrooge, while wealthy, does not seem to house. The companionship and unity between the parents and children can be seen as inspiring and heartbreaking for Scrooge as he wishes for the Ghost to “Remove him”, suggesting how the isolation and loneliness have scarred and have had a deep impact on him and how he views the world. Dickens uses this to show how you don’t need money and power to be happy, but rather the comfort and presence of your loved ones. He can be showing how the lower class are infinitely happier than the upper class as they share their pains and happiness with each other rather than worry about earning more money. 

Overall, Dickens uses the characters of Scrooge and Belle as microcosms to highlight the saddening and long lasting impact loneliness and isolation has on people. Their lives are employed by Dickens to highlight how society must come together for the welfare of others and to also encourage more social responsibility and unity, as we all are responsible for one another’s welfare, both their happiness and also their suffering. Dickens encourages us to focus not on monetary goals but rather on the higher aspect of light; love and belonging to somewhere.

I would greatly appreciate it if anyone has any estimates for each ao1, ao2, ao3 and have any tips to get better. Thanks everyone

IMAGES

  1. BC3001 Write an illustrated essay on the main techniques used for

    write and essay on isolation and mutation

  2. Mutation Conditions

    write and essay on isolation and mutation

  3. Lesson 5 Mutation

    write and essay on isolation and mutation

  4. 📚 Essay Sample on Genetic Mutation

    write and essay on isolation and mutation

  5. Tutorial Work

    write and essay on isolation and mutation

  6. Social Isolation Essay

    write and essay on isolation and mutation

VIDEO

  1. Hank mutation 💀 Write your favourite brawler in comments. #brawlstars #бравлстарс #brawl #supercell

  2. SPROUT GAMEPLAY WITH MUTATION 🌷☢️

  3. Let's Play .hack//Mutation [25] Mimiru

  4. Selected Short Stories by Stephen Crane

  5. Alien isolation ep 30 (the finale)

  6. Think About Mutation -- MotorRazor 96/ Razorhaus (Remix by The Syndicate)

COMMENTS

  1. What Are Mutations?Definition, Causes and Effects of Mutations

    Mutation means an alteration in the genes or chromosomes of a cell. This shift in the gametes may impact the development and structure of the progeny. A mutation in biology is a modification of the nucleic acid sequence of a virus, extrachromosomal DNA, or the genome of an organism. The observable traits of an organism (phenotype) may or may ...

  2. Essay on Mutation

    Essay # 1. Meaning of Mutation: Mutation of a plant or an animal means a sudden change in its hereditary make-up. Suddenly a mutated organism arises and the changed or mutated appearance is usually found to be hereditary, i.e., it breeds true. Since heredity is controlled by genes, it follows that the genes somehow change their behaviour.

  3. Mutation

    The genome is composed of one to several long molecules of DNA, and mutation can occur potentially anywhere on these molecules at any time. The most serious changes take place in the functional units of DNA, the genes.A mutated form of a gene is called a mutant allele.A gene is typically composed of a regulatory region, which is responsible for turning the gene's transcription on and off at ...

  4. Mutations

    Mutations are changes in the information contained in genetic material. For most of life, this means a change in the sequence of DNA, the hereditary material of life. An organism's DNA affects how it looks, how it behaves, its physiology — all aspects of its life. So a change in an organism's DNA can cause changes in all aspects of its life.

  5. A Mutation: A Change in the Genome of an Organism Essay

    This mutation refers to a type of genomic mutation. The protein encoded by the mutation is a receptor for norepinephrine. Get a custom Essay on A Mutation: A Change in the Genome of an Organism. This mutation is dominant, autosomal, and beneficial in its effect on the viability of individuals. Changes in the ADRB1 gene occur primarily in humans.

  6. Adaptation and Survival

    An adaptation is a mutation, or genetic change, that helps an organism, such as a plant or animal, survive in its environment.Due to the helpful nature of the mutation, it is passed down from one generation to the next. As more and more organisms inherit the mutation, the mutation becomes a typical part of the species.The mutation has become an adaptation.

  7. Speciation: The Origin of New Species

    This alternative is 'mutation-order speciation', defined as the evolution of reproductive isolation by the fixation of different advantageous mutations in separate populations experiencing ...

  8. PDF Essay: On the close relationship between speciation ...

    The main conclusion that I reach by the reflections developed in this essay is that the process of speciation is primarily driven by advantageous recessive mutations, and that those will, in turn ...

  9. Mutation

    A mutation is a change in the structure of a gene, the unit of heredity. Genes are made of deoxyribonucleic acid (DNA), a long molecule composed of building blocks called nucleotides.Each . nucleotide is built around one of four different subunits called bases.. These bases are known as guanine, cytosine, adenine, and thymine. A gene carries information in the sequence of its nucleotides, just ...

  10. Evidence for evolution (article)

    The evidence for evolution. In this article, we'll examine the evidence for evolution on both macro and micro scales. First, we'll look at several types of evidence (including physical and molecular features, geographical information, and fossils) that provide evidence for, and can allow us to reconstruct, macroevolutionary events.

  11. 19.2A: Genetic Variation

    Figure 19.2A. 1 19.2 A. 1: Low genetic diversity in the wild cheetah population: Populations of wild cheetahs have very low genetic variation. Because wild cheetahs are threatened, their species has a very low genetic diversity. This low genetic diversity means they are often susceptible to disease and often pass on lethal recessive mutations ...

  12. Genetics Essay on Mutation

    Transition mutations are more frequent than Transversion mutations, although most of the time transition mutations result in a silent mutation due to wobble of the bases (Carr, 2018). Wobble refers to the fact that some tRNA can recognize more than one codon and codons recognized by the same tRNA vary at the 3rd position, the 'wobble' base.

  13. PDF Bacterial Mutation; Types, Mechanisms and Mutant Detection ...

    Missense mutation: Missense mutations are DNA mutations which lead to changes in the amino acid sequence (one wrong codon and one wrong amino acid) of the protein product [1,4,5]. Nonsense mutation: A mutation that leads to the formation of a stop codon is called a nonsense mutation. Since these codon cause the termination of protein synthesis, a

  14. Reproductive Isolation and Its Potential Effects Essay

    Introduction. Reproductive isolation pertains to the fact that in case a population of the same species is separated into two parts and they are not able to breed with one another, genetic makeup will change according to the specific conditions that are unique to each location of the group. Eventually, the whole population of specifies will ...

  15. 8.1.1 Genetic Mutations

    A gene mutation is a change in the sequence of base pairs in a DNA molecule that may result in an altered polypeptide; Mutations occur continuously and spontaneously. Errors in the DNA often occur during DNA replication; As the DNA base sequence determines the sequence of amino acids that make up a protein, mutations in a gene can sometimes lead to a change in the polypeptide that the gene ...

  16. Modern Synthetic theory of Evolution

    The major concepts coming under this theory include genetic variations, reproductive and geographical isolation and natural selection. The Modern Synthetic Theory of Evolution showed a number of changes as to how the evolution and the process of evolution are conceived. The theory gave a new definition of evolution as "the changes occurring ...

  17. Write a short essay describing the roles of mutation, migration

    Write a short essay describing the roles of mutation, migration, and selection in bringing about speciation. Verified Solution This video solution was recommended by our tutors as helpful for the problem above

  18. Mutagenesis and Selection: Reflections on the In Vivo and In Vitro

    Despite the feasibility of mutation induction by using a wide variety of mutagens in plant species of interest, isolation of desired mutant in a given time will depend on the radiation facilities and treatment regime, besides the plant species (Pathirana 2011; Bado et al. 2015). Plant species that have short generation time are self-pollinating ...

  19. Mutation and premating isolation

    While premating isolation might be traceable to different genetic mechanisms in different species, evidence sup­ ports the idea that as few as one or two genes may often be sufficient to initiate isolation. Thus, new mutation can theoretically playa key role in the process. But it has long been thought that a new isolation mutation would fail,

  20. What is a Genetic Mutation? Definition & Types

    Germline mutation: A change in a gene that occurs in a parent's reproductive cells (egg or sperm) that affects the genetic makeup of their child (hereditary). Somatic mutation: A change in a gene that occurs after conception in the developing embryo that may become a baby. These occur in all cells in the developing body — except the sperm ...

  21. PDF Carrick Academy Higher Biology Essays

    Evolution Write notes on evolution under the following headings: (i) natural selection; (ii) genetic drift. Describe the evolution of a new species under the following headings. (i) Isolation and mutation (ii) Selection Protein Synthesis Write notes on (i) transcription (ii) translation (iii) post translational modifications Give an account of

  22. The Impact of Social Isolation Essay (Critical Writing)

    Social isolation has become a normal state of living for the world's population in recent years. For many people, confinement has become a highly stressful situation, triggering their mental health issues, while for others became an opportunity to learn and grow. The recent article reviewed explores the impact of social isolation on a person ...

  23. Essay on Genetic Variation

    Essay # 1. Meaning of Genetic Variation: Evolution requires genetic variation. If there were no dark moths, the population could not have evolved from mostly light to mostly dark. In order for continuing evolution there must be mechanisms to increase or create genetic variation and mechanisms to decrease it. Mutation is a change in a gene.

  24. Strand-resolved mutagenicity of DNA damage and repair

    Mutation clusters were defined as chains of mutations within the same tumour spaced less than X nucleotides from adjacent mutations, with X = 11, X = 101 or X = 201 depending on analysis as indicated.

  25. Can someone mark this essay on the effects on loneliness and isolation

    • how Dickens presents the effects of loneliness and isolation in this extract • how Dickens presents the effects of loneliness and isolation in the novel as a whole. [30 marks] Throughout a Christmas Carol, the theme of loneliness and isolation can be seen to haunt Scrooge from very early on till his redemption in the end.