Increase rate of change in coding regions?

Increase rate of change in coding regions?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

If a sequence is under selection will it acquire more changes over time because of faster fixation than if changes were neutral? Is this true or am I missing something?

Generally speaking, there are sequences that are under purifying selection (where new mutations are often deleterious) and there are sequences that are neutral. Sequences under constant positive selection do not really exist. So, when you ask

If a sequence is under selection will it acquire more changes over time because of faster fixation than if changes were neutral?

No! If a sequence is under selection, then it is under purifying selection. As such the rate of fixation will be lower than for a neutral sequence.

If you have a sequence, where most mutations are beneficial, then it the sequence would fix new mutations at a higher rate then a neutral sequence. However such sequence don't really exist! If it were to exist, then mutations would quickly fix such that new mutations would then be deleterious again.

Of course, a single beneficial mutation has a higher chance of fixation than a neutral mutation. Also, when a mutation is under selection (whether purifying or not), the expected change in heterozygosity in the population is higher than if the mutation is neutral (the expected change in allele frequency is equal to the additive genetic variance for fitness at this locus). But those are not details you are asking about.

Unusual DNA folding increases the rates of mutations

DNA sequences that can fold into shapes other than the classic double helix tend to have higher mutation rates than other regions in the human genome. New research by a team of Penn State scientists shows that the elevated mutation rate in these sequences plays a major role in determining regional variation in mutation rates across the genome. Deciphering the patterns and causes of regional variation in mutation rates is important both for understanding evolution and for predicting sites of new mutations that could lead to disease.

A paper describing the research is available online in the journal Nucleic Acids Research.

“Most of the time we think about DNA as the classic double helix this basic form is referred to as ‘B-DNA,’” said Wilfried Guiblet, co-first author of the paper, a graduate student at Penn State at the time of research and now a postdoctoral scholar at the National Cancer Institute. “But, as much as 13% of the human genome can fold into different conformations called ‘non-B DNA.’ We wanted to explore what role, if any, this non-B DNA played in variation that we see in mutation rates among different regions of the genome.”

Non-B DNA can fold into a number of different conformations depending on the underlying DNA sequence. Examples include G-quadruplexes, Z-DNA, H-DNA, slipped strands, and various other conformations. Recent research has revealed that non-B DNA plays critical roles in cellular processes, including the replication of the genome and the transcription of DNA into RNA, and that mutations in non-B sequences are associated with genetic diseases.

“In a previous study, we showed that in the artificial system of a DNA sequencing instrument, which uses similar DNA copying processes as in the cell, error rates were higher in non-B DNA during polymerization,” said Kateryna Makova, Verne M. Willaman Chair of Life Sciences at Penn State and one of the leaders of the research team. “We think that this is because the enzyme that copies DNA during sequencing has a harder time reading through non-B DNA. Here we wanted to see if a similar phenomenon exists in living cells.”

The team compared mutation rates between B- and non-B DNA at two different timescales. To look at relatively recent changes, they used an existing database of human DNA sequences to identify individual nucleotides — letters in the DNA alphabet — that varied among humans. These “single nucleotide polymorphisms” (SNPs) represent places in the human genome where at some point in the past a mutation occurred in at least one individual. To look at more ancient changes, the team also compared the human genome sequence to the genome of the orangutan.

They also investigated multiple spatial scales along the human genome, to test whether non-B DNA influenced mutation rates at nucleotides adjacent to it and further away.

“To identify differences in mutation rates between B- and non-B DNA we used statistical tools from ‘functional data analysis’ in which we compare the data as curves rather than looking at individual data points,” said Marzia A. Cremona, co-first author of the paper, a postdoctoral researcher at Penn State at the time of the research and now an assistant professor at Université Laval in Quebec, Canada. “These methods give us the statistical power to contrast mutation rates for the various types of non-B DNA against B-DNA controls.”

For most types of non-B DNA, the team found increased mutation rates. The differences were enough that non-B DNA mutation rates impacted regional variation in their immediate surroundings. These differences also helped explain a large portion of the variation that can be seen along the genome at the scale of millions of nucleotides.

“When we look at all the known factors that influence regional variation in mutation rates across the genome, non-B DNA is the largest contributor,” said Francesca Chiaromonte, Huck Chair in Statistics for the Life Sciences at Penn State and one of the leaders of the research team. “We’ve been studying regional variation in mutation rates for a long time from a lot of different angles. The fact that non-B DNA is such a major contributor to this variation is an important discovery.”

“Our results have critical medical implications,” said Kristin Eckert, professor of pathology and biochemistry and molecular biology at Penn State College of Medicine, Penn State Cancer Institute researcher, an author on the paper, and the team’s long-time collaborator. “For example, human geneticists should consider the potential of a locus to form non-B DNA when evaluating candidate genetic variants for human genetic diseases. Our current and future research is focused on unraveling the mechanistic basis behind the elevated mutation rates at non-B DNA.”

The results also have evolutionary implications.

“We know that natural selection can impact variation in the genome, so for this study we only looked at regions of the genome that we think are not under the influence of selection,” said Yi-Fei Huang, assistant professor of biology at Penn State and one of the leaders of the research team. “This allows us to establish a baseline mutation rate for each type of non-B DNA that in the future we could potentially use to help identify signatures of natural selection in these sequences.”

Because of their increased mutation rates, non-B DNA sequences could be an important source of genetic variation, which is the ultimate source of evolutionary change.

“Mutations are usually thought to be so rare, that when we see the same mutation in different individuals, the assumption is that those individuals shared an ancestor who passed the mutation to them both,” said Makova, a Penn State Cancer Institute researcher. “But it’s possible that the mutation rate is so high in some of these non-B DNA regions that the same mutation could occur independently in several different individuals. If this is true, it would change how we think about evolution.”


Point mutations usually take place during DNA replication. DNA replication occurs when one double-stranded DNA molecule creates two single strands of DNA, each of which is a template for the creation of the complementary strand. A single point mutation can change the whole DNA sequence. Changing one purine or pyrimidine may change the amino acid that the nucleotides code for.

Point mutations may arise from spontaneous mutations that occur during DNA replication. The rate of mutation may be increased by mutagens. Mutagens can be physical, such as radiation from UV rays, X-rays or extreme heat, or chemical (molecules that misplace base pairs or disrupt the helical shape of DNA). Mutagens associated with cancers are often studied to learn about cancer and its prevention.

There are multiple ways for point mutations to occur. First, ultraviolet (UV) light and higher-frequency light are capable of ionizing electrons, which in turn can affect DNA. Reactive oxygen molecules with free radicals, which are a byproduct of cellular metabolism, can also be very harmful to DNA. These reactants can lead to both single-stranded DNA breaks and double-stranded DNA breaks. Third, bonds in DNA eventually degrade, which creates another problem to keep the integrity of DNA to a high standard. There can also be replication errors that lead to substitution, insertion, or deletion mutations.

Transition/transversion categorization Edit

In 1959 Ernst Freese coined the terms "transitions" or "transversions" to categorize different types of point mutations. [2] [3] Transitions are replacement of a purine base with another purine or replacement of a pyrimidine with another pyrimidine. Transversions are replacement of a purine with a pyrimidine or vice versa. There is a systematic difference in mutation rates for transitions (Alpha) and transversions (Beta). Transition mutations are about ten times more common than transversions.

Functional categorization Edit

Nonsense mutations include stop-gain and start-loss. Stop-gain is a mutation that results in a premature termination codon (a stop was gained), which signals the end of translation. This interruption causes the protein to be abnormally shortened. The number of amino acids lost mediates the impact on the protein's functionality and whether it will function whatsoever. [4] Stop-loss is a mutation in the original termination codon (a stop was lost), resulting in abnormal extension of a protein's carboxyl terminus. Start-gain creates an AUG start codon upstream of the original start site. If the new AUG is near the original start site, in-frame within the processed transcript and downstream to a ribosomal binding site, it can be used to initiate translation. The likely effect is additional amino acids added to the amino terminus of the original protein. Frame-shift mutations are also possible in start-gain mutations, but typically do not affect translation of the original protein. Start-loss is a point mutation in a transcript's AUG start codon, resulting in the reduction or elimination of protein production.

Missense mutations code for a different amino acid. A missense mutation changes a codon so that a different protein is created, a non-synonymous change. [4] Conservative mutations result in an amino acid change. However, the properties of the amino acid remain the same (e.g., hydrophobic, hydrophilic, etc.). At times, a change to one amino acid in the protein is not detrimental to the organism as a whole. Most proteins can withstand one or two point mutations before their function changes. Non-conservative mutations result in an amino acid change that has different properties than the wild type. The protein may lose its function, which can result in a disease in the organism. For example, sickle-cell disease is caused by a single point mutation (a missense mutation) in the beta-hemoglobin gene that converts a GAG codon into GUG, which encodes the amino acid valine rather than glutamic acid. The protein may also exhibit a "gain of function" or become activated, such is the case with the mutation changing a valine to glutamic acid in the BRAF gene this leads to an activation of the RAF protein which causes unlimited proliferative signalling in cancer cells. [5] These are both examples of a non-conservative (missense) mutation.

Silent mutations code for the same amino acid (a "synonymous substitution"). A silent mutation does not affect the functioning of the protein. A single nucleotide can change, but the new codon specifies the same amino acid, resulting in an unmutated protein. This type of change is called synonymous change since the old and new codon code for the same amino acid. This is possible because 64 codons specify only 20 amino acids. Different codons can lead to differential protein expression levels, however. [4]

Single base pair insertions and deletions Edit

Sometimes the term point mutation is used to describe insertions or deletions of a single base pair (which has more of an adverse effect on the synthesized protein due to the nucleotides' still being read in triplets, but in different frames: a mutation called a frameshift mutation). [4]

Point mutations that occur in non-coding sequences are most often without consequences, although there are exceptions. If the mutated base pair is in the promoter sequence of a gene, then the expression of the gene may change. Also, if the mutation occurs in the splicing site of an intron, then this may interfere with correct splicing of the transcribed pre-mRNA.

By altering just one amino acid, the entire peptide may change, thereby changing the entire protein. The new protein is called a protein variant. If the original protein functions in cellular reproduction then this single point mutation can change the entire process of cellular reproduction for this organism.

Point germline mutations can lead to beneficial as well as harmful traits or diseases. This leads to adaptations based on the environment where the organism lives. An advantageous mutation can create an advantage for that organism and lead to the trait's being passed down from generation to generation, improving and benefiting the entire population. The scientific theory of evolution is greatly dependent on point mutations in cells. The theory explains the diversity and history of living organisms on Earth. In relation to point mutations, it states that beneficial mutations allow the organism to thrive and reproduce, thereby passing its positively affected mutated genes on to the next generation. On the other hand, harmful mutations cause the organism to die or be less likely to reproduce in a phenomenon known as natural selection.

There are different short-term and long-term effects that can arise from mutations. Smaller ones would be a halting of the cell cycle at numerous points. This means that a codon coding for the amino acid glycine may be changed to a stop codon, causing the proteins that should have been produced to be deformed and unable to complete their intended tasks. Because the mutations can affect the DNA and thus the chromatin, it can prohibit mitosis from occurring due to the lack of a complete chromosome. Problems can also arise during the processes of transcription and replication of DNA. These all prohibit the cell from reproduction and thus lead to the death of the cell. Long-term effects can be a permanent changing of a chromosome, which can lead to a mutation. These mutations can be either beneficial or detrimental. Cancer is an example of how they can be detrimental. [6]

Other effects of point mutations, or single nucleotide polymorphisms in DNA, depend on the location of the mutation within the gene. For example, if the mutation occurs in the region of the gene responsible for coding, the amino acid sequence of the encoded protein may be altered, causing a change in the function, protein localization, stability of the protein or protein complex. Many methods have been proposed to predict the effects of missense mutations on proteins. Machine learning algorithms train their models to distinguish known disease-associated from neutral mutations whereas other methods do not explicitly train their models but almost all methods exploit the evolutionary conservation assuming that changes at conserved positions tend to be more deleterious. While majority of methods provide a binary classification of effects of mutations into damaging and benign, a new level of annotation is needed to offer an explanation of why and how these mutations damage proteins. [7]

Moreover, if the mutation occurs in the region of the gene where transcriptional machinery binds to the protein, the mutation can affect the binding of the transcription factors because the short nucleotide sequences recognized by the transcription factors will be altered. Mutations in this region can affect rate of efficiency of gene transcription, which in turn can alter levels of mRNA and, thus, protein levels in general.

Point mutations can have several effects on the behavior and reproduction of a protein depending on where the mutation occurs in the amino acid sequence of the protein. If the mutation occurs in the region of the gene that is responsible for coding for the protein, the amino acid may be altered. This slight change in the sequence of amino acids can cause a change in the function, activation of the protein meaning how it binds with a given enzyme, where the protein will be located within the cell, or the amount of free energy stored within the protein.

If the mutation occurs in the region of the gene where transcriptional machinery binds to the protein, the mutation can affect the way in which transcription factors bind to the protein. The mechanisms of transcription bind to a protein through recognition of short nucleotide sequences. A mutation in this region may alter these sequences and, thus, change the way the transcription factors bind to the protein. Mutations in this region can affect the efficiency of gene transcription, which controls both the levels of mRNA and overall protein levels. [8]

Cancer Edit

Point mutations in multiple tumor suppressor proteins cause cancer. For instance, point mutations in Adenomatous Polyposis Coli promote tumorigenesis. [9] A novel assay, Fast parallel proteolysis (FASTpp), might help swift screening of specific stability defects in individual cancer patients. [10]

Neurofibromatosis Edit

Sickle-cell anemia Edit

Sickle-cell anemia is caused by a point mutation in the β-globin chain of hemoglobin, causing the hydrophilic amino acid glutamic acid to be replaced with the hydrophobic amino acid valine at the sixth position.

The β-globin gene is found on the short arm of chromosome 11. The association of two wild-type α-globin subunits with two mutant β-globin subunits forms hemoglobin S (HbS). Under low-oxygen conditions (being at high altitude, for example), the absence of a polar amino acid at position six of the β-globin chain promotes the non-covalent polymerisation (aggregation) of hemoglobin, which distorts red blood cells into a sickle shape and decreases their elasticity. [14]

Hemoglobin is a protein found in red blood cells, and is responsible for the transportation of oxygen through the body. [15] There are two subunits that make up the hemoglobin protein: beta-globins and alpha-globins. [16] Beta-hemoglobin is created from the genetic information on the HBB, or "hemoglobin, beta" gene found on chromosome 11p15.5. [17] A single point mutation in this polypeptide chain, which is 147 amino acids long, results in the disease known as Sickle Cell Anemia. [18] Sickle-cell anemia is an autosomal recessive disorder that affects 1 in 500 African Americans, and is one of the most common blood disorders in the United States. [17] The single replacement of the sixth amino acid in the beta-globin, glutamic acid, with valine results in deformed red blood cells. These sickle-shaped cells cannot carry nearly as much oxygen as normal red blood cells and they get caught more easily in the capillaries, cutting off blood supply to vital organs. The single nucleotide change in the beta-globin means that even the smallest of exertions on the part of the carrier results in severe pain and even heart attack. Below is a chart depicting the first thirteen amino acids in the normal and abnormal sickle cell polypeptide chain. [18]

Sequence for normal hemoglobin
START Val His Leu Thr Pro Glu Glu Lys Ser Ala Val Thr

Sequence for sickle-cell hemoglobin
START Val His Leu Thr Pro Val Glu Lys Ser Ala Val Thr

Tay–Sachs disease Edit

The cause of Tay–Sachs disease is a genetic defect that is passed from parent to child. This genetic defect is located in the HEXA gene, which is found on chromosome 15.

The HEXA gene makes part of an enzyme called beta-hexosaminidase A, which plays a critical role in the nervous system. This enzyme helps break down a fatty substance called GM2 ganglioside in nerve cells. Mutations in the HEXA gene disrupt the activity of beta-hexosaminidase A, preventing the breakdown of the fatty substances. As a result, the fatty substances accumulate to deadly levels in the brain and spinal cord. The buildup of GM2 ganglioside causes progressive damage to the nerve cells. This is the cause of the signs and symptoms of Tay-Sachs disease. [19]

Color blindness Edit

People who are colorblind have mutations in their genes that cause a loss of either red or green cones, and they therefore have a hard time distinguishing between colors. There are three kinds of cones in the human eye: red, green, and blue. Now researchers have discovered that some people with the gene mutation that causes colorblindness lose an entire set of "color" cones with no change to the clearness of their vision overall. [20]

In molecular biology, repeat-induced point mutation or RIP is a process by which DNA accumulates G:C to A:T transition mutations. Genomic evidence indicates that RIP occurs or has occurred in a variety of fungi [21] while experimental evidence indicates that RIP is active in Neurospora crassa, [22] Podospora anserina, [23] Magnaporthe grisea, [24] Leptosphaeria maculans, [25] Gibberella zeae [26] and Nectria haematococca. [27] In Neurospora crassa, sequences mutated by RIP are often methylated de novo. [22]

RIP occurs during the sexual stage in haploid nuclei after fertilization but prior to meiotic DNA replication. [22] In Neurospora crassa, repeat sequences of at least 400 base pairs in length are vulnerable to RIP. Repeats with as low as 80% nucleotide identity may also be subject to RIP. Though the exact mechanism of repeat recognition and mutagenesis are poorly understood, RIP results in repeated sequences undergoing multiple transition mutations.

The RIP mutations do not seem to be limited to repeated sequences. Indeed, for example, in the phytopathogenic fungus L. maculans, RIP mutations are found in single copy regions, adjacent to the repeated elements. These regions are either non-coding regions or genes encoding small secreted proteins including avirulence genes. The degree of RIP within these single copy regions was proportional to their proximity to repetitive elements. [28]

Rep and Kistler have speculated that the presence of highly repetitive regions containing transposons, may promote mutation of resident effector genes. [29] So the presence of effector genes within such regions is suggested to promote their adaptation and diversification when exposed to strong selection pressure. [30]

As RIP mutation is traditionally observed to be restricted to repetitive regions and not single copy regions, Fudal et al. [31] suggested that leakage of RIP mutation might occur within a relatively short distance of a RIP-affected repeat. Indeed, this has been reported in N. crassa whereby leakage of RIP was detected in single copy sequences at least 930 bp from the boundary of neighbouring duplicated sequences. [32] To elucidate the mechanism of detection of repeated sequences leading to RIP may allow to understand how the flanking sequences may also be affected.

Mechanism Edit

RIP causes G:C to A:T transition mutations within repeats, however, the mechanism that detects the repeated sequences is unknown. RID is the only known protein essential for RIP. It is a DNA methyltransferease-like protein, that when mutated or knocked out results in loss of RIP. [33] Deletion of the rid homolog in Aspergillus nidulans, dmtA, results in loss of fertility [34] while deletion of the rid homolog in Ascobolus immersens, masc1, results in fertility defects and loss of methylation induced premeiotically (MIP). [35]

Consequences Edit

RIP is believed to have evolved as a defense mechanism against transposable elements, which resemble parasites by invading and multiplying within the genome. RIP creates multiple missense and nonsense mutations in the coding sequence. This hypermutation of G-C to A-T in repetitive sequences eliminates functional gene products of the sequence (if there were any to begin with). In addition, many of the C-bearing nucleotides become methylated, thus decreasing transcription.

Use in molecular biology Edit

Because RIP is so efficient at detecting and mutating repeats, fungal biologists often use it as a tool for mutagenesis. A second copy of a single-copy gene is first transformed into the genome. The fungus must then mate and go through its sexual cycle to activate the RIP machinery. Many different mutations within the duplicated gene are obtained from even a single fertilization event so that inactivated alleles, usually due to nonsense mutations, as well as alleles containing missense mutations can be obtained. [36]

The cellular reproduction process of meiosis was discovered by Oscar Hertwig in 1876. Mitosis was discovered several years later in 1882 by Walther Flemming.

Hertwig studied sea urchins, and noticed that each egg contained one nucleus prior to fertilization and two nuclei after. This discovery proved that one spermatozoon could fertilize an egg, and therefore proved the process of meiosis. Hermann Fol continued Hertwig's research by testing the effects of injecting several spermatozoa into an egg, and found that the process did not work with more than one spermatozoon. [37]

Flemming began his research of cell division starting in 1868. The study of cells was an increasingly popular topic in this time period. By 1873, Schneider had already begun to describe the steps of cell division. Flemming furthered this description in 1874 and 1875 as he explained the steps in more detail. He also argued with Schneider's findings that the nucleus separated into rod-like structures by suggesting that the nucleus actually separated into threads that in turn separated. Flemming concluded that cells replicate through cell division, to be more specific mitosis. [38]

Matthew Meselson and Franklin Stahl are credited with the discovery of DNA replication. Watson and Crick acknowledged that the structure of DNA did indicate that there is some form of replicating process. However, there was not a lot of research done on this aspect of DNA until after Watson and Crick. People considered all possible methods of determining the replication process of DNA, but none were successful until Meselson and Stahl. Meselson and Stahl introduced a heavy isotope into some DNA and traced its distribution. Through this experiment, Meselson and Stahl were able to prove that DNA reproduces semi-conservatively. [39]


2.1 Acoustic data collection and processing

The C-POD was used as the passive acoustic data collection instrument. This device contains an omni-directional hydrophone that records the timing of zero-crossings (accuracy to 200 ns) and the peak amplitude between zero crossings (Tregenza, Dawson, Rayment, & Verfuss, 2016 ). These data are then used to identify the narrow-band high-frequency (NBHF) clicks of harbor porpoises (Au, Kastelein, Rippe, & Schooneman, 1999 Macaulay, Malinka, Gillespie, & Madsen, 2020 Villadsgaard, Wahlberg, & Tougaard, 2007 ). Data were collected from 12 stations (average depth 44 m, range 29.0–60.0 m) that were used both as a part of the SAMBAH project (April 2011–July 2013) and the SNMP (April 2017–March 2020) (Figure 1). C-PODs were anchored with the hydrophone approximately 2 m off the bottom, and the loggers were serviced every 3–6 months for battery and SD card changes, and functionality tests. A different C-POD was deployed at the station each time for logistical reasons, and to facilitate removal of systematic bias that could be caused by different sensitivities of the individual C-PODs.

All files downloaded from the C-PODs (including both the SAMBAH and SNMP data) were cut at midnight after deployment, and midnight prior to retrieval. Custom software (CPOD.exe, v. 2.044, was used to process the data. The average and variance of the instantaneous frequency, click duration, peak amplitude, and two measures of the click envelope were saved for each click. The KERNO classifier was then used to identify click trains and label them as either porpoise (NBHF), other cetacean, boat sonar, or unclassified. A second classifier specifically developed for the Baltic Sea marine region (Hel1, Tregenza, 2014 ) was also applied to the data to reduce the false positive rate, by removing click trains falsely identified as originating from a porpoise. When the number of detection positive minutes was below 60 a year, the file was visually validated to ensure that there were no false detections.

2.2 Potential sources of bias

Diel patterns have been shown in the vocal behavior of harbor porpoises (Carlström, 2005 Osiecka, Jones, & Wahlberg, 2020 Todd, Pearse, Tregenza, Lepper, & Todd, 2009 ). As there is a theoretical possibility of changes in vocal behavior over time in response to changes in prey availability or quality, we analyzed a range of acoustic metrics to minimize the risk of the results being influenced by behavioral changes over time. The analyzed metrics were: number of encounters, number of clicks, detection positive seconds (DPS) (all presented in Appendix S1), and detection positive hours (DPH) (presented in results) in a day. Given the extremely low sighting rate, it is not possible to tag Baltic Proper harbor porpoises to collect data on vocal rate over time. However, the use of a range of acoustic metrics (number of encounters, number of clicks, DPS, and DPH, in a day), and ensuring that the results are consistent over these metrics, should minimize the influence of both changes in the acoustic behavior of the animals and diel patterns in the vocal behavior of porpoises over time. Given the low detection rate of Baltic Proper harbor porpoises (max detection of 11 DPH per day), saturation with detections did not influence the results.

There were gaps in data collection over time (due to equipment failure, delayed battery changes due to inclement weather, and unexpected loss of equipment [i.e., being caught in trawl gear]) resulting in varying effort at each of the stations. Based on the results of the SAMBAH study, the Baltic Proper population is thought to congregate into a major cluster during May–October, which is when breeding takes place. During November to April the population has a more dispersed distribution pattern (Carlén et al., 2018 ). To account for the varying effort and seasonality, missing data were imputed using a seasonal model fitted separately for each station and only detections over the breeding season were used to calculate a yearly population index (see Section 2.3). Regression imputation with a generalized additive model (Wood, 2011 ) was used, assuming that the number of DPH (or number of encounters, number of clicks, DPS analysis was repeated for each of the acoustic metrics investigated, see Appendix S1) per day njy on Julian day j in year y follow a Poisson-distribution with mean λjy = exp (s(j) + by) . Here s(j) defines a cyclical spline function, such that s(0) = s(365) , which describes the seasonal pattern common to all years, and by is a fixed yearly effect. The model was fit in R (R Core Team, 2020 ) with the mgcv package using default options for automatic selection of basis dimension, providing estimates that were used to impute missing counts. Further details of the model fitting can be found in the Supporting Information. The proportion of missing days for each station and year is presented in Table S1.

Temperature affects sound propagation in seawater, so the temperature recorded by the C-PODs was examined over the years. Since temperature varies greatly seasonally, and the porpoise data were examined during May–October, the temperature data were examined over the same time period. An equation that calculates how the absorption of sound (dB/km) is influenced by various acoustic (frequency) and environmental (temperature, salinity, depth, acidity) factors from a previous study (Ainslie & McColm, 1998 ) was used to investigate the potential effect of temperature changes over the years on the detection rate. To do this, we assumed a frequency of 130 kHz (based on the likely frequency of harbor porpoise signals (Macaulay et al., 2020 Villadsgaard et al., 2007 )), salinity of 8 ppt (given the brackish waters of the Baltic Sea), depth of 44 m (average depth of the C-PODs), and acidity of 8 pH.

The Baltic Sea does not have any significant tides (a few centimeters) due the small opening to the North Sea, therefore this factor was not considered a potential source of bias. Additionally, characteristics of the station, such as depth and bottom type may influence detection. However, as the same 12 stations were used over time, and trends were examined at each station, it is unlikely that these factors influenced the detection rates recorded over time.

2.3 Temporal trend analysis

In order to investigate how the detection rate of harbor porpoises changed over the years (2011–2019), data from three stations (1032 1036 and 1041) with 90% of the DPH in this study (across both SAMBAH and SNMP) were selected, as they are also located in an area of high density for this population during May–October when breeding takes place (Northern Midsea Bank, Figure 1) (Carlén et al., 2018 ). The selection of these stations also ensured that the detections were likely to be for animals from the Baltic Proper population, as stations closer to the proposed May–October management border further to the west (Carlén et al., 2018 ) are more likely to contain detections that could be from the neighboring Belt Sea population. Using only the data from May to October, a yearly population index was defined as μy, the arithmetic mean of (possibly partially imputed- see Section 2.2) counts, a measure of the average number of DPH per day. In order to investigate trends over time, log-linear regressions were fitted to yearly indices for each of the three stations. Only five complete years of data were collected over the course of the two studies (SAMBAH 2011, 2012 SNMP 2017, 2018, 2019). For the purposes of this study, data were assumed to meet model assumptions (e.g., normality), even though it was not possible to test with five data points.

2.4 Indicators of population trends in abundance

Under EU legislation (European Commission 1992, 2008), all countries with harbor porpoises in their waters are required to set regional or sub-regional indicator thresholds that provide information on whether the species has achieved good environmental status for abundance. For the North Sea population of harbor porpoises and other cetacean populations within the OSPAR region, OSPAR has proposed a threshold for trends in abundance, set as a 5% change over 10 years (significance level α < .05) (CEMP, 2019 ). An abundance indicator is still currently in development for the Baltic Proper population of harbor porpoises within HELCOM. However, we calculated the power to detect a 5% change over 10 years in the Baltic Proper data (using detection rates, not abundance data) at these Swedish stations. These stations represent the area with the highest detection rates in the May–October distribution range of the Baltic Proper population (Carlén et al., 2018 ), and are therefore, most likely to be able to detect a change. We also calculated the number of years required to have 80% power to detect a 5% change in this region, as this information may be useful for further indicator development. Although our calculations are based on detection rates (not abundance estimates) it is likely that detection rates will need to be utilized as an index of abundance to be used as an indicator for this population within HELCOM over the next two to three EU reporting cycles, and repeatedly also after that. This is due to the fact that obtaining enough updated estimates of absolute abundance for this population to estimate a trend is still likely to be decades away (assuming one abundance estimate every 10–12 years), and even longer before such surveys can be carried out once per six-year reporting cycle.

Heat Transfer | Short/Long Answer Questions

(a) The factors that decide the rate of evaporation are:

  • Temperature
  • Surface area exposed
  • Partial pressure of liquid in the air above it.

When air is blown above the surface of liquid, it will take away the liquid carrying air particles from the air above the liquid, resulting in decrease in humidity and increase in rate of evaporation.

(b) The factors that decide the rate of evaporation are:

  • Temperature
  • Surface area exposed
  • Partial pressure of liquid in the air above it.

On increasing the surface area, the number of molecules in contact at the surface of liquid increases, and evaporation takes place rapidly.

(c) The factors that decide the rate of evaporation are:

  • Temperature
  • Surface area exposed
  • Partial pressure of liquid in the air above it.

The increase in temperature increases the kinetic energy of the molecules, they escape the force of attraction of molecules and evaporate faster.

hello students welcome to Lido learning's question and answer videos let's have a look at the interesting question in front of us, we have to give reasons for the increase in the rate of evaporation so rate at which evaporation is taking place or how fast is the evaporation taking place of a liquid so why does the there why is there an increase in the rate of evaporation when air is blown above the liquid that is case a when surface area of the liquid is increased that is case b when temperature of the liquid is increased that is case c so let's just understand first what do we mean by evaporation evaporation is the process by which the liquid changes into gas this is a surface phenomena and not necessarily temperature will increase in this case right so it does not depend on the increase in temperature here okay let's move ahead with the first the part that is a so let's have a look at the clothes the wet clothes which are drying in the the third case as you can see here in the third picture so we have placed a fan over here so what happens when we place a fan and the clothes are kept drying we see that it they dry faster right now why does this happen let's look into that in a bit more detail so when air is blown above the surface of liquid let's say these are wet clothes so on the surface water droplets are there so when air is blown above the surface of liquid it will take away the liquid carrying air particles from the air above the liquid so so let's say this is your container in which this is the liquid so the particles of the liquid which is on the surface when air is blown they take away the air they take away the liquid carrying air particles from the air above the liquid and resulting in a decrease in humidity so therefore there is less humidity and therefore there is increase in the rate of evaporation or the clothes they dry up faster or the rate at which the evaporation is taking place becomes faster here let's look at the second picture that is second the point that is the first picture over here so on increasing the surface area the number of molecules now why does evaporation increase when we increase the surface area so when we increase we know that evaporation is a surface phenomena so on increasing the surface area the number of molecules in contact at the surface of the liquid increases so if I increase the surface area of this container what will happen now more molecules will be present on the surface right because evaporation is a surface phenomena as a result since more number of molecules are there therefore evaporation will take place rapidly or rate will increase so a good example of this is if you don't spread out your clothes if you don't spread the clothes out for drying do they dry up faster or slower we see that when we spread out the clothes they dry up very faster right very fast that is rate of evaporation is higher when the clothes are spread out that is we increase the surface area so they spread out the clothes we actually are increasing the surface area of the clothes as a result rate of evaporation is faster let's have a look at the third point that is when the temperature of the liquid is increased now what will happen the third case is when we increase the temperature of the liquid let's say we are drying the clothes here on a sunny day and we are drying the clothes here at night now which out of these two cases will the clothes dry up faster we see that on a sunny day the clothes will dry up faster then at night why is it so because the temperature is more right when the sun is out so the increase in temperature increases the kinetic energy of the molecules again let's think of this as a liquid so the molecules are at the surface when there is a bright sunny day what happens the the kinetic energy of the molecules increases as a result, they quickly change into gas molecules and they evaporate from the surface so they escape the force of attraction so they overcome the force of attraction between them themselves and they escape and form gas molecules that is how the rate of evaporation also increases if the temperature is more then the rate at which the temperature is less for example at during day time the evaporation is more as compared to night so I hope this point was clear if you have any further questions please post your comments below thank you


Recent Rodent Duplicates

We retrieved gene duplicates from the Homolens (version 1) database of automatically inferred phylogenies constructed using Ensembl gene predictions (Penel S, Duret L, personal communication and queried using FamFetch ( Dufayard et al. 2005). We searched for cases of recent lineage-specific gene duplication in rodents, where a single gene in rat or mouse has exactly 2 coorthologs in the second species. Homolens internal identifiers were mapped to Ensembl identifiers, which were then used to retrieve map locations. Because Ensembl contains some annotated “introns” that are frameshift corrections, for analyses of intron content, we only considered annotated introns that are ≥50 nt and flanked by coding exons.

Gene Duplication Categories

We categorized recent rodent duplicates on the basis of 2 criteria: relative location in the genome and mechanism of duplication. We designated all physically linked duplicate pairs with <5 intervening genes as “local” duplications. All other duplicates either on the same chromosome or on different chromosomes were classified as “distant.”

We classified duplicated genes on the basis of duplication mechanism by distinguishing between RNA-mediated retrotranspositions (which typically create a single-exon retrogene from a multiexon paralog) and DNA-based transpositions (which typically conserve exon–intron structure), using a rigorous set of criteria based on counts of coding exons. For pairs consisting of a single-exon gene with a multiexon paralog, we counted the introns of the latter gene that lie within the protein alignment of the 2 duplicates. If ≥2 introns were present, we inferred that duplication had occurred by retrotransposition resulting in the loss of these introns in the retrocopy. Because all detected retrogenes have a single coding exon, this set excludes retrogenes that have been incorporated into chimeric coding regions following gene-fusion events, but potentially includes cases that have acquired noncoding exons de novo following retrotransposition ( Vinckenbosch et al. 2006). If both members of a duplicate pair contained ≥2 exons, we counted introns within the alignment, and where there was evidence of not more than 1 intron loss, we inferred that duplication had occurred by DNA-based transposition. Although these strict criteria allow confident inference of the mechanism of duplication for many pairs, they leave some pairs unclassified. For example, when both members of a duplicate pair are single-exon genes, we were not able to infer the mechanism. Such unclassified pairs were not used for the analysis of the impact of duplication mechanism on rate asymmetry, but were included in the analysis of the effect of duplicate relocation.

Direction of (retro)Transposition of Distant Duplicates

For distant duplicates, we established the direction of (retro)transposition to discriminate between the relocated paralog and the static paralog that remains at the ancestral locus. This was done using a framework of positional anchors consisting of unduplicated single-copy genes for which there is a 1:1:1 orthologous relationship between human, mouse, and rat. These singletons were retrieved from Homolens using FamFetch with the query topology ((mouse, rat), human) constrained so that no gene duplication has occurred since the primate–rodent split.

To establish the direction of (retro)transposition of distantly separated mouse duplicates that are coorthologs of a single rat gene, for example, we located the closest singleton anchors that bracket the rat gene ( fig. 1A). We then determined the locations of the single mouse orthologs of the rat bracketing genes. When the mouse orthologs of a pair of rat bracketing genes are linked in mouse, and themselves bracket 1 of the 2 mouse duplicates, we designated the bracketed mouse duplicate as the static copy and the other mouse duplicate as the relocated duplicate. Assignment of the direction of (retro)transposition by this method was possible for 118 of the 147 distant duplicates in mouse and for 106 of the 137 distant duplicates in rat.

(A) Determining the direction of transposition for distantly separated duplicates: Duplication of gene Y in the mouse lineage following mouse–rat speciation created 2 mouse duplicates (paralogs YM1 and YM2), which are coorthologs of a single rat gene (YR). To polarize the direction of transposition in mouse and thus discriminate between the static and transposed duplicates, we considered genes XR and ZR that flank gene YR in rat and have single orthologs in mouse (XM and ZM). Because both XM and ZM are found to flank YM1, this duplicate can be designated the static copy implying that its paralog (YM2) has been relocated by retrotransposition. (B) Classification by duplication mechanism of duplicate pairs having branch-specific dS > 0.001 and dN > 0.001. A resampling strategy was applied to these 98 duplicates to separately determine the effect of duplicate relocation and retrotransposition on rate asymmetry measured by RN.

(A) Determining the direction of transposition for distantly separated duplicates: Duplication of gene Y in the mouse lineage following mouse–rat speciation created 2 mouse duplicates (paralogs YM1 and YM2), which are coorthologs of a single rat gene (YR). To polarize the direction of transposition in mouse and thus discriminate between the static and transposed duplicates, we considered genes XR and ZR that flank gene YR in rat and have single orthologs in mouse (XM and ZM). Because both XM and ZM are found to flank YM1, this duplicate can be designated the static copy implying that its paralog (YM2) has been relocated by retrotransposition. (B) Classification by duplication mechanism of duplicate pairs having branch-specific dS > 0.001 and dN > 0.001. A resampling strategy was applied to these 98 duplicates to separately determine the effect of duplicate relocation and retrotransposition on rate asymmetry measured by RN.

Measures of Sequence Evolution

For each sequence triplet consisting of a single-copy gene in 1 rodent species and its 2 coorthologs in the second rodent species, we aligned the Ensembl protein sequences using ClustalW ( Thompson et al. 1994) and back translated it to create a codon-based alignment. These alignments were used as input to the program like-tri-test ( Conant and Wagner 2003) to estimate branch-specific rates of synonymous (dS) and nonsynonymous divergence (dN) as well as branch-specific estimates of dN/dS.

Prevalence of Significantly Asymmetric Sequence Divergence

We examined the prevalence of significantly asymmetric sequence divergence using a likelihood-based approach. For each pair of duplicates, we tested whether a model of unconstrained evolution on the branches leading to each duplicate gave a significantly better fit to the data than a null model in which the duplicates were constrained to evolve symmetrically. We used like-tri-test ( Conant and Wagner 2003) to test 3 null models representing symmetry between duplicates with respect to synonymous divergence (dS1 = dS2), nonsynonymous divergence (dN1 = dN2), and strength of selective constraint (ω1 = ω2). For each of these tests, we compared the likelihoods of the alternative models of constrained and unconstrained evolution. When twice the difference in log likelihoods exceeded 3.84 (χ 2 test P ≤ 0.05) the null model of symmetric divergence was rejected and duplicate gene divergence was classed as asymmetric. Otherwise, the divergence of the duplicates was designated symmetric for that measure. The purpose of this analysis was to calculate the relative prevalence of asymmetry between different types of duplicate and not to determine whether sequence divergence was significantly asymmetric for an individual pair of duplicates. Therefore, we did not perform a multiple testing correction. Because this approach relies on the likelihood ratio test to assign duplicates as either symmetric or asymmetric but does not quantify the magnitude of sequence asymmetry, we did not impose the filter used in the previous section that excludes duplicates with branch-specific values of dS and dN < 0.001.

Gene Expression Information

For each recent duplicate in mouse, we looked for evidence of expression by using the predicted transcript as a MegaBlast ( Zhang et al. 2000) query to mouse expressed sequence tags (ESTs) and cDNAs. We did not study gene expression in rat duplicates because this species has much lower overall EST coverage than mouse. Starting with all hits with E < 1 × 10 −20 , any ESTs with >75% of their sequence aligned with >97% nucleotide identity to only one of the 2 duplicates were assigned to that duplicate. Any ESTs matching both duplicates by these criteria were aligned to them using ClustalW. We then considered diagnostic sites at which the EST sequence shares an identical base with only one of the 2 duplicates and where all 3 sequences are well aligned (i.e., no gap occurs within 2 nt). Only if all diagnostic sites group the EST with the same duplicate did we assign the EST to that gene.

We assigned ESTs to tissues using the TissueInfo database ( Skrabanek and Campagne 2001), discarding ESTs from cancerous sources and keeping only those from normal unpooled tissues. For each tissue, we quantified a gene's expression frequency using the count of its ESTs from that tissue expressed as a fraction of all ESTs derived from that tissue. We then used the highest expression frequency for a gene among all tissues to represent its peak expression. We quantified the asymmetry in peak expression between a given pair of duplicates using the absolute (unsigned) normalized difference in peak expression, where P1 and P2 are the peak expression levels of each duplicate. For unlinked duplicates for which the direction of (retro)transposition could be determined, we also quantified the direction of change in expression using the signed normalized difference in expression peak, where Ps and Pr are the peak expression levels of the static and relocated duplicate, respectively. Similarly, we defined expression breadth (B) as the number of distinct tissues represented among the ESTs assigned to the gene. Because retrogenes are sparsely sampled with ESTs (see Results), we could not reliably quantify the expression breadth of individual retrogenes. Thus, if a retrogene is expressed ubiquitously but at a very low level its expression may appear tissue specific purely as a consequence of low EST coverage. Although this prevented us from estimating asymmetry in expression breadth for individual pairs of retroduplicates (analogous to the measures of asymmetry in expression peak, RP and SRP), we were able to test whether the expression breadth of retrogenes as a group is significantly different to that of their static progenitor paralogs (see Results).

Increase rate of change in coding regions? - Biology

Molecular Genetics: Mutations

I. Definition ‑ heritable change in DNA sequence

II. Kinds of mutations

III. Hemoglobin, Sickle Cell Anemia and Thalassemia ‑ Some examples
Hemoglobin, the oxygen carrying protein in blood cells, is a quarternary protein comprised of four chains, 2 alpha ( α ) and 2 beta ( β) chains. The molecule also has a heme unit that contains an iron. The gene sequence of the anti‑sense strand for the β- chain of hemoglobin was provided in class (click here). Some take home lessons from the gene:

IV. Chromosomal mutations ‑ larger scale changes in nucleotides at the level of the chromosome. Several kinds. (not on exam)

A. Transposons ‑ first identified by B. McClintock at Cold Spring Harbor, NY (1951). She won a Nobel prize for her work. She studied kernal color in maize (corn). She discovered two families of transposons (movable elements, or jumping genes). One she called AcDs (activator/dissociator). These transposons are regions of chromosomes that could move from one place to another. Holy Mendel, Batman. This was a heretical idea at the time and virtually ignored. Ac can move by itself, but Ds only moves in the presence of Ac. A schematic representation of what happens (diagrams on overhead are a little better):

Nina Federoff et al ‑ studied the molecular biology of this system. They found that Ac was 4563 nucleotides long with inverted repeats on each end. The inverted repeats are mirror image sequences of nucleotides. There were a variety of Ds elements, varying sizes, but all similar in nucleotide sequence to Ac. In one Ds she found, it was identical to Ac except for a 194 piece segment missing. This suggested that something in that piece is necessary for movement. Deduced it was a protein called transposase ‑ cuts out elements with inverted repeat sequences and reattaches at raondom. Thus, Ds is essentially a crippled version of Ac. Ac produces the transposase that frees Ds to reinsert in other places.

B. Chromosomal (pericentric) inversions. Chromosome breaks and piece reinserts itself upside down.

C. Robertsonian translocation. Chromosomes fuse end to end, especially in small acrocentric chromosomes.

V. Sources/Causes of mutations

A. Errors during replication

B. Mutagens such as chemicals (replace correct nucleotides, increase rate of error by DNA polymerase) or physical agents (such as xray, UV ‑ thymine dimers)

VI. Importance of Mutations

A. From the individual perspective, a mutation can be good, bad or indifferent. Frequently, a mutation is selected against.

B. From the perspective of a population, mutations are good because they are the raw material necessary for evolutionary change.

VII. Molecular Biology of Evolution ‑ Chimps and Humans (not on exam)

A. DNA similarity/differences can show: (1) phylogeny ‑ the more similar the DNA, the more closely related and (2) provide indication of when species evolved (time of divergence a sort of molecular clock).

B. The Third Chimpanzee (Jared Diamond)
In this book Diamond describes some recent studies concerning the evolution of humans. Cites a study by Sibley and Ahlquist (1984) that compared DNA of several primates.

DNA hybridization studies ‑ DNA melts (strands separate as increase temperature. Can mix DNA from different species. The more similar the DNA the higher the temperature it will take to melt (spearate the strands, because they form a lot of hydrogen bonds with their complement nucleotides). As a rule of thumb, for every 1 degree lowering of melting point there is a 1% difference in the DNA. Conclusions from Sibley/Ahlquist studies:

VIII. Mechanisms of molecular evolution: An example
This example features the evolution of two human sex hormones, β-LH ( luteininzing hormone) which stimulates ovulation and β- HCG (human chorionic gonadotropin) which is produced by embryo and is the basis of pregnancy test. Both hormones are proteins, each with 2 chains, an alpha and beta. The alpha chain of both are identical and coded by chromosome #6. The beta chain of both are similar. They are coded by a series of genes on chromosome 19. There are 8 genes side by side, 7 code for HCG and one for LH. It is hypothesized that an ancestral LH gene duplicated (occurs during meiosis when there is an unequal cross over with one chromatid getting extra, the other missing stuff). One of the duplicated LH genes was conserved and is our current LH gene. The other diverged, accumulating mutations, and became an HCG gene. This HCG gene was duplicated additional times resulting in our current arrangement on chromo 19. Two HCG genes have been sequenced and they differ by two nucleotides. One change is silent the other results in a difference of one amino acid between the two protein products (beta chain). The beta LH protein is 121 amino acids long. The beta HCG is 145. The first 112 amino acids of both are the same and they have the same length piece of DNA coding for both. For more details, check out the assignment posted on line.


The dataset that supports the findings of this study is archived in the Research Center for Eco-Environmental Sciences, State Key Laboratory of Urban and Regional Ecology, Chinese Academy of Sciences and available from the corresponding author upon reasonable request.

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

Structured Question Worked Solutions

Are xylem vessels living or dead structures? Give a reason.

They are dead structures as they do not contain protoplasm.

2. What are the main functions of xylem vessels?


3. State three ways in which xylem vessels are adapted to their functions.

  • The xylem does not have any cross walls and protoplasm - this enables water to move easily through the lumen
  • Lignin deposited on the walls helps strengthen the walls and prevent the vessel from collapsing
  • When bundled together, the xylem vessels provide mechanical support to the plant.
  • The absorption of water takes place through the root hairs.
  • The root hairs grow between the soil particles and are in close contact with the water surrounding them. The sap in the root hair cell is a relatively concentrated solution of sugars and various salts.
  • Thus, the sap has a lower water potential than the soil solution.
  • These two solutions are separated by the partially permeable cell surface membrane of the root hair cells. Water therefore enters the root hairs by osmosis.
  • The entry of water dilutes the sap.
  • The sap of the root hair cell becomes more diluted than that of the next cell.
  • Therefore, water passes by osmosis from the root hair cells into the other inner cells of the cortex. This process continues until the water enters the xylem vessels and moves up the plant.
  • The living cells around the xylem vessels in the root use active transport to pump the mineral salts or ions into the vessels. This lowers water potential in the xylem vessels.
  • Water therefore passes from living cells into the xylem vessels by osmosis and flows upwards.

a. Describe a pathway by which a sucrose molecule is transported from the leaf to a sink such as a fruit.
b. Describe an experiment that can demonstrate the process described in (a).
c. Suggest and explain one reason why a sucrose molecule may be transported to a particular sink and not to other sinks.

a. A pathway is the route taken by the sucrose molecule from the cells in the leaf to the fruit. The sucrose molecule moves from the mesophyll cells in the leaf to the phloem in a vascular bundle of the leaf. This is followed by moving to the phloem in a vascular bundle of the fruit and finally to the cells of the fruit.

b. step 1: cut off a complete ring of bark including the phloem and cambium from the main stem of a woody twig. This will leave the xylem exposed. Place the twig in water with the ring immersed.
step 2: prepare another twig that has a cut ring above the water level.
step 3: set up a control using an unringed twig
step 4: observe the twigs daily. Note where roots or swellings appear. Make drawings of observations.

c. Sucrose can be converted into glucose. Glucose is required for tissue respiration. Sucrose will first be transported to sinks which have higher rates of metabolic activity, such as growing points (shoots and root tips)

6. Describe how water from the capillary tube enters the shoot to reach one of the leaves.


Transpiration by the leaves creates a transpiration pull, which sucks water up the xylem vessels to the leaves, water in the container enters the shoot to replace that lost in transpiration. Capillary action, whereby water tends to move up very narrow tubes, also allows water to move through the capillary tube and through the xylem vessels in the stem to reach the leaves.