Experiment: Once the sequence of a region of genomic DNA is known, such is the case for human
chromosome 22, a custom synthetic primer PCR-based approach might be developed for rapidly sequencing
syntenic regions of other eucaryotic genomes.
- This idea is based on the observation that exons typically are
conserved between evolutionary distant species such as humans and mice,
even though their intronic sequences are not. The sequence of human
chromosome 22 reveals that the average genomic size for a gene is 19.2 Kbp
(median 3.7 Kbp) with a mean exon number of 5.3 (median 3.0).
Therefore the average size of an intron is calculated to be 5.2 Kbp (median 1.2 Kbp),
a region easily PCR'd using existing methods. Thus, once the exons
in one genome (human for example) are predicted, a set of exon-specific
custom synthetic primers could be synthesized using this known genomic sequence.
These primers could be used first to PCR a portion of adjacent exons and their
intervening intronic or intragenic sequences using another, related genome
(mouse for example) as the template and then as specific sequencing primers
off the PCR'd related genomic DNA.
-
For this to be successful the exons must be predicted with reasonable assurance of being correct and then the
exon-specific primers produced for the PCR off the target genomic DNA followed by primer-based DNA
synthesis off the corresponding PCR product.
-
This approach can be tested by PCR using human-exon specific primers with a region of known mouse genomic
DNA sequence to determine the efficiency and if any additional rules must be developed for improved primer
picking. Target exons can be predicted by standard in silico approaches such as Blast searches of GenBank nr
and EST databases, overlaid with results from FGenesh, XGrail, GeneScan, and/or other ab initio methods.
Primers will be picked using a modified version of PrimOU and synthesized on the MerMade in 96 well format at
a cost of less than $1 per 20-mer. Once the efficiency of this approach is determined using the known, completed
human and mouse sequence-based comparison, then other regions of the mouse genome could be sequenced
using human specific primers to provide additional information about the successful rate of both the genomic PCR
and subsequent PCR-based sequencing using the human specific primers for mouse genomic DNA PCR-
produced templates. All primers and PCR products will be produced and archived in 96 or 384 well bar-coded
microtiter plates for bookkeeping and easy retrieval and stored at -70 deg C for later use if needed.
-
Individual PCR'd regions that are not closed using the custom synthetic PCR primers for sequencing will be
closed by additional rounds of custom synthetic primer synthesis using primers picked from the newly generated sequences and subsequent sequencing off the already
produced PCR products. Any regions that are not spanned by PCR products, such as regions with larger than 10
Kbp introns or intergenic regions, regions representing multiple gene copies, either true genes or pseudogenes, or
regions with large, low copy number repeated sequences, will require more classical BAC-based sequencing
choosing BACs whose end sequence fall in the PCR-based sequenced regions. With ~800 genes containing
~4,240 exons representing ~40 % of human chromosome 22, approximately 8,480 primers initially will be
required to produce the exon-specific PCR products and an additional ~50,000 primers will be needed to complete
the sequence of these PCR products. The ~58,480 primers, with an average read length of 500 bases will give
~29.2 Mbp of double stranded sequence covering approximately 90% of the entire 32.5 Mb region of the mouse
genome syntenic to human chromosome 22. Therefore, the number of shotgun sequence BACs needed will
represent ~10 % of the target syntenic regions based on the above calculations taking into account the known,
predicted exons on human chromosome 22. If the PCR-based sequencing is only 70% efficient, then the total
number of additional BAC-based shotgun sequences needed will represent 30 % of the total.
- The cost of this Exon PCR-based approach to sequence regions of
other mammalian genomes that are syntenic to regions of the human genome
is ~one-tenth that of a mapped BAC shotgun approach if the cost of custom
synthetic primers is less than $1 per 20-mer.
Other Applications of this PCR-based approach to comparative sequencing:
-
If successful for human-mouse comparative sequencing, this approach will eliminate the need for highly
expensive and labor intensive BAC target clone mapping, target clone isolation as well as shotgun library
production, isolation and sequencing, except for those regions (between 10 and 30% of the total syntenic regions)
which have very large introns (>10 kb) that may be difficult to PCR and/or regions of large repeats which are not
comparable between evolutionary distant species.
- If individuals already have the clones for a region of high biological interest mapped in a strain of
mouse other than B6, that region could be sequenced from the non-B6 strain and then using this
PCR-based approach to do the sequence from the B6 strain. Since most mouse strains are fairly
identical with only ~1/500 SNPs, primers could be made throught the region (from introns as well
as exons) instead of using the exon-specific primer-based approach proposed when the genomic sequences
have diverged greatly.
-
Similarly, this PCR primer-based approach also could be used for other comparative genomic sequence studies,
such as comparative primate, feline, bovine, etc followed by primer-based sequencing.