Act88F.seq
VERSION 5:
HTML version of the Sparrow lab working sequence prepared by JDC
20/10/94.
All HTML formatting was deliberately placed in this header region to
retain compatability with the GCG software. To recover the workable
file
save this page in ASCII format.
SEQUENCE FEATURES:
Transcription start 1420
5' intron (in UTR) 1499-2033
Translation start (ATG) 2050-2052
3' intron (in cds) 2975-3034
Translation stop (TAA) 3238-3240
EXTRA INFORMATION:
3' end of
Act88F (from R. Cripps)
Insertion in 5' intron of allele
A138V
3D structure information.
VERSION 4:
Revised 16/10/89 to include changes in 5' non-coding region found by
Emma
Hennessey
The sequence from 1536 to 1580 was confirmed as correct, between 1580
and
1684 there are several changes including the addition of 7 new bases.
Also the extra bases TAG at positions 1970-1972 and G at 1999
which are not shown in Fyrberg. Apart from these changes the sequence
from 1904-2114 was confirmed as correct.
In the coding region (as shown above) the sequence to aa22 was
confirmed
as correct.
In addition the sequence from aa100-aa163 was confirmed as correct
except
for changes in position 3 of aa150 and aa151 (no change in aa)
In addition I have confirmed the coding region sequence from the KpnI
site
(2955) to the termination codon (3237) [aa305-aa375] as correct.
VERSION 3:
This contains JCS revisions to the Fyrberg version of the coding
region
based on the Fyrberg gels.
VERSION 2:
A revision of file allact88.seq made on 11.1.87
This version of the Actin88f gene sequence contains the Fyrberg et al
version of the coding sequence.
CONTAINS ALL REVISIONS BY JCS AS OF 16.9.86
ENTERED BY DRD
The sequence to the transcription start site is from Geyer and
Fyrberg (1986) Molec. Cell Biol. 6 000-000 "5'-Flanking Sequence
Required
for Regulated Expresion of a Muscle-Specific Drosophila melanogaster
Actin Gene".
Coding sequence is from Sanchez et al corrected to correspond to
Fyrberg. Details of Fyrbergs version of the Canton S wild type
Actin 88F sequence are given in table 1 of Mahaffey et al (1985)
Cell 40 101-110 "The Flightless Drosophila Mutant raised has Two
Distinct
Genetic Lesions Affecting Accumulation of Myofibrillar Proteins in
Flight
Muscles."
Also see Karlik et al. (1984) Cell 38 711-719.
All other sequences are from Sanchez et al (1983) J Mol Biol 163
533-551 "Two Drosophila actin genes in detail"
NB Fyerberg gives nucleotides 8 AS G not C
25 (Sanchez) as G not A
52-53 as GG not AA
Fyerberg omits nucleotide 63 (Sanchez) (T).
This file has the Sanchez version of these nucleotides.
Act88fgene.Seq Length: 3516 October 21, 1994 18:10 Type: N Check:
5191
..
1 TCTAGAATGC ACAATAGGCA AATTTAGTTA AGATATGAAT TTTTAAATAA
51 ATGGTGAGCC CAATCAATTC AGTGGTTGAA TGACTTTTCA TAAATTAAAA
101 AATAAAGATA AGAATGGTGA ACAATTCTGT TCGCAGCCAA TAACCTCTTG
151 CTCAATACAC GTGTCAATCA AGGCAATCCA AATAAAACGC TTTGGGAATG
201 CCACCAATTC ACTTCCGAGC ATCAGTTCCT ATCTTTAGCC AACCGATTCG
251 ATTATTTCAT GTGGGCAAGC AATAAAAACG TAAATAGAAG AAGTAAAAAA
301 TAATTAAATC TACATAAAGG AATAAATACA GTTCGATTGA GAAAATACAT
351 TTTCGCTCGG TCTGGCTGGC AATGGTTGGT TAATTGCACT GATAAATGGT
401 CGGCACGGTG ATTTCGCAAC TTCGGGATTG CATCGGCGCC GCAATGCAAA
451 GTGCAGCAGC ATTCTGTAGA ATGCGATTGC AAATGTGGAT GCAGCTTCCT
501 CGAGCACCGC GCGCGGAGAT CTGATCAACC TTGCGTGTTG ATTTATCGGT
551 GCCGCTCTGC TTGGCGCGTC TATTTTAGAT TCGCCTCGCT GCGTGCCCGT
601 TGAAATGTCC CATTCTCCCA GTCCCTGCCG CGGATGCCAA TTGTCTTGCG
651 TCGGTCCTTC TAAGGTCCGT TTCTATTTTC CGAAGCTCTC AGCACCGAAT
701 GAGTCGTCCG CCGCAGCAGT CGCCCATTGG CAGCAGGATT GGGACAGAGA
751 TGGGGACGGA GATGGGGCTA ATTGGCCGCT CGAGAGTGCT GATTGCCGTT
801 TAGGTGGCCC ATACACCGCT ATCACGCACC TCTGCTAATC ACTCGGCTAT
851 GGCGTTCTCT TATCTTTCGA GAGCTTTCTC TCTCCTGGCA CTCCCTAGAA
901 ATAATGAATA GGGTCCTAAG ATTGATAGCT TACTTCCATC ATATATTGTC
951 AATTAATTAA ATATTTCAGG ATTAAAATAT GAAACGAATT GAACATAAAG
1001 TTTCTACTAC ATAGTTATTT AAGCTGTTAT ATGTTATGAG ACCATTTTCT
1051 CAGGATTTGT ACCTACTAAC AATGTGAAAA AAATATAAAA TTGTCATATT
1101 TTCGCAGTTT GGAAATTCCC TCGTTTATTG AATTTATTGG TAATCTTAAT
1151 AAATGATTCT ATGCTTTATT AAGTATTTAA TTGTGTGGCT TCCTTTTTTT
1201 TTGTTGAAAG CGCATTAATG AGTCGTCTTC GTGCAATGAG GCATCCAAAC
1251 TTCTGACATG CTCGGCCAGA AGTCTGAAAA CTGCTTATAT GGATCGGTTC
1301 GAGTTGATTG TTCCGCAGCA CTTTCGCTCA ATCTTTTTCT CAGTGCCGCA
1351 CTGGCATCCA ACTCAAATCG CTTCGAGGGA GAGCCGAGAT ATAAAGGCAG
1401 GACAGACCGA TCGGCGTGCC ATTTGTTCTT GAATCTAGTT GTCAACAGGA
1451 ATCGAACGTG CGACTCTATC CAATTTTTCT CCTTTCGTTG ACCTAAAAGG
1501 TGTGTGAGTG CGACCTCAAT GTCGAAGGAT CCAAGGATTA TTACAGAAAA
1551 AGCCAAGAGG ACTAAGGATA TTAAAACTCT TTTTAATAAG TTCGGATTGT
1601 TTGATGGATT TTTCTACAAG TCCACTAATC GGTCTTCCGA AACGCTTCAA
1651 CTCATCTAAA CTATAAAGTG AAGAGTCACT TGCAACGAAA CGTATTTCAA
1701 TTAATTTGAT ACGTTTAAAT TAAGTTCCAT GAGCCATTCC TTTCCGATAT
1751 TTCCGATATT TTTAGAGCAC TGATTTAGTT TCAAGTGAAT AACCAATTAG
1801 GCATGACTCA AAAGGAAATG GAATATACCA ATTTTGGCAA TTTTTCATGG
1851 TTTTATTTAC TGAAATGTGC TCAAATGGAC AATAGAGTTT CACTTCACTT
1901 CTTCAATATC TTAAAAAGTT AAATATTTTC TTGAGACACA AATTAGTTTT
1951 CTATGTTGTC ATTAAAGTAG TAGAATTTAA AGAATTGAGA TGTAGGTGGG
2001 AGCTAACCGT GTGCACTTCC ATCTCCCTTC CAGATAAACA ACTGCCAAGA
2051 TGTGTGACGA TGATGCGGGT GCATTAGTTA TCGACAACGG ATCGGGCATG
2101 TGCAAAGCCG GCTTCGCCGG TGATGACGCT CCCCGTGCTG TCTTCCCCTC
2151 AATTGTGGGT CGTCCCCGAC ACCAGGGTGT GATGGTGGGT ATGGGTCAGA
2201 AGGACTCGTA CGTGGGCGAC GAGGCGCAAA GTAAGCGCGG TATCCTGACG
2251 CTGAAGTACC CCATCGAGCA CGGCATCATC ACGAACTGGG ACGACATGGA
2301 GAAGATCTGG CATCACACCT TCTACAACGA GCTGCGCGTG GCCCCCGAGG
2351 AGCATCCAGT ATTATTGACC GAGGCTCCAC TGAACCCCAA GGCCAATCGC
2401 GAGAAGATGA CCCAGATCAT GTTCGAGACC TTCAACTCGC CGGCCATGTA
2451 CGTGGCCATC CAGGCCGTGC TCTCCCTGTA CGCCTCCGGT CGTACCACCG
2501 GTATTGTGCT GGACTCCGGC GATGGTGTCT CCCACACCGT GCCCATCTAT
2551 GAGGGCTTCG CCCTGCCCCA CGCCATTCTG CGTCTGGACT TGGCTGGTCG
2601 CGATCTGACC GATTACCTGA TGAAGATCCT GACGGAGCGC GGCTACAGCT
2651 TCACCACCAC CGCCGAGCGT GAGATCGTGC GCGACATCAA GGAGAAGCTG
2701 TGCTACGTGG CTCTGGACTT CGAGCAGGAG ATGGCCACCG CTGCCGCCTC
2751 CACCTCGCTG GAGAAGTCGT ACGAGTTGCC TGACGGCCAG GTGATCACCA
2801 TTGGCAACGA GCGCTTCCGC TGCCCCGAGG CCCTGTTCCA GCCCTCGTTC
2851 CTGGGCATGG AGTCGTGCGG CATCCACGAG ACCGTCTACA ACTCGATCAT
2901 GAAGTGCGAC GTGGACATCC GCAAGGACCT GTATGCCAAC TCTGTGCTGT
2951 CCGGCGGTAC CACCATGTAC CCTGGTACAC GGATCGTTCG CTTCAGCAGT
3001 TGCACTTGTG CTTAATCCTT TGGTGCACTT TCAGGTATTG CCGATCGTAT
3051 GCAGAAGGAG ATCACTGCCC TGGCCCCATC GACCATCAAG ATCAAGATCA
3101 TTGCGCCACC CGAGAGGAAG TACTCCGTCT GGATCGGTGG CTCCATCCTG
3151 GCCTCGCTGT CCACCTTCCA GCAGATGTGG ATCTCGAAGC AGGAGTACGA
3201 CGAGTCCGGC CCCGGAATCG TTCACCGCAA ATGCTTTTAA GTCTTCGCCC
3251 GCCGCGAAAG CTCTTCAAAG GCAGCAACCA GCAGCGACCA ACAAGCATCC
3301 ATCTCGACCT TACCCAACAA CCTCGGCTCG GACAGTGATA GACAAAAGCA
3351 GCGAACCCAT CGCGACAACA ATTATCATCC AACTCAGATT CATAGCAGAT
3401 AATCAGAGGC AACCTCGGTT GTCGGTGGTT ATCTTATGGC ATTTCATCGG
3451 CAGCGGTATA GCGGATTTTT ATTTTGAAGA ACTAATCGTA ATCGTAAGAG
3501 TCGTGGTCTG CTCAGG