Act88F.seq

VERSION 5:
  HTML version of the Sparrow lab working sequence prepared by JDC 
20/10/94.
  All HTML formatting was deliberately placed in this header region to
  retain compatability with the GCG software. To recover the workable 
file
  save this page in ASCII format.

  SEQUENCE FEATURES:

   Transcription start     1420
   5' intron (in UTR)      1499-2033
   Translation start (ATG) 2050-2052
   3' intron (in cds)      2975-3034
   Translation stop (TAA)  3238-3240

  EXTRA INFORMATION:
3' end of Act88F (from R. Cripps)
Insertion in 5' intron of allele A138V
3D structure information.
VERSION 4:
   Revised 16/10/89 to include changes in  5' non-coding region found by 
Emma 
 Hennessey
   The sequence from 1536 to 1580 was confirmed as correct, between 1580 
and  
 1684 there are several changes including the addition of 7 new bases.
 Also the extra bases TAG at positions 1970-1972 and G at 1999
 which are not shown in Fyrberg.  Apart from these changes the sequence
 from 1904-2114 was confirmed as correct.

   In the coding region (as shown above) the sequence to aa22 was 
confirmed
 as correct.

   In addition the sequence from aa100-aa163 was confirmed as correct 
except 
 for changes in position 3 of aa150 and aa151 (no change in aa)

   In addition I have confirmed the coding region sequence from the KpnI 
site
 (2955) to the termination codon (3237) [aa305-aa375] as correct. 

VERSION 3:
   This contains JCS revisions to the Fyrberg version of the coding 
region 
 based on the Fyrberg gels.

VERSION 2:
 A revision of file allact88.seq made on 11.1.87
 This version of the Actin88f gene sequence contains the Fyrberg et al
 version of the coding sequence.


CONTAINS ALL REVISIONS BY JCS AS OF 16.9.86
  ENTERED BY DRD

   The sequence to the transcription start site is from Geyer and
 Fyrberg (1986) Molec. Cell Biol. 6 000-000 "5'-Flanking Sequence 
Required
 for Regulated Expresion of a Muscle-Specific Drosophila melanogaster
 Actin Gene".

   Coding sequence is from Sanchez et al corrected to correspond to
 Fyrberg. Details of Fyrbergs version of the Canton S wild type
 Actin 88F sequence are given in table 1 of Mahaffey et al (1985)
 Cell 40 101-110 "The Flightless Drosophila Mutant raised has Two 
Distinct
 Genetic Lesions Affecting Accumulation of Myofibrillar Proteins in 
Flight
 Muscles."
   Also see Karlik et al. (1984) Cell 38 711-719.

   All other sequences are from Sanchez et al (1983) J Mol Biol 163
 533-551 "Two Drosophila actin genes in detail"

   NB Fyerberg gives nucleotides   8           AS G not C
                                  25 (Sanchez) as G not A
                               52-53           as GG not AA 
      Fyerberg omits nucleotide 63 (Sanchez) (T).
      This file has the Sanchez version of these nucleotides.
  
Act88fgene.Seq  Length: 3516  October 21, 1994  18:10  Type: N  Check: 
5191  
..

       1  TCTAGAATGC ACAATAGGCA AATTTAGTTA AGATATGAAT TTTTAAATAA 

      51  ATGGTGAGCC CAATCAATTC AGTGGTTGAA TGACTTTTCA TAAATTAAAA 

     101  AATAAAGATA AGAATGGTGA ACAATTCTGT TCGCAGCCAA TAACCTCTTG 

     151  CTCAATACAC GTGTCAATCA AGGCAATCCA AATAAAACGC TTTGGGAATG 

     201  CCACCAATTC ACTTCCGAGC ATCAGTTCCT ATCTTTAGCC AACCGATTCG 

     251  ATTATTTCAT GTGGGCAAGC AATAAAAACG TAAATAGAAG AAGTAAAAAA 

     301  TAATTAAATC TACATAAAGG AATAAATACA GTTCGATTGA GAAAATACAT 

     351  TTTCGCTCGG TCTGGCTGGC AATGGTTGGT TAATTGCACT GATAAATGGT 

     401  CGGCACGGTG ATTTCGCAAC TTCGGGATTG CATCGGCGCC GCAATGCAAA 

     451  GTGCAGCAGC ATTCTGTAGA ATGCGATTGC AAATGTGGAT GCAGCTTCCT 

     501  CGAGCACCGC GCGCGGAGAT CTGATCAACC TTGCGTGTTG ATTTATCGGT 

     551  GCCGCTCTGC TTGGCGCGTC TATTTTAGAT TCGCCTCGCT GCGTGCCCGT 

     601  TGAAATGTCC CATTCTCCCA GTCCCTGCCG CGGATGCCAA TTGTCTTGCG 

     651  TCGGTCCTTC TAAGGTCCGT TTCTATTTTC CGAAGCTCTC AGCACCGAAT 

     701  GAGTCGTCCG CCGCAGCAGT CGCCCATTGG CAGCAGGATT GGGACAGAGA 

     751  TGGGGACGGA GATGGGGCTA ATTGGCCGCT CGAGAGTGCT GATTGCCGTT 

     801  TAGGTGGCCC ATACACCGCT ATCACGCACC TCTGCTAATC ACTCGGCTAT 

     851  GGCGTTCTCT TATCTTTCGA GAGCTTTCTC TCTCCTGGCA CTCCCTAGAA 

     901  ATAATGAATA GGGTCCTAAG ATTGATAGCT TACTTCCATC ATATATTGTC 

     951  AATTAATTAA ATATTTCAGG ATTAAAATAT GAAACGAATT GAACATAAAG 

    1001  TTTCTACTAC ATAGTTATTT AAGCTGTTAT ATGTTATGAG ACCATTTTCT 

    1051  CAGGATTTGT ACCTACTAAC AATGTGAAAA AAATATAAAA TTGTCATATT 

    1101  TTCGCAGTTT GGAAATTCCC TCGTTTATTG AATTTATTGG TAATCTTAAT 

    1151  AAATGATTCT ATGCTTTATT AAGTATTTAA TTGTGTGGCT TCCTTTTTTT 

    1201  TTGTTGAAAG CGCATTAATG AGTCGTCTTC GTGCAATGAG GCATCCAAAC 

    1251  TTCTGACATG CTCGGCCAGA AGTCTGAAAA CTGCTTATAT GGATCGGTTC 

    1301  GAGTTGATTG TTCCGCAGCA CTTTCGCTCA ATCTTTTTCT CAGTGCCGCA 

    1351  CTGGCATCCA ACTCAAATCG CTTCGAGGGA GAGCCGAGAT ATAAAGGCAG 

    1401  GACAGACCGA TCGGCGTGCC ATTTGTTCTT GAATCTAGTT GTCAACAGGA 

    1451  ATCGAACGTG CGACTCTATC CAATTTTTCT CCTTTCGTTG ACCTAAAAGG 

    1501  TGTGTGAGTG CGACCTCAAT GTCGAAGGAT CCAAGGATTA TTACAGAAAA 

    1551  AGCCAAGAGG ACTAAGGATA TTAAAACTCT TTTTAATAAG TTCGGATTGT 

    1601  TTGATGGATT TTTCTACAAG TCCACTAATC GGTCTTCCGA AACGCTTCAA 

    1651  CTCATCTAAA CTATAAAGTG AAGAGTCACT TGCAACGAAA CGTATTTCAA 

    1701  TTAATTTGAT ACGTTTAAAT TAAGTTCCAT GAGCCATTCC TTTCCGATAT 

    1751  TTCCGATATT TTTAGAGCAC TGATTTAGTT TCAAGTGAAT AACCAATTAG 

    1801  GCATGACTCA AAAGGAAATG GAATATACCA ATTTTGGCAA TTTTTCATGG 

    1851  TTTTATTTAC TGAAATGTGC TCAAATGGAC AATAGAGTTT CACTTCACTT 

    1901  CTTCAATATC TTAAAAAGTT AAATATTTTC TTGAGACACA AATTAGTTTT 

    1951  CTATGTTGTC ATTAAAGTAG TAGAATTTAA AGAATTGAGA TGTAGGTGGG 

    2001  AGCTAACCGT GTGCACTTCC ATCTCCCTTC CAGATAAACA ACTGCCAAGA 

    2051  TGTGTGACGA TGATGCGGGT GCATTAGTTA TCGACAACGG ATCGGGCATG 

    2101  TGCAAAGCCG GCTTCGCCGG TGATGACGCT CCCCGTGCTG TCTTCCCCTC 

    2151  AATTGTGGGT CGTCCCCGAC ACCAGGGTGT GATGGTGGGT ATGGGTCAGA 

    2201  AGGACTCGTA CGTGGGCGAC GAGGCGCAAA GTAAGCGCGG TATCCTGACG 

    2251  CTGAAGTACC CCATCGAGCA CGGCATCATC ACGAACTGGG ACGACATGGA 

    2301  GAAGATCTGG CATCACACCT TCTACAACGA GCTGCGCGTG GCCCCCGAGG 

    2351  AGCATCCAGT ATTATTGACC GAGGCTCCAC TGAACCCCAA GGCCAATCGC 

    2401  GAGAAGATGA CCCAGATCAT GTTCGAGACC TTCAACTCGC CGGCCATGTA 

    2451  CGTGGCCATC CAGGCCGTGC TCTCCCTGTA CGCCTCCGGT CGTACCACCG 

    2501  GTATTGTGCT GGACTCCGGC GATGGTGTCT CCCACACCGT GCCCATCTAT 

    2551  GAGGGCTTCG CCCTGCCCCA CGCCATTCTG CGTCTGGACT TGGCTGGTCG 

    2601  CGATCTGACC GATTACCTGA TGAAGATCCT GACGGAGCGC GGCTACAGCT 

    2651  TCACCACCAC CGCCGAGCGT GAGATCGTGC GCGACATCAA GGAGAAGCTG 

    2701  TGCTACGTGG CTCTGGACTT CGAGCAGGAG ATGGCCACCG CTGCCGCCTC 

    2751  CACCTCGCTG GAGAAGTCGT ACGAGTTGCC TGACGGCCAG GTGATCACCA 

    2801  TTGGCAACGA GCGCTTCCGC TGCCCCGAGG CCCTGTTCCA GCCCTCGTTC 

    2851  CTGGGCATGG AGTCGTGCGG CATCCACGAG ACCGTCTACA ACTCGATCAT 

    2901  GAAGTGCGAC GTGGACATCC GCAAGGACCT GTATGCCAAC TCTGTGCTGT 

    2951  CCGGCGGTAC CACCATGTAC CCTGGTACAC GGATCGTTCG CTTCAGCAGT 

    3001  TGCACTTGTG CTTAATCCTT TGGTGCACTT TCAGGTATTG CCGATCGTAT 

    3051  GCAGAAGGAG ATCACTGCCC TGGCCCCATC GACCATCAAG ATCAAGATCA 

    3101  TTGCGCCACC CGAGAGGAAG TACTCCGTCT GGATCGGTGG CTCCATCCTG 

    3151  GCCTCGCTGT CCACCTTCCA GCAGATGTGG ATCTCGAAGC AGGAGTACGA 

    3201  CGAGTCCGGC CCCGGAATCG TTCACCGCAA ATGCTTTTAA GTCTTCGCCC 

    3251  GCCGCGAAAG CTCTTCAAAG GCAGCAACCA GCAGCGACCA ACAAGCATCC 

    3301  ATCTCGACCT TACCCAACAA CCTCGGCTCG GACAGTGATA GACAAAAGCA 

    3351  GCGAACCCAT CGCGACAACA ATTATCATCC AACTCAGATT CATAGCAGAT 

    3401  AATCAGAGGC AACCTCGGTT GTCGGTGGTT ATCTTATGGC ATTTCATCGG 

    3451  CAGCGGTATA GCGGATTTTT ATTTTGAAGA ACTAATCGTA ATCGTAAGAG 

    3501  TCGTGGTCTG CTCAGG