Question

In python Attached is a file called sequences.txt, it contains 3 sequences (one sequence per line)....

In python

Attached is a file called sequences.txt, it contains 3 sequences (one sequence per line). Also attached is a file called AccessionNumbers.txt. Write a program that reads in those files and produces 3 separate FATSA files. Each accession number in the AccessionNumbers.txt file corresponds to a sequence in the sequences.txt file. Remember a FASTA formatted sequence looks like this:

>ABCD1234

ATGCTTTACGTCTACTGTCGTATGCTTTACGTCTACTGACTGTCGTATGCTTACGTCTACTGTCG

The file name should match the accession numbers, so for 1st one it should be called ABCD1234.txt.

Note: Print out the sequences in upper case AND remove any special characters in the sequence!

sequences.txt contains the following sequence:

ATGCTTTACGTCTACTGTCGTATGCTTTACGTCTACTGACTGTCGTATGCTTACGTCTACTGTCG
atgctttacgtctactgtcgtatgctttacgtctactgactgtcgtatgcttacgtctactgtcg
atgctttacgt-tcgtatgctttacgtc---tatgcttacgt--cttacgt-----ctactgtcg

AccessionNumbers.txt contains the following:

ABCD1234
GFRT7890
HJIS5630

the output should produce 3 separate text files named "ABCD1234.txt" "GFRT7890.txt" and "HJIS5630.txt"

each text file output should look like this:

>ABCD1234

ATGCTTTACGTCTACTGTCGTATGCTTTACGTCTACTGACTGTCGTATGCTTACGTCTACTGTCG

0 0
Add a comment Improve this question Transcribed image text
Answer #1

#function that takes sequence
#and returns the valid sequence in upper case.
def formatSeq(seq):
   #resultant seq
   res = ""
   #for each char in seq
   for i in seq:
       #if i_th char is one of the ATCG, then add it.
       if( i.upper() in "ATCG"):
           res += i.upper()
   #return the result
   return res

#open sequences.txt and AccessionNumbers.txt
seq = open("sequences.txt")
acNum = open("AccessionNumbers.txt")

#read one line from sequence and accession txt files
sequence = seq.readline()
accession = acNum.readline()
#till the end of the files
while(sequence and accession):
   #format the sequence by calling function
   sequence = formatSeq(sequence.strip())
   accession = accession.strip()
   #create file with accession number as txt file
   out = open(accession+".txt","w")
   #write accession number
   out.write(">"+accession+"\n")
   #and sequence
   out.write(sequence)
   out.close()
   #read next sequence and accession number.
   sequence = seq.readline()
   accession = acNum.readline()

seq.close()
acNum.close()

#sequence.txt

ATGCTTTACGTCTACTGTCGTATGCTTTACGTCTACTGACTGTCGTATGCTTACGTCTACTGTCG
atgctttacgtctactgtcgtatgctttacgtctactgactgtcgtatgcttacgtctactgtcg
atgctttacgt-tcgtatgctttacgtc---tatgcttacgt--cttacgt-----ctactgtcg
#accessionnumbers.txt

ABCD1234
GFRT7890
HJIS5630

ABCD1234.txt

>ABCD1234
ATGCTTTACGTCTACTGTCGTATGCTTTACGTCTACTGACTGTCGTATGCTTACGTCTACTGTCG

GFRT7890.txt

>GFRT7890
ATGCTTTACGTCTACTGTCGTATGCTTTACGTCTACTGACTGTCGTATGCTTACGTCTACTGTCG

HJIS5630.txt

>HJIS5630
ATGCTTTACGTTCGTATGCTTTACGTCTATGCTTACGTCTTACGTCTACTGTCG

Add a comment
Know the answer?
Add Answer to:
In python Attached is a file called sequences.txt, it contains 3 sequences (one sequence per line)....
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • In python Attached is a file called sequences.txt, it contains 3 sequences (one sequence per line)....

    In python Attached is a file called sequences.txt, it contains 3 sequences (one sequence per line). Also attached is a file called AccessionNumbers.txt. Write a program that reads in those files and produces 3 separate FATSA files. Each accession number in the AccessionNumbers.txt file corresponds to a sequence in the sequences.txt file. Remember a FASTA formatted sequence looks like this: >ABCD1234 ATGCTTTACGTCTACTGTCGTATGCTTTACGTCTACTGACTGTCGTATGCTTACGTCTACTGTCG The file name should match the accession numbers, so for 1st one it should be called ABCD1234.txt. Note:...

  • Write a Python program that converts an input file in FASTA format, called "fasta.txt", to an...

    Write a Python program that converts an input file in FASTA format, called "fasta.txt", to an output file in PHYLIP format called "phylip.txt". For example, if the input file contains: >human ACCGTTATAC CGATCTCGCA >chimp ACGGTTATAC CGTACGATCG >monkey ACCTCTATAC CGATCGATCC >gorilla ATCTATATAC CGATCGATCG Then the output file should be human ACCGTTATACCGATCTCGCA chimp ACGGTTATACCGTACGATCG monkey ACCTCTATACCGATCGATCC gorilla ATCTATATACCGATCGATCG FASTA format has a description (indicated with a '>') followed by 1 or more lines of a DNA sequence. PHYLIP format has a description...

  • python/idle 1. Write a program that counts the number of A’s in a DNA sequence. The...

    python/idle 1. Write a program that counts the number of A’s in a DNA sequence. The input is one sequence in FASTA format in a file called ‘dna.txt’. For example, if the file contains: >human ACCGT then the output of the program should be 1. Your program should work for any sequence and not just the one in the example.

  • There is a file lets call it the.faa. It contains all of the polypeptide coding sequences...

    There is a file lets call it the.faa. It contains all of the polypeptide coding sequences in the E. coli K12 genome in FASTA format. Your task for this exercise is to generate an amino acid usage report with counts only, and in no particular order. Your output should look like this: T: 69645 G: 95475 V: 91683 Y: 36836 H: 29255

  • In Python, do a basic encryption of a text file in the following manner. The program...

    In Python, do a basic encryption of a text file in the following manner. The program encrypt.py will read in the following text file and rearrange the lines in the file randomly and save the rearranged lines of txt to another file called encrypted.txt. It will also save another file called key.txt that will contain the index of the lines that were rearranged in the encrypted file, so for example if the 4th line from the original file is now...

  • Python Assignment Tasks 1. Place the two provided plaintext files (file_1.txt, file_2.txt) on your desktop. 2....

    Python Assignment Tasks 1. Place the two provided plaintext files (file_1.txt, file_2.txt) on your desktop. 2. Write a Python program named fun_with_files.py. Save this file to your desktop as well. 3. Have your program write the Current Working Directory to the screen.   4. Have your program open file_1.txt and file_2.txt, read their contents and write their contents into a third file that you will name final.txt . Note: Ponder the open‐read/write‐close file operation sequence carefully. 5. Ensure your final.txt contains...

  • Create a class called DuplicateRemover. Create an instance method called remove that takes a single parameter...

    Create a class called DuplicateRemover. Create an instance method called remove that takes a single parameter called dataFile (representing the path to a text file) and uses a Set of Strings to eliminate duplicate words from dataFile. The unique words should be stored in an instance variable called uniqueWords. Create an instance method called write that takes a single parameter called outputFile (representing the path to a text file) and writes the words contained in uniqueWords to the file pointed...

  • Lab Description : Exercise : a) Create an interface named ReadAndWriteFile, that has readAndReturnFile() and writeFile()...

    Lab Description : Exercise : a) Create an interface named ReadAndWriteFile, that has readAndReturnFile() and writeFile() methods. b) readAndReturnFile() and writeFile() both throws IOException, FileNotFoundException and Exception. c) Create a class named DealingWithTextFile that implements ReadAndWriteFile. This class read and write txt files. d) Create a class named DealingWithCsvFile that implements ReadAndWriteFile. This class read and write csv files. - Please mention that you have to deal with exceptions, use a dialog message to catch the exceptions. e) DealingWithTextFile class...

  • Program Language: PYTHON Consider the following file structure: each line of the file contains a word....

    Program Language: PYTHON Consider the following file structure: each line of the file contains a word. The words are in sorted order. For example, a file might look like this apple apple apple apple banana bargain brick brick sample sample simple text text text Write a program that asks the user for a filename with this structure. The program's job is to write the sequence of words to another file, without any duplicates. Name the output file  result.txt. Each word is...

  • Write a complete C++ program that defines the following class: Class Salesperson { public: Salesperson( );...

    Write a complete C++ program that defines the following class: Class Salesperson { public: Salesperson( ); // constructor ~Salesperson( ); // destructor Salesperson(double q1, double q2, double q3, double q4); // constructor void getsales( ); // Function to get 4 sales figures from user void setsales( int quarter, double sales); // Function to set 1 of 4 quarterly sales // figure. This function will be called // from function getsales ( ) every time a // user enters a sales...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT