Question

Consider the following variant of string alignment: given two strings x,y, and a positive integer L,...

Consider the following variant of string alignment: given two strings x,y, and a positive integer L, find all contiguous substrings of length at least L that are aligned (using no-ops = substituting a letter for itself) in some optimal alignment of x and y. Assume the costs of substitution, insertion, and deletion are given by constants csubs,cins,cdel, and that the cost of substituting a letter for itself is zero.

(a) Show that the output here contains at most O((n−L)^3) substrings.

(b) Give an algorithm that solves this problem

0 0
Add a comment Improve this question Transcribed image text
Answer #1
According to my understanding of your question, you want all the contiguous substrings of length L to appear in the list for strings X and Y. I will first show you the algorithm and code to help you understand and then discuss the time complexity of the same. Note : Coded in Python

Let X and Y be two strings 

X = 'Today is a good day, it is a good idea to have a walk.'
Y = 'Yesterday was not a good day, but today is good, shall we have a walk?'

Algorithm

1. Remove punctuation from both X and Y.

2. Split both Strings into words using Split() function.

3. Use Two loops to detect common strings and while loop so as to change value of i in the loop itself.

4. Take a string r to store values of substring and check if string is same then keep adding to r. Else if not same then check if there is already one in buffer and add it to result (here z as in code)

5. Print the list Z

Code

import string

X = 'Today is a good day, it is a good idea to have a walk.'
Y = 'Yesterday was not a good day, but today is good, shall we have a walk?'
z=[]
L = 5                                    #random length 5
s1=X.translate(None, string.punctuation) #remove punctuation
s2=Y.translate(None, string.punctuation)
print s1
print s2
sw1=s1.lower().split()                   #split it into words
sw2=s2.lower().split()
print sw1,sw2
i=0
while i<len(sw1):          #two loops to detect common strings. used while so as to change value of i 
    x=0
    r=""
    d=i
    #print r
if len(r)<=L                                     #checking the length less than L
    for j in range(len(sw2)):                        
        #print r
        if sw1[i]==sw2[j] :        
            r=r+' '+sw2[j]                       #if string same keep adding to a variable
            x+=1
            i+=1
        else:
            if x>0        # if not same check if there is already one in buffer and add it to result                                                                  
                z.append(r)
                i=d
                r=""
                x=0
    if x>0:                                            #end case of above loop
        z.append(r)
        r=""
        i=d
        x=0
    i+=1 
    #print i
print list(set(z)) 

#O(n^3) normally but for length L it will become O((n-L)^3)

Time Complexity

Now, we have used a while loop and a for loop for 2 strings and calculating substrings for the same. Its taking O(n^3) time.

However since we have introduced condition that only length L substrings are to be found then, loop will take n-L steps to calculate the same.

Hence Time Complexity = O((n-L)^3) at worst case

Add a comment
Know the answer?
Add Answer to:
Consider the following variant of string alignment: given two strings x,y, and a positive integer L,...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Given two strings X and Y, a third string Z is a common superstring of X...

    Given two strings X and Y, a third string Z is a common superstring of X and Y if X and Y are both subsequences of 2. (Example: if X = sos and Y = soft, then Z = sosft is a common superstring of X and Y.) Design a dynamic programming algorithm which, given as input two strings X and Y, returns the length of the shortest common superstring (SCS) of X and Y. Specifically, you have to write...

  • Write an algorithm that takes two strings X and Y and a positive integer k as...

    Write an algorithm that takes two strings X and Y and a positive integer k as input, and determines whether Y can be turned into X by at most k insertions or deletions. For example: let X = “abacad", Y = “cebacad" and k = 3 then your algorithm should give a yes answer because Y can be turned into X in 3 steps by a left-to-right processing as follows: delete leftmost c delete “e" insert “a" Give an tight...

  •    Problem 2: Sequence similarity measure. Let 3 and y be two given DNA sequences, represented...

       Problem 2: Sequence similarity measure. Let 3 and y be two given DNA sequences, represented as strings with characters in the set {A, G, C,T}. The similarity measure of r and y is defined as the maximum score of any alignment of r and y, where the score for an alignment is computed by adding substitution score and deletion and insertion scores, as explained below. (Some operations have negative scores.) The score for changing a character T, into a...

  • OpenSooq Assignment - Software Engineer Given two strings input of printable text, a and b ....

    OpenSooq Assignment - Software Engineer Given two strings input of printable text, a and b . let the edit distance between a and b is least number of operations needed to transform a to b, write php code that calculate edit distance using: 1. hamming distance that only have substitute operations (ex. substitute letter X with letter Y). 2. Levenshtein distance: that have 3 possible operators: insert, delete or substitution operations the function should consider all possibilities but should not...

  • QUESTION 1 Given two double variables named x and y, which of the following statements could...

    QUESTION 1 Given two double variables named x and y, which of the following statements could you use to initialize both variables to a value of 0.0? a. x | y = 0.0; b. x = y = 0.0; c. x, y = 0.0; d. none of the above 1 points    QUESTION 2 When you use a range-based for loop with a vector, you a. can avoid out of bounds access b. must still use a counter variable c....

  • This C++ Program consists of: operator overloading, as well as experience with managing dynamic memory allocation...

    This C++ Program consists of: operator overloading, as well as experience with managing dynamic memory allocation inside a class. Task One common limitation of programming languages is that the built-in types are limited to smaller finite ranges of storage. For instance, the built-in int type in C++ is 4 bytes in most systems today, allowing for about 4 billion different numbers. The regular int splits this range between positive and negative numbers, but even an unsigned int (assuming 4 bytes)...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT