Question

Hi it's my code for python

I almost finished my project but only one thing left which is most confusing part please help me

I have to find most occurring ending character from a to z

End a b cd ef g j.. rst u VW xy z Count Start a 2.0 0.0 10.0 7.0 49.0 1.0 2.0 2.0 0.0 0.0 ...3.0 11.0 15.0 0.0 0.0 0.0 1.0 11

For instance, output should be like this I have to find a to z.

Words starting with ‘a’ end mostly with ‘O’

Words starting with ‘b’ end mostly with ‘O’

......
No words start with ‘O’(If there's no word in the character from a to z, it should be the same with this)

....

Words starting with ‘z’ end mostly with ‘O’

For example.

from the screenshot above, e seems like the most occurring, so the answer (for a) might be 'e'

In addition,

1. I have to write this output into txt file.

2. Please get rid of decimal points from the code I don't know where they came from...

Thanks!!

-----------------------------------------------------------------------------------------

Here's my code

import numpy as np
import pandas as pd

df_input=pd.read_csv('/Users/Loveyou/Downloads/words_file2.txt',header=None)

df_input.head()

df_output=pd.DataFrame(np.zeros(676,dtype=int).reshape((26,26)),
index = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'],
columns = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'])
df_output.index.name="Start"
df_output.columns.name="End"

#Here I have mentioned first i[-1] because in DataFrame first index goes for column
#so we go for End as we mentioned the Row name
for i in df_input[0]:
df_output[i[-1]][i[0]]+=1

#axis=1 is to find colunms and add count
df_output.sum(axis=1)
df_output["Count"] = df_output.sum(axis=1)
#Because axis is default just skip 0 and find raws and add Total
df_output.sum()
df_output.loc["Total", :] = df_output.sum()

df_output.to_csv("Loveyou_project31.csv", mode='w')

df_output

I am not sure for this---->print("Words starting with ‘a’ end mostly with : {}").format(df_output['a'].value_counts())

txt file is here

https://drive.google.com/open?id=1VXtEPNBJ6ypJZ62ypeeWtzS9TjcGhBp4

0 0
Add a comment Improve this question Transcribed image text
Answer #1
############## your provided code ############################# 

import numpy as np
import pandas as pd

df_input=pd.read_csv('/Users/Loveyou/Downloads/words_file2.txt',header=None)

df_input.head()

df_output=pd.DataFrame(np.zeros(676, int).reshape((26,26)))
rows = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
columns = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
df_output.index.name="Start"
df_output.columns.name="End"
df_output.columns = columns
df_output.index = rows
  
#Here I have mentioned first i[-1] because in DataFrame first index goes for column
#so we go for End as we mentioned the Row name
for i in df_input[0]:
    df_output[i[-1]][i[0]] += 1

#axis=1 is to find colunms and add count
df_output.sum(axis=1)
df_output["Count"] = df_output.sum(axis=1)
#Because axis is default just skip 0 and find rows and add Total
df_output.sum()
df_output.loc["Total", :] = df_output.sum()

df_output.to_csv("Loveyou_project31.csv", mode='w')

############### till here #################################################################

################## additonal code as per your requirements #################################

df_temp = df_output.drop('Total')   # to remove the 'Total' row. 
df_temp = df_temp.drop('Count',axis=1)   # to remove the 'Count' column 
###### changes are not made in actual df_output. we have stored the changes in temporary dataframe df
## we have done this so as to apply 'idxmax' function which returns the index with maximum value
   max_end_list = df_temp.idxmax(axis = 1)
### code for printing which word ends mostly with which word
i=0
for char in rows:
    valid = False
    for j in rows:
        if df_output[j][char] != 0:
            valid = True
            break
    if valid == True:
        print("words starting with " + char + " ends mostly with " + max_end_list[i])
    else:
        print("No words start with " + char)
    i += 1
df_output
##########################################################################

output:

words starting with a ends mostly with e
words starting with b ends mostly with h
words starting with c ends mostly with e
words starting with d ends mostly with e
words starting with e ends mostly with e
words starting with f ends mostly with e
words starting with g ends mostly with s
words starting with h ends mostly with y
words starting with i ends mostly with e
words starting with j ends mostly with n
words starting with k ends mostly with s
words starting with l ends mostly with d
words starting with m ends mostly with e
words starting with n ends mostly with e
words starting with o ends mostly with e
words starting with p ends mostly with e
words starting with q ends mostly with c
words starting with r ends mostly with e
words starting with s ends mostly with e
words starting with t ends mostly with e
words starting with u ends mostly with e
words starting with v ends mostly with e
words starting with w ends mostly with e
No words start with x
words starting with y ends mostly with e
words starting with z ends mostly with h

a b c d e f g h i j ... r s t u v w x y z Count
a 2.0 0.0 10.0 7.0 49.0 1.0 2.0 2.0 0.0 0.0 ... 3.0 11.0 15.0 0.0 0.0 0.0 1.0 11.0 0.0 141.0
b 0.0 0.0 1.0 1.0 4.0 0.0 0.0 6.0 0.0 0.0 ... 0.0 4.0 4.0 0.0 0.0 0.0 0.0 1.0 0.0 27.0
c 1.0 0.0 1.0 5.0 38.0 0.0 4.0 1.0 0.0 0.0 ... 3.0 15.0 22.0 0.0 0.0 0.0 0.0 13.0 0.0 134.0
d 0.0 0.0 3.0 5.0 36.0 0.0 1.0 3.0 0.0 0.0 ... 4.0 6.0 12.0 0.0 0.0 1.0 0.0 6.0 0.0 85.0
e 0.0 0.0 5.0 3.0 24.0 0.0 0.0 1.0 1.0 0.0 ... 3.0 3.0 16.0 0.0 0.0 1.0 0.0 7.0 0.0 72.0
f 0.0 0.0 1.0 5.0 10.0 0.0 0.0 0.0 0.0 0.0 ... 2.0 7.0 4.0 0.0 0.0 0.0 0.0 1.0 0.0 37.0
g 0.0 0.0 0.0 2.0 3.0 0.0 0.0 1.0 0.0 0.0 ... 0.0 4.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 12.0
h 0.0 0.0 0.0 3.0 1.0 0.0 1.0 0.0 0.0 0.0 ... 0.0 4.0 1.0 0.0 0.0 0.0 0.0 5.0 0.0 17.0
i 0.0 0.0 1.0 2.0 42.0 0.0 0.0 0.0 0.0 0.0 ... 3.0 11.0 18.0 0.0 0.0 0.0 0.0 3.0 0.0 88.0
j 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 3.0
k 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
l 0.0 0.0 2.0 4.0 1.0 0.0 0.0 1.0 0.0 0.0 ... 0.0 3.0 3.0 0.0 0.0 0.0 0.0 3.0 0.0 21.0
m 0.0 0.0 0.0 2.0 9.0 0.0 0.0 1.0 0.0 0.0 ... 1.0 9.0 2.0 0.0 0.0 0.0 0.0 2.0 0.0 33.0
n 0.0 0.0 1.0 0.0 5.0 0.0 0.0 0.0 0.0 0.0 ... 1.0 4.0 4.0 0.0 0.0 0.0 0.0 0.0 0.0 17.0
o 0.0 0.0 0.0 0.0 10.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 8.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 22.0
p 2.0 0.0 5.0 4.0 35.0 0.0 1.0 1.0 0.0 0.0 ... 0.0 12.0 10.0 0.0 0.0 0.0 2.0 11.0 0.0 91.0
q 0.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 1.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 7.0
r 0.0 0.0 0.0 3.0 31.0 0.0 0.0 4.0 0.0 0.0 ... 2.0 2.0 7.0 0.0 0.0 0.0 0.0 0.0 0.0 56.0
s 0.0 0.0 3.0 2.0 15.0 0.0 2.0 0.0 0.0 0.0 ... 0.0 10.0 8.0 0.0 0.0 0.0 0.0 7.0 0.0 51.0
t 0.0 0.0 0.0 3.0 9.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 6.0 5.0 0.0 0.0 0.0 0.0 3.0 0.0 31.0
u 1.0 0.0 0.0 1.0 2.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 2.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 9.0
v 0.0 0.0 1.0 2.0 10.0 0.0 0.0 0.0 0.0 0.0 ... 1.0 5.0 2.0 0.0 0.0 0.0 1.0 3.0 0.0 27.0
w 0.0 0.0 0.0 1.0 2.0 0.0 0.0 1.0 0.0 0.0 ... 1.0 0.0 0.0 0.0 0.0 1.0 0.0 1.0 0.0 10.0
x 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
y 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
z 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 ... 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3.0
Total 6.0 0.0 35.0 55.0 338.0 1.0 11.0 23.0 1.0 0.0 ... 25.0 130.0 136.0 0.0 0.0 3.0 5.0 80.0 0.0 996.0

27 rows × 27 columns

Add a comment
Know the answer?
Add Answer to:
Hi it's my code for python I almost finished my project but only one thing left which is most con...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Four Squares Productions, a firm hired to coordinate the release of the movie Pirates of the...

    Four Squares Productions, a firm hired to coordinate the release of the movie Pirates of the Caribbean: On Stranger Tides (starring Johnny Depp), identified 16 activities to be completed before the release of the film. The appropriate data are shown in the following table: Immediate Predecessor(s) Immediate Predecessor(s) Activity >> 3.0 Time (weeks) Activity a m b 1.0 2.0 4.0 11.0 12.5 14.0 C 9.0 12.0 15.0 D 4.0 5.0 7.0 5.0 7.0 G 3.0 5.0 6.5 4.0 7.7 9.0...

  • USE EXCEL TO CALCULATE THE FREQUENCIES AS SHOWN BELOW. PLEASE PROVIDE EXCEL FORMULA USED. Frequency Distribution...

    USE EXCEL TO CALCULATE THE FREQUENCIES AS SHOWN BELOW. PLEASE PROVIDE EXCEL FORMULA USED. Frequency Distribution Low High Bins Frequency -67.0 -56.6 (-67, -56.6] -56.6 -46.2 (-56.6, -46.2] -46.2 -35.8 (-46.2, -35.8] -35.8 -25.4 (-35.8, -25.4] -25.4 -15.0 (-25.4, -15] -15.0 -4.6 (-15, -4.6] -4.6 5.8 (-4.6, 5.8] 5.8 16.2 (5.8, 16.2] 16.2 26.6 (16.2, 26.6] 26.6 37.0 (26.6, 37] 37.0 47.4 (37, 47.4] 47.4 57.8 (47.4, 57.8] 57.8 68.2 (57.8, 68.2] 68.2 78.6 (68.2, 78.6] 78.6 89.0 (78.6, 89]...

  • analyze this NMR & IR S23 CDC13 QE-300 240 UN (43 MIL.) 10.02s, 1H), 7.716.J-2 Hz....

    analyze this NMR & IR S23 CDC13 QE-300 240 UN (43 MIL.) 10.02s, 1H), 7.716.J-2 Hz. ) 2.0 11.5 11.0 10.5 10.0 9.5 9.0 8.5 8.0 7.5 7.0 6.5 6.0 4.0 3.5 3.0 2.5 20 15 100.5 0.0 -0.5 -1.0 -1.5 -2. 5.5 5.0 4.5 fl (ppm)

  • 3. Based on the integration of the peaks, what is the relative number of protons which...

    3. Based on the integration of the peaks, what is the relative number of protons which make up each signal? 4. Identify any common splitting patterns. (ie. Isopropyl, ethyl, etc) UN (43 MIL.) 10.02s, 1H), 7.716.J-2 Hz. ) 2.0 11.5 11.0 10.5 10.0 9.5 9.0 8.5 8.0 7.5 7.0 6.5 6.0 4.0 3.5 3.0 2.5 20 15 100.5 0.0 -0.5 -1.0 -1.5 -2. 5.5 5.0 4.5 fl (ppm)

  • 7. What kind of carbons correspond to these chemical shifts? 8. Based on this analysis, the unknown might contain the f...

    7. What kind of carbons correspond to these chemical shifts? 8. Based on this analysis, the unknown might contain the following substructure: UN (43 MIL.) 10.02s, 1H), 7.716.J-2 Hz. ) 2.0 11.5 11.0 10.5 10.0 9.5 9.0 8.5 8.0 7.5 7.0 6.5 6.0 4.0 3.5 3.0 2.5 20 15 100.5 0.0 -0.5 -1.0 -1.5 -2. 5.5 5.0 4.5 fl (ppm)

  • 5. Based on this analysis, the compound might be or contain the following substructure: 6. How...

    5. Based on this analysis, the compound might be or contain the following substructure: 6. How many different types of carbons appear to be present? What are the chemical shifts for these carbons? UN (43 MIL.) 10.02s, 1H), 7.716.J-2 Hz. ) 2.0 11.5 11.0 10.5 10.0 9.5 9.0 8.5 8.0 7.5 7.0 6.5 6.0 4.0 3.5 3.0 2.5 20 15 100.5 0.0 -0.5 -1.0 -1.5 -2. 5.5 5.0 4.5 fl (ppm)

  • Fill out the tables below of the starting material and pure product by using the given...

    Fill out the tables below of the starting material and pure product by using the given NMR spectrums. Identify if the pure isomer of methyl nitrobenzoate as ortho, meta, or para. Complete the table below using your proton NMR spectrum of your starting material. Be sure to include all peaks. Note: The table is expandable. Use the structure below for the letter assignments in your table. Splitting Integration Assignment Peak (ppm) Other Notes -7.95 -7.92 0627 -787 785 7.30 751...

  • Below are pictures of my HNMR and CNMR for salenH2. I have already assigned some of...

    Below are pictures of my HNMR and CNMR for salenH2. I have already assigned some of the protons and carbons myself, but I am having trouble with the rest. As far as I know, there are 8 unique carbons and 7 unique hydrogens. The carbons are numbered and the hydrogens are lettered. I have included a labeled picture of the compound on each page. CNMR overall: CNMR zoomed: HNMR overall: HNMR zoomed: Thanks! (thousandths) 0 2.0 4.0 6.0 8.0 10.0...

  • A variety of spectra for an organic compound with molecular formula C10H16O are presented below. The experimental accurate mass using (+) APCI source is 153.1280 u. The 1H, 13C, COSY, HSQC and HMBC NM...

    A variety of spectra for an organic compound with molecular formula C10H16O are presented below. The experimental accurate mass using (+) APCI source is 153.1280 u. The 1H, 13C, COSY, HSQC and HMBC NMR spectra are given in the following slides. Propose a structure for this unknown and answer or address the following questions or requirements: a. Using the most abundant isotopes of C, H and O, what are the errors in ppm and milli-Daltons for the experimental accurate mass?...

  • 1. How many different types of protons appear to be present? What are the chemical shifts...

    1. How many different types of protons appear to be present? What are the chemical shifts for these protons? What does this indicate (if anything) about the electronic environment of the protons? 2. What are the multiplicities for each peak? UN (43 MIL.) 10.02s, 1H), 7.716.J-2 Hz. ) 2.0 11.5 11.0 10.5 10.0 9.5 9.0 8.5 8.0 7.5 7.0 6.5 6.0 4.0 3.5 3.0 2.5 20 15 100.5 0.0 -0.5 -1.0 -1.5 -2. 5.5 5.0 4.5 fl (ppm)

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
Active Questions
ADVERTISEMENT