

The URL is part of the assignment, it is a web scraper in Wiki.
Below is a screen shot of the python program to check indentation. Comments are given on every line explaining the code.Below is the output of the program:
Below is the code to copy: #CODE STARTS HERE----------------
import re
import urllib.request
from bs4 import BeautifulSoup
#Add the list of urls here
urls = ['https://en.wikipedia.org/wiki/AA', 'https://en.wikipedia.org/wiki/AB',
'https://en.wikipedia.org/wiki/AC','https://en.wikipedia.org/wiki/ZY']
web_text = "" #Used to store all the text
for url in urls:
res = urllib.request.urlopen(url) #Make a request to the url
soup = BeautifulSoup(res.read(),'html.parser') #Convert the html into soup element
web_text = web_text+soup.text #Get only the text and leave all the html tags
#Remove punctuations and digits
punc = '''!()-[]{};:'"\, <>./?@#$%^&*_~123456789'''
for ele in web_text: #loop through each character
if ele in punc: #Check if its a punctuation
web_text = web_text.replace(ele, " ") #Replace punctuation by " "
counts = dict() #Dictionary for counter
for word in re.findall(r'\b\S+\b', web_text): #Find words using regex
word = word.lower() #Make the words into lowercase
counts[word] = counts.get(word, 0) + 1 #Count the words
#Sort the words and print the 15 most common words
for k, v in sorted(counts.items(), key=lambda item: item[1], reverse=True)[:15]:
print(v,k)
#CODE ENDS HERE------------------The URL is part of the assignment, it is a web scraper in Wiki. A' Read...
Write a batch script, which combines a few tools in Linux to finish a big-data processing task --- finding out most frequently used words on Wikipedia pages. The execution of the script generates a list of distinct words used in the wikipedia pages and the number of occurrences of each word on these web pages. The words are sorted by the number of occurrences in ascending order. The following is a sample of output generated for 4 Wikipedia pages. 126...
I will like to compare automobile producers. This assignment
suppose to read data like div tags a etc. And count occurrence of
them.
Reading from a URL while working with an API (using
Mediawiki API as an example)
Input: Will be obtained from a URL using
Mediawiki API -- starter code below
Output: Up to you... sort of.
What to submit: Upload a report (.pdf
preferred) containing screenshots of code, output, and
discussion/conclusions to d2l dropbox. Please also submit your...
In Java please
Only use methods in the purpose.
Thank you
The purpose of this assignment is to help you learn Java identifiers, assignments, input/output nested if and if/else statements, switch statements and non-nested loops. Purpose Question 2-String variables/Selection & loops. (8.5 points) Write a complete Java program which prompts the user for a sentence on one line where each word is separated by one space, reads the line into one String variable using nextline), converts the string into Ubbi...
In this assignment, you will explore more on text analysis and an elementary version of sentiment analysis. Sentiment analysis is the process of using a computer program to identify and categorise opinions in a piece of text in order to determine the writer’s attitude towards a particular topic (e.g., news, product, service etc.). The sentiment can be expressed as positive, negative or neutral. Create a Python file called a5.py that will perform text analysis on some text files. You can...
CIS 221 Loan Calculator Enhancement Introduction You are a systems analyst working for a company that provides loans to customers. Your manager has asked you to enhance and correct their existing Loan Calculator program, which is designed to calculate monthly and total payments given the loan amount, the annual interest rate, and the duration of the loan. Although the current version of the program (hereby termed the “As Is” version) has some functionality, there are several missing pieces, and the...
JAVA Primitive Editor (Please help, I am stuck on this assignment which is worth a lot of points. Make sure that the program works because I had someone answer this incorrectly!) The primary goal of the assignment is to develop a Java based primitive editor. We all know what an editor of a text file is. Notepad, Wordpad, TextWrangler, Pages, and Word are all text editors, where you can type text, correct the text in various places by moving the...
JAVA Primitive Editor The primary goal of the assignment is to develop a Java based primitive editor. We all know what an editor of a text file is. Notepad, Wordpad, TextWrangler, Pages, and Word are all text editors, where you can type text, correct the text in various places by moving the cursor to the right place and making changes. The biggest advantage with these editors is that you can see the text and visually see the edits you are...
Python program This assignment requires you to write a single large program. I have broken it into two parts below as a suggestion for how to approach writing the code. Please turn in one program file. Sentiment Analysis is a Big Data problem which seeks to determine the general attitude of a writer given some text they have written. For instance, we would like to have a program that could look at the text "The film was a breath of...
Recursion and Trees Application – Building a Word Index Make sure you have read and understood · lesson modules week 10 and 11 · chapters 9 and 10 of our text · module - Lab Homework Requirements before submitting this assignment. Hand in only one program, please. Background: In many applications, the composition of a collection of data items changes over time. Not only are new data items added and existing ones removed, but data items may be duplicated. A list data structure...