Question

One recurring idea in software engineering research is using various kinds of data to try to...

  1. One recurring idea in software engineering research is using various kinds of data to try to identify "problematic" areas of a codebase: classes/methods that seem to have notably higher defect density (more bugs/lines of code) than elsewhere, areas that tend to be tied to flaky tests (those that fail nondeterministically for no immediately apparent reason), etc. How might you try to gather such data for a project using version control and issue trackers? What are some of the major caveats of the approach you're suggesting? (Using this kind of data is inherently heurstic, but can still provide useful results, with caveats.)
  2. Consider some problematic ways to use this data.
    • One classic example is to add up the lines of code written by a developer in some period of time, based on changes in version control, and use that to measure "productivity" (and thus, to influence raises and promotion). There are all sorts of other problems like this:
    • More code is not necessarily better (maybe it's longer because it's poorly written)
    • Does removing code count negatively? (If so, anyone unlucky enough to be the one who deletes dead code from a removed feature is in for a bad performance review.)

What are other problematic ways this data could be misused to draw incorrect conclusions, and why are they problematic?

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Misleading statistics are simply the misusage – purposeful or not – of a numerical data. The results provide a misleading information to the receiver, who then believes something wrong if he or she does not notice the error or the does not have the full data picture.

Given the importance of data in today’s rapidly evolving digital world, it is important to be familiar with the basics of misleading statistics and oversight. As an exercise in due diligence, we will review some of the most common forms of misuse of statistics, and various alarming (and sadly, common) misleading statistics examples from public life.

Here are a few potential mishaps that commonly lead to misuse:

  • Faulty polling

The manner in which questions are phrased can have a huge impact on the way an audience answers them. Specific wording patterns have a persuasive effect and induce respondents to answer in a predictable manner. For example, on a poll seeking tax opinions, let’s look at the two potential questions:

– Do you believe that you should be taxed so other citizens don’t have to work?
– Do you think that the government should help those people who cannot find work?

  • Flawed correlations

The problem with correlations is this: if you measure enough variables, eventually it will appear that some of them correlate. As one out of twenty will inevitably be deemed significant without any direct correlation, studies can be manipulated (with enough data) to prove a correlation that does not exist or that is not significant enough to prove causation.

  • Data fishing

This misleading data example is also referred to as “data dredging” (and related to flawed correlations). It is a data mining technique where extremely large volumes of data are analyzed for the purposes of discovering relationships between data points. Seeking a relationship between data isn’t a data misuse per se, however, doing so without a hypothesis is.

  • Misleading data visualization

Insightful graphs and charts include very basic, but essential, grouping of elements. Whatever the types of data visualization you choose to use, it must convey:

– The scales used
– The starting value (zero or otherwise)
– The method of calculation (e.g., dataset and time period)

  • Purposeful and selective bias

The last of our most common examples for misuse of statistics and misleading data is, perhaps, the most serious. Purposeful bias is the deliberate attempt to influence data findings without even feigning professional accountability. Bias is most likely to take the form of data omissions or adjustments.

  • Using percentage change in combination with a small sample size

Another way of creating misleading statistics, also linked with the choice of sample discussed above, is the size of said sample. When an experiment or a survey is led on a totally not significant sample size, not only will the results be unusable, but the way of presenting them – namely as percentages – will be totally misleading.

Add a comment
Know the answer?
Add Answer to:
One recurring idea in software engineering research is using various kinds of data to try to...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • starter code To write a program using the starter code which is TestLinkedList to see if...

    starter code To write a program using the starter code which is TestLinkedList to see if the LinkedList program has bugs. It will produce ether a pass or fail.More information is in the first two pictures. LinkedList.java /** * @author someone * * Implements a double-linked list with four errors */ public class LinkedList<E> { // The first and last nodes in the list private Node<E> head, tail; // Number of items stored in the list private int size; //...

  • Using C programming language Question 1 a) through m) Exercise #1: Write a C program that...

    Using C programming language Question 1 a) through m) Exercise #1: Write a C program that contains the following steps (make sure all variables are int). Read carefully each step as they are not only programming steps but also learning topics that explain how functions in C really work. a. Ask the user for a number between 10 and 99. Write an input validation loop to make sure it is within the prescribed range and ask again if not. b....

  • #include <iostream> #include <iomanip> #include <vector> using namespace std; Part 1. [30 points] In this part,...

    #include <iostream> #include <iomanip> #include <vector> using namespace std; Part 1. [30 points] In this part, your program loads a vending machine serving cold drinks. You start with many foods, some are drinks. Your code loads a vending machine from foods, or, it uses water as a default drink. Create class Drink, make an array of drinks, load it and display it. Part 1 steps: [5 points] Create a class called Drink that contains information about a single drink. Provide...

  • If you’re using Visual Studio Community 2015, as requested, the instructions below should be exact but...

    If you’re using Visual Studio Community 2015, as requested, the instructions below should be exact but minor discrepancies may require you to adjust. If you are attempting this assignment using another version of Visual Studio, you can expect differences in the look, feel, and/or step-by-step instructions below and you’ll have to determine the equivalent actions or operations for your version on your own. INTRODUCTION: In this assignment, you will develop some of the logic for, and then work with, the...

  • C LANGUAGE. PLEASE INCLUDE COMMENTS :) >>>>TheCafe V2.c<<<< #include ...

    C LANGUAGE. PLEASE INCLUDE COMMENTS :) >>>>TheCafe V2.c<<<< #include <stdio.h> int main() { int fries; // A flag denoting whether they want fries or not. char bacon; // A character for storing their bacon preference. double cost = 0.0; // The total cost of their meal, initialized to start at 0.0 int choice; // A variable new to version 2, choice is an int that will store the // user's menu choice. It will also serve as our loop control...

  • The discussion: 150 -200 words. Auditing We know that computer security audits are important in business....

    The discussion: 150 -200 words. Auditing We know that computer security audits are important in business. However, let’s think about the types of audits that need to be performed and the frequency of these audits. Create a timeline that occurs during the fiscal year of audits that should occur and “who” should conduct the audits? Are they internal individuals, system administrators, internal accountants, external accountants, or others? Let me start you: (my timeline is wrong but you should use some...

  • Risk management in Information Security today Everyday information security professionals are bombarded with marketing messages around...

    Risk management in Information Security today Everyday information security professionals are bombarded with marketing messages around risk and threat management, fostering an environment in which objectives seem clear: manage risk, manage threat, stop attacks, identify attackers. These objectives aren't wrong, but they are fundamentally misleading.In this session we'll examine the state of the information security industry in order to understand how the current climate fails to address the true needs of the business. We'll use those lessons as a foundation...

  • The following are screen grabs of the provided files Thanks so much for your help, and have a n...

    The following are screen grabs of the provided files Thanks so much for your help, and have a nice day! My Java Programming Teacher Gave me this for practice before the exam, butI can't get it to work, and I need a working version to discuss with my teacher ASAP, and I would like to sleep at some point before the exam. Please Help TEST QUESTION 5: Tamagotchi For this question, you will write a number of classes that you...

  • Using the book, write another paragraph or two: write 170 words: Q: Compare the assumptions of...

    Using the book, write another paragraph or two: write 170 words: Q: Compare the assumptions of physician-centered and collaborative communication. How is the caregiver’s role different in each model? How is the patient’s role different? Answer: Physical-centered communication involves the specialists taking control of the conversation. They decide on the topics of discussion and when to end the process. The patient responds to the issues raised by the caregiver and acts accordingly. On the other hand, Collaborative communication involves a...

  • what discuss can you make about medicalization and chronic disease and illness? Adult Lealth Nursing Ethics...

    what discuss can you make about medicalization and chronic disease and illness? Adult Lealth Nursing Ethics mie B. Butts OBJECTIVES After reading this chapter, the reader should be able to do the following: 1. Explore the concept of medicalization as it relates to the societal shift away from physician predominance of the 1970s. 2. Differentiate among the following terms: compliance, noncompliance, adherence, nonadherence, and concordance. 3. Examine cultural views with regard to self-determination, decision making, and American healthcare professionals' values...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT