Using a new dataset, select two qualitative variables and two quantitative variables. Explain why you selected these variables.
Analysis:
For your qualitative variables, create a contingency table and calculate the association between them.
For your quantitative variables, calculate the correlation between them. Include a scatter plot to visually represent this relationship.
Interpretation: Explain your findings. What does the association or correlation say about the relationship between your variables? Is the relationship strong, weak, positive, negative, or nonexistent?
Reflection: Reflect on the importance of understanding associations and correlations in data analysis and how they can guide further data investigation.
Submission Format: Your submission should be a maximum of 500-600 words. Submit your assignment in APA format as a Word document or a PDF file. Include your written analysis and any tables or visualizations that support your findings. If you used any software for your calculations (like R, Python, Excel), please include your code or formulas as well. Include an APA-formatted reference list for any external resources used.
Category: R
#Create the following data frame a. use the UNITID as the row name; are you able
#Create the following data frame a. use the UNITID as the row name; are you able to use
#SEGMENT_ID as the row name? b. Set the column SEGMENT_ID as factor
#c.Check the class of the data frame
#UNITID SEGMENT_ID MW PRICE
#AYER band1 135 24
#SHUER band1 230 50
#BUSHA band1 105 26
#MINA band1 97 34
#TEUA band1 300 74
I want all the codes to run properly, then you to knit the file and send me the
I want all the codes to run properly, then you to knit the file and send me the pdf and all the tables.
Once you figure out all acuracy values you can add a table to compare the values of the supervised models, as this one:
Model <- c('Decision Tree-C5.0','Random Forest','kNN','SVM-vanilladot')
Accuracy_percent <- c(88.57,88.32,88.29,88.00)
mytable<- data.frame(Model, Accuracy_percent)
qplot(1:10, 1:10, geom = "blank") + theme(line = element_blank(), text = element_blank()) + annotation_custom(grob = tableGrob(mytable))
I’m working on a r project and need support to help me learn. Hi, I need someon
I’m working on a r project and need support to help me learn. Hi, I need someone to help me correct some issues in my codes for a project presentation. I want all the codes to run properly, then you to knit the file and send me the pdf and all the tables. You can add a model evaluation table of all supervised models listing all the models and their ROC values.This an eample:
Model <- c('Decision Tree-C5.0','Random Forest','kNN','SVM-vanilladot')
Accuracy_percent <- c(88.57,88.32,88.29,88.00)
mytable<- data.frame(Model, Accuracy_percent)
qplot(1:10, 1:10, geom = "blank") + theme(line = element_blank(), text = element_blank()) + annotation_custom(grob = tableGrob(mytable))
Hi, I need someone to help me correct some issues in my codes for a project pres
Hi, I need someone to help me correct some issues in my codes for a project presentation.
I want all the codes to run properly, then you to knit the file and send me the pdf and all the tables.
You can add a model evaluation table of all supervised models listing all the models and their ROC values.This an eample: Model <- c('Decision Tree-C5.0','Random Forest','kNN','SVM-vanilladot')
Accuracy_percent <- c(88.57,88.32,88.29,88.00)
mytable<- data.frame(Model, Accuracy_percent)
qplot(1:10, 1:10, geom = “blank”) + theme(line = element_blank(), text = element_blank()) + annotation_custom(grob = tableGrob(mytable))
Hi, I need someone to help me correct some issues in my codes for a project pres
Hi, I need someone to help me correct some issues in my codes for a project presentation.
I want all the codes to run properly, then you to knit the file and send me the pdf and all the tables.
You can add a model evaluation table of all supervised models listing all the models and their ROC values.This an eample: Model <- c('Decision Tree-C5.0','Random Forest','kNN','SVM-vanilladot')
Accuracy_percent <- c(88.57,88.32,88.29,88.00)
mytable<- data.frame(Model, Accuracy_percent)
qplot(1:10, 1:10, geom = "blank") + theme(line = element_blank(), text = element_blank()) + annotation_custom(grob = tableGrob(mytable))
Hi there, i have a short assignment regarding to Digit heaping, Does not need e
Hi there,
i have a short assignment regarding to Digit heaping, Does not need external resources or any coding
just write one paragraph which contains at least two ideas (methods or approaches) on how to improve data collection for early childhood height and weight data. Remember that there may be a variety of reasons why measurement is imprecise (rounding). Your suggestions can address any of these reasons: difficulty measuring small humans, manual input of data, incentives to rush or manipulate data, carelessness, data fabrication, etc.
please don’t bid unless you are familiar with the subject
No chatgpt or any AI, Must be 100% original
No need for external resources or references.
Thank you
Hi there, i have an assignment regarding to Digit heaping, Please Only research
Hi there,
i have an assignment regarding to Digit heaping, Please Only researchers who holds Master’s Degree bids
PLEASE DON’T BID IF YOU ARE NOT FAMILIAR WITH THE COURSE
NO CHATGPT OR ANY AI, MUST BE 100% ORIGINAL
No need for external resources or references.
Write one paragraph which contains at least two ideas (methods or approaches)
on how to improve data collection for early childhood height and weight
data. Remember that there may be a variety of reasons why measurement
is imprecise (rounding). Your suggestions can address any of these
reasons: difficulty measuring small humans, manual input of data,
incentives to rush or manipulate data, carelessness, data fabrication,
etc.
Exercise 2: Mapping Disease Ecology with Data: Zipped data file from Intro to QG
Exercise 2: Mapping Disease Ecology with
Data: Zipped data file from Intro to QGIS tutorialDownload Zipped data file from Intro to QGIS tutorial
Downloadable Copy of this Assignment
ActionsLearning Objectives:
Use data to produce maps in QGIS
Use proxy data to make a claim about the disease ecology of malaria
Consider the choices cartographers make when creating a map
Practice putting map elements together to make a cohesive argument through a map
Description:
Students are expected to complete three exercises over the course of the quarter. This is the second of those. In this exercise, students are introduced to data manipulation and mapping in QGIS. QGIS is an open-source GIS mapping and spatial analysis platform with growing usage around the world.
This exercise uses proxy data about the climactic and environmental conditions associated with malaria and other parasitic diseases to predict disease risk in Kenya. Effectively, we are building a disease ecology model and using it to map expected risk of infections. Because we are dealing with environmental proxies, we will be working primarily with raster data. This exercise follows the QGIS tutorial introduced in section on October 26th which was designed to illustrate how we can construct ecological models to predict disease and give you some introductory skills in QGIS such that you can use it to both produce maps and conduct spatial analysis.
This exercise is worth 10% of your course grade and will be graded out of 10 points.
Instructions:
Using the QGIS tutorial introduced in section on Thursday, October 26th and available at https://canvas.uw.edu/courses/1666622/pages/intro-to-qgis-tutorial, answer the questions that follow. After you have made your two maps (one each in response to the two questions below), you are asked to reflect on the experience following the prompt.
Question 1: In the Intro to QGIS tutorial, we walked through modeling year-round malaria risk in Kenya. In that model, we simply looked at whether malaria was present or absent, but of course even in areas where malaria is found, it isn’t necessarily found in the same amount. You are now being tasked with making a map of malaria risk in Kenya in May, showing areas of high, medium, low, and no risk. You’ll need to use all of the exclusionary factors identified in the introductory tutorial (elevation, humidity, and temperature) to identify areas of no risk. Remember that the mosquitoes that carry malaria:
Live in areas that are below 1500m in elevation
Are active when temperatures are between 21 and 32°C (NOTE: when assessing year-round risk, we only excluded areas that were too hot or too cold all year round. Since we are now looking at a specific month and assessing how active mosquitoes are in that month, we can exclude areas that are too hot or too cold during the month of May using the maximum and minimum May temperatures.)
Can reproduce if humidity is above 60%
Because mosquitoes breed faster and thus bite more at higher temperatures (as long as it isn’t too hot!), you’ll then use the following temperature ranges to assess malaria risk:
High risk: average monthly temp >28
Medium risk: average monthly temp 25-27.9
Low risk: average monthly temp <24.9
Note: this intentionally requires you to think about how you can use Raster Calculator to create three different risk bands. If you get stuck, feel free to ask for help, but please try to think it through for at least a couple minutes on your own first. You’ll also need to think about how to symbolize your final map as you have a couple of different options depending on how you created your risk bands. Ultimately, it is up to you how you make your map, but you will be assessed on whether the bands are correct and on how clearly your map conveys the information. As you work, make note of the choices you are making to include in your reflection (see below).
Question 2: Another disease that is common in Kenya are intestinal parasites. One type of these parasites, nematodes, move through the soil during parts of their lifecycle so require particular climactic conditions to become endemic in a region. First, they are temperature sensitive and require temperatures between 15 and 25 degrees Celsius to survive (Remember that like mosquitoes, since they have a relatively short lifecycle, there are places where they will be endemic during parts of the year, even if it is too hot or cold during other parts of the year). Second, they require 6% or more soil moisture which is found in places with at least 65% humidity. Given these two constraints, construct a binary (risk / no risk) map of where in Kenya nematode infection is likely to be endemic. Once again, it is up to you how you make your map, but you will be assessed on whether the information it communicates is correct and on how clearly your map conveys the information. As you work, make note of the choices you are making to include in your reflection (see below).
Question 3: Finally, please write a reflection (250 – 400 words) that considers:
What choices did you make as a cartographer?
Why / how did you make them?
How did the technology you were using (QGIS, raster data) constrain or support those choices?
What impact did the choices you made and the technology and data you used have on the story your maps tell?
Note: please do not simply copy and paste your reflection from Exercise 1 here and change R to QGIS. We will be looking for your reflection to be specific to this technology and these maps. If you wish, you are welcome to compare the choices you made here to those you made in Exercise 1.
Data:
You can find all of the data you will need for this exercise in the zipped file on Canvas that we used for the Intro to QGIS tutorial. That zipped file can also be downloaded here Download downloaded here.
Rubric
Exercise 2 Rubric
Exercise 2 Rubric
CriteriaRatingsPts
This criterion is linked to a Learning OutcomeQuestion 1How well does your map communicate the information and answer the question.
4 ptsExcellent
Map does a great job of answering the question. Information is presented clearly and effectively. Map is aesthetically pleasing, technically correct, and does not include superfluous map elements.
3 ptsGood
Map answers the question, but could be a little clearer. It may contain superfluous map elements or distracting map features, but it does answer the question and is technically correct.
2 ptsAcceptable
Map contains information relevant to the question, but doesn't fully answer it. Map may lack clarity or information presented may not be correct.
1 ptsAttempted
If you attempt to make a map, you will earn at least one point, even if you do not successfully answer the question with a map.
0 ptsNo Map/Answer Included
4 pts
This criterion is linked to a Learning OutcomeQuestion 2How well does your map communicate the information and answer the question.
4 ptsExcellent
Map does a great job of answering the question. Information is presented clearly and effectively. Map is aesthetically pleasing, technically correct, and does not include superfluous map elements.
3 ptsGood
Map answers the question, but could be a little clearer. It may contain superfluous map elements or distracting map features, but it does answer the question and is technically correct.
2 ptsAcceptable
Map contains information relevant to the question, but doesn't fully answer it. Map may lack clarity or information presented may not be correct.
1 ptsAttempted
If you attempt to make a map, you will earn at least one point, even if you do not successfully answer the question with a map.
0 ptsNo Map/Answer Included
4 pts
This criterion is linked to a Learning OutcomeQuestion 3Does your reflection touch on all of the points of Question 3 and present a coherent reflection on the exercise process and the limitations of the technology.
2 ptsExcellent
Reflection shows significant thought and addresses all aspects of the question. Reflection is clear and brings in terms and/or concepts from class.
1 ptsAcceptable
Reflection addresses the prompts in the question but may not be complete or could be strengthened with more attention.
0 ptsNo Reflection Included
2 pts
Total Points: 10
A researcher wishes to predict the selling price of houses of City ABC using mul
A researcher wishes to predict the selling price of houses of City ABC using multiple linear
regression model. A random sample of 543 houses located in the non-core area that is sold in last
year were randomly selected to form a dataset “Housing.csv”.
The dataset includes the following eight variables:
Variable Description
price Price of the houses in $
area Area of a house in square feet
bedrooms Number of house bedrooms
bathrooms Number of bathrooms
stories Number of house stories
mainroad Whether connected to mainroad (Yes/No)
basement Whether has a basement (Yes/No)
parking Number of house parking
The dependent variable is “house_ price”. The “real_estate_valuation.csv” dataset can be downloaded
(a) Utilize R to determine the multiple linear regression model to predict the Price of houses by
considering which independent variable(s) be included in the model among the other given
variables using stepwise regression (forward). You are expected to perform relevant model
checking including relevant graphs plotting after the desired model is formulated. All R
programs must be included in the answer and marks will be deducted if failing to do so.
(40 marks)
(b) Perform relevant hypothesis testing to assess the validity of the multiple linear regression model
obtained as well as the validity of individual regression coefficients. (5 marks)
(c) Interpret the regression coefficients of the model. (5 marks)
(d) Write a reflective journal of not more than 200 words that summarizes your learning experience
in applying knowledge and skills acquired in the course to build the regression model for the
given problem, and that explain how this experience could enrich your ability to apply course
knowledge to real life applications. (10 marks)