Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.Rproj.user
.Rhistory
.RData
.Ruserdata
16 changes: 16 additions & 0 deletions Exercise09.Rproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Version: 1.0

RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: Sweave
LaTeX: pdfLaTeX

AutoAppendNewline: Yes
StripTrailingWhitespace: Yes
38 changes: 38 additions & 0 deletions Exercise9_Gallo_Script.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Hayden Gallo
# Intro to Biocomputing
# Exercise 09

library(ggplot2)

# Problem 1
#1. Find some data on two variables that you would expect to be related to each other.
#These data can come from your own research, your daily life, or the internet.
#Enter those data into a text file or Excel and then save a text file, and write a script
#that loads this text file and produces a scatter plot of those two variables that includes a trend line.

baseball_data <- read.table('baseball_data.txt',header = TRUE, sep = '\t')

ggplot(data = baseball_data, aes(x = H, y = R)) + geom_point() + geom_smooth(se = FALSE, method ='lm') +
xlab('Hits') + ylab('Runs') + ggtitle('Hits vs. Runs for Professional Baseball Teams')


# Problem 2
# 2. Given the data in “data.txt”. Write a script that generates two figures that sumamrize the data.
# First, show a barplot of the means of the four populations. Second, show a scatter plot of all of the observations.
# You may want to “jitter” the points (geom_jitter()) to make it easier to see all of the observations
# within a population in your scatter plot. Alternatively, you could also try setting the alpha argument
# in geom_scatterplot() to 0.1. Answer the following questions as a comment in your R script - Do the bar
# and scatter plots tell you different stories? Why?

data <- read.table('data.txt', header = TRUE, sep = ',')

data.mean <- aggregate(observations ~ region, data, mean)

ggplot(data.mean, aes(x = region, y = observations)) + geom_bar(stat = 'identity') + ggtitle('Mean of Each of the Four Populations')

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use stat_summary as shown in lecture slides

ggplot(data = data, aes(x = region, y = observations)) + geom_point() + ggtitle('Region vs Observation Value') + geom_jitter()

# Yes, the bar and scatter plots definitely tell different stories. The bar plot shows that the averages
# of the populations are very close to one another, but based on the scatterplot, the spread of the data
# for each of the populations is very different, so it shows that the populations are made up of very different
# observation values but just happen to have means close to one another.
32 changes: 32 additions & 0 deletions baseball_data.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
Tm Bat BatAge R/G G PA AB R H 2B 3B HR RBI SB CS BB SO BA OBP SLG OPS OPS+ TB GDP HBP SH SF IBB LOB
Arizona Diamondbacks 57 26.5 4.33 162 6027 5351 702 1232 262 24 173 658 104 29 531 1341 0.23 0.304 0.385 0.689 95 2061 97 60 31 50 14 1039
Atlanta Braves 53 27.5 4.87 162 6082 5509 789 1394 298 11 243 753 87 31 470 1498 0.253 0.317 0.443 0.761 111 2443 103 66 1 36 13 1030
Baltimore Orioles 58 27 4.16 162 6049 5429 674 1281 275 25 171 639 95 31 476 1390 0.236 0.305 0.39 0.695 97 2119 95 83 12 43 10 1095
Boston Red Sox 54 28.8 4.54 162 6144 5539 735 1427 352 12 155 704 52 20 478 1373 0.258 0.321 0.409 0.731 102 2268 131 63 12 50 23 1133
Chicago Cubs 64 27.9 4.06 162 6072 5425 657 1293 265 31 159 620 111 37 507 1448 0.238 0.311 0.387 0.698 96 2097 130 84 19 36 16 1100
Chicago White Sox 44 29.3 4.23 162 6123 5611 686 1435 272 9 149 654 58 10 388 1269 0.256 0.31 0.387 0.698 97 2172 127 73 16 35 9 1117
Cincinnati Reds 66 29.4 4 162 5978 5380 648 1264 235 18 156 618 58 33 452 1430 0.235 0.304 0.372 0.676 83 2003 127 92 12 33 6 1020
Cleveland Guardians 50 25.9 4.31 162 6163 5558 698 1410 273 31 127 662 119 27 450 1122 0.254 0.316 0.383 0.699 102 2126 115 81 22 52 36 1156
Colorado Rockies 43 29.1 4.31 162 6105 5540 698 1408 280 34 149 669 45 20 453 1330 0.254 0.315 0.398 0.713 90 2203 139 61 10 40 10 1113
Detroit Tigers 53 27.9 3.44 162 5870 5378 557 1240 235 27 110 530 47 24 380 1413 0.231 0.286 0.346 0.632 84 1859 108 58 10 44 8 1015
Houston Astros 45 29.3 4.55 162 6054 5409 737 1341 284 13 214 715 83 22 528 1179 0.248 0.319 0.424 0.743 111 2293 118 60 9 42 18 1068
Kansas City Royals 55 27.1 3.95 162 6010 5437 640 1327 247 38 138 613 104 34 460 1287 0.244 0.306 0.38 0.686 93 2064 101 48 20 44 7 1091
Los Angeles Angels 66 27.9 3.85 162 5977 5423 623 1265 219 31 190 600 77 27 449 1539 0.233 0.297 0.39 0.687 94 2116 95 54 25 25 28 1050
Los Angeles Dodgers 51 29.6 5.23 162 6247 5526 847 1418 325 31 212 812 98 18 607 1374 0.257 0.333 0.442 0.775 112 2441 85 56 3 53 22 1159
Miami Marlins 56 28.8 3.62 162 5949 5395 586 1241 248 20 144 554 122 29 436 1429 0.23 0.294 0.363 0.658 86 1961 120 70 4 36 6 1045
Milwaukee Brewers 51 29.1 4.48 162 6122 5417 725 1271 251 17 219 703 96 30 577 1464 0.235 0.315 0.409 0.724 105 2213 117 80 11 37 25 1102
Minnesota Twins 61 26.9 4.3 162 6113 5476 696 1356 269 18 178 668 38 17 518 1353 0.248 0.317 0.401 0.718 107 2195 133 62 10 46 11 1126
New York Mets 61 29.7 4.77 162 6176 5489 772 1422 272 27 171 735 62 22 510 1217 0.259 0.332 0.412 0.744 113 2261 122 112 20 44 25 1158
New York Yankees 54 30.2 4.98 162 6172 5422 807 1308 225 8 254 764 102 33 620 1391 0.241 0.325 0.426 0.751 113 2311 121 70 14 41 36 1093
Oakland Athletics 64 28.3 3.51 162 5863 5314 568 1147 249 15 137 537 78 23 433 1389 0.216 0.281 0.346 0.626 82 1837 109 59 22 33 7 969
Philadelphia Phillies 56 28.1 4.61 162 6077 5496 747 1392 255 29 205 719 105 28 478 1363 0.253 0.317 0.422 0.739 107 2320 116 52 6 44 15 1075
Pittsburgh Pirates 68 26.3 3.65 162 5912 5331 591 1186 221 29 158 555 89 32 476 1497 0.222 0.291 0.364 0.655 84 1939 96 54 19 32 14 1016
San Diego Padres 55 28.2 4.35 162 6175 5468 705 1317 275 18 153 682 49 22 574 1327 0.241 0.318 0.382 0.7 104 2087 95 65 17 46 24 1174
Seattle Mariners 59 27.5 4.26 162 6117 5375 690 1236 229 19 197 663 83 27 596 1397 0.23 0.315 0.39 0.704 106 2094 120 89 9 45 17 1129
San Francisco Giants 66 30 4.42 162 6117 5392 716 1261 255 18 183 683 64 16 571 1462 0.234 0.315 0.39 0.705 98 2101 109 95 6 53 14 1115
St. Louis Cardinals 51 28.8 4.77 162 6165 5496 772 1386 290 21 197 739 95 25 537 1226 0.252 0.325 0.42 0.745 114 2309 112 80 5 45 11 1132
Tampa Bay Rays 61 27 4.11 162 6008 5412 666 1294 296 17 139 634 95 37 500 1395 0.239 0.309 0.377 0.686 100 2041 93 57 7 31 13 1074
Texas Rangers 55 28 4.36 162 6029 5478 707 1308 224 20 198 670 128 41 456 1446 0.239 0.301 0.395 0.696 98 2166 82 47 10 38 12 1007
Toronto Blue Jays 51 27.1 4.78 162 6158 5555 775 1464 307 12 200 756 67 35 500 1242 0.264 0.329 0.431 0.76 116 2395 136 55 8 33 13 1111
Washington Nationals 55 28.7 3.72 162 5998 5434 603 1351 252 20 136 579 75 31 442 1221 0.249 0.31 0.377 0.688 99 2051 140 60 20 37 12 1099
League Average 50 28.2 4.28 162 6068 5449 694 1322 265 21 174 663 83 27 495 1360 0.243 0.312 0.395 0.706 100 2152 113 68 13 41 16 1087