Big Data Session 4: Oct. 6, 2021

Big Data in Environmental Science and Toxicology is a 2021 seminar series from the Texas A&M Superfund Research Center (heading image with abstract networking graphic and hands on a laptop)

Download Slide Deck (PDF) | Download Supporting Files (ZIP) (right-click and save file)


Fred Wright
Fred Wright
Burcu Beykal
Burcu Beykal
Allison Dickey
Allison Dickey

Wednesday, Oct. 6, 2021 | 1:00–3:00 p.m. (Central US Time) 
Fred Wright—North Carolina State University, Burcu Beykal—University of Connecticut, and Allison Dickey—North Carolina State University 

Zoom Details: Will be emailed to registrants on the morning of the session

MANIPULATING AND DISPLAYING BIG(ISH) DATA IN R 

This session will provide a tutorial on commonly used and useful aspects of R, using the RStudio interface. Example datasets will be used that are relevant to bench scientists and environmental researchers. We do not assume any prior familiarity with R.

  • Introduction to R
    • An introduction to RStudio and installation of packages
    • Reading data into R in various formats
    • Exploring data types and dimensions
    • Extracting data and identifying missing data
    • Sorting data and using the apply function
    • Merging data frames
  • Data visualization and analysis
    • Plotting/graphics in base R (scatterplots, histograms, boxplots, etc.)
    • Basic summary statistics
    • Basic inferential statistics (e.g. t-tests, ANOVA, multiple test correction)
    • Clustering and dimensional reduction (e.g. PCA)
  • More Advanced Visualization
    • Using ggplot2
    • Customizing plots
    • Spatial displays and maps in ggplot2
    • Interactive plots using plotly

Post about the series on social media and use this hashtag!
#TAMUSuperfundBigData2021