Data Management & Analysis Core

"Big Data Analytics for Environmental Health Decisions in Emergency Response"

The Data Management & Analysis Core (DMAC) is one of the key components of the Center that will support all projects and cores in their data management, analysis, quality control needs. Directed by Dr. Efstratios N. Pistikopoulos and in collaboration with co-Investigators Dr. Fred A. Wright, Dr. Lan Zhou, and Dr. Candice Brinkmeyer-Langford, the DMAC will provide a number of essential services to the Center’s researchers by assisting them is achieving key environmental and biomedical outcomes under four specific aims: (i) providing a new platform for data management and sharing across the Center, (ii) applying best-practice analysis methods to Center data, (iii) developing new methods that are urgently needed to solve the problems posed in the Projects, and (iv) maintaining research and data quality control protocols for the Center. The DMAC will establish a data universe (“dataverse”) for data sharing, integration, and collaboration. The “dataverse” will be used to manage Center datasets where each component will securely deposit and access data through a web-based platform and ensure Center generated data comply with Findable, Accessible, Interoperable, and Reusable (FAIR) principles. Our central hypothesis remains that the Center, with its interconnected projects and cores that produce multidimensional and variable kinds of data, must be able to conduct corresponding state-of-the-art computational analyses that are accessible, free of erroneous inputs, interpretable, and reusable by all stakeholders. The DMAC responds to Superfund mandate #1: advanced techniques for the detection, assessment, and evaluation of the effect of hazardous substances on human health.

The DMAC of the Texas A&M Superfund Research Center serves as a hub for translating the raw experimental data produced by the Research Projects into useful knowledge to the community via data collection, integration, quality control, analysis, and model generation. The Core utilizes state-of-the-art methods in data science, optimization and machine learning, develops and applies novel dimensionality reduction techniques, and establishes a common data management and computational platform for collaboration within the Center and data dissemination to the Center stakeholders.

Leadership:



Specific aims:

  1. Advance the existing data management infrastructure to create a “dataverse” under the Texas Data Repository (TDR) for sharing, integrating, and disseminating data and data science methods across all projects of the Center and to the wider community.
  2. Provide service and expertise in state-of-the-art methods of statistical analysis, machine learning, data science, and mathematical optimization to support disaster research response (DR2) and successful completion of each project.
  3. Advance and develop high-performance data-driven analysis techniques and frameworks for use by Center projects.
  4. Develop and maintain data quality assurance and control protocols for each Center component and the data management system.