Skip to main content

data-analysis

Mentors and Regional Facilitators
Name Region Skills Interests
Andrew Fullard Campus Champions
Alyssa Pivirotto ACCESS CSSN, Campus Champions
Alana Romanella Campus Champions
Craig Gross Campus Champions, CCMNet
Bala Desinghu ACCESS CSSN, Campus Champions, CAREERS, Northeast
diana Trotman CAREERS
Deborah Penchoff Campus Champions
David Ryglicki
Daniel Sierra-Sosa Campus Champions
Fernando Garzon ACCESS CSSN
Feseha Abebe-Akele CCMNet
Georgia Stuart TRECIS
Iman Rahbari Campus Champions, ACCESS CSSN
Jacob Fosso Tande ACCESS CSSN, Campus Champions, CCMNet
Jason Yalim Campus Champions
Katia Bulekova ACCESS CSSN, Campus Champions, CAREERS, CCMNet, Northeast
Laura Christopherson Campus Champions, CCMNet
Luis Cueva Parra Campus Champions, CCMNet
shuai liu ACCESS CSSN
Mohsen Ahmadkhani CCMNet, ACCESS CSSN
Mahmoud Parvizi Campus Champions
Michael Puerrer Campus Champions, Northeast
Maryam Taeb
Nannan Shan CCMNet, ACCESS CSSN
Paul Rulis Campus Champions
Rebecca Belshe Campus Champions, CCMNet
Russell Hofmann ACCESS CSSN, CCMNet
Xiaoqin Huang ACCESS CSSN
Liwen Shih Campus Champions, ACCESS CSSN
Swabir Silayi ACCESS CSSN, CCMNet, Campus Champions
Suhong Li CAREERS, ACCESS CSSN
Sathish Srinivasan ACCESS CSSN
Thomas Pranzatelli
Yun Shen CAREERS, Northeast, ACCESS CSSN, CCMNet
Users
Name Roles Skills Interests
Amin Sepehri
student facilitator
Anish Gaikwad
student facilitator
Ahamed Arafaat…
student facilitator
Arvind Sai Che…
student facilitator
Carlos Paniagua
researcher/educator
Bala Desinghu
mentor
researcher/educator
rcf
diana Trotman
student facilitator
rcf
mentee
Denis Kornev
researcher/educator
Emma Strand
student facilitator
cheng fang
student facilitator
Yuanyuan Zeng
student facilitator
Gabrielle Martinez
student facilitator
Chris Hemme
researcher/educator
Hening Cui
student facilitator
Ifeoma Ugwuanyi
student facilitator
Rui Feng Ye
student facilitator
Katia Bulekova
mentor
rcf
Katie Salas
student facilitator
Lenore Martin
researcher/educator
Matt Ferguson
researcher/educator
Minxiu Shi
student facilitator
Mohamed Eltayeb
student facilitator
Parth Shah
student facilitator
Qingdou Han
student facilitator
RAM PALASETTY
student facilitator
Ryan De Lorenzo
student facilitator
Sanguthevar Ra…
researcher/educator
Shagesh Sridharan
student facilitator
Suhong Li
mentor
researcher/educator
Stanley Nwoji
researcher/educator
Som Bishoyi
student facilitator
Seth Rosen
student facilitator
Sydney Shearer
student facilitator
Ying-Chih Sun
student facilitator
William Feng
student facilitator
Yun Shen
mentor
Yuming Ding
student facilitator
Zoe Reich
student facilitator
Blog Entries
There are no Blog Entries associated with this topic.

Affinity Groups

Logo Name Description Tags Join
Launch Launch is a regional computational resource that supports researchers incorporating computational and data-enabled approaches in their scientific workflows at eleven under-resourced institutions in… Login to join

Topics from Ask.CI

Loading topics from Ask.CI ...

Engagements

Investigation of robustness of state of the art methods for anxiety detection in real-world conditions
University of Illinois at Urbana-Champaign

I am new to ACCESS. I have a little bit of past experience running code on NCSA's Blue Waters. As a self-taught programmer, it would be interesting to learn from an experienced mentor. 

Here's an overview of my project:

Anxiety detection is topic that is actively studied but struggles to generalize and perform outside of controlled lab environments. I propose to critically analyze state of the art detection methods to quantitatively quantify failure modes of existing applied machine learning models and introduce methods to robustify real-world challenges. The aim is to start the study by performing sensitivity analysis of existing best-performing models, then testing existing hypothesis of real-world failure of these models. We predict that this will lead us to understand more deeply why models fail and use explainability to design better in-lab experimental protocols and machine learning models that can perform better in real-world scenarios. Findings will dictate future directions that may include improving personalized health detection, careful design of experimental protocols that empower transfer learning to expand on existing reach of anxiety detection models, use explainability techniques to inform better sensing methods and hardware, and other interesting future directions.

Status: Complete
Prediction of Polymerization of the Yersinia Pestis Type III Secretion System
Nova Southeastern University

Yersinia pestis, the bacterium that causes the bubonic plague, uses a type III secretion system (T3SS) to inject toxins into host cells. The structure of the Y. pestis T3SS needle has not been modeled using AI or cryo-EM. T3SS in homologous bacteria have been solved using cryo-EM. Previously, we created possible hexamers of the Y. pestis T3SS needle protein, YscF, using CollabFold and AlphaFold2 Colab on Google Colab in an effort to understand more about the needle structure and calcium regulation of secretion. Hexamers and mutated hexamers were designed using data from a wet lab experiment by Torruellas et. al (2005). T3SS structures in homologous organisms show a 22 or 23mer structure where the rings of hexamers interlocked in layers. When folding was attempted with more than six monomers, we observed larger single rings of monomers. This revealed the inaccuracies of these online systems. To create a more accurate complete needle structure, a different computer software capable of creating a helical polymerized needle is required. The number of atoms in the predicted final needle is very high and more than our computational infrastructure can handle. For that reason, we need the computational resources of a supercomputer. We have hypothesized two ways to direct the folding that have the potential to result in a more accurate needle structure. The first option involves fusing the current hexamer structure into one protein chain, so that the software recognizes the hexamer as one protein. This will make it easier to connect multiple hexamers together. Alternatively, or additionally the cryo-EM structures of the T3SS of Shigella flexneri and Salmonella enterica Typhimurium can be used as models to guide the construction of the Y. pestis T3SS needle. The full AlphaFold library or a program like RoseTTAFold could help us predict protein-protein interactions more accurately for large structures. Based on our needs we have identified the TAMU ACES, Rockfish and Stampede-2 as promising resources for this project. The generated model of the Y. pestis T3SS YscF needle will provide insight into a possible structure of the needle. 

Status: Complete
Bayesian nonparametric ensemble air quality model predictions at high spatio-temporal daily nationwide  1 km grid cell
Columbia University

I aim to run a Bayesian Nonparametric Ensemble (BNE) machine learning model implemented in MATLAB. Previously, I successfully tested the model on Columbia's HPC GPU cluster using SLURM. I have since enabled MATLAB parallel computing and enhanced my script with additional lines of code for optimized execution. 

I want to leverage ACCESS Accelerate allocations to run this model at scale.

The BNE framework is an innovative ensemble modeling approach designed for high-resolution air pollution exposure prediction and spatiotemporal uncertainty characterization. This work requires significant computational resources due to the complexity and scale of the task. Specifically, the model predicts daily air pollutant concentrations (PM2.5​ and NO2 at a 1 km grid resolution across the United States, spanning the years 2010–2018. Each daily prediction dataset is approximately 6 GB in size, resulting in substantial storage and processing demands.

To ensure efficient training, validation, and execution of the ensemble models at a national scale, I need access to GPU clusters with the following resources:

  • Permanent storage: ≥100 TB
  • Temporary storage: ≥50 TB
  • RAM: ≥725 GB

In addition to MATLAB, I also require Python and R installed on the system. I use Python notebooks to analyze output data and run R packages through a conda environment in Jupyter Notebook. These tools are essential for post-processing and visualization of model predictions, as well as for running complementary statistical analyses.

To finalize the GPU system configuration based on my requirements and initial runs, I would appreciate guidance from an expert. Since I already have approval for the ACCESS Accelerate allocation, this support will help ensure a smooth setup and efficient utilization of the allocated resources.

Status: Complete

People with Expertise

Deborah Penchoff

University of Tennessee - Knoxville

Programs

Campus Champions

Roles

research computing facilitator

Placeholder headshot

Expertise

Ahamed Arafaath Muthalif Mubarak Ali

New Jersey Institute of Technology

Programs

CAREERS

Roles

student-facilitator

Placeholder headshot

Expertise

+20 more tags

Kyle Randall

Programs

ACCESS CSSN

Roles

student-facilitator

NASA Langley Picture

Expertise

People with Interest

Elie Alhajjar

Programs

ACCESS CSSN, CCMNet

Roles

mentor, cssn, Consultant, CCMNet

Dr. Elie Alhajjar

Interests

Mohsen Ahmadkhani

Programs

CCMNet, ACCESS CSSN

Roles

student-facilitator, mentor, cssn, CCMNet

Mohsen Ahmadkhani

Interests

Adedeji Adekunle

Rutgers University, Camden

Programs

CAREERS

Roles

student-facilitator

Interests