Skip to main content

Parallel analysis of variants in multiple bear genomes: Visualization of complexity

Project Information

Researcher-facing, domain-specific, bioinformatics, visualization
Project Status: In Progress
Project Region: CAREERS
Submitted By: Galen Collier
Project Email: grigorie@camden.rutgers.edu
Project Institution: Rutgers University–Camden
Anchor Institution: CR-Rutgers
Project Address: 303 Cooper St
Camden, New Jersey. 08102

Mentors: Galen Collier
Students: Bill Ni

Project Description

Introduction: This interdisciplinary proposal focuses on the application of high-performance computing (HPC) approaches and efficient visualization to the analysis of next-generation sequencing (NGS) of the genomes brown and polar bears. It aims to improve and speed up the detection of common variants in cohorts of related genomes to establish evolutionary trajectories of the corresponding species. The work will be performed by a graduate student under the supervision of Dr. Andrey Grigoriev, Professor at the Biology Dept and Center for Computational and Integrative Biology at Rutgers-Camden. Remote work is the most likely mode of operation in this project.

Genomes of all organisms and species undergo constant change and mutations are of varying scales. Structural variants (SVs) typically affect much larger genome intervals compared to single nucleotide variants (SNVs) or short insertions/deletions (indels). Currently, comparative genomics efforts mostly focus on SNV/indels in protein coding regions, while the role of SVs (especially outside those regions) generally remains a mystery. There is an unmet need and a growing interest in understanding the effect of SVs in evolution using NGS.

Project details: The low accuracy of current SV finding pipelines necessitates a visual inspection of found variants. Efficient graphical representation of these variants remains a challenge and existing tools cannot cope with more than 3 samples(1). A flexible data-driven pipeline connecting the output of variant-finding algorithm to the graphical user interface is also needed. This project will include the development of such interface and pipeline. This project will also complement our work on parallel search in multiple samples, combining weaker evidence at similar locations in similar subspecies will further improve SV prediction accuracy compared to the current pipelines based on our algorithm, GROM(2).

References

1. Robinson, J.T., Thorvaldsdóttir, H., Wenger, A.M., Zehir, A. and Mesirov, J.P. (2017) Variant review with the integrative genomics viewer. Cancer research, 77, e31-e34.
2. Smith, S., Kawash, J., Grigoriev, A. (2017) Lightning-fast genome variant detection with GROM. GigaScience 6(10), 1-7.

Project Information

Researcher-facing, domain-specific, bioinformatics, visualization
Project Status: In Progress
Project Region: CAREERS
Submitted By: Galen Collier
Project Email: grigorie@camden.rutgers.edu
Project Institution: Rutgers University–Camden
Anchor Institution: CR-Rutgers
Project Address: 303 Cooper St
Camden, New Jersey. 08102

Mentors: Galen Collier
Students: Bill Ni

Project Description

Introduction: This interdisciplinary proposal focuses on the application of high-performance computing (HPC) approaches and efficient visualization to the analysis of next-generation sequencing (NGS) of the genomes brown and polar bears. It aims to improve and speed up the detection of common variants in cohorts of related genomes to establish evolutionary trajectories of the corresponding species. The work will be performed by a graduate student under the supervision of Dr. Andrey Grigoriev, Professor at the Biology Dept and Center for Computational and Integrative Biology at Rutgers-Camden. Remote work is the most likely mode of operation in this project.

Genomes of all organisms and species undergo constant change and mutations are of varying scales. Structural variants (SVs) typically affect much larger genome intervals compared to single nucleotide variants (SNVs) or short insertions/deletions (indels). Currently, comparative genomics efforts mostly focus on SNV/indels in protein coding regions, while the role of SVs (especially outside those regions) generally remains a mystery. There is an unmet need and a growing interest in understanding the effect of SVs in evolution using NGS.

Project details: The low accuracy of current SV finding pipelines necessitates a visual inspection of found variants. Efficient graphical representation of these variants remains a challenge and existing tools cannot cope with more than 3 samples(1). A flexible data-driven pipeline connecting the output of variant-finding algorithm to the graphical user interface is also needed. This project will include the development of such interface and pipeline. This project will also complement our work on parallel search in multiple samples, combining weaker evidence at similar locations in similar subspecies will further improve SV prediction accuracy compared to the current pipelines based on our algorithm, GROM(2).

References

1. Robinson, J.T., Thorvaldsdóttir, H., Wenger, A.M., Zehir, A. and Mesirov, J.P. (2017) Variant review with the integrative genomics viewer. Cancer research, 77, e31-e34.
2. Smith, S., Kawash, J., Grigoriev, A. (2017) Lightning-fast genome variant detection with GROM. GigaScience 6(10), 1-7.