CI Pathways: HPC Data Science with Pandas & SQL
This session will cover traditional tools for data manipulation, with a focus on Pandas, a widely-used Python library, and SQL, a common relational database management system. Participants will learn about the features and practical applications of these popular data management tools, as well as their limitations.
Pre-requisites:
- Basic Python programming, such as NumPy/SciPy
- Basic Linux shell commands
- To participate in the hands-on exercises, you must know the basics of using NCSA's Delta
CI Pathways is a training program led by the National Center for Supercomputing Applications and the Pittsburgh Supercomputing Center funded by NSF award 2417789. For more information about the program, please visit the CI Pathways webpage on HPC-Moodle.