Skip to main content

Statistical Analysis of criminal cases in the United States District Court of Puerto Rico

Project Information

Project Status: Complete
Project Region: CAREERS
Submitted By: Gaurav Khanna
Project Email: mchou@providence.edu
Project Institution: Providence College
Anchor Institution: CR-University of Rhode Island

Mentors: Michael Chou
Students: Emily Gelchie

Project Description

For the purposes of submitting an amicus brief to the US Supreme Court, the Puerto Rico Association of Criminal Defense Lawyers (PRACDL) compiled several indictments and docket sheets from the PACER system. Data from these documents were extracted and analyzed with sociodemographic data from the US Census. The wealth of data contained in these documents is not easily accessible for statistical study. The goal of this project is two-fold. First, to write script to data mine these documents for information including but not limited to: the length of time that the case is "open", the percentage of persons represented by a court-appointed attorney, the average length of sentences, the number of persons granted bail, the number of persons with bail violations and the reasons for those violations, among others. Secondly, data science techniques will be used to provide insightful visualizations and detect correlation between these various categories. An understanding of these data will facilitate related future social justice projects in this jurisdiction, as well as apply to other indictment and docket sheets from the PACER system at large.

Project Information

Project Status: Complete
Project Region: CAREERS
Submitted By: Gaurav Khanna
Project Email: mchou@providence.edu
Project Institution: Providence College
Anchor Institution: CR-University of Rhode Island

Mentors: Michael Chou
Students: Emily Gelchie

Project Description

For the purposes of submitting an amicus brief to the US Supreme Court, the Puerto Rico Association of Criminal Defense Lawyers (PRACDL) compiled several indictments and docket sheets from the PACER system. Data from these documents were extracted and analyzed with sociodemographic data from the US Census. The wealth of data contained in these documents is not easily accessible for statistical study. The goal of this project is two-fold. First, to write script to data mine these documents for information including but not limited to: the length of time that the case is "open", the percentage of persons represented by a court-appointed attorney, the average length of sentences, the number of persons granted bail, the number of persons with bail violations and the reasons for those violations, among others. Secondly, data science techniques will be used to provide insightful visualizations and detect correlation between these various categories. An understanding of these data will facilitate related future social justice projects in this jurisdiction, as well as apply to other indictment and docket sheets from the PACER system at large.