Criminal History Record Assessment & Research Program
Problem
Studying recidivism is costly and lacks efficiency.
For years, The Bureau of Justice Statistics (BJS) has used information stored in the nation’s automated criminal history records to assess the officially recognized, law-violating behavior of various samples of individuals. To conduct recidivism studies, BJS provided state criminal history repositories with identifying information on study subjects and requested each participating state repository to extract selected information on each subject’s criminal justice activities. Because the structure and content of the data extracted from these repositories varies from state to state, it required a significant amount of manual review and coding to transform each state’s data into a commonly formatted, researchable database.
Solution
State-of-the-art rule system and natural language processing.
NORC developed a state-of-the-art processing rule system (UI) to standardize criminal history records to code criminal records by offense type, charge severity, agency type, and sentence length, among others. The system allows for rule development of both standardizations of offenses as well as mathematical formulas to calculate sentence lengths. The UI performs sophisticated error checking and evaluation, ensuring accuracy.
NORC also utilizes natural language processing (NLP) to predict offense codes. Using Natural Language Took Kit (NLTK) and SciKit-Learn, NORC developed an expert system that integrates forms of artificial intelligence and text analytics methods to facilitate faster and more accurate coding of offenses. This expert system uses NLTK and its NLP abilities to process the extensive dataset of offense crosswalks that NORC developed under the CCHRRD award into ground truth datasets that SciKit-Learn can then transform into predictive models using a variety of the machine learning algorithms available within the module.
Result
NORC’s system reduces research costs and increases efficiency.
By integrating these features into a comprehensive system, NORC’s software can yield an extremely high predictive modeling accuracy in converting a diverse range of state offense and sentencing variables, adding a tremendous amount of efficiency and accuracy to the conversion process. The Criminal History Record Assessment and Research Program (CHRARP) enhances the Department of Justice’s research capabilities and allows BJS to conduct recidivism and criminal career research far more cost-efficiently and, as a result, more frequently.