Development of a child sexual abuse conversation (CSAC) dataset

Mark Warner

EPSRC sub-award (REPHRAIN)

Durham Constabulary and Northumbria Police

This project will lead to advances in our understanding of how perpetrators of child sex grooming engage online with young people through computer-mediated communication tools and platforms (e.g., Facebook, WhatsApp). We will work with Durham and Northumbria Police to develop a child sexual abuse conversation (CSAC) dataset, the data of which we will obtain from forensic evidence extracts of mobile and computer devices legitimately seized as part of criminal investigations into perpetrators of child sex offences. Our work will consist of identifying, acquiring, sanitising, and anonymising data to build the dataset. This will lay the foundations for research into online grooming behaviours, and the development of automated grooming detection tools through future labelling of the dataset for the purpose of language modelling.