Line listings are "The data and information provided at substance level is in the form of an electronic Reaction Monitoring Report (eRMR) containing aggregated data and a line listing with details of the individual cases." (see: glossary).
At first, we started to follow the links in the internet pages and clicked every download individually. We then renamed all downloaded files and used automated routines to process them into a database for further investigation. The processing of the files became easy, but the downloading of them resulted in a lot of manual labour. We discovered that because of the large number of available ICSR’s, this would probably take years.
The error reported was often that a maximum number of records was reached or a timeout took place. So we wanted to find a way to download smaller line listings which would not generate an error. To achieve this, we inspected the query parameters available and tested various ways of setting or unsetting them. We managed to use some to filter the result, f.i. we could download a single year by setting a parameter
"Line Listing Objects”.”Gateway Year” “eq” “2020” which indeed resulted in records for the year of 2020 only. We kept at this until we had defined a couple of interchangeable filters that where usable for both product and substance line listings and made sure we stayed within the maximum number of records. Our scripts, written in python version 3, is able to download line listings as they are available via the portal using the filter options (which are also available through the portal as select options in visual
elements) for further automated processing on a personal computer.
We used two different scripts for the download.
- One script that parses the web pages that show maps/tables giving information about the number of cases per product or substance per country. This script generated the CSV files available in the file
EMA-countries-2023-06-20.zipthese files where uploaded into the database’s public schema as the tables
- Another script parsed the webpages that generate “line listings” resulting in CSV files containing detailed information about ICSR’s. The result are the files available in the files
All scripts are written in python version 3 and available on request. We chose not to disclose the scripts used for downloading line listings publicly, but on request to make sure people running these scripts know what they are doing as we do not want to stress the EMA web application.