We want to know about unique reports. The key identifying value of a report is the local number. We asumed these to be unique but reports can be updated or even deleted over time. We noticed that when updates take place, another report with the same local number can appear in the downloads. this can be across multiple line listings so it is a bit hard to detect when using the EMA portal, but we discovered they are present. Besides that, reports are overlapping between substance line listings and product line listings. Our first course of action is to grab all local number's and isolate them in a table. We use a table to increase performance later on when we want to access unique reports again.
Because we do not want to polute the
import schema's, we create a new schema to create derived tables.
CREATE SCHEMA analyses;
First, we select all unique reports from the substance table and create a new table with the results.
CREATE TABLE analyses.icsr_temp AS SELECT local_number, max(report_date) AS report_date, max(country) AS country FROM import.substance GROUP BY local_number;
This should render
Then we insert the unique reports from the product table into the table generated in the previous query.
INSERT INTO analyses.icsr_temp SELECT local_number, max(report_date) AS report_date, max(country) AS country FROM import.product GROUP BY local_number;
Which should add another
3106773 into the
Then we copy the reports into a table that merges the unique reports from the substance table and the product table so it cannot contain double records.
CREATE TABLE analyses.icsr AS SELECT local_number, max(report_date) AS report_date FROM analyses.icsr_temp GROUP BY local_number;
This should render
We can now drop the temporary table.
DROP TABLE analyses.icsr_temp;
Let's see how much unique reports remain.
The final result is that we have
10210452 unique reports in the
EMA reports 25.3 million Case Safety Reports (that represent 15 million cases) in the following document.
21 March 2023, EMA/900566/2022, H-Division, 2022 Annual Report on EudraVigilance for the European Parliament, the Council and the Commission, Reporting period: 1 January to 31 December 2022.
What we found is that the resulting table contains
10210452 rows, approx. 10 million unique reports.
We can also count the number of unique reports per year running the following query.
SELECT extract(year from date_trunc('year', report_date)) AS "year", count(*) FROM analyses.icsr GROUP BY date_trunc('year', report_date);
This should render:
From this table we can see that the database starts to really receive reports in 2003, reaching a tenfold of reports from 2006, seeing a steep rise in 2008 and eventually reaching 100 times the number of reports from the early 2000's in 2021 declining again in 2022 and given our reports were downloaded in July, our careful estimate is that this decline will probably continue in 2023.