General Issues

Understanding Broad Data Quality Concerns in EM-DAT

Three types of data quality issues can be considered:

A cross-comparison of EM-DAT with a local database and/or a disaster-specific database can help identifiy Issue 1 (e.g., Koç & Thieken, 20181; Lin et al., 20212). For an account of missing values for existing events, we refer to Jones et al. 20213 and the section on Accounting Biases. Issue 3 is partially related to the data collection sources, protocols, or reporting systems generally used by different databases.

Data quality issues within EM-DAT are related to the data collection protocols from dedicated sources. EM-DAT’s completeness reflects the coverage of its sources. Since source reporting has improved over the years, EM-DAT data coverage has improved significantly over the last 30 to 40 years. Nevertheless, gaps and quality issues remain. EM-DAT protocols are meant to guide the way information is monitored and collected from sources. However, no universally applied protocol ensures that different sources report disaster impact and losses using the same guidelines to define, for instance:

  • the beginning and end of disaster events.
  • the geographical footprint of a disaster.
  • impact variables such as deaths (in particular, when computed based on excess mortality), affected people, or economic costs.
  • the disaster type selected by the sources.

Some references illustrate the issues and challenges related to collecting and maintaining a disaster database, e.g., Guha-Sapir & Misson 19924, Kron et al. 20125, and Wirtz et al. 20146.

To some extent, EM-DAT owes its popularity to its simplicity. It reports disaster events as rows in an Excel table. However, this simplicity comes at the cost of conceptual limitations in dealing with complex and compound events and situations. In such cases, as exemplified in the box below, EM-DAT will probably report the disaster in the same way as the source which presented it. The EM-DAT database manager can only choose to select some numbers over some others (see Daily Encoding). However, no model is involved in correcting differences in reporting protocols because this task goes beyond the information monitoring conducted at the CRED by the EM-DAT team.

Such biases that result from differences in the impact reporting systems were generally referred to by Gall et al. 20097 as systemic biases. Some studies point to systemic biases by highlighting that EM-DAT does not correlate well with other databases (e.g., Moriyama et al., 20188; Panwar & Sen, 20209). In their article, Gall et al. 20097 cover four other types of biases: time, hazard-related, spatial, and accounting biases. These are illustrated in the next sections.

