Chapter 5: Data Analysis

Learning objectives

  • What are the different ways of analysing quantitative data?
  • What kinds of questions can be answered through quantitative data analysis?
  • How can we develop effective data visualizations?
  • Which methods can be used to generate statistics of hard-to-reach migrant groups?

Summary

‘Analysis’ means different things to different actors: academics, practitioners, government officials or international civil servants often take a different approach to analysis depending on their needs. At the most basic level, quantitative analysis entails transforming data in order to produce descriptive statistics, which include measures of frequency, measures of central tendency and measures of dispersion. The objective of descriptive analysis is to summarize the data in meaningful ways in order to suggest potential patterns or trends. Exploratory analysis provides an overview of how the variables in the dataset are related to reveal unanticipated connections and patterns. Correlation analysis evaluates the relationship between two variables in the dataset, but a correlation between variables does not automatically imply causality.  Finally, experimental analysis establishes a causal relationship between two variables through a randomized intervention or manipulation by the researcher.

Using visualizations such as charts, graphs and maps enhance  data  accessibility for a wide range of audiences, making it easier for both  non-technical individuals and experts  to understand complex patterns and trends. Examples of these include scatter plots, bar charts, pie charts, proportional symbol maps and alluvial and chord diagrams. However, visualizations can also confuse readers if the message is not clear or mislead readers if the data are not accurately presented. Therefore, picking the appropriate visualization depending on the goal of the analysis   is key. While visualizations are a powerful tool to explore and communicate data patterns one should avoid visualizations if the data are highly dispersed, there are too few or too many values or there is not enough variation.

Finally, methodologies have been developed to generate estimates for situations where data collection is limited or cannot reach ‘hidden’ migrant groups. For example, UNODC highlighted the potential of the Multiple Systems Estimation methodology (which is based on the capture-recapture method) for estimating the numbers of non-detected victims of Trafficking in Persons (TiP) in a country. The US Census Bureau uses the residual method for estimating foreign-born and native-born emigration during the intercensal period. The Pew Research Centre calculates the difference between the stock of migrants in survey and census data and the stock of registered international migrants to generate estimates of irregular migration.