ISSS608 2016-17 T3 Group5 Immigration Methodology

From Visual Analytics and Applications
Jump to navigation Jump to search

Group 5 Banner.jpeg Group 5 - Why Did the Migrant Cross the Road?

Proposal

Introduction

Methodology

Application

Poster

Design Framework

-A detail description of the design principles used and data visualisation elements built Addressing the analytic gap

Currently, we observe that attempts at visualising migration flows tend to be more descriptive than analytic, choosing to focus on the presentation of the directions and the volumes of the migration flows rather than providing any analytical link explaining the flows. This tendency seems to partially stem from the complexity of the phenomenon of migration, which makes explaining its flows from a general level difficult. Ernest Ravenstein - the grandfather of migration theory - theorised migration as being caused by push-pull factors; where “push” factors in the origin country propel people to emigrate and “pull” factors attract people to immigrate. Most contemporary approaches to migration – be it neoclassical economics, segmented labour market theory, or the world systems approach – have fundamentally not departed from Ravenstein’s thesis. However, as the reasons that people migrate become more varied, a more comprehensive theory of migration becomes difficult to formulate. Hence, most studies of migration that move beyond a descriptive level tend to address the effects and implications of migration rather than the causes of migration itself. However, as scholars and policy makers focused on more context-focused approaches and “middle-level” theoretical explanations to the study of migration, they tend to lose sight of how variables at the more structural level - such as macro-economic indicators, political regime, sociocultural factors and other country level attributes - could actually play a part in influencing migration patterns.

Our ISSS608 project would like to address this gap, moving beyond the current visualisations and attempt to create a dashboard application that could provide an additional analytic lens. We fall back to Ravenstein’s theory of migration, by looking at how country attributes influence migration flows. In order to provide a more holistic overview, we decide to combine economic data from the World Bank, data on political regime characteristics from the Polity IV database, and sociocultural measures from Geert Hofstede’s 6 Cultural Dimensions to provide create a general profile for countries. To study migration flows, we decided upon the World Bank Global Bilateral Migration dataset. This dashboard application focuses on allowing the user to explore not only the migration flows, but the push and pull factors behind these flows as well by visualizing the various country attributes as differential levels that may correlate with the magnitude and degree of migration.

Data Description

Migration data (Bilateral migration data)

Migration data indicates the migration flow number of people from country to country in different year, which contains 232 countries migration flows. This is extracted from migration flow of a matrix of different countries.

Estimates of bilateral migration flows at region and country levels for 5-year periods (mid-year to mid-year): 1990-1995, 1995-2000, 2000-2005, 2005-2010. The estimates capture the number of people who change their country of residence over 5-year periods.


Polity data (Polity IV dataset)

The Polity IV dataset covers all major, independent states in the global system over the period 1800-2015 (i.e., states with a total population of 500,000 or more in the most recent year; currently 167 countries).


Wbdata (world revolution indicates)

The world revolution indicates show the different attributes of different countries starts from 1985 to 2010. We are including 69 quantitative attributes such as GDP, fertility, CO2, inflation, health expense, unemployment and so on. These factors will be used at subsequent analysis in our project.


Hofstede's cultural dimensions data

Hofstede cultural dimensions conceptualises national cultural as comprising of 6 dimensions: the Power Distance Index (PDI), Individualism versus Collectivism (IDV), Masculinity versus Femininity (MAS), Uncertainty Avoidance Index (UAI), Long Term Orientation versus Short Term Normative Orientation (LTO) and Indulgence versus Restraint (IND).

Culture.png
  • PDI score measures the degree to which les powerful members of society accept the unequal way in which power is distributed. A country with high PDI score suggests that power differences are accepted as the norm while a low PDI score suggests that people have more focus on social justice issues and strive to equalise how power is distributed.
  • IDV scores measures the permeation of the culture of individualism within societies. A higher IDV score suggests that societal values are more individualistic and self-image is defined by “I” rather than “we”. A low IDV score suggests the opposite, that societies have a more collectivist outlook and practice more communitarian values.
  • The MAS dimension represents whether success in society is defined through a culture of “achievement, heroism, assertiveness and material rewards” (high score; “masculine”) or whether it instead prefers “cooperation, modesty, caring for the weak, and quality of life” (low score; “feminine”).
  • UAI measures risk aversion and how much countries will feel when faced with uncertainty. When UAI is high, societal values manifest in a low tolerance of unorthodoxy and contestation where strict codes of behaviour are enforced. In low UAI countries, practice and what works counts more, and people are more flexible and relaxed about what the proper way of doing things are.
  • LTO is related and somewhat similar to UAI, where it measures how societies deal with their own historicity and manage challenges in the present and the future. Countries with low LTO tend to be more suspicious of societal change and countries with high LTO tend to prefer more pragmatic approaches to dealing with the future.
  • IND scores measure how societies deal with basic gratification needs. High IND societies are more predisposed to having fun and enjoying life while low IND societies delay gratification and regulate it through strict social norms and codes.

The data collected for Hofstede’s Cultural Dimensions dataset (https://geert-hofstede.com/national-culture.html) originate from different periods. The first scores from more than 70 countries – of which only 40 were used – were collated between 1967 and 1973. Later editions expanded the range of countries and the current third edition covers a total of 111 countries. However, Hofstede argues that as “culture changes very slowly, the scores can be considered up to date”.

Design Principles

Having distilled several design principles through the review of past visualisations of migration, our design was driven by these considerations:

  • For Data Visualisation:

#We needed a way to successfully represent all the migration flows between source and destination countries

#We needed a way to compare between the attributes of source and destination countries

#We needed a way to allow users to compare between the 5-yearly migration dataset and the annual data of the country attributes.

  • For User Interface Design:

#To allow the user to creatively explore, within a sandbox environment, the relationships between country attributes and the migration flows.

#To embed controls within the user interface that will allow user to explore without being overwhelmed by too many country-pairs or attributes

#To provide some form of prior analysis or recommendation system to reduce the problem of multidimensionality of the country attributes and allow users to get a better sense of how to choose which attributes to study.


We developed an analytical framework to explore determinants of migration i.e. how the attributes of source and destination countries are related to in and out migration rates of these countries. We decided to build our data visualisation dashboard using Rstudio and R shiny as R is a flexible and powerful language that is good at data manipulation and has many packages for data visualisation. Using R/Shiny and other packages, we attempted to integrate bilateral migration flow data with data describing the characteristics of both source and destination countries, drawing data from the World Bank, the Polity IV dataset, and measures from the Hofstede’s Cultural Dimension Theory. This culminated in the design of an analytical dashboard that allows users to perform exploratory data analysis to aid policy and academic research on migration.


Given its utility as observed in our review of past work, we decided to represent the migration flows using a chord diagram, and represent the country attributes using a parallel coordinate plot. We also wanted to create an automated variable selection function that recommends to the user which variables are the most important so that the user does not need to shuffle through more than a hundred attributes before finding the one of interest. We also decided to provide a series of controls for the user to select and choose the origin and destination countries through specifying their respective regions. That way, when the user explores, he or she can better manage what countries are represented. At the same time, the partial loading of the dataset will reduce processing overheads, especially on slower computers. We also wanted users to be able to toggle between different time periods for both migration flows as well as country attributes so that they can better explore and compare whether different time periods of the attributes had any relation with the migration flows. Finally, we wanted to provide a partial regression analysis function so that the user can understand, based on what attributes were selected, the effect in which the attribute(s) in question had on the migration flows. The process flow of our app is presented below:

                             Process.jpeg