"The UN plays a very important role in ensuring that the data revolution for sustainable development is inclusive, not only by incorporating big data and analytics into planning and decision making, but also by working with governments, policy leaders, and the broader international community to address the gaps in women's access to ICT [information and communication technology] and other tools and activities that generate new sources of data."
This United Nations Entity for Gender Equality and the Empowerment of Women (UN Women) report outlines the value of big data (organic, unstructured data) for monitoring the Sustainable Development Goals (SDGs) in relation to women. It provides background context on how big data can be used to facilitate and assess progress towards the SDGs, and focuses in particular on SDG 5: "Achieve gender equality and empower all women and girls". The report presents the benefits of big data (for example, real time data), risks (for example, elite capture and privacy), and policy implications (for example, how it can be incorporated in project cycles from planning to evaluation) associated with the use of big data to improve the lives of women and girls. Finally, it identifies concrete data innovation projects from across the development sector that have considered the gender dimension.
Research methods included a literature review focused on big data and gender, interviews with colleagues from UN Women and UN Global Pulse, interviews with individuals and organisations working in the field of big data (LIRNEAsia, International Development Research Centre (IDRC), World Wide Web Foundation, and the University of Southern California), and answers to a short questionnaire posed to six UN Women country offices.
Traditional data (e.g., household surveys, institutional records, or censuses) are often collected with a specific intention, following a structured format, and with valid and reliable measurements. In contrast, big data generally:
- are produced as a byproduct of people's digital behaviour;
- are not gathered in a way guided by a research question, and therefore require interpretation after the event;
- imperfectly match the entire universe of cases. Representativeness is generally not a factor in big data collection, as there is no sampling strategy involved, and non-coverage is often a concern when assessing data quality;
- are often accessible in real time (at the time that the data are produced). However, data analytics may require some days or weeks;
- can be analysed by combining different data sources; multiple datasets can illuminate new insights and/or validate indicators, or help with triangulation of data sources;
- can be harnessed to improve decision-making. Appropriate guidance and frameworks can help to translate insights from big data into value for organisations, governments, or businesses.
Sources such as social media trails, call data records, radio data, and satellite imagery - both alone and combined with traditional data sources - can shed light on the lives of women and girls, UN Women explains. Big data analytics can facilitate:
- real-time situational awareness;
- the ability to "shine a light on the invisible", by improving information on the lives of women and girls;
- new information on mobility, social interactions, sentiment and cultural beliefs, and economic activity;
- early warning of emerging issues and crises;
- improved understanding of community well-being;
- understanding of both local impacts and larger geographic patterns;
- identification of trends and correlations within and across large datasets that would otherwise be unknown;
- data visualisation for more nuanced and accessible insights;
- opportunities for participatory monitoring, real-time feedback, and learning loops;
- the ability to recalibrate and iterate within the implementation of a programme; and
- improvements in accountability and transparency.
According to UN Women, there are several issues to consider in using big data to foster a more gender-responsive data revolution for sustainable development. These include:
- Addressing the gender gaps in "traditional" data - including areas where women's activities, women's needs, women's interests, and threats women face are largely invisible - to get a richer, more nuanced understanding of gender equality and women's empowerment issues.
- Real-time monitoring of gender indicators and progress (or women's perceptions on progress) on gender equality across the SDGs, including but also extending beyond SDG 5.
- Understanding the social norms and political realities around gender equality and women's empowerment in order to effectively interpret big data analytics. For example, what women are comfortable saying online may not reflect their opinions. Alternatively, there may be digital-world threats faced by women that prevent them from feeling comfortable expressing themselves online, such as issues of privacy and security or the risk of online harassment and abuse.
- Addressing the gaps in women's access to information and communication technologies (ICTs) and other technologies that generate data. The gender gap in internet access is 11.6% globally and 32.9% in Least Developed Countries (LDCs) (ITU, 2017). Moreover, there is a gender gap in the degree of sophistication of use, as well as in the degree of control that women and men typically have over these resources and tools.
- Improved understanding by the development community of methodological issues of studies focused on gender that limit generalisability, such as self-selection of cases and context in which the data are produced.
- Understanding and properly addressing potential risks of harms that may result from the use of data, even in its de-identified form. Gender by itself is considered sensitive information by many privacy-related regulations due to the fact that knowing or being able to identify gender, even without knowing other demographic information, may lead to discrimination of an individual.
As the report elucidates, big data can address multiple types of research questions, including those that are descriptive, predictive, and prescriptive. Roughly a quarter of all SDG indicators (53 out of 230) explicitly or implicitly address gender equality. Of the 14 proposed indicators to monitor SDG 5, specifically, there are only three for which internationally accepted standards for measurement exist and for which data are regularly collected by most countries (referred to as Tier I indicators). Of the remaining 11 indicators, five have internationally accepted standards, but data collection by most countries is largely irregular (referred to as Tier II indicators). For the remaining six (referred to as Tier III indicators), international standards do not yet exist, and most countries do not regularly collect the data. UN Women explains that big data can have a role to play for all three tiers. For example, for Tier II indicators, big data can help with indicator estimation via triangulation with other data sources.
However, there are areas for caution in assessing policy issues related to big data and gender. UN Women stresses that a balance needs to be struck between protecting privacy and maximising the utility of big data for safeguarding civil rights, ensuring fairness, and preventing discrimination. This implies that data teams need to understand how insights drawn from the data may impact people's lives, and they must be thoughtful and responsible in the ways they present any conclusions to responders, civil society leaders, or policymakers. Further, there remain concerns that careless interpretation of big data might lead to disproportionate representation of those who are capable of producing digital trails (for example, it is possible to fail to include those who do not have access to technology, are not online, or prefer not to engage).
The report ends with a compendium of gender-related big data projects and their relevance to the SDGs. The projects listed include examples of big data for sustainable development projects sourced from UN Global Pulse, the UN Global Working Group on Big Data for Official Statistics, the UNECE/Sandbox, Data2X, the NYU Governance Lab, the Data Science for Social Good programme at the University of Chicago, Flowminder, the United Nations Children's Fund (UNICEF), and the World Bank Group. Projects related to SDG 5 focus on overall gender discrimination (5.1), gender-based violence (5.2), early marriage and female genital mutilation (FGM) (5.4), and sexual and reproductive decision-making (5.6). Other projects are relevant to the gender dimensions of other SDGs, and still others focus more on methods and tools that could be applicable to understanding the gender dimensions of several SDGs (for example, a Twitter tool to label gender, or a tool to analyse speech on radio broadcasts in local languages).
UN Women website, February 9 2018. Image credit: Copyright UN Women