Curated list of research and data on gender and social network analysis
3. Data on gender for social network analysis
There are different sources for data on gender more or less suitable for social network analysis.
To conduct social network analysis, researchers require relational data that map interactions, ties, and relationships between individuals or entities (Hanneman & Riddle, 2005).
In the case of social media datasets, hashtags function as relational links between users, while tweets, retweets, and replies create ties that can be analyzed.
Archival data, such as survey responses or academic dissertations, may require preprocessing to be suitable for network analysis. For instance, datasets need to be transformed into a format representing nodes (e.g., individuals, documents, or institutions) and edges (e.g., relationships, shared themes, or communication links). Once structured, data can be analyzed to reveal how gender dynamics have shaped intellectual, political, or cultural networks over time.
Where do we find data on gender then? A shift is that more and more not gathering data yourself, but using “archived” data. This data is not usually cleaned and adapted for SNA and the organisation of data differs.Gender data examples:
The World Bank's Gender Data Portal https://genderdata.worldbank.org/en/home
UN The Women Count Data Hub https://data.unwomen.org/
EU Gender Statistics Database https://eige.europa.eu/gender-statistics/dgs
Open data initiatives
Open science, EU-funded: https://about.zenodo.org/ and https://zenodo.org/search?q=facebook%20gender&f=access_status%3Aopen&f=resource_type%3Adataset&f=file_type%3Acsv&l=list&p=1&s=10&sort=bestmatch
Center for open science: https://osf.io/5azmc/, https://osf.io/cjh7x/ , https://osf.io/mxehs,
University initiatives
Harvard University’s “Harvard Dataverse” https://dataverse.harvard.edu/dataverse/harvard/?q=gender+social+media
Stanford Large Network Dataset Collection : https://snap.stanford.edu/data/
The Social Science Research Council, Datasets for Social Media Research here https://labs.ssrc.org/social-media-datasets/ > https://airtable.com/appcP72xxEj2xhNSF/shrjfIgBXXDlY76Ly/tblE45Xhf7kYbLaNP?viewControls=on
Online dataset "portals" with more or less added packages for programming
Github, https://github.com/KaiDMML/FakeNewsNet
Kaggle, https://www.kaggle.com/datasets
Google datasets: https://datasetsearch.research.google.com/search?ref=TDJjdk1URnlNbW8yWkRJd2NRPT0sTDJjdk1URjNPWEYwTW5GMmNnPT0sTDJjdk1URndlREZmTURseFl3PT0%3D&query=open%20social%20media%20dataset%20gender&docid=L2cvMTFyOGtoOXlzeg%3D%3D