Workshop participants from CREST, UNI-CAMP, UNISA and UKZN

Workshop dinner

26 April 2024

By Rein Treptow

A workshop on Big data analytics and AI for STI Policy was hosted at CREST from 17 April to 19 April 2024. The workshop brought together participants from the University of Stellenbosch, the University of Campinas, the University of São Paulo, Leiden University, the University of Kwa-Zulu Natal, and the University of South Africa. The workshop received funding under a joint NRF-FAPESTP ISS funding call. This forms part of a 5-year inter-agency agreement to support scientific collaborative activities between South African researchers and researchers based in the State of São Paulo in Brazil. The main outcome stipulated for the funding is clearly defined collaborative research projects ready to be submitted for multi-year funding. Presentations at the workshop included:
• The Laboratory of Studies on Research Organization and Innovation (LAB-GEOPI) within the national and local context – an overview of the history, context and some of the major research and teaching outputs. [Adriana Bin & Sergio Salles-Filho]
• Big data and AI for STI policy at CREST – an overview of CREST’s data infrastructure (the hardware, software, database structure and the data integration processes). [Johann Mouton, Herman Redelinghuys, Graeme Mehl & Ahmed Hassan]
• Embracing openness and AI: New projects and developments in the CWTS data infrastructure – an example of Leiden University’s efforts to support open information (the Leiden Ranking Open Edition) and a discussion on the labeling of publication classifications using large language models. [Nees-Jan van Eck]
• Big Data & STI Policy Decision-Making – a discussion of the research trajectory of InSySPo. [Nick Vonortas]
• Data infrastructure experience – a discussion on the economics, tools, and methodologies of big data analysis. [Alysson Mazoni & Rodrigo Costas]
• Lessons learnt from working with full-text documents and LLMs –advice for parsing full text documents, hierarchical text classification, citation context extraction, sentiment classification of citation contexts, and citation intent identification. [Marcel Dunaiski]
• How can evidence-informed STI policy benefit from AI: AI Tools as Policy Advisors – a discussion of some of the opportunities and major risks of adopting LLMs and other AI systems. [Robert Tijssen]
• Alternative data sources: new interactions, new questions – a federation of diverse data sources is proposed. [Rodrigo Costas]
• Big data analytics and Artificial Intelligence for STI policy: Exploring opportunities and options for South Africa, Brazil, and the Global South – a research project that seeks to identify, develop, and employ approaches, methodologies, and indicators that generate evidence on how research in its various fields of knowledge is (and should be) planned, financed, evaluated, and communicated. [Robert Tijssen]
• Big data analytics and Artificial Intelligence for STI policy – opportunities for the development and cross-applications of methodologies and comparative studies between South Africa, Brazil, and the Global South. [Adriana Bin & Sergio Salles-Filho]
• Studying the regional relevance of universities – a study presented as an opportunity for ‘big data’ cooperation and joint institutional learning. [Robert Tijssen]

During the course of the workshop, it became increasingly evident that there are significant areas of overlap and exciting opportunities for collaboration between CREST and Lab-GEOPI (UNICAMP) as well as between the various other research partners represented at the workshop. CREST and Lab-GEOPI are both well positioned to utilize big data and AI for STI policy. There are exciting prospects, not just for collaborative research projects, but also for joint courses (executive courses for students and training around data management for staff) as well as joint development of data infrastructure and tools. Further steps are being undertaken to operationalize these opportunities for greater collaboration and resource sharing. Updates to follow.