University of OuluCenter for Ubiquitous Computing
.01

HOME

PERSONAL DETAILS
Center for Ubiquitous Computing, P.O.Box 4500, FIN-90014, Finland
mapiconimg
panos.kostakos@oulu.fi
+358 505 950 718
Hello. I am passionate about Digital Forensics, Big Data, and Digital humanities. Welcome to my Personal and Academic profile.

BIO

ABOUT ME

Author of various articles on illegal arms trade, cocaine smuggling, illegal migration and terrorism. I am currently a postdoc researcher in the Center for Ubiquitous Computing, University of Oulu, Finland. My research is focused on the development of new computational methodologies and their application to organised crime, terrorism, and corruption.
Online Viz

Topic Modelling on mafia related news articles

Named Entity Recognition (NER)

Machine Perception of the Mafia

CUTLER data flow with Sankey Particles

.02

Projects

Funded Projects

CUTLER 2018-2020

H2020

GRAGE 2014-2018

H2020

YoungRes 2019-2021

ISF-Police Action Grant

PRINCE 2019-2021

ISF-Police Action Grant

Project descriptions

CUTLER - Coastal Urban Development Through The Lenses of Resiliency

H2020-CO-CREATION-2017

Coastal urban development incorporates a wide range of development activities that are taking place as a result of the water element existing in the fabric of the city. This element may have different forms (i.e. a bay, a river, or a brook) but in almost all cases the surrounding area constitutes what maybe considered as the heart of the city.

Every city that incorporates the water-element in its fabric is confronted with the fundamental requirement of developing policies for driving development in the surrounding area, while balancing between: a) economic growth, b) protection of the environmental, and c) safeguarding social cohesion. This requirement is tightly connected with the concept of Urban Resilience, which is the capacity of individuals, communities, businesses and systems within a city to survive, adapt and grow no matter what chronic stresses and acute shocks they experience.

In developing policies that add value to the resilience of a city, we shift the existing paradigm of policy making, which is largely based on intuition, towards an evidence-driven approach enabled by big data. Our attention is placed on policies related to the water element. Our basis is the sensing infrastructures installed in the cities offering demographic data, statistical information, sensor readings and user contributed content forming the big data layer. Methods for big data analytics are used to measure the economic activity, assess the environmental impact and evaluate the social consequences. The extracted pieces of evidence are used to inform, advice, monitor, evaluate and revise the decisions made by policy planners.

Finally, effective policies are developed dealing with: a) the economic and urban development of Thermaikos Bay, Thessaloniki, b) the transformation of Düden Brook into a recreation and park area, Antalya, c) the development of a Storm Water Plan, Antwerp, and d) the review of the Country Development Plan in the River Lee territory, City of Cork.

GRAGE - Grey and green in Europe: elderly living in urban areas

H2020-MSCA-RISE-2014

The EU has to face many challenges in achieving a more balanced regional development and sustainable economic recovery. Many of those challenges have to do with the ageing population trend, urbanization and environment under distress. More liveable and efficient communities is a target to be reached in Europe, where the “silver hair” trends can become a challenging opportunity, from a social, economic and cultural perspective. Despite those challenges are strongly interlinked, solutions provided in urban contexts not often pay due attention to the social process underlying urban trends and to the needs and behaviour of elderly citizens.

GRAGE intends to contribute to fill this gap, developing winning ideas to promote an active, harmonious and inclusive citizenship for elderly people living in urban contexts. The consortium gathers ground-breaking expertise from different scientific background (legal, economic, humanities, engineering), from academic and non academic institutions, belonging to several countries (from EU and Ukraine). Using a mix of methodologies, the research and innovation programme of the project will evolve around the idea of citizenship as a collector of interest, healthy environment and suitable urban solutions for an aging society. Main themes will be: green buildings, food and urban agriculture, information and language technology. Researcher will analyze their role in transforming cities in environments that support green and healthy lifestyles for elderly people. GRAGE intents to boost dialogue through Europe, both strengthening the academic and non-academic collaboration and a practical understanding of elderly living across Europe. Such a cooperation can have a series of returns for Europe, ranging from a more effective solution to strategic challenges (sustainable cities and demographic change) to new business opportunities for European firms, offering solutions and products for smart/inclusive/ageing societies at global level.

Project ID: 645706

Funded under: H2020-EU.1.3.3. - Stimulating innovation by means of cross-fertilisation of knowledge

Total cost: EUR 828 000

YoungRes - Strengthening European Youngsters Resilience through Social Media and Serious Games

ISF-Police Action Grant

Despite national and Pan-European effort to tackle polarization and radicalization of the opinions, that can be translated into extremist behaviors, especially for youth and vulnerable population, the lack of fruitful interaction has often been pointed out as a crucial barrier to reach the desirable impact.

For this purpose, YoungRES aims to exploit the familiarity of youth to social media and gaming platforms in order to accommodate an eLearning platform that would i) enable the presentation and discussion of relevant debatable topics as an interactive game, imitating and extending the original concept of Shimpai Muyou! and utilizing Brandsma’s polarization model; ii) identify the student argumentation in a way to pinpoint potential hidden factors of polarization and/or radicalization; iii) trigger personalized feedback to both learner and educator to reflect on potential polarization risk while promoting EU values; iv) analyze the social network of participating users in the context of polarization.

Therefore, the ultimate overall objective of YoungRES is to identify sign of polarization / radicalization of the teenagers and students population first in Spain and Finland, and, second, at European level, through a set of actions. First, it promotes the aforementioned eLearning platform would enable educationalists to track polarization patterns through gamification. Second, it develops online learning tools that will elicit and follow up learner argumentation when debating controversial and sensible issues that may exhibit hidden polarization factors of the individuals. Third, it tests new counter- measures that promote dissemination of EU democratic values and moderate voices through both online and selected workshop related events. Fourth, it critically reviews the current practice of cooperation between law enforcement, social services, NGOs, community based organizations and academia in the context of preventing extremism and polarization, while paving the way.

PRINCE- Preparedness & Response for CBRNE Incidents

ISF-Police Action Grant

Chemical, Biological, Radiological, Nuclear, and high-yield Explosive (CBRNE) events have the potential to destabilize governments, create conditions that exacerbate violence, or promote terrorism. These events can quickly overwhelm the infrastructure and capability of the responders.

PRINCE aims to support first aid responders and law enforcement/security authorities by providing them with an evidence base for strategic level decisions related to prevention, detection, Respiratory Protection, Decontamination and response to CBRN event. PRINCE aims to produce a roadmap based on EU & International Actions plans and recommendations by creating a PRINCE catalogue of training curricula in line with the INTERNATIONAL CBRN TRAINING CURRICULUM and EU, based on best practises and international proven CBRNE exercises.

PRINCE aims to produce CBRNE SOPs and plans for two incidents (Chemical and Radiological) in two major exercises (Greece, Portugal). The exercises will be performed with representatives from all responders to (1) share information on CBRN threat and risks; (2) exchange best practices; (3) perform joint trainings and exercises. PRINCE will provide recommendations to CBRNE equipment, systems, and training content and to develop ICT tools (E-training platform, CBRN Emergency system). PRINCE aims to enhance protection of public spaces, community and infrastructure by sharing project outcomes with wider audience through online information material, presentations to public events and media.

Short term beneficiaries are CBRN responders and authorities from GR, PT, CY, FL and DE, Medium term beneficiaries: EU CBRN authorities, stakeholders, Long term beneficiaries: Citizens, public authorities, CBRNE technology partners, business, Government advisors, R&D and industry. PRINCE increases sustainability through cross-border/cross-sectoral collaboration and by exchanging best practices and knowledge on joint exercises and training courses between five member states.

.03

CURRENT RESEARCH

FORTHCOMING PAPERS
Face DetectionForensicsOnline Behaviour

Detecting the Age of Twitter Users

Detecting the Age of Twitter Users

CorporaForensics

Paraphrasing detection

Paraphrasing detection

About The Project

On Web Based Sentence Similarity for Paraphrasing Detection

Mourad Oussalah and Panos Kostakos Center for Ubiquitous Computing, University of Oulu, P.O.Box 4500, FIN-90014, Oulu, Finland

Semantic similarity measures play vital roles in information retrieval, natural language processing and paraphrasing detection. With the growing plagiarisms cases in both commercial and research community, designing efficient tools and approaches for paraphrasing detection becomes crucial. This paper contrasts web-based approach related to analysis of snippets of the search engine with WordNet based measure. Several refinements of the web-based approach will be investigated and compared. Evaluations of the approaches with respect to Microsoft paraphrasing dataset will be performed and discussed.

To appear: KDIR 2017, International Conference on Knowledge Discovery and Information Retrieval.

Online BehaviourProstitution

Early detection of individuals at risk.

Early detection of individuals at risk.

About The Project

Early detection of individuals at risk of being drawn into online sex trade: A mixed method approach using covert online ethnography, SNA and machine learning.

by: Panos Kostakos, University of Oulu; Lucie Špráchalová, Charles University; Mourad Oussalah, University of Oulu.

How can we identify individuals at risk of being drawn into online sex trade? Recent research shows that technology enables a greater number of individuals to be involved in illicit sex markets. Because technology reduces transaction costs and breaks-down market entry barriers, the number of open- air and indoors sex workers has in the past years increased very rapidly. This has far reaching implications for economic development, social cohesion, and public health. As a result, there is urgent need for tools that prevent the spread of illegal sex trade online. In this paper, we present work in progress of a tool that uses social network data to enable early detection of individuals at risk of being drawn into sex trade online. Our method can be summarised as follows. First, we extracted users’ profiles (N=28,000) from an online European adult forum. Second, we conducted covert online ethnography and carried out interviews with a random sample of the users. This enabled us to develop a user typology that highlights the social organisation of the illicit market in conjunction with self-reported data about the risk of exposure to illicit activities. Third, we used graph theory to analyse the structural position of users. Finally, we used machine learning to train a model that predicts the risk and social role of individual users within the network.

To appear: Illicit Networks Workshop, Adelaide, Australia, 11-12 December 2017.

Online BehaviourSmuggling

Predicting Refugee flows with Google Trends

Predicting Refugee flows with Google Trends

About The Project

Correlating refugee flows with Internet search data

by: Panos Kostakos, University of Oulu; Simo Hosio, University of Oulu; Daniela Irrera, University of Catania; Christoph Breidbach, University of Melbourne; Vassilis Kostakos, Melbourne, Australia.

Can Internet search data be used as a proxy to monitor refugee mobility? Thousands of refugees and migrants cross the Mediterranean Sea to reach various parts of Europe every year. Evidently, the increasing number of irregular crossings is also leading to more deaths. The soaring refugee death toll creates an urgent need for novel tools that monitor and forecast refugee flows. Because existing monitoring systems rely extensively on international and regional human networks, there is a lack of tools that forecast refugee mobility patterns with hyperlocal precision. As a result, local authorities and search and rescue (SAR) organisations cannot deploy resources timely and effectively to manage the risks of irregular border-crossing. This study investigates the correlation between refugee mobility data (arrival dates) and Internet search data from Google Trends. Google Trends is a freely accessible tool that provides access to Internet search data by analysing a sample of all web queries submitted by end-users to Google. This online tool has already been used to study a variety of phenomena including suicide, drug use, unemployment, drug crime, and Influenza, to name a few. In our method, we carried out interviews with end-user organisations and a survey with refugees in Greece (entry point) and Finland (destination point) to identify what search queries they have used in every leg of their journey. Next, we conducted time series analysis on Google search data to investigate whether interest in user- defined and/or generic search queries correlate with levels of refugee arrival dates recorded by UNHCR and SAR organisations. Finally, Pearson’s correlation coefficients were calculated as a measure of association between refugee arrival dates, SAR data, and internet search trends.

To appear: Illicit Flows Workshop, Adelaide, Australia, 13 December 2017.

.04

PUBLICATIONS

All Publications

Panos Kostakos , Abhinay Pandya, Olga Kyriakouli, Mourad Oussalah (2018) Inferring Demographic data of Marginalized Users in Twitter with Computer Vision APIs , IEEE European Intelligence
and Security Informatics Conference (EISIC) 2018 October 24-25, 2018, Blekinge Institute of Technology, Karlskrona, Sweden

Lu Jiyan, Panos Kostakos, Mourad Oussalah, Susanna Pirttikangas (2018) Combining Semantic and Phonetic Word Association in Verbal Learning Context, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, Spain, 28-31 August.

Panos Kostakos, Lucie Špráchalová, Abhinay Pandya, Mohamed Aboeleinen and Mourad Oussalah (2018) Covert online ethnography and machine learning for detecting individuals at risk of being drawn into online sex work. 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, Spain, 28-31 August.

Panos Kostakos, Abhinay Pandya, Mourad Oussalah, Simo Hosio, Christoph Breidbach, Vassilis Kostakos, Niels van Berkel, Olga Kyriakouli and Arash Sattari (2018) Correlating Refugee Border Crossings with Internet Search Data, 2018 IEEE International Conference on Information Reuse and Integration for Data Science. July 7 – 9, 2018. Salt Lake City, Utah, USA.

Abhinay Pandya, Mourad Oussalah, Paola Monachesi, Panos Kostakos and Lauri Loven (2018) On the use of URLs and hashtags for age prediction of Twitter users, 2018 IEEE International Conference on Information Reuse and Integration for Data Science, July 7 – 9, 2018. Salt Lake City, Utah, USA.

Panos Kostakos, Markus Nykänen, Mikael Martinviita, Abhinay Pandya and Mourad Oussalah (2018) Meta-Terrorism: identifying linguistic patterns in public discourse after an attack, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, Spain, 28-31 August, 2018

Panos Kostakos (2018) ”Inferring serious crime perceptions from Twitter and Google Trends”, International Journal of Cyber Criminology, 12(1):282-299.

Kostakos P. (2017) ‘Transnational Organised Crime 1985-2014: mapping the knowledge structure through co-word analysis and Social Network Analysis’, 8th Annual Illicit Networks Workshop, London, UK.

Panos Kostakos, Miika Moilanen, Arttu Niemelä, Mourad Oussalah (2017), ’Catchem: A browser plugin for the Panama Papers using approximate string matching’, 2017 European Intelligence and Security Informatics Conference (EISIC).

Mourad Oussalah and Panos Kostakos, (2017), ‘On Web Based Sentence Similarity for Paraphrasing detection’, KDIR 2017, 9th International Conference on Knowledge Discovery and Information Retrieval.

Kostakos P. (2016) Μεταναστευτικό: Ματ στην «τελευταία οριζόντια», On Alert, 5 Mar.

Kostakos P. (2016) ‘Transnational Organised Crime 1985-2014: mapping the knowledge structure through co-word analysis and Social Network Analysis’, 8th Annual Illicit Networks Workshop, London, UK.

Kostakos P. (2016) ’European Criminology 1975-2015: Mapping the knowledge structure through co-word analysis’, European Society of Criminology, 2016, Muenster, Germany.

Kostakos P. (2015) Understanding serious crime: A ‘Big Data’ approach” 1st ECPR Standing Group on Organised Crime Conference, 2015, Napoli, Italy.

Kostakos P. (2013) Gangster Politics: Organized crime as the continuation of politics by other means, 7th ECPR General Conference, 2013, Bordeaux.

Kostakos P. (2013) What does the public think about organized crime? Frames, perceptions and implication for the EU’s strategy in the Southeast Europe, The Aspen Institute, May 2013, Montenegro.

Kostakos P. (2013) ’What does the public think about organized crime? Frames, perceptions and implication for the EU’s strategy in the Southeast Europe’, The Aspen Institute, Germany.

Kostakos P. (2011) Conflict, Power and Wealth: Organised Crime as an Everyday Phenomenon. A case Study of Greece, Bath: University of Bath.

Kostakos P. & Antonopoulos A. G. (2010) ’The ‘Good’, the ‘Bad’ and the ‘Charlie: The Business of Cocaine Smuggling in Greece’, Global Crime 11(1): 34-57.

Kostakos V and Kostakos P. (2010) ’Inferring social networks from physical interactions: a feasibility study’, International Journal of Pervasive Computing and Communications, 6(4): 423 – 431.

Arsovska, J. & Kostakos P. (2008) ’Illicit arms trafficking and the limits of rational choice theory: the case of the Balkans’, Trends in Organized Crime 11: 352-78.

Felia Allum, Francesca Longo, Daniela Irrera, Panos A. Kostakos (eds) (2010) Defining and defying organized crime: discourses, perceptions and reality, London: Routledge.

Arsovska J. & Kostakos P. (2010) ‘The social perception of organized crime in the Balkans: a world of diverging views?’ in Defining and defying organized crime: discourses, perceptions and reality, London: Routledge.

Allum, F. & Kostakos P. (2010) ‘Deconstruction in progress: towards a better understanding of organized crime?’ in Defining and defying organized crime: discourses, perceptions and reality, London: Routledge.

Kostakos P. (2010) ‘Organized Crime: Problems of Methodology and Research’, 29th May, Workshop: Organised crime, civil society, and the policy process, Risk Monitor and the Open Society Institute, Borovetc, Bulgaria, May 28-30.

Kostakos P. (2009) Illegal immigration from the East-Lifting the burden from the Greek State? Research Institute for European and American Studies (RIEAS) online: http://rieas.gr/images/KOSTAKOS.pdf.

Kostakos P. (2011) ’Greek Organized Crime: An Essentially Contested Concept’, Athens SecurityForum.

Kostakos P., Antonopoulos, G., Gramatikakis, G. And Maspero, A. (2010) ‘Women in Organised crime in Greece: A Methodological note and some Preliminary Data’, ECPR Standing Group on Organized Crime Newsletter 9(2): 5-7.

Kostakos P. (2010) ‘Islamist Terrorism in Europe: Could Greece Be Next?’, Jamestown Terrorism Monitor 3(36): 3-5 (October).

Kostakos P. (2010) ‘Ισλαμιστές και τρομοκρατία στην Ευρώπη: Η σειρά της Ελλάδας;’ On Alert, 11, Oct.

Kostakos V & Kostakos P. (2008) ‘Intelligence gathering by capturing the social processes within prisons’ arXiv: 0804.3064v2.

Kostakos P. (2008) ‘Theoretical issues in the crime-terror nexus literature’ (November 6, 2008). Available at SSRN: http://ssrn.com/abstract=1296837.

Kostakos P. (2008) ‘Al Qaeda’s dark networks in Greece’ (November 6, 2008), Available at SSRN: http://ssrn.com/abstract=1296825.

Kostakos P. (2007) ‘Flexible foe-Transnational Criminal Networks in Greece’ Jane’s Intelligence Review (June)

Arsovska J. and Kostakos A. P (2007) ‘Cocaine traffickers turn to the Balkans – Changing routes’ Jane’s Intelligence Review (March).

Kostakos P. (2007) ‘The threat of Islamic radicalism in Greece’ Jamestown Terrorism Monitor 5(15):9-12 (August).

Kostakos V. and Kostakos P. (2007) ‘Social network analysis (SNA): real world methodological challenges… Catching up with technology’ ECPR Standing Group on Organized Crime Newsletter 6(1):7-9.

Kostakos P. & Arsovska J. (2007) ‘Emerging cocaine routes in the Balkans’, ECPR Standing Group on Organized Crime Newsletter 6(2):3-5.

Kostakos P. & Kostakos V. (2006) ‘Criminal group behaviour and operational environments’, ECPR Standing Group on Organized Crime Newsletter 5 (2): 6-7.

Kostakos P. (2006) ‘Theories and international cooperation against transnational organized crime in the Balkans’, ECPR Standing Group on Organized Crime Newsletter 5(1): 6-7.

Kostakos P. (2008)) ‘Terrornomics’ in: Newsletter of the ECPR-SG on Extremism & Democracy (Book Review).

Kostakos P. (2007) (eds)‘Organized crime in Europe: Concepts, patterns and control policies in the European Union and beyond’ Global Crime 8(2): 187-190 (Book Review).

Kostakos P. (2006) Five families: The rise, decline and resurgence of America’s most powerful Mafia empires Global Crime 7(2): 282-284 (Book Review).

.05

STUDENTS

VISITING RESEARCHERS

OLGA KIRIAKOULI

HAROKOPIO UNIVERSITY

Luci Špráchalová

Charles University

Lu Jiyan (陆积堰)

NorthWestern Polytechnical University

Olga Kiriakouli

Twitter Data Analytics, Visualisation, Online tools

I have studied Informatics and Telematics (BSc.), and majored in web services management workflow mechanisms. I have worked as project manager for the Data Digitization of Municipality of Eleusis (Greece), web developer in various companies, and IT Support Specialist and General Officer at Manpower Agency of Greece. I am doing my MSc in Web Engineering at Harokopio University of Athens where I focus on analyzing social networking services.

Luci Špráchalová

Online ethnography, sub-cultures, illicit online sex work

I studied B.A. Social Pathology and Prevention, M.A. Social Pedagogy. My final theses were both about risky sexual behaviour and sexual self-perception and perception of normality. I worked as a free-time pedagogue and social worker - I worked with children from risky areas and poor background. I focused also on sexual education there. Now I am a project manager for a research project which is dedicated to university students, their life aspiration, and study motivation, I am an assistant at the Department of Social Pathology and Sociology, Pedagogical Faculty - I teach ethics, sociology and philosophy (all basic courses). I do my PhD in Sociology at Charles University in Prague (I have just finished my second year) and I focus on sexual minorities, communities/subcultures, especially the BDSM community online.

Lu Jiyan

Vocabulary Memory Based on Semantic and Phonetic Word Associations

I’m in my fourth year of Bachelor Degree in NorthWestern Polytechnical University in China. My bachelor thesis proposes an effective way to discover and memorize new English vocabulary based on both semantic and phonetic associations. The method proposed aims to automatically find out the most associated words of a given word. The measurement of semantic association was achieved by calculating cosine similarity of two word vectors, and the measurement of phonetic association was achieved by calculating the longest common subsequence of phonetic symbol strings of two words. Finally, the method is implemented as a web application.
.06

STUDENT PROJECTS (NLP)

AGE DETECTION

Building age detection algorithms from short texts.

Students: Benedikt Putz and Gorka Urbizu

The problem of identifying the age of authors based on their use of language is a major scientific challenge. Age awareness tools are valuable in overcoming various forensic linguistics challenges like threatening letters, ransom demands, radicalisation, and pedophilia. A major limitation to overcome is the lack of data. In most cases, authorities have access only to fragmented piece of written text. The aim of this project is to build models that predict the age of users based on their language-usage in Twitter. The novelty of our approach is the use of  approximate ager recognition software in combination with custom Natural Language Processing tools.

Datasets: Twitter historical archive of Dutch Twitter users (10M tweets).

 

Crime Sensing

Crime sensing using data from hotel reviews in London, UK. 

Student: Niko Leinonen

Tourists are easy targets for theft and victimisation. For this reason, they are much more likely to remain alert, attentive, and vigilant to suspicious activities. Tourists are also more likely to report criminal incidents to the police, friends, and social media. In this project, you will develop a crime sensing tool using textual data from hotel reviews and police-recorded crime incidents from the wider London Metropolitan Area. Can hotel reviews be used as a proxy for sensing criminal activities? Can we use reviews to predict criminal behaviours in the wider vicinity of lodges? How does crime rates impact on local businesses? How can we sense Victimisation Sentiments using customers’ reviews?

Available datasets: i) Historical archive of hotel reviews from London, ii) Data.police.uk.

 

Terrorism & Echo Chambers

Meta-Terrorism: identifying linguistic patterns in public discourse after an attack.

Students: Markus Nykanen and Mikael Martinviita, University of Oulu.

Terrorist events send shockwaves across communities and often lead to group polarization and community conflict. Open source communication data collected from social media channels can shed light in the pull and push mechanisms that drive group polarisation—e.g. echo chambers. Moreover, social media data can be used to study violence escalation and de-escalation. For example, previous research, has used the 2013 murder of Fusilier Lee Rigby in Woolwich as a case study, to show that social media communication data can illuminate the inter- and intra-community conflict dynamics arising in the aftermath of violent terrorist-like events. This project identifies linguistic features that drive the nascence and formation of echo chambers in the public sphere following a terrorist attack.

Available datasets: Historical archive of 20M tweets.

 

Corruption Detection

Catchem: A Browser Plugin for the Panama Papers using Approximate String Matching

Students: Miika Moilanen and Arttu Niemelä, University of Oulu.

Abstract— The Panama Papers is a collection of 11.5 million leaked records that contain information for more than 214,488 offshore entities. In this paper, we present work in progress on a web browser plugin that detects company names from the Panama Papers and alerts the user by means of unobtrusive visual cues. We compare company names from the Public Works and Government Services Canada (PWGSC) against the Panama Papers using three differ string matching methods. Monge-Elkan gives the best match results, but is much slower than the other algorithms. Levenshtein gives reasonably good match results and is also fast. Jaccard is fast, but matching performance is very poor, if the names are modified even a little.

Dataset: Panama Papers and PWGSC.