ParlAI

A collection of software which likely saved me 1000 hours of work.

Introduction

ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat to VQA (Visual Question Answering).

Facebook has collected and made available an amazing collection of tools for NLP researchers, but these same tools act as an incredible platform for collecting conversational sociological data. I am able to use this platform to very quickly have short conversations with social actors about any part of their social world, for example:

  • perception and understanding of friends and family, coworkers and social workers
  • understandings of politics, political events, political parties and ideologies
  • understandings of others' understandings of politics, friends and family, etc.
  • the conversational tactics and systems of reasoning people use in practice to negotiate their understandings

This means making novel machine-readable large-N datasets concerning the understandings and social worlds of everyday individuals -- attempting to bring qualitative data to quantitative methods.

Links

Simple setup

I use the machine-learning AMI created for Amazon SageMaker (which is incredible). For this specific set-up it's helpful to follow Amazon's instructions to install NodeJS.

You're then going to have to set up an account with Heroku, and a requester account with Amazon Mechanical Turk. You'll also have to create a role with administrator permissions so ParlAI can create and manage Turkin' tasks.

git clone https://github.com/facebookresearch/ParlAI.git ~/ParlAI
cd ~/ParlAI; python setup.py develop

Running an mturk task:

python run.py -nc 2 -r 0.05 --sandbox

Developing with ParlAI

Examples

Question / Answer data collection

The code

The original paper presents the following figure as example, showing an incredibly simple conversation between a bot and a human:

Multi-agent dialog

The code

Wow! This set of Mturk tasks pairs turkers together with the researcher to participate in a turn-based chat, where the researcher speaks first, then turker 1, then turker 2.