IMG_6316.JPG

itp

Population Infinite Final Project: Simulated Intimacy

 

THE IDEA

Can I simulate my own relationship?

What will I learn from simulating my own relationship?

The deliverables would be two chatbots, one for me and one for my partner, trained on our Facebook message data. I wanted to create these bots and have them talk to each other. I wondered if the chatbots would fight, and if they did, could they resolve their issues better than my partner and I do?

I downloaded all of my and my partner’s Facebook message data following these steps. I planned to use the sequence-2-sequence framework to train two models to speak like me and my partner.

THE PROCESS

I aimed to follow this tutorial by Adit Deshpande to create a Facebook messenger bot. I ran into a problem immediately, though: the file that I needed to feed into the first python file was generated by fbchat-archive-parser, a tool that no longer works since Facebook updated their download process to be more user friendly. In order to create the file I needed (a practice file), I used p5.js to parse a long conversation JSON into the format that would be readable by the python script. The resulting file looked like:

Sender Name: hey what’s up

My Name: Not too much, you?

let chat;
let bloop;
function preload() {
  chat = loadJSON("message_1.json");
}
function setup() {
for (i =0; i < chat.messages.length; i++) {  
  bloop =chat.messages[i].content;
  blip = chat.messages[i].sender_name;
  if ((chat.messages[i].sender_name != "Cara Neel") && (chat.messages[i+1].sender_name == "Cara Neel")){
  console.log(chat.messages[i].sender_name + " : " + chat.messages[i].content + "\n" + chat.messages[i+1].sender_name + " : " + chat.messages[i +1].content);      
      }
}
}

The next thing I did was re-create the createDataset.py command to be able to read my file. I had to update the file in order to work with the newer version of python I was using, as well as re-write the function to be able to parse the file I had created. After this, I ran Word2Vec.py and Seq2Seq.py (after adapting them to python3), and I got… a fairly poorly trained model.


THE RESULT

I retrained the model using another, longer conversation and the responses were better, although still repetitive and strange. I trained one model on a file comprised on my message data and a separate model on a file comprised of my partner’s data. The last change I made was to the Seq2Seq.py test sentences. I changed them to reflect the conversations I have with my partner:

encoderTestStrings = ["hi babe",
					"hi",
					"hey how are you",
					"when will you be home?",
					"are you mad?"
					]
Screen Shot 2019-10-08 at 12.16.53 PM.png
Screen Shot 2019-10-08 at 12.57.05 PM.png
Screen Shot 2019-10-09 at 5.40.16 PM.png
Screen Shot 2019-10-09 at 5.59.51 PM.png

TROUBLESHOOTING

From my perspective, there are two issues that are preventing my model from running smoothly:

  1. Not enough data: For the proof of concept, I was working with one conversation (because I had only written code to parse one conversation JSON file at a time). Thus, the model was working from a small dataset.

  2. There is some kind of error happening when I run createDataset.py. Looking at the conversationsDictionary.npy file, there are tons of weird characters that got jumbled into the file, for unclear reasons. I think this is a contributing factor in why the model doesn’t run as well.

 
Caroline NeelComment