Assignment #1

I am really interested in analyzing text and specifically my own text. What type of words do I use the most frequently? What type of links have I shared most often? By looking at data that analyzes me, maybe I can begin to understand myself better or the way that I talk and articulate myself. So for this assignment, I downloaded my facebook archive and focused on my chat data. Facebook is the main social media that I have used frequently in my life so I thought it was adequate to focus on it. I received a huge folder of my messages in HTML format. So I converted each one to a JSON file using a python plugin and from there I begin parsing and analyzing each file.

Using NLTK, I wrote and ran a python script on my JSON files to look for links I have shared and words I have used most frequently. Things that I discovered and found interesting about myself…

I have said:

“like” = 3,336 times
“yes”(677) + “yea”(1397) + “yeah”(243) = 2,317 times
“no”(1109) + “nooo”(90) + “nope”(126) = 1,325 times
“sleep”(178) + “tired”(100) = 278 times
“arab”(84) + “arabic”(90) = 174 times
“ugh” = 461 times
“work”(547) + “working”(223) = 770 times
“sorry” = 220 times
“dunno”(115) + “idk”(242) = 357 times
“nouf” = 90 times
“love” = 434 times
“hate” = 232

And hilarious articles like this that I have shared 10 times…

