I followed the latest trend also downloaded the zip archive of my Facebook data, but what I found after analyzing the data is not the thing I was expecting.
I wanted to know what Facebook has on me, like everyone else now but I started to dig deeper beyond contact data, ads clicks, and my activity history. For the first time, all the ideas that were in my head about parsing my facebook data seemed doable. The pages were clean and they were just screaming to me, ‘use me, you know how to code something around me’. I started to write a scraper to analyze all this data, tens of thousands of messages, friend adding history, my vocabulary, I was curious about everything. The whole Facebook Cambridge Analytica scandal should remind us, how facebook is big in our lives, how much we use it daily and we exhibit behaviors far beyond simple likes and photos.
I wanted to know my patterns and did something changed over the years. I was using Facebook for so many years and the messenger was also always my primary mode of communication for a very long time. There has to be interesting data there, I would be very disappointed if all my work turned dull and boring. I created a script in ruby to parse my zip archive data and I am sharing it with all of you, all you need is zip copy archive of Facebook (it has to be in English language).
Here it is for you to try after reading, sorry about code quality, but it’s a quick hack, so forgive me:
This script will generate excel file with all the stats spread across worksheets: ‘Friends ranking’, ‘My message statistics’, ‘Vocabulary statistics’, ‘Contact list’, ‘Making friends’.
First thing I was interested in with whom I was writing the most and how the ratio in our conversion looked like.
So far, so good, my best friend is on first place, followed by my current girlfriend and I see they bring a little more engagement to a conversation as they are writing more messages and words. This is very interesting statistic, probably telling that I really prefer to talk with people who equally engage in conversation.
So far it was fun, I have a ranking of my friends by a number of messages, words, characters. The next section brought memories. Time for a little confession, I was a really shy type, I sat alone at school till end of the high school. I wasn’t going to parties, drinking alcohol, even hang out with friends. If anything mentioned ever happen, it was so rare as almost non-existent. After I moved out from parents and I know it was the best decision of my life, I was alone, desperate for friends and they were nowhere to be found. I never went on dates, rarely went with anyone to cinema or for beer, I spent my time binge watching tv shows. But something clicked one year when I started to go out, go out a lot, I mean really a lot. It wasn’t good to party that amount but I needed that, contact with girls, going to hip hop classes, caring more about more looks. And the funny thing is, you can see it in data.
**This is my number of sent messages by year, try to guess, where I started to become more social.
Almost 10x more on 2016 when I started to make my first steps on the social and dating scene and then almost 40x compared to 2015. I know it’s just data from one communication type but I was always using the messenger before as my main communication. There is no denying that there is a strong correlation between my social interactions with the people and the amount of messages I was sending. In 2015 and 2014 I am the level of average 3.6 messages per day and that just show how lonely I was at that time, there wasn’t anyone on the other end of the line for me to talk, to setup meeting, to care about my day or say goodnight. In 2017, it’s average of 108 messages per day, I had people to talk to, interesting people that I felt comfortable talking to and exchanging that many messages per day. My social skills were at my peak and I felt really undefeatable when approaching girls.
Most busy messaging days
Ok, even that statistic shows something wonderful about being in social mode. Most busy messaging days happen to be from June 2017 just before I met my girlfriend and made a really good impression, good enough that we are together.
And here we are, aren’t we cute together? :)
I also analyzed my history of adding new friends, the progress is still there but this statistic is not so trustworthy as every job change, a rapid spike of old colleagues creating Facebook easily make data chaotic. But you can also see progress here.
**Making friends by year
My most social year is also the year, where I made most friends. But as I said before, this is not data that I trust 100% because any social environment change makes this data not reliable.
I was also interested when I was making new friends, was it on weekend or during working days. As I predicted most new friendships came from going out on weekends.
You know, what else you can get from Facebook data? Type of person, night owl vs. early bird. It’s not that hard, just analyze your messaging patterns, broken down by an hour.
A quick look and you know, I am not the early bird type. I always suspected that I am the night owl but now I have the data to back it. I am more productive during night hours and my most intensive coding sessions happened during night hours. Even this article, I’m creating during the late hours. It’s easier for me to get into flow state but also it’s quite.
Let’s take a look at the vocabulary statistics. I was interested about my vocabulary. I have used 43k unique words and 366k words in total. That’s just the tip of the iceberg. What about most common words used by me, after we go through the common words we get to the interesting stuff. I suspected that probably as quite sensitive person, that should also be reflected in my writing. I will translate this polish words to expression in English for better comprehension. I don’t want to overanalyse this but compared to my other friends, I will say it looks different, evidently I like to express my emotions and how I feel and people tend to hide them, especially during texting. Maybe that’s also part of being shy, it’s easier to write things than say them in front of other people.
Also, I really like to talk about Taco Hemingway, polish hip-hop artist :) I communicate my feelings quite openly and based on my experience probably people who do that could easily become a target of bullying and that leads to closing yourself, talking less, taking fewer chances as first tries ended horribly.
So now you know my story(at least small part without going into depressing moods), expressed by data, a story that I almost forgot because things are so much different for me now. I don’t remember the guy that was shy of approaching girls, dancing with them, going out with friends. And that’s what really interesting about our age, we don’t only can make memories by writing journals, making photos, we can create memories by creating data. The data that we can analyze in the future and that could spark memories, even though sometimes a bit sad.
Want to run this script and also get insights about yourself. Maybe just for fun or maybe you will also find something interesting about yourself.
Check my github repository with instruction, how to run: