Facebook is collecting your data — 500 terabytes a day

With more than 950 million users, Facebook is collecting a lot of data. Every time you click a notification, visit a page, upload a photo, or check out a friend’s link, you’re generating data for the company to track. Multiply that by 950 million people, who spend on average more than 6.5 hours on the site every month, and you have a lot of information to deal with.

Here are some of the stats the company provided Wednesday to demonstrate just how big Facebook’s data really is:

  • 2.5 billion content items shared per day (status updates + wall posts + photos + videos + comments)
  • 2.7 billion Likes per day
  • 300 million photos uploaded per day
  • 100+ petabytes of disk space in one of FB’s largest Hadoop (HDFS) clusters
  • 105 terabytes of data scanned via Hive, Facebook’s Hadoop query language, every 30 minutes
  • 70,000 queries executed on these databases per day
  • 500+terabytes of new data ingested into the databases every day

“If you aren’t taking advantage of big data, then you don’t have big data, you have just a pile of data,” said Jay Parikh, VP of infrastructure at Facebook on Wednesday. “Everything is interesting to us.”

Parikh said the company is constantly trying to figure out how to better analyze and make sense of the data, including doing extensive A/B testing on all potential updates to the site, and making sure it responds in real time to user input.

“We’re growing fast, but everyone else is growing faster,” he said.

via Facebook is collecting your data — 500 terabytes a day — Data | GigaOM.


Filed under electronic discovery, Privacy Rights

2 responses to “Facebook is collecting your data — 500 terabytes a day

  1. The worrying thing is, what are they going to do with all the data? Will it be analysed and sold? I am one of the small number of people who don’t have a Facebook account — by the looks of it, I’ll never have one.

  2. There are several problems with Facebook keeping data for so long and to such extent.

    1. While Facebook states that the information belongs to the User, it also states that the User allows Facebook to use the information

    2. As an user you can download all of your data. That means it is discoverable in a lawsuit. So ALL of that data, is kept until you delete your account. And all of the rants/vents, pictures, opposing counsel might have access to in a legal process (assuming it is relevant, etc).

    3. Facebook provides in its Privacy Terms that they use the data for many purposes, including:
    (a) to see how effective their ads are for you and others; and
    (b) for internal operations (such as troubleshooting, data analysis, testing, research, and service improvement)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s