Tech Threads

Data storage mechanism in Facebook

The data storage mechanism in Facebook is amazing and curious too. We are all almost familiar with social media mainly Facebook where photo uploads total 300 million per day. Daily generates 4.5 billion likes. Every 60 seconds, 510 comments are posted, 293,000 statuses are updated. It’s really curious to know how Facebook stores such a huge volume of data that’s totally impossible using traditional database management systems (RDBMS).

Facebook is using distributed database management system called Cassandra.
Cassandra was initially developed by Facebook to enhance their inbox search feature. In July, 2008, it was released as open source project on Google code. Eventually, it became an incubator project for Apache community in 2009. It has been developed by using Java programming language and coming under the NoSQL Database management system.
Cassandra has been designed to achieve high scalable, extreme performance and to hold very large volume of data across many commodity servers in a cluster. This is a schema less database and maintain column family concept to hold data. Cassandra is a proven fault-tolerance on commodity hardware or cloud infrastructure.


Written by
Gautam Goswami

Can be reached for real-time POC development and hands-on technical training at gautambangalore@gmail.com. Besides, to design, develop just as help in any Hadoop/Big Data handling related task. Gautam is a advisor and furthermore an Educator as well. Before that, he filled in as Sr. Technical Architect in different technologies and business space across numerous nations.
He is energetic about sharing information through blogs, preparing workshops on different Big Data related innovations, systems and related technologies.

Page: 1 2

Recent Posts

The Role of Materialized Views in Modern Data Stream Processing Architectures + RisingWave

Incremental computation in data streaming means updating results as fresh data comes in, without redoing… Read More

5 days ago

Unlocking the Power of Patterns in Event Stream Processing (ESP): The Critical Role of Apache Flink’s FlinkCEP Library

We call this an event when a button is pressed, a sensor detects a temperature… Read More

3 weeks ago

Real-Time Redefined: Apache Flink and Apache Paimon Influence Data Streaming’s Future

Apache Paimon is made to function well with constantly flowing data, which is typical of… Read More

1 month ago

Revolutionize Stream Processing with the Power of Data Fabric

A data fabric is an innovative system designed to seamlessly integrate and organize data from… Read More

2 months ago

Bridging the Gap: Unlocking the Power of HDFS-Based Data Lakes with Streaming Databases

Big data technologies' quick development has brought attention to the necessity of a smooth transition… Read More

2 months ago

Which Flow Is Best for Your Data Needs: Time Series vs. Streaming Databases

Data is being generated from various sources, including electronic devices, machines, and social media, across… Read More

3 months ago