Bạn có biết Twitter lưu trữ dữ liệu trên database nào?

Công nghệ thông tin

Từ khóa:

công nghệ thông tin

Mình có search được 1 chút thì có một bài trên quora có cùng ý tưởng như vầy. Ý tưởng là sử dụng rất nhiều loại DB khác nhau với công dụng khác nhau.

Their GitHub page is quite informative on this topic, if you don't mind digging around a bit:

MySQL - Twitter uses MySQL heavily for primary storage of Tweets and Users, and maintains a custom fork that they recently open-sourced:
https://github.com/twitter/mysql
. More information on the engineering blog:
http://engineering.twitter.com/2...
FlockDB - This is Twitter's in-house graph database, which they use to store social graph information (following, etc). Ultimately it's built on MySQL, but it's still basically a proper database in its own right:
https://github.com/twitter/flockdb
Memcached: Twitter uses a "heavily modified" fork of Memcached 1.4.4 which they call Twemcache. It's been in production for well over a year already (as of 7/2012):
https://github.com/twitter/twemc...
. Also check out lots more info on the Engineering Blog:
http://engineering.twitter.com/2...
Cassandra: Twitter has spent a lot of time experimenting with Cassandra. Ultimately this led to services like Snowflake for quickly generating unique identifiers:
https://github.com/twitter/snowf...
. There's also a Ruby gem for Cassandra here:
https://github.com/twitter/cassa...
Gizzard: This is an in-house scala framework for building custom distributed databases (with arbitrary storage technology), that underlies a number of the other systems discussed here. GitHub page here:
https://github.com/twitter/gizzard
Apache Lucene: This isn't on the GitHub page, but has been talked about publicly on the engineering blog. The search index is now powered by Lucene, through a system they call Earlybird. See
http://engineering.twitter.com/2...
and
http://engineering.twitter.com/2...
for more detail.
HBase and Hadoop: Twitter uses Hadoop and HBase heavily, although this also isn't clear from the GitHub page. Check out
Kevin Weil
's "Elephant Bird" project on GitHub:
https://github.com/kevinweil/ele...
, with more information on the blog:
http://engineering.twitter.com/2...
Redis: Finally, there's some experimental timeline storage technology that was developed on Redis. It's unclear whether Redis is being used in production or not at this time, but see the Haplocheirus project:
https://github.com/twitter/haplo...

There's plenty more, and surely lots of technology that has not been publicly disclosed, but that's the majority of public information that's available on the topic.

Trả lời

An DANG

Mình có search được 1 chút thì có một bài trên quora có cùng ý tưởng như vầy. Ý tưởng là sử dụng rất nhiều loại DB khác nhau với công dụng khác nhau.

Their GitHub page is quite informative on this topic, if you don't mind digging around a bit:

MySQL - Twitter uses MySQL heavily for primary storage of Tweets and Users, and maintains a custom fork that they recently open-sourced:
https://github.com/twitter/mysql
. More information on the engineering blog:
http://engineering.twitter.com/2...
FlockDB - This is Twitter's in-house graph database, which they use to store social graph information (following, etc). Ultimately it's built on MySQL, but it's still basically a proper database in its own right:
https://github.com/twitter/flockdb
Memcached: Twitter uses a "heavily modified" fork of Memcached 1.4.4 which they call Twemcache. It's been in production for well over a year already (as of 7/2012):
https://github.com/twitter/twemc...
. Also check out lots more info on the Engineering Blog:
http://engineering.twitter.com/2...
Cassandra: Twitter has spent a lot of time experimenting with Cassandra. Ultimately this led to services like Snowflake for quickly generating unique identifiers:
https://github.com/twitter/snowf...
. There's also a Ruby gem for Cassandra here:
https://github.com/twitter/cassa...
Gizzard: This is an in-house scala framework for building custom distributed databases (with arbitrary storage technology), that underlies a number of the other systems discussed here. GitHub page here:
https://github.com/twitter/gizzard
Apache Lucene: This isn't on the GitHub page, but has been talked about publicly on the engineering blog. The search index is now powered by Lucene, through a system they call Earlybird. See
http://engineering.twitter.com/2...
and
http://engineering.twitter.com/2...
for more detail.
HBase and Hadoop: Twitter uses Hadoop and HBase heavily, although this also isn't clear from the GitHub page. Check out
Kevin Weil
's "Elephant Bird" project on GitHub:
https://github.com/kevinweil/ele...
, with more information on the blog:
http://engineering.twitter.com/2...
Redis: Finally, there's some experimental timeline storage technology that was developed on Redis. It's unclear whether Redis is being used in production or not at this time, but see the Haplocheirus project:
https://github.com/twitter/haplo...

There's plenty more, and surely lots of technology that has not been publicly disclosed, but that's the majority of public information that's available on the topic.

Nội dung liên quan

Dữ liệu, thông tin của ta liệu có được đảm bảo khi biết có hacker đã rao bán 30 triệu thông tin của người Việt trên website về giáo dục?

E hiện là học sinh cấp 3 ban A và thường có rất nhiều đề thầy cô phát trên lớp mà không biết nên dùng gì để đựng và sắp xếp chúng vào 1 chỗ cho gọn. Mong mn cho e gợi ý ạ?

Bạn có biết Twitter lưu trữ dữ liệu trên database nào?

Công nghệ thông tin

công nghệ thông tin

Dữ liệu, thông tin của ta liệu có được đảm bảo khi biết có hacker đã rao bán 30 triệu thông tin của người Việt trên website về giáo dục?

Công nghệ nước nào đang dẫn đầu?

Logo của Twitter có ý nghĩa gì?

Bạn có quan tâm nhiều đến trình độ học vấn của người yêu/nửa kia của mình không?

E hiện là học sinh cấp 3 ban A và thường có rất nhiều đề thầy cô phát trên lớp mà không biết nên dùng gì để đựng và sắp xếp chúng vào 1 chỗ cho gọn. Mong mn cho e gợi ý ạ?

Giúp mình với ạ?

Khi phỏng vấn HR thường hỏi thu nhập hiện tại của bạn. Trong trường hợp này bạn nên trả lời ra sao?

Dev Việt Nam tuổi nghề thấp nhưng ở nước ngoài lại khá cao. Lý do vì sao?

CA. Chỉ dùng thêm một loại hoá chất. nêu cách phân biệt các oxit K2O, Al2O3, CaO, MgO.

Cách tốt nhất để xua tan cơn buồn ngủ là gì?

Bạn dành bao nhieu thời gian cho việc vào Facebook ?