The big names in dotcom world are busy open sourcing some of their secret sause. It is very important to become familiar with these often strangely named projects because they are responsible for several competitive advantages. Since the list is growing please suggest new solutions in the comments section so they can be added.
- Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store
- Hive a data warehouse infrastructure that provides data summarization and ad hoc querying.
- FlashCache is a general purpose writeback block cache for Linux. It was developed as a loadable Linux kernel module, using the Device Mapper and sits below the filesystem.
- HipHop for PHP transforms PHP source code into highly optimized C++. HipHop offers large performance gains and was developed over the past two years.
- Open Compute Project an open hardware project aims to accelerate data center and server innovation while increasing computing efficiency through collaboration on relevant best practices and technical specifications.
- Scribe is a scalable service for aggregating log data streamed in real time from a large number of servers.
- Thrift provides a framework for scalable cross-language services development in C++, Java, Python, PHP, and Ruby.
- Tornado is a relatively simple, non-blocking web server framework written in Python. It is designed to handle thousands of simultaneous connections, making it ideal for real-time Web services.
- codemod assists with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention.
- Online Schema Change for MySQL lets you alter large database tables without taking your cluster offline.
- Phabricator is a collection of web applications which make it easier to write, review, and share source code. It is currently available as an early release and is used by hundreds of Facebook engineers every day.
- PHPEmbed makes embedding PHP truly simple for all of our developers (and indeed the world) we developed this PHPEmbed library which is just a more accessible and simplified API built on top of the PHP SAPI.
- phpsh provides an interactive shell for PHP that features readline history, tab completion, and quick access to documentation. It is ironically written mostly in Python.
- Three20 is an Objective-C library for iPhone developers which provides many UI elements and data helpers behind our iPhone application.
- XHP is a PHP extension which augments the syntax of the language such that XML document fragments become valid expressions.
- XHProf is a function-level hierarchical profiler for PHP with a simple HTML-based navigational interface.
Twitter open sourced some complete projects (e.g. FlockDB) but especially adds extensions to existing projects. For a full list see here.
- Apache Traffic Server is fast, scalable and extensible HTTP/1.1 compliant caching proxy server.
- Hadoop THE nosql solution at the moment was started by Yahoo. Yahoo actively contributes also to several extensions like Avro and Pig.
- Azkaban is simple batch scheduler for constructing and running Hadoop jobs or other offline processes
is a Faceted Search implementation written purely in Java, an extension of Apache Lucene
is a flexible, partial, out-of-order and real-time typeahead search.
is Hadoop library for large-scale data processing.
is a deployment automation platform
Indexing engine for IndexTank
, BackOffice, Storefront, and Nebulizer for IndexTank
is a distributed publish/subscribe messaging system
is a utility package for performing operations on compressed arrays of sorted integers
is a simple persistent data store with very low latency and high throughput
is a library that provides easy cluster management and workload distribution
is a distributed, elastic, realtime, searchable database
is a distributed key-value storage system
is a real-time search and indexing system built on Apache Lucene
Categories: Disrup. Technology, Innovation
android, apache traffic server, avro, bigtable, cassandra, chromium, facebook open source, flockdb, gfs, glu, google open source, gwt, hadoop, hbase, hdfs, hiphop for php, hive, linkedin open source, pig, sensei, thrift, twitter open source, voldemort, yahoo linkedin google twitter facebook open source, yahoo open source, yui, zookeeper