Archive

Posts Tagged ‘apache traffic server’

Open Source Solution Index from the Big Dotcoms

January 26, 2012 Leave a comment

The big names in dotcom world are busy open sourcing some of their secret sause. It is very important to become familiar with these often strangely named projects because they are responsible for several competitive advantages. Since the list is growing please suggest new solutions in the comments section so they can be added.

Google

Facebook

  • Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store
  • Hive a data warehouse infrastructure that provides data summarization and ad hoc querying.
  • FlashCache is a general purpose writeback block cache for Linux. It was developed as a loadable Linux kernel module, using the Device Mapper and sits below the filesystem.
  • HipHop for PHP transforms PHP source code into highly optimized C++. HipHop offers large performance gains and was developed over the past two years.
  • Open Compute Project an open hardware project aims to accelerate data center and server innovation while increasing computing efficiency through collaboration on relevant best practices and technical specifications.
  • Scribe is a scalable service for aggregating log data streamed in real time from a large number of servers.
  • Thrift provides a framework for scalable cross-language services development in C++, Java, Python, PHP, and Ruby.
  • Tornado is a relatively simple, non-blocking web server framework written in Python. It is designed to handle thousands of simultaneous connections, making it ideal for real-time Web services.
  • codemod assists with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention.
  • Facebook Animation is a JavaScript library for creating customizable animations using DOM and CSS manipulation.
  • Online Schema Change for MySQL lets you alter large database tables without taking your cluster offline.
  • Phabricator is a collection of web applications which make it easier to write, review, and share source code. It is currently available as an early release and is used by hundreds of Facebook engineers every day.
  • PHPEmbed makes embedding PHP truly simple for all of our developers (and indeed the world) we developed this PHPEmbed library which is just a more accessible and simplified API built on top of the PHP SAPI.
  • phpsh provides an interactive shell for PHP that features readline history, tab completion, and quick access to documentation. It is ironically written mostly in Python.
  • Three20 is an Objective-C library for iPhone developers which provides many UI elements and data helpers behind our iPhone application.
  • XHP is a PHP extension which augments the syntax of the language such that XML document fragments become valid expressions.
  • XHProf is a function-level hierarchical profiler for PHP with a simple HTML-based navigational interface.

Twitter

Twitter open sourced some complete projects (e.g. FlockDB) but especially adds extensions to existing projects. For a full list see here.

Yahoo

  • Apache Traffic Server is fast, scalable and extensible HTTP/1.1 compliant caching proxy server.
  • Hadoop THE nosql solution at the moment was started by Yahoo. Yahoo actively contributes also to several extensions like Avro and Pig.
  • YUI is a free, open source JavaScript and CSS framework for building richly interactive web applications.

LinkedIn

  • Azkaban is simple batch scheduler for constructing and running Hadoop jobs or other offline processes
  • Bobo is a Faceted Search implementation written purely in Java, an extension of Apache Lucene
  • Cleo is a flexible, partial, out-of-order and real-time typeahead search.
  • Datafu is Hadoop library for large-scale data processing.
  • Decomposer is for massive matrix decompositions
  • Glu is a deployment automation platform
  • A set of useful gradle plugins
  • Indexing engine for IndexTank and API, BackOffice, Storefront, and Nebulizer for IndexTank  
  • Kafka is a distributed publish/subscribe messaging system
  • Kamikaze is a utility package for performing operations on compressed arrays of sorted integers
  • Krati is a simple persistent data store with very low latency and high throughput
  • Base utilities shared by all linkedin open source projects
  • A set of utility classes and wrappers around ZooKeeper
  • Norbert is a library that provides easy cluster management and workload distribution
  • Sensei is a distributed, elastic, realtime, searchable database
  • Voldemort is a distributed key-value storage system
  • Zoie is a real-time search and indexing system built on Apache Lucene
Follow

Get every new post delivered to your Inbox.

Join 305 other followers

%d bloggers like this: