Why can Facebook, Google, Salesforce and Twitter role out new features every day and regular telecom operators only every 6 months? Although they are dotcoms, they have thousands of employees and a lot of legacy systems as well. However they are able to roll out a new feature every day, if not every hour or minute and large new systems every so many months, weeks or even days.
How do they do it and how can the telecom industry learn from it?
On highscalability, you will find a lot of information on the architectures of large dotcoms. However if you look at different articles you see that each of the larger dotcoms has an architecture that is shared among different products and services, e.g. scaling messages at facebook.
This is the secret sause of the dotcoms. They have built and continuously improved a highly distributed architecture that can handle millions of users and peta bytes of information. On top of this “shared architecture” go the services. New employees are able to quickly create new services because they do not have to worry about scaling data, monitoring the service, deploying/upgrading versions, backing up data, versioning code, etc.
On the other hands operators have no standardized shared architecture. Instead there is a puzzle of different solutions that often use totally different technologies, hardware, etc. Maintenance and upgrades are a nightmare.
Trying to launch any new service requires a massive amount of planning, lots of different skills, expensive investments in third-party licenses and hardware, etc.
How can you do it differently?
Building a private cloud with virtual servers and storage will not resolve operator’s problems. Just virtualizing the puzzle of solutions is not going to do away with complex integrations.
Operators need to make a more bolt move. They need to separate the new from the old. Legacy systems should be kept and isolated. However a new architecture should be built that works in parallel with the legacy systems. This new architecture should focus on launching new services and partner services at dotcom speeds. Everything should be handled as an independent service. Each service should get its own API. A storage services, a billing service, a monitoring service, a provisioning service, an identity service, a datawarehouse service, a deployment service, a mobile shop service, an inventory service, a support service, etc.
All APIs should use a common technology. APIs for third-parties could use REST. APIs for internal high-load usage could use Thrift or Protobuffers. Each API should have two versions, the easy and the low-level version. The easy API offers the most used but in general basic functionality, e.g. sendSMS(from, to, message). The low-level API offers a complete feature set, e.g. sendBinarySMS, sendSMSWithDeliveryConfirmation, etc. This will allow most services to use the easy API but to have access to the advanced functionality when needed.
Loadbalancing when using the services is key. The loadbalancer is the secret for many rolling upgrades in the dotcom world. An application that uses a certain service will use client-based loadbalancing. By having the loadbalancing be able to receive events, it is possible to dynamically add/remove instances of an API, gradually move requests to a new version of the API, etc.
New service developers will now have to focus on building the business logic for the new service and not on data migrations, scaling, monitoring, backups, etc. The service can have completely new ways of billing and charging, a complex deployment workflow, advanced monitoring requirements, large data storage requirements, etc. However it is not the billing or charging system that has to be extended. Neither a centralized EAI. Nor the monitoring system. Instead it is the service that decides what is best for the service via the use of the easy or low-level APIs. By moving the peculiarities of every service into the service and not into generic OSS and BSS systems, these support systems can be drastically simplified.
Operators should try to focus on launching a lot more niche services and opening up their infrastructure to a long-tail of service suppliers. Instead of general services like PBX for SME, operators should think about hotel reservation services, doctor scheduling services, etc. The value of the operator should be in offering a reliable back-office architecture, assuring service quality and managing the support eco-system. The long-tail of service suppliers should be put to work to launch competing niche offerings and let customers decide which one will survive or not.