Hadoop has run into architectural limitations and the community has started working on the Next Generation Hadoop [NGN Hadoop]. NGN Hadoop has some new management features of which multi-tenant application management is the major one. However the key change is that MapReduce no longer is entangled inside the rest of Hadoop. This will allow Hadoop to be used for MPI, Machine Learning, Master-Worker, Iterative Processing, Graph Processing, etc. New tools to better manage Hadoop are also being incubated, e.g. Ambari and HCatalog.
Why is this important for telecom?
Having one platform that allows massive data storage, peta-byte data analytics, complex parallel computations, large-scale machine learning, big data map reduce processing, etc. all in one multi-tenant set-up means that telecom operators could see massive reductions in their architecture costs together with faster go-to-market, better data intelligence, etc.
Telecom applications, that are redesigned around this new paradigm, can all use one shared back-office architecture. Having data centralized into one large Hadoop cluster instead of tens or hundreds of application-specific databases, will enable unseen data analytics possibilities and bring much-needed efficiencies.
What is needed is that several large operators define this approach as their standard architecture hence telecom solution providers will start incorporating it into their solutions. Commercial support can be easily acquired from companies like Hortonworks, Cloudera, etc.
Having one shared data architecture and multi-tenant application virtualization in the form of a Telco PaaS would allow third-parties to launch new services quickly and cheaply, think days in stead of years…
Most telecom projects involve installing an Oracle RAC cluster, a SAN, application server clusters, etc. Only the time it takes to procure and install the basic hardware and software takes months. We are not even talking about the costs…
If you want to launch new ideas every month, you have to use Cloud Computing. This can be public cloud, private cloud or hybrid cloud. However even then too much time is spend on installation and configuration of software.
Infrastructure automation is about making a team productive in quickly launching new services or updating existing services. Infrastructure starts with having a standardized development environment, automated build tools (e.g. Maven or Ant for Java), continuous test automation (junit but also Hudson or Cruisecontrol), etc. The next step would be to also automate deployment (e.g. Puppet & mCollective from PuppetLabs) of server software and Cloud infrastructure.
This type of automation is often 90% equal so having a standardized framework would extremely shorten the times to get software development up and running as well as deploying it into test and production environments.
This would however only be the start of the journey. Dotcoms launch new features on a weekly or even daily basis. They monitor in detail what users do and often launch multiple alternative versions of a new feature. Gradual deployment of small features allows to see performance problems strait away and avoids extensive regression tests and fast rollback.
Let’s see how this could be applied in telecom.
The key to success is copying Google. Google is having standardized architecture components that are reused among different teams (e.g. BigTable). By building up a shared infrastructure and the tools to quickly deploy new services onto it or update existing features, time to market can be dramatically reduced. Infrastructure is a secret competitive weapon that too few companies use.
Google has changed very little to its basic architecture building blocks over the years. Everything runs on top of the Google File System and Bigtable. Except for Google Instant which is reversing Map-Reduce usage, new services have been reusing their existing architecture.
Similar observations can be made for the rest of the main players. So why is it that Telecom operators have not invested in one architecture to launch multiple services? No idea.
One architecture for VAS
The concept is simple. Create one common architecture. This architecture should have multiple components:
- A high-available real-time data store – stores all application and user data
- A right-time data analytics service – allows collective intelligence and data mining
- An asset exposure layer – applications can re-use network assets and get isolated from internal complexities
- Presentation layer – facilitate mobile GUI and Web 2.0 development
- Application Engine – allows applications to run and focus on business logic instead of scaling and integration
- Continuous Deployment – instead of monthly big-bang deployments, incremental daily or weekly releases are possible, even hourly like some dotcoms.
- Unified Administration – one place to know what is happening both technically and business-wise with the applications.
- Long-Tail Business Link – all business and accounting transactions for customers, partners, providers, etc. are centralized.
Having such an architecture in place would allow telco innovations to be brought to market at least ten times faster. Application and service designers would have to focus on business logic and not on the rest. Administrators would have one platform to manage and not a puzzle of systems. Integrations would have to be done ones to a common integration layer.
Building such an architecture should be done in the dotcom style and not a telco RFQ. Only by doing iterative projects which bring together the components can you build an architecture that is really used and not a side project that starts to have its own life.
It even makes sense to open source the architecture. Telco’s business is not about building architectures hence having a common platform that was started by one would benefit the whole industry. It even would give a competitive advantage to the one that started the architecture for knowing it better than any competitor. Of course for this to happen, a telco has to recognize that their future major competitors are not the neighboring telco but a global dotcom…
In the telecom domain a scalable real-time architecture means paying a lot of money in hardware and licenses. You buy the Oracle RAC solution, build a Weblogic cluster, set-up a storage area network, etc.
In the dotcom world things look differently. Facebook, Google, Twitter, Yahoo, Amazon, etc. have more active users then any telecom system. However they have build their architecture on top of open source solutions and average servers. Some even build their own software and sometimes open-sourced it.
Some of this software has very exotic names: Hadoop, Bigtable, Cassandra, Pig, Elephant-Bird, Dremel, Pregel, Dynamo, etc. Additionally design decisions are taken that would surprise every IT teacher: “do not normalize”, “do not expect immediate consistency”, “no transaction support”, “store in memory instead of on disk”, etc.
However if you can support 500 million users, 100 million daily hits, 130TB of logs, 20 billion tweet messages, 1 million servers, etc. then something you should be doing right.
The telecom software industry seems to have been isolated from the Internet during the last five years. With the shift to IP it is expected that more IT companies will be able to provide telecom solutions. Is this the solution? Not sure! Also IT companies are still playing catch-up in the cloud computing domain. Few IT solutions providers are demonstrating, they now think Map-Reduce instead of Middleware.
Google Voice is coming and most operators seem to be still more worried about churning subscribers. Google Latitude and Maps demonstrated that with new technology and innovation you can destroy the telecom monopoly on location-based services overnight…
If you are a telecom operator and you are worried, perhaps it is time we talk.
When the words “technology and innovation” come to mind, most people think about Google, Apple, Amazon, Facebook, Salesforce, etc. Just a few think about telecom operators. The biggest telecom innovation has been mobile voice. SMS was never a technological innovation but an unplanned surprise success. MMS never got close to SMS. The iPhone and Android did not come from any telecom operator or provider.
Why is it that five people with a limited budget in months are able to stun the world whereas massive multinationals with deep pockets are not?
The reason is simple: “To innovate you have to try often and kill quickly”.
Google launched Wave in the beginning of this year. Google “killed” Wave about 6 months later. Every day Google makes a change to their search algorithm.
The current process
The cost for a telecom operator to innovate is massive. A simplified process would be the following:
- The marketing department receives calls and visits from every possible telecom provider on a daily basis. New ideas are thrown on the table to see if they stick.
- The marketing department selects the best ideas.
- These ideas are scanned by the different other departments, e.g. operations, finance, legal, IT, etc.
- A multi-disciplinary team is assembled to write the requirements for the new service, a.k.a. RFI.
- Several possible telecom providers receive the RFI and provide a response.
- A budget is allocated based on the responses and an RFQ, request for quotation, is organized.
- Several telecom providers respond to the RFQ.
- A bidding war is started and one or two winners are selected. If there are no clear winners then a proof of concept is requested.
- The winner develops the solution. Operations, IT, marketing, legal, accounting, etc. all work together to launch the new service.
- The service is launched.
The whole process easily can take a year or more and costs multiple millions. If this would be your money, you would be very careful how you would spend it as well. The end result is that only a handful of new services are launched. Only those that are expected to be immediate successes.
This process is a very “useful” process for driving down large integration and network equipment costs. However it is not an innovation stimulating process.
How can you bring innovation back to telecom?
The first step is to avoid a small set of marketing people to take decisions on what is a good service or not. The only person that can legitimately decide if a service is good is the end-user.
The dotcoms therefore launch very often incremental and new services. They monitor in detail which ones, users like. It is even possible that different alternatives are launched in parallel to see which of them the users prefer. Direct feedback is critical. If a service is not picking up or users complain about it, it gets killed quickly. Services that get good feedback are continuously improved, based on user´s feedback.
How to apply “try often, kill quickly” to the telecom world?
The major show-stopper is the telecom architectural complexity. Although a marketing person has a good idea, it often takes months to update all the systems. The reason is that network operations, business logic and user data are scattered over multiple systems and departments.
To solve this problem, services and data should be separated from the network. Google´s technological differentiator is their generic data store a.k.a. Bigtable. Bigtable is an in-house developed generic high-volume, always-available data store. More than 60 services are reported to be using this common data store. Services as different as docs, maps, app engine, etc.
Google has over a million servers. Maintenance and operations are fully automated. Software is written in such a way that failure of hardware is assumed. Hardware are not top-end but instead commodity rather low-end servers. Software can easily extend over hundreds of servers.
Applications are isolated and use the servers and data through standard interfaces.
I can´t throw away my legacy
Of course an established operator can not throw away their legacy systems. So until we have a common data store and isolation between software and hardware what can we do?
The trick is to start small, move quickly and use asset exposure. Isolate the legacy systems and expose simple APIs to the telecom assets. Via asset exposure a lot of the “hard-coded” SS7 services can be substituted with network intelligence in the cloud. Mini applications can be written by anybody from a large multinational to an individual developer. As long as users can pick their preferred services and applications from the “net app store”.
Data should also be transitioned to a common data store. In the beginning this might mean nightly synchronization of different silos. However little by little the common data store should become the master of the data. Dotcoms are no longer using sql database as a one-size fits all solution. Google, Yahoo, Facebook, Twitter, etc. all developed their in-house solutions and some even open sourced them.
Applications should be running on public or private clouds hence scaling up demand for the top applications during the day and well as scaling down during the night. This should control CAPEX of the hardware. Too much logic is packed into proprietary hardware. Software should be separated from hardware and written in such a way that it can scale to hundreds of servers.
Development teams should not have a contract of 12 months with a waterfall of requirements. Teams should be small (5-6) and have short iterations to deliver small incremental innovations. The dotcoms have a tendency to release new features multiple times a week. Even some multiple times a day. Get immediate feedback and kill if not successful. For telecom innovation teams to do the same, they should be multi-disciplinary. Ideally a mix of people from the operator and strategic partners. It pays off to have a common architecture to deploy the individual services quickly. It pays off even better to have an open API so the innovation team works on the infrastructure for others to innovate.
The small teams should have at least one person that is business and marketing focused and that has the commercial responsibility to make the service a success. Different small teams that have high pressure of time, innovate quickly. Pressure of time is also important. If there is no external pressure of time, then it has to be build internal. A simple technique is to allow people from all over the organization to take a break of their day job and to take part in an innovation team. They should have clear milestones. One month to come up with an idea. Two months for a prototype. Three months to launch the first beta. Two months to get user traction. Any milestone missed means the project is stopped or at least potentially stopped. Failure is not a shame. Quite the opposite. People will go back to their day to their day to day job with new ideas and new energy. After a while they might try again and have success. Innovation and failure go hand in hand. If you can not afford failure, you can not get innovation…