In 1999 you could easily spend $1M on having a company build a static web site. A few years later any student could make you a web site. HTML became a commodity. The same commodity effect needs to happen to Big Data.
The past: build your own petabyte solution
A few years back only the happy few extremely technically gifted companies were able to create solutions to store TBs and even PBs of data. Google started to write papers. Yahoo and Facebook started to release open source solutions. Shortly after Big Data became a buzz word and anybody that was somebody in the IT consultancy space was talking about Hadoop.
Now: open source solutions and lots of handholding
In 2014 it is possible to download Hadoop, Spark, Storm, etc. You can even find prepackaged solutions from Hortonworks, Cloudera, MapR, Pivotal, IBM, etc. But still Big Data projects are hard. You need very bright people or spend quite a lot to get anywhere. Many projects run over budget and under deliver.
Future: instant Big Data solutions
We are ready for the next step and convert Big Data in a commodity. Several startups are launching Big Data solutions as a service. Unfortunately for many SaaS providers, having a Big Data SaaS solution is not enough. Big Data means lots of data. Data that can hold sensitive information. Data that can grow with GBs a day. This is the reason why if any SaaS Big Data solution ought to be successful, it also needs an on-premise alternative.
We are also missing a portable Big Data logic container. The industry is raving about Docker. Several startups are working on making Docker containers the way to share your map-reduce logic. I predict that many more Big Data logic can be containerised and made portable. Any data scientist should be able to reuse Deep Belief or Random Forest algorithms by just reusing a container.
The other part of the puzzle that is still missing is data visualisation and manipulation tools. There are many Big Data key-value stores and map-reduce engines. However the data visualisation and reporting space is still wide open. The Apache Foundation does not [yet] provide a drag-and-drop tool to setup dashboards, generate reports, schedule notifications, run workflows, automate data imports, etc.
Industry specific reusable assets is another part that is missing. Nobody wants to go and reinvent eCommerce recommendation algorithms every time a new Big Data platform becomes available.
However all of this is coming at enormous speeds. As soon as all the pieces of the puzzle are coming together then cloud orchestration solutions like Juju, ServiceMesh, Brooklyn, etc. will allow enterprises to start consuming Big Data solutions as a commodity. Instant Big Data solutions are 6-36 months away depending on your requirements.
Have you ever counted the number of Linux devices at home or work that haven’t been updated since they came out of the factory? Your cable/fibre/ADSL modem, your WiFi point, television sets, NAS storage, routers/bridges, media centres, etc. Typically this class of devices hosts a proprietary hardware platform, an embedded proprietary Linux and a proprietary application. If you are lucky you are able to log into a web GUI often using the admin/admin credentials and upload a new firmware blob. This firmware blob is frequently hard to locate on hardware supplier’s websites. No wonder the NSA and others love to look into potential firmware bugs. They are the ideal source of undetected wiretapping.
The next IT revolution: micro-servers
The next IT revolution is about to happen however. Those proprietary hardware platforms will soon give room for commodity multi-core processors from ARM, Intel, etc. General purpose operating systems will replace legacy proprietary and embedded predecessors. Proprietary and static single purpose apps will be replaced by marketplaces and multiple apps running on one device. Security updates will be sent regularly. Devices and apps will be easy to manage remotely. The next revolution will be around managing millions of micro-servers and the apps on top of them. These micro-servers will behave like a mix of phone apps, Docker containers, and cloud servers. Managing them will be like managing a “local cloud” sometimes also called fog computing.
Micro-servers and IoT?
Are micro-servers some form of Internet of Things. Yes they can be but not all the time. If you have a smarthub that controls your home or office then it is pure IoT. However if you have a router, firewall, fibre modem, micro-antenna station, etc. then the micro-server will just be an improved version of its predecessor.
Why should you care about micro-servers?
If you are a mobile app developer then the micro-servers revolution will be your next battlefield. Local clouds need “Angry Bird”-like successes.
If you are a telecom or network developer then the next-generation of micro-servers will give you unseen potentials to combine traffic shaping with parental control with QoS with security with …
If you are a VC then micro-server solution providers is the type of startups you want to invest in.
If you are a hardware vendor then this is the type of devices or SoCs you want to build.
If you are a Big Data expert then imagine the new data tsunami these devices will generate.
If you are a machine learning expert then you might want to look at algorithms and models that are easy to execute on constraint devices once they have been trained on potentially thousands of cloud servers and petabytes of data.
If you are a Devop then your next challenge will be managing and operating millions of constraint servers.
If you are a cloud innovator then you are likely to want to look into SaaS and PaaS management solutions for micro-servers.
If you are a service provider then this is the type of solutions you want to have the capabilities to manage at scale and easily integrate with.
If you are a security expert then you should start to think about micro-firewalls, anti-micro-viruses, etc.
If you are a business manager then you should think about how new “mega micro-revenue” streams can be obtained or how disruptive “micro- innovations” can give you a competitive advantage.
If you are an analyst or consultant then you can start predicting the next IT revolution and the billions the market will be worth in 2020.
The next steps…
It is still early days but expect some major announcements around micro-servers in the next months…
At TADHack some months ago it was clear that SMS and phone calls are out and WebRTC is the new hot technology for developers. Via your browser you can talk to your salesman, doctor and coach. Your browser can be mobile. This means that video calls will be universal as soon as 4G is everywhere. Bad news for operators that will see data on their networks balloon without new revenues. Good news for users that will have a whole new world of communication opening up with voice, video, screen sharing, web apps, etc. all seamlessly integrated.
How can business be generated with WebRTC?
Per minute call billing is out. Unless of course you are talking to a highly paid consultant that charges you by the second or minute. One time payment like mobile apps are only viable if you can embed WebRTC technology in a mobile app, not if you need to support an ongoing business. This means that we need a new subscription model for WebRTC. We need a micro subscription model. Especially for services that will be used on a long term basis, e.g. conference facilities, next generation voice mails, etc. As always operators will be hesitant to cannibalise a juicy per minute business for a low margin 1-99 cents per months subscription service. So are there others that could bill micro-subscriptions? The obvious choice would be cloud providers. They can already do hourly micro billing on monthly cycles hence adding some recurring element would be straightforward. So my prediction is that WebRTC will see operator’s problems accelerate whereby cloud will no longer deliver you only IT solutions but also your communication services.
Every day a new orchestration solution is being presented to the world. This post is not about which one is better but about what will happen if you embrace these new technologies.
The traditional scale-up architecture
Before understanding the new solutions, let’s understand what is broken with the current solutions. Enterprise IT vendors have traditionally made software that was sold based on the number of processors. If you were a small company you would have 5 servers, if you were big you would have 50-1000 servers. With the cloud anybody can boot up 50 servers in minutes, so reality has changed. Small companies can manage easily 10000 servers, e.g. think of successful social or mobile startups.
Also software was written optimised for performance per CPU. Many traditional software comes with a long list of exact specifications that need to be followed in order for you to get enterprise support.
Big bloated frameworks are used to manage the thousands of features that are found in traditional enterprise solutions.
The container micro services future
Enterprise software is often hard to use, integrate, scale, etc. This is all the consequence of creating a big monolithic system that contains solutions for as many use cases possible.
In come cloud, containers, micro-services, orchestration, etc. and all rules change.
The best micro services architecture is one where important use cases are reflected in one service, e.g. the shopping cart service deals with your list of purchases however it relies on the session storage service and the identity service to be able to work.
Each service is ran in a micro services container and services can be integrated and scaled in minutes or even seconds.
What benefits do micro services and orchestration bring?
In a monolithic world change means long regression tests and risks. In a micro services world, change means innovation and fast time to market. You can easily upgrade a single service. You can make it scale elastically. You can implement alternative implementations of a service and see which one beats the current implementation. You can do rolling upgrades and rolling rollbacks.
So if enterprise solutions would be available as many reusable services that can all be instantly integrated, upgraded, scaled, etc. then time to market becomes incredibly fast. You have an idea. You implement five alternative versions. You test them. You combine the best three in a new alternative or you use two implementations based on a specific customer segment. All this is impossible with monolithic solutions.
This sounds like we reinvented SOA
Not quite. SOA focused on reusable services but it never embraced containers, orchestration and cloud. By having a container like Docker or a service in the form of a Juju Charm, people can exchange best practice’s instantly. They can be deployed, integrated, scaled, upgraded, etc. SOA only focused on the way services where discovered and consumed. Micro services focus additionally on global reuse, scaling, integration, upgrading, etc.
We are not quite there yet. Standards are still being defined. Not in the traditional standardisation bodies but via market adoption. However expect in the next 12 months to see micro services being orchestrated at large scale via open source solutions. As soon as the IT world has the solution then industry specific solutions will emerge. You will see communication solutions, retail solutions, logistics solutions, etc. Traditional vendors will not be able to keep pace with the innovation speed of a micro services orchestrated industry specific solution. Expect the SAPs, Oracles, etc. of this world to be in chock when all of a sudden nimble HR, recruiting, logistics, inventory, supplier relationship management solutions, etc. emerge that are offered as SaaS and on-premise often open source. Super easy to use, integrate, manage, extend, etc. It will be like LEGO starting a war against custom made toys. You already know who will be able to be more nimble and flexible…
Cisco came up with the term of Fog Computing and The Wall Street Journal has endorsed it, so I guess Fog Computing will become the next hype.
What is Fog Computing?
Internet of Things will embed connectivity into billions of devices. Common thinking says your IoT device is connected to the cloud and shares data for Big Data analytics. However if your Fitbit starts sending your heartbeat every 5 seconds, your thermometer tells the cloud every minute that it is still 23.4 degrees, your car tells the manufacturer its hourly statistics, farmers measure thousands of acres, hospitals measure remote patients health continuously, etc. then your telecom operator will go bankrupt because their network is not designed for this IoT Data Tsunami.
Fog Computing is about taking decisions as close to the data as possible. Hadoop and other Big Data solutions have started the trend to bring processing close to where the data is and not the other way around. Now Fog Computing is about doing the same on a global scale. You want decisions to be taken as close to where the data is generated and stop it from reaching global networks. Only valuable data should be travelling on global networks. Your Fitbit could sent average heartbeat reports every hour or day and only sent alerts when your heartbeat passed a threshold for some amount of time.
How to implement Fog Computing?
Fog Computing is best done via machine learning models that get trained on a fraction of the data on the Cloud. After a model is considered adequate then the model gets pushed to the devices. Having a Decision Tree or some Fuzzy Logic or even a Deep Belief Network run locally on a device to take a decision is lots cheaper than setting up an infrastructure in the Cloud that needs to deal with raw data from millions of devices. So there are economical advantages to use Fog Computing. What is needed are easy to use solutions to train models and send them to highly optimised and low resource intensive execution engines that can be easily embedded in devices, mobile phones and smart hubs/gateways.
Fog Computing is also useful for Non-IoT
Also network elements should become a lot more intelligent. When was the last time you were on a large event with many people around you. Can you imagine any event in the last 24 months where WiFi was working brilliantly? Most of the time WiFi works in the morning when people are still getting in but soon after it stops working. Fog Computing can be the answer here. You only need to analyse data patterns and take decisions on what takes up lots of data. Chances are that all the mobiles, tablets and laptops that are connected to the event WiFi have Dropbox or some other large file sharing enabled. You take some pictures of things on the event and since you are on WiFi the network gets saturated by a photo sharing service that is not really critical for the event. Fog Computing would detect this type of bandwidth abuse and would limit it or even block it. At the moment this has to be done manually but computers would do a lot better job at it. So Software Defined Networking should be all over Fog Computing.
Telecom Operators and Equipment Manufacturers Should Embrace Fog Computing
Telecom operators should heavily invest in Fog Computing by making Open Source standards that can be easily embedded in any device and managed from any cloud. When I say standards, I don’t mean ETSI. I mean organise a global Fog Computing competition with a $10 million award for the best open source Fog Computing solution. Make a foundation around it with a very open license, e.g. Apache License. Invite and if necessary oblige all telecom and general network suppliers to embed it.
The alternatives are…
Not solving this problem will provoke heavy investment in global networks that carry 90% junk data and an IoT Data Tsunami. Solving this problem via network traffic shaping is a dangerous play in which privacy and net neutrality will come up earlier than later. You can not block Dropbox, YouTube or Netflix traffic globally. It is a lot easier if everybody blocks what is not needed or at least minimises such traffic themselves. Most people have no idea how to do it. Creating easy to use open source tools would be a first good step…
Adrian was speaking at Gigaom’s Structure event and one detail of Gigaom’s article struck my attention. According to them Adrian thinks: ” Google will not be a huge factor in enterprise computing”.
How can it be that one of the biggest technology companies, owner of the most advanced distributed systems in the world and the inventor of cloud computing for internal use, can not get enterprise computing?
Why is Google’s Cloud not ready for Enterprise Computing?
1) Cloud-only vision
Google is the only of the three that has a Cloud-only vision. The two others understand that enterprises will not drop everything their doing and moving overnight all systems to the cloud. Without a “VPC” or hybrid cloud vision, Google is going nowhere.
2) Focused on the visionnaires
API solutions for mobile, prediction, etc. are all good and well but most enterprises don’t know what oAuth and REST mean. They are still stuck in the Corba, J2EE/RMI, Dotnet, etc. era. Yes Google has Apps, Gmail, etc. and they can compete with Office, Exchange, etc. but most enterprise software is customised for Office integration, not yet for Apps integration.
3) Lack of exit strategy
If you are a challenger you need to convince enterprises that the risk of moving to your platform is worth it. The best strategy is to say that people can easily go back. When AWS was only starting, you had Eucalyptus being the exit strategy. What will a CTO do when Google’s prediction API becomes too expensive? In enterprises the expression has always been: “Nobody ever got fired for choosing [fill in SAP, Microsoft, Oracle, etc.]“. AWS is the dominant player. Without an exit strategy Google is a big risk for enterprises.
4) Lack of trust
Google’s Gmail is famous for reading your emails and putting targeted ads on the Internet. Snowden, the NSA and Google scare non-American enterprises.
1) Cheapest without free movement is worthless
Google is starting a price war but AWS and Azure have done a good job at locking people into their services APIs. Google should work on multi-cloud solutions that allow people to convert any software into as-a-Service, a.k.a. Anything-as-a-Service / XaaS. Make people independent of the cloud provider and price becomes the most important aspect. There are solutions already for XaaS, you just need to know where to look.
2) On-Site Option
Google should embrace OpenStack and make sure it delivers on-par with the market leader VMWare but more importantly make sure that there is a one click option to move between OpenStack on-premise and the Google cloud as well as vice-versa.
3) Easy path from yesterday to tomorrow
Are you hooked on Exchange, Oracle, SAP, etc.? There should be easy migration tools as well as solutions to encapsulate the past and make it work with the future. Instant legacy integration is possible. Again you just need to know where to look.
4) Trust & SLA
One simple message: “Google will not spy on you and will give you the best SLA of the cloud industry”.