Archive

Posts Tagged ‘big data’

IoT and personal health

November 9, 2014 Leave a comment

I just saw Eric Dishman’s TED session on “Health care should be a team sport“. I love the idea of providing people with chronicle illness the means to be diagnosed and treated remotely and use big data to learn of a large group of patients with similar issues. Personally this would mean that when my sons have breathing problems we would not have to drag them in the middle of the night to a hospital where they are exposed to many viruses. Instead by measuring their oxygen level and listening to their longs a personalized remote diagnose could be made and some nebulizers or other things administered. At scale all equipment would probably cost less than £200 because Maplin already sells the nebulizer and oxygen level meter for a combined £110. Add another £90 at worst for a stethoscope that can be connected via bluetooth to a smartphone. Now via Hangout a doctor could remotely diagnose the results and even in the future a computer programme. All results of millions of patients would be collected in order to improve treatment. So no need for an expensive hospital in London with a receptionist, nurse and doctor dedicating 2 hours. By just avoiding one hospital night, the whole system would be enormously profitable. Additionally Ubuntu’s Juju can be used to set up all the big data and diagnostic software in minutes in any cloud or server on any place in the world. If other open source solutions are used then the total solution would be in reach for any developing country. There are probably more than one developer whose kids are asthmatic, and would happily contribute time. It sounds like an ideal Gates Foundation or Kickstarter project. If you think you can help please reach out to me because this is not work for me, this is personal engagement.

The Cloud Winners and Losers?

October 15, 2014 Leave a comment

The cloud is revolutionising IT. However there are two sides to every story: the winners and the losers. Who are they going to be and why? If you can’t wait here are the losers: HP, Oracle, Dell, SAP, RedHat, Infosys, VMWare, EMC, Cisco, etc. Survivors: IBM, Accenture, Intel, Apple, etc. Winners: Amazon, Salesforce, Google, CSC, Workday, Canonical, Metaswitch, Microsoft, ARM, ODMs.

Now the question is why and is this list written in stone?

What has cloud changed?
If you are working in a hardware business (storage, networking, etc. is also included) then cloud computing is a value destroyer. You have an organisation that is assuming small, medium and large enterprises have and always will run their own data centre. As such you have been blown out of the water by the fact that cloud has changed this fundamental rule. All of a sudden Amazon, Google and Facebook go and buy specialised webscale hardware from your suppliers, the ODMs. Facebook all of a sudden open sources hardware, networking, rack and data centre designs and makes it that anybody can compete with you. Cloud is all about scale out and open source hence commodity storage, software defined networks and network virtualisation functions are converting your portfolio in commodity products. If you are an enterprise software vendor then you always assumed that companies will buy an instance of your product, customise it and manage it themselves. You did not expect that software can be offered as a service and that one platform can offer individual solutions to millions of enterprises. You also did not expect that software can be sold by the hour instead of licensed forever. If you are an outsourcing company then you assume that companies that have invested in customising Siebel will want you to run this forever and not move to Salesforce.

Reviewing the losers
HP’s Cloud Strategy
HP has been living from printers and hardware. Meg rightfully has taken the decision to separate the cashcow, stop subsidising other less profitable divisions and let it be milked till it dies. The other group will focus on Cloud, Big Data, etc. However HP Cloud is more expensive and slower moving than any of the big three so economies of scale will push it into niche areas or make it die. HP’s OpenStack is a product that came 2-3 years late to the market. A market as we will see later that is about to be commoditised. HP’s Big Data strategy? Overpay for Vertica and Autonomy and focus your marketing around the lawsuits with former owners, not any unique selling proposition. Also Big Data can only be sold if you have an open source solution that people can test. Big Data customers are small startups that quickly have become large dotcoms. Most enterprises would not know what to do with Hadoop even if they could download it for free [YES you can actually download it for free!!!].
Oracle’s Cloud Strategy
Oracle has been denying Cloud existed until their most laggard customer started asking questions. Until very recently you could only buy Oracle databases by the hour from Amazon. Oracle has been milking the enterprise software market for years and paying surprise visits to audit your usage of their database and send you an unexpected bill. Recently they have started to cloud-wash [and Big Data wash] their software portfolio but Salesforce and Workday already are too far ahead to catch them. A good Christmas book Larry could buy from Amazon would be “The Innovator’s Dilemma“.
Dell’s Cloud Strategy
Go to the main Dell page and you will not find the word Big Data or Cloud. I rest my case.
SAP’s Cloud Strategy
Workday is working hard on making SAP irrelevant. Salesforce overtook Siebel. Workday is likely to do the same with SAP. People don’t want to manage their ERP themselves.
RedHat’s Cloud Strategy
[I work for their biggest competitor] RedHat salesperson to its customers: There are three versions. Fedora if you need innovation but don’t want support. CentOS if you want free but no security updates. RHEL is expensive and old but with support. Compare this to Canonical. There is only one Ubuntu, it is innovative, free to use and if you want support you can buy it extra.
For Cloud the story is that RedHat is three times cheaper than VMWare and your old stuff can be made to work as long as you want it according to a prescribed recipe. Compare this with an innovator that wants to completely commoditise OpenStack [ten times cheaper] and bring the most innovative and flexible solution [any SDN, any storage, any hypervisor, etc.] that instantly solves your problems [deploy different flavours of OpenStack in minutes without needing any help].
Infosys or any outsourcing company
If the data centre is going away then the first thing that will go away is that CRM solution we bought in the 90’s from a company that no longer exists.
VMWare
For the company that brought virtualisation into the enterprise it is hard to admit that by putting a rest API in front of it, you don’t need their solution in each enterprise any more.
EMC
Commodity storage means that scale out storage can be offered at a fraction of the price of a regular EMC SAN solution. However the big killer is Amazon’s S3 that can give you unlimited storage in minutes without worries.
Cisco
A Cisco router is an extremely expensive device that is hard to manage and build on top of proprietary hardware, a proprietary OS and proprietary software. What do you think will happen in a world where cheap ASIC + commodity CPU, general purpose OS and many thousands of network apps from an app store become available? Or worse, a network will no longer need many physical boxes because most of it is virtualised.
What does a cloud loser mean?
A cloud loser means that their existing cash cows will be crunched by disruptive innovations. Does this mean that losers will disappear or can not recuperate? Some might disappear. However if smart executives in these losing companies would be given the freedom to bring to market new solutions that build on top of the new reality then they might come out stronger. IBM has shown they were able to do so many times.

Let’s look at the cloud survivors.
IBM
IBM has shown over and over again that it can reinvent itself. It sold its x86 servers in order to show its employees and the world that the future is no longer there. In the past it bought PWC’s consultancy which will keep on reinventing new service offerings for customers that are lost in the cloud.
Accenture
Just like PWC’s consultancy arm within IBM, Accenture will have consultants that help people make the transition from data centre to the cloud. Accenture will not be leading the revolution but will be a “me-to” player that can put more people faster than others.
Intel
X86 is not going to die soon. The cloud just means others will be buying it. Intel will keep on trying to innovate in software and go nowhere [e.g. Intel’s Hadoop was going to eat the world] but at least its processors will keep it above the water.
Apple
Apple knows what consumers want but they still need to prove they understand enterprises. Having a locked-in world is fine for consumers but enterprises don’t like it. Either they come up with a creative solution or the billions will not keep on growing.
What does a cloud survivor mean?
A cloud survivor means that the key cash cows will not be killed by the cloud. It does not give a guarantee that the company will grow. It just means that in this revolution, the eye of the tornado rushed over your neighbours house, not yours. You can still have lots of collateral damage…

Amazon
IaaS = Amazon. No further words needed. Amazon will extend Gov Cloud into Health Cloud, Bank Cloud, Energy Cloud, etc. and remove the main laggard’s argument: “for legal & security reasons I can’t move to the cloud”. Amazon currently has 40-50 Anything-as-a-Service offerings in 36 months they will have 500.
Salesforce
PaaS & SaaS = Salesforce. Salesforce will become more than a CRM on steroids, it will be the world’s business solutions platform. If there is no business solution for it on Salesforce then it is not a business problem worth solving. They are likely to buy competitors like Workday.
Google
Google is the king of the consumer cloud. Google Apps has taken the SME market by storm. Enterprise cloud is not going anywhere soon however. Google was too late with IaaS and is not solving on-premise transitional problems unlike its competitors. With Kubernetes Google will re-educate the current star programmers and over time will revolutionise the way software is written and managed and might win in the long run. Google’s cloud future will be decided in 5-10 years. They invented most of it and showed the world 5 years later in a paper.
CSC
CSC has moved away from being a bodyshop to having several strategic important products for cloud orchestration and big data. They have a long-term future focus, employing cloud visionaries like Simon Wardley, that few others match. You don’t win a cloud war in the next quarter. It took Simon 4 years to take Ubuntu from 0% to 70% on public clouds.
Workday
What Salesforce did to Oracle’s Siebel, Workday is doing to SAP. Companies that have bought into Salesforce will easily switch to Workday in phase 2.
Canonical
Since RedHat is probably reading this blog post, I can’t be explicit. But a company of 600 people that controls up to 70% of the operating systems on public clouds, more than 50% of OpenStack, brings out a new server OS every 6 months, a phone OS in the next months, a desktop every 6 months, a complete cloud solution every 6 months, can convert bare-metal into virtual-like cloud resources in minutes, enables anybody to deploy/integrate/scale any software on any cloud or bare-metal server [Intel, IBM Power 8, ARM 64] and is on a mission to completely commoditise cloud infrastructure via open source solutions in 2015 deserves to make it to the list.
Metaswitch
Metaswitch has been developing network software for the big network guys for years. These big network guys would put it in a box and sell it extremely expensive. In a world of commodity hardware, open source and scale out, Clearwater and Calico have catapulted Metaswitch to the list of most innovative telecom supplier. Telecom providers will be like cloud providers, they will go to the ODM that really knows how things work and will ignore the OEM that just puts a brand on the box. The Cloud still needs WAN networks. Google Fibre will not rule the world in one day. Telecom operators will have to spend their billions with somebody.
Microsoft
If you are into Windows you will be on Azure and it will be business as usual for Microsoft.
ARM
In an ODM dominated world, ARM processors are likely to move from smart phones into network and into cloud.
ODM
Nobody knows them but they are the ones designing everybody’s hardware. Over time Amazon, Google and Microsoft might make their own hardware but for the foreseeable future they will keep on buying it “en masse” from ODMs.
What does a cloud winner mean?
Billions and fame for some, large take-overs or IPOs for others. But the cloud war is not over yet. It is not because the first battles were won that enemies can’t invent new weapons or join forces. So the war is not over, it is just beginning. History is written today…

The next IT revolution: micro-servers and local cloud

Have you ever counted the number of Linux devices at home or work that haven’t been updated since they came out of the factory? Your cable/fibre/ADSL modem, your WiFi point, television sets, NAS storage, routers/bridges, media centres, etc. Typically this class of devices hosts a proprietary hardware platform, an embedded proprietary Linux and a proprietary application. If you are lucky you are able to log into a web GUI often using the admin/admin credentials and upload a new firmware blob. This firmware blob is frequently hard to locate on hardware supplier’s websites. No wonder the NSA and others love to look into potential firmware bugs. They are the ideal source of undetected wiretapping.

The next IT revolution: micro-servers
The next IT revolution is about to happen however. Those proprietary hardware platforms will soon give room for commodity multi-core processors from ARM, Intel, etc. General purpose operating systems will replace legacy proprietary and embedded predecessors. Proprietary and static single purpose apps will be replaced by marketplaces and multiple apps running on one device. Security updates will be sent regularly. Devices and apps will be easy to manage remotely. The next revolution will be around managing millions of micro-servers and the apps on top of them. These micro-servers will behave like a mix of phone apps, Docker containers, and cloud servers. Managing them will be like managing a “local cloud” sometimes also called fog computing.

Micro-servers and IoT?
Are micro-servers some form of Internet of Things. Yes they can be but not all the time. If you have a smarthub that controls your home or office then it is pure IoT. However if you have a router, firewall, fibre modem, micro-antenna station, etc. then the micro-server will just be an improved version of its predecessor.

Why should you care about micro-servers?
If you are a mobile app developer then the micro-servers revolution will be your next battlefield. Local clouds need “Angry Bird”-like successes.
If you are a telecom or network developer then the next-generation of micro-servers will give you unseen potentials to combine traffic shaping with parental control with QoS with security with …
If you are a VC then micro-server solution providers is the type of startups you want to invest in.
If you are a hardware vendor then this is the type of devices or SoCs you want to build.
If you are a Big Data expert then imagine the new data tsunami these devices will generate.
If you are a machine learning expert then you might want to look at algorithms and models that are easy to execute on constraint devices once they have been trained on potentially thousands of cloud servers and petabytes of data.
If you are a Devop then your next challenge will be managing and operating millions of constraint servers.
If you are a cloud innovator then you are likely to want to look into SaaS and PaaS management solutions for micro-servers.
If you are a service provider then this is the type of solutions you want to have the capabilities to manage at scale and easily integrate with.
If you are a security expert then you should start to think about micro-firewalls, anti-micro-viruses, etc.
If you are a business manager then you should think about how new “mega micro-revenue” streams can be obtained or how disruptive “micro- innovations” can give you a competitive advantage.
If you are an analyst or consultant then you can start predicting the next IT revolution and the billions the market will be worth in 2020.

The next steps…
It is still early days but expect some major announcements around micro-servers in the next months…

Software Defined Everything

The other day Taxis in London where on strike because Uber was setting up shop in London. Do you know a lot of people that still send paper letters? Book holiday flights via a travel agent?  Buy books in book stores? Rent DVD movies?

5 smart programmers can bring down a whole multi-billion industry and change people’s habits. It has long been known that any company that changes people habits becomes a multi-billion company. Cereals for breakfast, brown coloured sweet water, throw-away shaving equipment, online bookstore, online search & ads, etc. You probably figured out the name of the brand already.

Software Defined Everything is Accelerating

The Cloud, crowd funding, open source, open hardware, 3D printing, Big Data, machine learning, Internet of Things, mobile, wearables, nanotechnology, social networks, etc. all seem individual technology innovations. However things are changing.

Your Fitbit will send your vital signs via your mobile to the cloud where deep belief networks analyse it and find out that you are stressed. Your smart hub detects you are approaching your garage and your Arduino controller linked to your IP camera encased in a 3D printing housing detects that you brought a visitor. A LinkedIn and Facebook image scan finds that your visitor is your boss’s boss. Your Fitbit and Google Calendar have given away over the last months that whenever you have a meeting with your boss’s boss, you get stressed. Your boss’s boss music preferences are guesses based on public information available on social networks. Your smart watch gets a push notification with the personal profile data that could be gathered from your boss’s boss: he has two boys and a girl, got recently divorced, the girl recently won a chess award, a facebook tagged picture shows your boss in a golf tournament three weeks ago, an Amazon book review indicates that he likes Shakespeare but only the early work, etc. All of a sudden your house shows pictures of that one time you plaid golf. Music plays according to what 96.5% of Shakespeare lovers like from a crowd-funded bluetooth in-house speaker system…

It might be a bit farfetched but what used to be disjoint technologies and innovations are fast coming together. Those companies that can both understand the latest cutting-edge innovations and be able to apply them to improve their customer’s life or solve business problems will have a big competitive edge.

Software is fast defining more and more industries. Media, logistics, telecom, banking, retail, industrial, even agriculture will see major changes due to software (and hardware) innovations.

What should you do? If you are technology savvy?

You should look for customers that want faster horses and draw a picture of a car. Make a slide deck. Get feedback and adjust. Build a prototype. Get feedback and adjust. Create a minimum valuable product. Get feedback and adjust… Change the world.

If you have a business problem and money but are not technology savvy?  

Organise a competition in which you ask people to solve your problem and give prices to the best solution. You will be amazed by what can come out of these.

If you work in a traditional industry and think software is not going to redefine what you do?

Call your investment manager and ask them if you have enough money in the bank to retire in case you would get fired next year and wouldn’t be able to find a job any more. If the answer is no! Then start reading the top of the blog post again…

The future of Big Data is linked to Cloud

Data volumes are growing exponentially. Unstructured data from Twitter, LinkedIn, Mailling Lists, etc. has the potential to transform many industries if it could be combined with structured data. Machine learning, natural language processing, sentiment analysis, etc. everybody talks about them, hardly anybody is really using them at scale. Too many people when they talk about Big Data unfortunately start with the answer and then ask what the problem it. The answer seems to be Hadoop. News flash: Hadoop is not the answer and if you start from the answer to look for problems then you are doing it wrong.

What are Common Data Problems?

Most Big Data problems are about storage and reporting. How do I store all the exponentially growing data in such a way that business managers can get to in seconds when they need it? Ad-hoc reporting, adequate prediction, and making sense of the exponentially growing data stream are the key problems.

Big Data Storage?

Do you have relational data, unstructured data, graph data, etc.? How do you store different types of data and make it available inside an enterprise? The basics for big data storage is cloud storage technology. You want to store any type of data and be able to quickly scale up storage. RedHat did not buy Inktank for $175M because traditional storage has solved all of today’s problems. Premium SAN and other storage technologies are old school. They are too expensive for Big Data. They were designed with the idea that each byte of data is critical for an enterprise. Unfortunately this is no longer the case. You mind loosing transactional sales data. You don’t mind so much loosing sample tweets you bought from Datasift or Apache log files from an internal low-impact server. This is where cloud storage solutions like Inktank’s Ceph allow commodity storage to be built that is reliable, scalable and extremely cost effective. Does this mean you don’t need SANs any more? Wrong again. TV did not kill Radio. Same here.

Cloud storage technologies are needed because each type of data behaves differently. If you have log data that only is appended then HDFS is fine. If you have read-mostly data then a relational database is ideal. If you have write-mostly data then you need to look at NoSQL. If you need heavy read-and-write then you need strong Big Data architecture skills. What is more important: short latency, consistency, reliability, cheap storage, etc.? Each of these means that the solution is different. No latency means in-memory or SSD. Consistency means transactional. Reliability means replication. You can even now find inconsistent databases like BlinkDB. There is no longer one size fits all. Oracle is no longer the answer to everybody’s data questions.

What will companies need? Companies need cloud storage solutions that offer these different storage capabilities like a service. Amazon’s RDS, DynamoDB, S3 and Redshift are examples of what companies need. However companies need more flexibility. They need to be able to migrate their data between public cloud providers to optimise their costs and have added security. They also need to be able to store data in private local clouds or nearby hosted private clouds for latency or regulatory reasons.

The future of ETL & BI

Traditional ETL will see a revolution. ETL never worked. Business managers don’t want to go and ask their IT department to make a change in a star schema in order to import some extra data from the Internet followed by updates to reports and dashboards. Business managers want an easy to use tool that can answer their ad-hoc queries. This is the reason why Tableau Software + Amazon Redshift are growing like crazy. However if your organisation is starting to pump terabytes of data into Redshift, please be warned: The day will come that Amazon sends you a bill that your CxO will not want to pay and he/she will want you to move out of Amazon. What will you do then? Do you have an exit strategy?

The future of ETL and BI will be web tools that any business manager can use to create ad-hoc reports. The Office generation wants to see dynamic HTML5 GUIs that allow them to drag-and-drop data queries into ad-hoc reports and dashboards. If you need training then the tool is too difficult.

These next-generation BI tools will need dynamic back-office solutions that allow storing real-time, graph, blob,  historical relational, unstructured, etc. data into a commonly accessible cloud storage solution. Each one will be hosted by a different cloud service but they will all be an API away. Software will be packaged in such a way that it knows how to export its own data. Why do you need to know where Apache stores the access and error logs and in which format? Apache should be able to export whatever interesting information it contains in a standardised way into some deep storage. Machine learning should be used to make decisions on how best to store that data for ad-hoc reporting afterwards. Humans should no longer be involved in this process.

Talking about machine learning. With the volumes of data growing from gigabytes into petabytes, traditional data scientists will not scale. In many companies a data scientist is similar to a report monkey: “Find out why in region X we sold Y% less”, etc. Data scientist should not be synonymous for dynamic report generators. Data scientists should be machine learning experts. They should tell the computer what they want, not how they want it. Today’s data scientists pride themselves they know R, Python, etc. These tools are too low-level to be usable at scale. There are just not enough people in the world to learn R. Data is growing exponentially, R experts at best can grow linear. What we need are machine learning GUI solutions like RapidMiner Studio but supported by Petabyte cloud solutions. A short term solution could be an HTML5 GUI version of RapidMiner Studio that connects to a back-end set of cloud services that use some of the nice Apache Spark extensions for machine learning, streaming, Big Data warehousing/SQL, graph retrieval, etc. or solutions based on Druid.io. For sure there are other solutions possible.

What is important is that companies start realising that data is becoming a strategic weapon. Those companies that are able to collect more of it and convert it into valuable knowledge and wisdom will be tomorrow’s giants.  Most average machine learning algorithms become substantially better just by throwing more and more data at them. This means that having a Big Data architecture is not as critical as having the best trained models in the industry and continue to train them. There will be a data divide between the have’s and have-not’s. Google, Facebook, Microsoft and others have been buying any startup that smells like Deep Belief Networks. They have done this with a good reason. They know that tomorrow’s algorithms and models will be more valuable than diamonds and gold. If you want to be one of the have’s then you need to invest in cloud storage now. You need to have massive historical data volumes to train tomorrow’s algorithms and start building the foundations today…

 

Solving the pressing need for Linux talent…

September 17, 2013 Leave a comment

The Linux Foundation shared the below infographics recently.  Click on it and you get the associated report. The short message is, if you are an expert in Linux you are in high demand because companies don’t find enough experts due to the Cloud and Big Data boom.

Unless cloning machines are discovered later this year, quickly expanding the number of Linux experts is unlikely to happen. This means total cost of ownership for enterprises is likely to rise. This is ironic since Linux is all about open source and providing some of the most amazing solutions for free.

The obvious alternative is to focus on Microsoft products. They are relatively cheap in total cost of ownership since licenses are “payable” and average Windows skills can be easier found.

However Microsoft is loosing the server war, especially in the web application space. So this is not a winning strategy if you are going to do Cloud.

How to solve the pressing need for Linux talent?

The only possible strategy is to lower the number of experts needed per company. Larger companies always will need some but they should be focused on the “interesting high-value tasks”. This concept of interesting and high-value is key. With the number of cloud servers exploding, we can not expect the number of experts to explode.

Open source products like Puppet and Chef have helped to alleviate the pain for the more “skilled” companies. One DevOp was able to manage more than ten times as many machines as before. Unfortunately these server provisioning tools are not for the faint of heart. They  require experts that know both administration and coding.

It is time for the next generation of tools. Ubuntu, the number #1 Cloud operating system, is leading the way with Juju. If Linux wants to continue to be successful then the common problems, the boring problems, the repetitive problems, etc. should be solved. Solved by Linux gurus in such a way that we, the less IT gifted, can get instant solutions for these common problems.

We need a Linux democracy in which the lesser skilled, but unfortunately the majority, can instantly reuse best-in-class blueprint solutions. Juju is a new class of tools that gives you instant solutions. For all those common problems: scaling a web application, monitoring your infrastructure, sharding MongoDB, replicating a database, installing a Hadoop cluster, setting up continuous integration, etc. Juju can offer solutions. The individual software components have been “charmed”. A Charm allows the software to be instantly deployed, integrated and scaled. However the real revolution is just starting. Juju will have bundles pretty soon. Technically speaking, a bundle is a collection of pre-configured and integrated Charms. In lays speak, a bundle is an instant solution for a common problem. You instantly deploy a bundle [one command or drag-and-drop] and you get a blue-print solution. Since Juju is open source, the community can create as many instant solutions as there are common problems.

So if you want to scale your IT solutions without stretching neither your budget or cloning your employees and without the lock-in of any proprietary and expensive commercial software, then you should try Juju today. Play with the GUI or install Juju today.

MapD – Massively Parallel GPU-based database

An MIT student recently created a new type of massively distributed database, one that runs on graphical processors instead of CPUs. Mapd, as it has been called, makes use of the immense computational power available in off-the-shelf graphics cards that can be found in any laptop or PC. Mapd is especially suitable for real-time quering, data analysis, machine learning and data visualization. Mapd is probably only one of many databases that will try new hardware configurations  to cater for specific application use cases.

Alternative approaches could focus on large sets of cheap mobile processors, Parallella processors, Raspberry PIs, etc. all stitched together. The idea would be to create massive processing clouds based on cheap specialized hardware that could beat traditional CPU Clouds both in price and performance at least for some specific use cases…

Follow

Get every new post delivered to your Inbox.

Join 325 other followers

%d bloggers like this: