Many developers and devops are doing a lot of repetitive tasks every day. One of them is deploying a web app and scaling it. We all know the theory for deployment: install an app server, install a database, deploy your app on the app server and your data on the database.
Scaling is also a common problem however several people already have answers for it: put a load balancer in front, duplicate your app server, create database slaves for read only data, create a database cluster for high volumes of writes, use in-memory or NoSQL databases for extremely high write volumes, use memcached for avoiding to go to the database, use Varnish to avoid going to the web server, etc.
So these are not new problems, more like common recurring tasks for devops and developers. What if instant solutions could be made available hence anybody in the world, independent of their level of knowledge, can instantly install a scalable solution?
At Ubuntu we think Open Source blueprint solutions for these common problems should be within everybody’s reach. Instantly deploying and scaling a rails app on any cloud is already a reality: https://juju.ubuntu.com/docs/howto-rails.html. The next step is to make it even easier. One command or drag-and-drop to deploy a complete stack in high-availability. Even one command to have continuous deployment + high-availability at once. This is exactly why we are organizing a contest to win $10,000 with 6 categories. Two of them should be familiar to you now: high-availability and continuous deployment.
Can you imagine the extra time you will gain if all common recurring problems would instantly disappear? Especially if you think what is common and recurring for some experts might be rocket science for the rest of us. If you haven’t played around with Juju, then this is the best time ever…
In this post I want to show a technique that is an alternative for creating a business case: “Lean Canvas”. Lean Canvas has been proposed by the book: “Running Lean“, that itself is based on “Lean Startup“.
The idea of Lean Canvas is to put what would go into the executive summary of a business case on one page and to forget about writing the rest of the business case. The justification is that writing a business case takes 2 to 3 months and CxOs normally only read the executive summary. So instead of spending 2 to 3 months, you spend hours or days and get it in front of customers to get feedback. With the feedback you can then refine your idea and create a Minimum Valuable Product in the same 2 to 3 months. So instead of having a nice paper report nobody reads, you can start earning money.
The Lean Canvas contains the major customer problems you want to solve. These customer problems need to be important [A painkiller, not a vitamin], shared by many and not have an easy workaround. Customers need to validate them before you start thinking about solutions. Customers are the ideal party to tell you about their problems but not necessarily the best to give you ideas about a solution. Think about Henry Ford’s words: “If I had asked what people wanted they would have said faster horses…”. Most startups focus excessively on the solution and forget that they need to validate a lot more things. After the problem, the second most important part of the Lean Canvas is the customer segment and channel. Who do you want to offer a product to and how to do reach them. Also the unique value proposition is key. The other elements of the Lean Canvas are the unfair advantage [how can I avoid others to just copy my business?], key metrics [how can I measure success?] and last but not least the cost structure [what does it cost to acquire a customer, build a minimum valuable product, etc.?] and revenue streams [how much am I going to charge and what other revenue sources are there]. You can create a Lean Canvas on paper or use a SaaS-version.
So far the theory, now let’s review an example…
The customer problems:
Door keys are a nuisance. You can lose them. You have to give copies to family and friends if you want them to go to your house if you are not there. Do you really want to give the cleaning lady or man a copy? Is my lock safe from burglars?
The mailman or delivery guy comes to my home but often packages do not fit my mailbox.
When people ring my bell, they know when I am not home. That is unsafe.
So what is the solution?
My proposed solution is the iDoor. The iDoor is an intelligent door which you control remotely to decide who accesses, who delivers and who is shut-out. Via a camara and full-duplex audio system, you are able to see who is standing in front of your door and communicate with them. Your smartphone will be your remote door manager. Advanced models could have face recognition and share data with other intelligent doors in the neighbourhood, hence if you are sleeping a siesta and those annoying door to door vendors approach your door they will automatically hear a message to go away and your bell will not function. If a burglar is detected, then the police can be warned. If the postman has a big package then remotely you can open a compartment so they can store the package. If your family comes they can go into the house without problems. Your cleaning lady can as well, as long as it is her normal working hours and she comes alone.
Unique value proposition?
As if you were home 24×7. Busy people will never miss an Amazon package again. Burglars will not know if you are in the garden or not home at all.
Mid-high class house owners.
An existing door manufacturer that targets upper markets should be partnered with. An example could be Hörnmann.
Door sales and door usage.
A complete costing has to be done. TBD.
Door sales and door installation/maintenance services are the primary revenue stream. However door apps and selling anonymous aggregated data could be additional sources.
You can find a quick summary in the following slides as well as some details about the technology components. This example needs customer validation and several areas need quite some more work [e.g. cost, revenue, unfair advantage, etc.]. However I hope the idea is clear.
Maarten Ectors is a senior executive specialised in value innovation: creating new products and generating new revenues based on cutting-edge technologies like Big Data, Cloud, etc. He is currently looking for new challenges. You can contact him at: maarten at telruptive dot com.
I initially complaint about the complexity of installing Mesos when I was playing around with Spark and Shark. However
when I saw the Twitter Mesos and Framework presentation, I understood why Mesos can be disruptive to how you architect applications in a highly distributed manner typical for Cloud Computing.
You can see the presentation here.
The key is that Twitter combined Mesos with Zookeeper, Linux Control Groups and Google’s Protocol Buffers as well as Spark, Storm and Hadoop. This provides them with a way to easily program services that can be scaled to hundreds of mesos nodes, automatically upgraded and restarted in case of failure. Also resource usage can be controlled via the control groups. Zookeeper manages the configuration. Protocol buffers assure efficient communication between nodes. Services can use Spark and Storm for real-time operations and Hadoop for batch. Developers do not have to worry about scaling the services, deploying them to different nodes, etc. This is handled by the Twitter Framework and Mesos master.
There is only one thing to add: “TWITTER PLEASE OPEN SOURCE YOUR TWITTER FRAMEWORK” or in Twitter language: “#mesos please #opensource #twitterfw now @telruptive “…
The website defines Spark as a MapReduce-like cluster computing framework designed to support low-latency iterative jobs. However it would be easier to say that Spark is Hadoop for real-time.
Spark allows you to run MapReduce jobs together with your data on distributed machines. Unlike Hadoop Spark can distributed your data in slices and store it in memory hence your processing and data are co-located in memory. This gives an enormous performance boost. Spark is more than MapReduce however. It offers a new distributed framework on which different distributed computing paradigms can be modelled. Examples are: Hadoop’s Hive => Shark (40x faster than Hive), Google’s Pregel / Apache’s Giraph => Bagel, etc. An upcoming Spark Streaming is supposed to bring real-time streaming to the framework.
The excellent part
Spark is written in Scala and has a very straight forward syntax to run applications from the command line or via compiled code. The possibilities to run iterative operations over large datasets or very compute intensive operations in parallel, make it ideal for big data analytics and distributed machine learning.
The points for improvement
In order to use Spark, you need to install Mesos. Mesos is a framework for distributed computing that was also developed by Berkeley. So in a sense they are eating their own dog food. Unfortunately Mesos is not written in scala so installing Spark becomes a mix of make’s, ant’s, .sh, XML, properties, .conf, etc. It would not be bad if Mesos would have consistent documentation but due to incubation into Apache the installation process is currently undergoing changes and is not straightforward.
Spark allows to connect to Hadoop, Hbase, etc. However running Hadoop on top of Mesos is “experimental” to say the least. The integration with Hadoop should be lighter. At the end only access to HDFS, SequenceFiles, etc. is required. This should not mean that a complete Hadoop should be installed and Spark should be recompiled for each specific Hadoop version.
If Spark wants to become as successful as Hadoop, then they should learn from Hadoop’s mistakes. Complex installation is a big problem because Spark needs to be installed on many machines. The Spark team should take a look at Ruby’s Rubygems, Node.js’s npm, etc. and make the installation simple, ideally via Scala’s package manager, although it is less popular.
If possible the team should drop Mesos as a prerequisite and make it optional. One of Spark’s competitors is Storm & Trident, you can install a Storm cluster in minutes and have a one click command to run Storm on an EC2 cluster.
It would be nice if there would be an integration SDK that allows extensions to be plugged-in. Integrations with Cassandra, Redis, Memcache, etc. could be developed by others. Also looking at a distribution in which Cassandra’s Brisk is used to mimic Hive and HDFS (a.k.a. CassandraFS) and have it all pre-bundled with Shark, could be an option. Spark’s in-memory execution and read speed, combined with Cassandra’s write speed, should make for a pretty quick and scalable solution. Ideally without the need to fight with namenodes, datanodes, jobtrackers, etc. and other Hadoop hard-to-configure inventions…
The conclusion is that distributed computing and programming is already hard enough by itself. Programmers should be focusing on their algorithms and not need a professional admin to get them started.
All-in-all Spark, Shark, Streaming Spark, Bagel, etc. have a lot of potential, it is just a little bit rough around the edges…
Update: I am reviewing my opinion about Mesos. See the Mesos post.
Cloudify, from the scalability experts GigaSpaces, is still its early stages. Unlike Google App Engine, Azure, Heroku, etc. this PaaS is more focused on the application life cycle and not on being a “transparent” application server and database. The main focus is automating application and services deployment, monitoring, autoscaling, etc. The closest competitor would be Scalr.
Unlike Scalr, Cloudify’s focus is on Cloud-neutrality. Cloudify is not focusing on using specific Amazon services for scalability but instead to make a neutral Cloud platform. The advantage is that every possible Cloud being it private or public can be used and scenarios like hybrid clouds with Cloud bursting from private to public cloud are possible. The deep understanding of large-scale architectures in a company like GigaSpaces is a guarantee that Cloudify will scale in the future.
Cloudify is still missing some important functionality like security, multi-tenancy, integrations with lower-level automation frameworks (e.g. Chef and Puppet), complex upgrade management [e.g. rolling upgrades, MySQL schema upgrades, A/B testing of new features, etc.], etc. However the roadmap is pointing towards most of these items.
Software architects should understand the possibilities Cloudify, Scalr, etc. bring. By having a reusable automation framework companies are able to spend more development and operations time on bringing new business features and less on reinventing the wheel.
Twitter is having a Real-Time Analytics solution that could easily become as important as Hadoop. They talked about open sourcing it but so far have not done so.
This post is an open invitation to Twitter open source Rainbird and accelerate Real-Time Analytics adoption in the world. Hadoop has changed thousands if not millions of companies. Rainbird could do a similar thing.
In order to gather people around this subject, I am proposing that you include #TWOSRB in your tweets. #TWOSRB stands for Twitter please Open Source RainBird:
Although the number of solutions Amazon AWS is offering has become very large, here are 5 ideas of what Amazon could be adding next.
There are thousands of APIs out there. However what is missing is an easy way for companies to control their costs. In line with other marketplaces Amazon runs, there could be an API marketplace. An API marketplace would allow third-party API providers to let Amazon do the charging. Companies would be able to pay one bill to Amazon AWS and use thousands of APIs. Also third-party API providers would be winning because they often can not charge small amounts to a large set of developers. Amazon already sends you a bill or charges your credit card, hence adding some dollar/euro cents for external API usage would be easy to do. The third-party API provider would avoid having to lock-in users in large monthly usage fees to offset credit card and management charges. Amazon of course would be the big winner because they could get a revenue share on these thousands of APIs. End-users would also be winning because they can easily compare different APIs and get community feedback from other developers and pick those APIs with the best reputation. The typical advantages of any online marketplace. Also cross-selling, advertisement, etc. and other areas can be reused by Amazon. A final advantage would even be to have Amazon be in the middle and offer a standard interface with third-parties offering competing implementations. This would allow developers to easily switch providers.
A lot of applications would be helped if they could use language APIs that are paid per request. Language APIs is a group name for text-to-speech, speech recognition, natural language processing, even mood analysis APIs. These are all APIs that are available individually but there is a clear economies of scale effect. The more speech you transcribe or text documents you process, the better your algorithms become. Also there is an over-supply of English language APIs but an under-supply of any other language in the world, except for Spanish, French and German perhaps. Another problem with existing APIs is that a high monthly volume is needed in the even the most basic subscription plan. Examples are Acapela VaaS pricing that costs a minimum of €1500. Very few applications will use this amount of voice.
M2M APIs and Services
Amazon is already working hard on Big Data solutions. M2M sensors can generate large volumes of data pretty quickly. S3 or DynamoDB would be ideal to store this data. However what is missing is an easy way to connect and manage large number of sensors and devices and their accompanying applications. There are few standards but with examples like Pachube, Amazon should be able to get inspired. Especially the end-to-end service management, provisioning, SLA management, etc. could use a big boost from a disruptive innovator like Amazon. Also M2M sensor intelligence could be offered from Amazon, see my other article about this subject.
Mobile APIs and Solutions
With billions of phones out there, mobilizing the Web will be the next challenge. Securely exposing company data, applications and processes towards mobile devices is a challenge today. BYOD, bring-your-own-device, is a headache for CIOs. We do not all have a MAC so we can not sign iPhone apps and launch them on the App Store. Ideally there would be a technical solution for enterprises to manage private app stores, deploy apps on different devices and be able to send notification to all or subsets of their employees. Also functionality like Usergrid in which developers would not have to focus on the backoffice logic would be of interest. Also tools to develop front-end for different devices would be appreciated, examples like Tiggzi come to mind. There are a lot of island solutions but few really integrated total solutions.
Support APIs and Services
Amazon is becoming more and more important in the global IT infrastructure business. This means that solutions will move more and more to the Cloud and sometimes be hybrid cloud. With these complex solution scenarios in which third-parties, Amazon and on-site enterprise services have to be combined, risks of things going wrong are high. Support services both from a technical point of view:
- detect failures and to automatically try to solve them
- manage support ticket distributions between different partners
- measure SLAs
as well as from a functional point of view:
- dynamic call centers with temporary agents
- 3rd party certification programs in case small partners do not have local resources
- 3rd party support marketplace to offer more competition and compare reputations
are all areas in which global solutions could disrupt local and island solutions that are currently in place.
Fujitsu just presented SaaSification on Cebit. Existing applications can be easily brought to the Cloud and sold via App Stores and SaaS marketplaces. IBM is also working on SaaSification and even adds multi-tenancy.
What is next?
Everybody wants to have a full App Store or SaaS Marketplace, so SaaSification is the next step after launching your store. However converting a client/server application to the Cloud is only step 1. Step 2 is creating new services that are specifically built for the Cloud.
What does Built-for-the-Cloud means?
Cloud-Ready applications should also accept the new reality of APIs. Both for exposure as well as consumption. This means that applications need to be redesigned according to application slices.
So if SaaSification wants to be successful then it needs to add quick enablers for multi-tenancy, big data, integration with external APIs as well as API exposure, etc. This integration concept can be called iPaaS or integration platform-as-a-Service. iPaaS should not only focus on exposing or integrating APIs but on providing complex services by integration multiple SaaS solutions together.
Other enablers should be added as well. Basically 80% of a SaaS solution consists out of the same elements or tries to solve the same problems. These could all be provided via a SaaSification PaaS:
- Blog – to describe the newest ideas.
- Forum – for people to get answers from the community.
- IT PaaS – where you run the actual business logic and UI. Data storage is assumed to be provided by the Big Data elements.
- Portal and Mobile Portal – allows to quickly define the “static” content for the web and mobile site.
- Deployment management – ideally continuous deployment or integration tools that allow fast feature by feature deployment.
- A/B testing – allow new features to be deployed to subsets of users and check which version of a feature has the highest impact on the bottom-line. A/B testing was made popular by Amazon.
- Automated testing – lots of testing can be automated but especially end-to-end and performance testing are the harder tests that should be focused on.
- Configuration management – manage the version control of the code.
- Metering and billing – be able to meter the resource usage by users, companies or any other element you want to meter and be able to bill users both for subscriptions as well as for usage, ideally with advanced set-up with overage, etc.
- Marketplace listing and provisioning – automate the listing of products on the marketplace as well as the provisioning of new services.
- Single sign-on & identity management - allow companies to use their own user credentials (e.g. SAML), authorization for third-parties (e.g. oAuth), etc.
- Reporting and data warehousing – this can be part of the big data stack but especially being able to create ad-hoc reports for instance for A/B testing . Of course regular business reporting needs to be included as well.
- ERP – accounting, resource management, etc.
- CRM – sales and lead management
- Operations & Maintenance – automation of back-ups, monitoring both for the performance and fault management but as well business monitoring.
- Support – helpdesk, ticketing system, SLA management, etc.
- Social integration – tools to add social aspects like Facebook apps, Twitter feeds, etc.
The idea is not that a SaaSification PaaS offers all these solutions by custom development. Instead the SaaSification PaaS should allow startups to assemble an ideal architecture by combining different solutions from different providers. For example you would be able to select the support solution you prefer, e.g. desk.com, zendesk.com, etc. and this solution would be completely integrated into the overall stack, e.g. CRM integration with help desk and fault management together with sign sign-on.
SaaSification 2.0 should focus on making sure that 2-5 people can start a new dotcom solution and focus on creating a killer service and not on building up yet another stack of solutions for configuration management, support, billing, etc. If a SaaSification PaaS can shorten the time to launch with months and reduce the needs to operate the solution with several people then startups will see the value. Instead of SaaSification PaaS a good term could be Incubation PaaS, to incubate SaaS solutions. Once the business model and solution is proven, there will be money to move to a custom-build stack but during incubation and crossing-the-chasm enterpreneurs should be able to focus on delivering value to their customers and not on re-inventing the startup wheel.
The Raspberry Pi is nothing less than a revolution. A fully equipped computer the size of an iPhone for an amazing price, starting from $25. The first units will be shipped in Q2. The sites of the two global suppliers went down in the first hours due to overwhelming demand for pre-orders.
The Raspberry Pi is initially targeted as an educational device. However a low-power small fully equipped computer can do so much more. Hobbyist all over the world are working on new solutions.
Expect new solutions for consumers and the enterprise that will incorporate the Raspberry Pi. From home automation, think: “a server in every home”, to industrial use. Machine-to-Machine – M2M – solutions will be given a big boost.
Operators should look into offering industrial solutions whereby thousands of Raspberry Pi’s need to be monitored, remotely upgraded and administered.