Archive

Archive for the ‘Collective Intelligence’ Category

Big Data Apps and Big Data PaaS

March 21, 2012 2 comments

Enterprises no longer have a lack of data. Data can be obtained from everywhere. The hard part is to convert data into valuable information that can trigger positive actions. The problem is that you need currently four experts to get this process up and running:

1) Data ETL expert – is able to extract, transform and load data into a central system.

2) Data Mining expert – is able to suggest great statistical algorithms and able to interpret the results.

3) Big Data programmer – is an expert in Hadoop, Map-Reduce, Pig,  Hive, HBase, etc.

4) A business expert – that is able to guide all the experts into extracting the right information and taking the right actions based on the results.

A Big Data PaaS should focus on making sure that the first three are needed as little as possible. Ideally they are not needed at all.

How could a business expert be enabled in Big Data?

The answer is Big Data Apps and Big Data PaaS. What if a Big Data PaaS is available, ideally open source as well as hosted, that comes with a community marketplace for Big Data ETL connectors and Big Data Apps? You would have Big Data ETL connectors to all major databases, Excel, Access, Web server logs, Twitter, Facebook, Linkedin, etc. For a fee different data sources could be accessed in order to enhance the quality of data. Companies should be able to easily buy access to data of others on a Pay-as-you-use basis.

The next steps are Big Data Apps. Business experts often have very simple questions: “Which age group is buying my product?”, “Which products are also bought by my customers?”, etc. Small re-useable Big Data Apps could be built by experts and reused by business experts.

A Big Data App example

A medium sized company is selling household appliances. This company has a database with all the customers. Another database with all the product sales. What if a Big Data App could find which products tend to be sold together and if there are any specific customer features (age, gender, customer since, hobbies, income, number of children, etc.) and other features (e.g. time of the year) that are significant? Customer data in the company’s database could be enhanced with publicly available information (from Facebook, Twitter, Linkedin, etc.). Perhaps the Big Data App could find out that parents (number of children >0), whose children like football (Facebook), are 90% more likely to buy waffle makers, pancake makers, oil fryers, etc. three times a year. Local football clubs might organize events three times a year to gain extra funding. Sponsorship, direct mailing, special offers, etc. could all help to attract more parents, of football-loving-kids, to the shop.

The Big Data Apps would focus on solving a specific problem each: “Finding products that are sold together”, “Clustering customers based on social aspects”, etc. As long as a simple wizard can guide a non-technical expert in selecting the right data sources and understanding the results, it could be packaged up as a Big Data App. A marketplace could exist for the best Big Data Apps. External Big Data PaaS platforms could also allow data from different enterprises to be brought together and generate extra revenue as long as individual persons can not be identified.

NextGen Hadoop, beyond MapReduce

Hadoop has run into architectural limitations and the community has started working on the Next Generation Hadoop [NGN Hadoop]. NGN Hadoop has some new management features of which multi-tenant application management is the major one. However the key change is that MapReduce no longer is entangled inside the rest of Hadoop. This will allow Hadoop to be used for MPI, Machine Learning, Master-Worker, Iterative Processing, Graph Processing, etc. New tools to better manage Hadoop are also being incubated, e.g. Ambari and HCatalog.

Why is this important for telecom?
Having one platform that allows massive data storage, peta-byte data analytics, complex parallel computations, large-scale machine learning, big data map reduce processing, etc. all in one multi-tenant set-up means that telecom operators could see massive reductions in their architecture costs together with faster go-to-market, better data intelligence, etc.

Telecom applications, that are redesigned around this new paradigm, can all use one shared back-office architecture. Having data centralized into one large Hadoop cluster instead of tens or hundreds of application-specific databases, will enable unseen data analytics possibilities and bring much-needed efficiencies.

Is this shared-architecture paradigm new? Not at all. Google has been using it since 2004 at least when they published Map Reduce and BigTable.

What is needed is that several large operators define this approach as their standard architecture hence telecom solution providers will start incorporating it into their solutions. Commercial support can be easily acquired from companies like Hortonworks, Cloudera, etc.

Having one shared data architecture and multi-tenant application virtualization in the form of a Telco PaaS would allow third-parties to launch new services quickly and cheaply, think days in stead of years…

Social Graph for Big Data just got a new Open-Source member: Giraph

February 21, 2012 1 comment

The Big Data elephant just got a well-connected Giraff friend. Putting it differently, Yahoo and LinkedIn have open sourced scalable social graph software. If Hadoop was the Open Source version of the Google File System and HBase the Bigtable version, now it is time for an Open Source version of Google’s Pregel: Giraph.

In Open Source Social Graph Software Not Ready Yet I complained about the social graph not being ready. Giraph should change this.

So why is this important for operators?

Any service that wants to “be social” needs a social graph solution. A social graph links the Twitter followers, the LinkedIn colleagues, the Facebook friends, etc. For operators a mobile social graph can link callers. Who calls who, who influences who, who is going to churn with whom, who might also appreciate this marketing campaign, who should definitely know about this new service, etc.?

The “Hello World” example of Giraph is Google’s Pagerank. Pagerank is the power of Google search and now it is available to everybody that has millions of users. Be sure to  keep an eye on this Giraph because the “Apache Zoo” just acquired a new important animal in its Big Data Analytics department…

Social Niche Marketplaces and SaaSification

February 8, 2012 Leave a comment

Google App Marketplace was the first marketplace for SaaS. However there has lately been an explosion of SaaS marketplaces. Unfortunately most of them are eCommerce sites that support subscriptions and resell Microsoft 365, some cloud backup and 3 to 5 things more.

Operators that are considering such a me-too marketplace should try harder

There is nothing like an average enterprise customer. Each customer is looking for a unique mix of services. You have innovators, early adopters, early majority, late majority, laggards. You have self-employed, micro, small, medium and large companies. You have industries. Users are working on different functions within a company (finance, operations, sales, etc.).

However never has it been easier to personalize product portfolios according to market segments, industries, adoption likelihood, usage, etc. Operators should not set-up one marketplace but instead set-up intelligent personalized niche marketplaces. Users can tell you which industry they belong to, what their company size is, what their function is and if they are more eager to use the latest and greatest or if they want a full eco-system with a market leading product. This means that a highly personalized portfolio can be shown instead of a bunch of generalist products.

Why sell different products via different channels?

If you have customers segmented, then ideally all relevant products are presented in one personalized marketplace. Ranging from phones, tablets, mobile apps, SaaS, on-site equipment, advanced consultancy services, support, etc.

Bringing in intelligence and social commerce

The next step is to increase the likelihood of selling a product and cross-selling products. Users like product reviews and ratings. However users love product reviews and ratings from people they trust. What if each product in addition to a general section on product reviews and ratings also has a social review section. The social review section would be like:

  • these contacts from my linkedin network have bought this service
  • these contacts have bought these alternative services
  • their ratings are
  • in addition they also bought these services

How to go from 0 to 1.000.000 products?

Many operators offer services for “the average customer”. The product catalog is relatively small. Few have more than a couple of niche products per industry. Setting up a social niche marketplace is no good if you do not have a large catalog of personalized services to sell.

SaaSification to the rescue. Every industry has a lot of small companies that have build niche products. Most of these products require on-site installations. This means a lot of CAPEX. Often more is spend on buying the hardware, base software, services to maintain the data center, support services, etc. than on the actual software. By offering these small companies a SaaSification solution whereby they can migrate their on-site solution to an operator-hosted SaaS solution, the product catalog can be quickly extended with thousands of niche products. Offering tools to make single-tenant solutions multi-tenant and to make web solutions mobile-enabled, will substantially improve your chances to attrack ISVs.

New SaaS will move from the innovators towards the early adopters, early majority, etc. Early majority products will be niche market leaders, have strict SLAs, a support eco-system, etc. Leading products can be identified by the market. Operators can spot those niche market leading products and offer special deals, even co-branding. This strategy will allow a personalized long tail strategy without the long tail costs…

Scaling Machine Learning

November 14, 2011 1 comment

I stumbled upon GraphLab. GraphLab allows for scaling machine learning. It is sort of the Hadoop for Machine Learning. It recently changed to the very business friendly open source Apache license. GraphLab is written in C++ but has Java and Python APIs. Things like PageRank, Collaborative Filtering, Clustering, etc. are what GraphLab can be used for.

So what is so important about GraphLab for telcos?

Having a scalable and business friendly open source solution for machine learning will allow operators to use algorithms like PageRank for calls (Who is the most important person among a set of subscribers), Collaborative Filtering (make recommendations on which services or apps a subscriber should buy based on what others have bought that are similar), Clustering (grouping subscribers that have common features automatically and target them with promotions), etc.

New NoSQL and similar products to keep on the radar.

August 12, 2011 1 comment

Google has open sourced a low-level nosql storing engine that is authored by the creators of mapreduce and bigtable. Definitely worthwhile to keep an eye on: leveldb. Especially for the products that will be incorporating it.

In a previous post I mentioned that open source graph databases where not ready yet. This one looks promising. Especially because the authors are the number three in the social networking space. At least if they provide access to the code and use a business friendly open source license like Apache’s: stigdb.

Twitter is open sourcing storm on September 19th. It has been referred to as the hadoop of realtime processing. All stream related data is likely to see big advantages by using storm. Update: Storm has been released on github. Check out the wiki pages.

Why don´t you recommend me?

September 8, 2010 13 comments

Every time I use Google or Amazon, they become a bit smarter. They know which type of information I search for, which products I like, which messages I write, etc. Afterwards they can find similar users and recommend me products and services I might like.

The usage of data warehouses is common in the telecom domain. However the collective intelligence that is held in them, hardly sees daylight. I will receive the odd call if my profile fits a potential churner. I might even get notified about a new tariff. But this is where it normally stops.

Why are telecom operators not replicating what made the Amazon´s and Google´s big: collective intelligence? You can easily cluster users, categorize their behavior, help them search what they need, recommend them services they might want, etc.

Let´s take an easy example: tariff plans.

They come in all colors and sizes, change frequently and have direct impact on my happiness.  So why don´t I get a recommendation like:

Similar users of our services:

  • subscribed to the 500Mb data plan (73%)
  • added the “call-for-free-in-the-weekend” option (65%)
  • removed the 300 free SMS option (43%)

Even better would be:

Based on your last 3 months behavior and the behavior of users similar to you, you can save €21/month if you change your tariff plan from “expensive-tariff-A” to “cheap-tariff-B”.

Yes of course the operators would be loosing all the money they are overcharging. So in order to avoid lowering ARPU how can we use collective intelligence to increase sales or create new customized services?

“Congratulations with your new iPhone. Other users that purchased an iPhone also:

  • subscribed to visual voicemail (54%)
  • contracted a theft insurance (39%)

“We have noticed that you call these 5 persons most frequently. Two of them are not a user of our services.  You would save €5.43/month if they would join our services. Additionally for €3/month you would be able to call these 5 persons unlimited.”

All my examples are too complex to handle technology-wise? I don´t think they are more complex than what Google, Facebook and Amazon are doing. You just need to make sure you use their technology…

Follow

Get every new post delivered to your Inbox.

Join 189 other followers