Archive

Posts Tagged ‘yahoo’

Social Graph for Big Data just got a new Open-Source member: Giraph

February 21, 2012 1 comment

The Big Data elephant just got a well-connected Giraff friend. Putting it differently, Yahoo and LinkedIn have open sourced scalable social graph software. If Hadoop was the Open Source version of the Google File System and HBase the Bigtable version, now it is time for an Open Source version of Google’s Pregel: Giraph.

In Open Source Social Graph Software Not Ready Yet I complained about the social graph not being ready. Giraph should change this.

So why is this important for operators?

Any service that wants to “be social” needs a social graph solution. A social graph links the Twitter followers, the LinkedIn colleagues, the Facebook friends, etc. For operators a mobile social graph can link callers. Who calls who, who influences who, who is going to churn with whom, who might also appreciate this marketing campaign, who should definitely know about this new service, etc.?

The “Hello World” example of Giraph is Google’s Pagerank. Pagerank is the power of Google search and now it is available to everybody that has millions of users. Be sure to  keep an eye on this Giraph because the “Apache Zoo” just acquired a new important animal in its Big Data Analytics department…

LTE will kill the telecom cash-cow. Is your cash-calf ready to take over?

Long Term Evolution, LTE or sometimes also referred to as 4G, is the next generation mobile network technology.  It promises to bring network speed to the mobile that can beat the current ADSL offerings. In the beginning LTE prices might be high but competition especially from new entrants – “the Ryanairs of telecom” / “4G Bitpipes” – are likely to bring affordable pricing plans soon. The US already has the first “4G Bitpipe players”: Clearwire and Lightsquared.

So what does it mean if tomorrow you can have ADSL-like speeds for an (almost) flat-rate. In practice, end-users would be crazy to still pay €0,15 for minute for a call or per SMS. Skype with its optimized codecs (e.g. SILK) will offer better voice quality and will throw in video for free. Instant messaging, Twitter and Facebook chat will completely substitute SMS. This will be the end of the telecom cash-cows: calls and SMS…

What will be the next cash-calf? For those operators that are still looking for the “Killer App” – that single technology that only telecom operators can offer and is extremely successful – I have some news. Postal services are still looking for their killer app after the stamp was substituted by email. So is the music industry. There is no economic law that says that a former monopolist has the right to pick its next monopoly.

So if there is no “Killer App” does it mean that all telecom operators are doomed to become bit-pipes tomorrow? Over time several will but not necessarily all. Although dotcoms have the sexiest solutions, large corporations are unlikely to massively shift their communication services to a heavily indebted 25 people company close to a surf-paradise beach. So due to inertia the abyss is still some years away. However should you just give up and let  consumer ARPU drop year by year?

I believe there is still a window of opportunity for telecom operators to bring new appealing services. However they must be willing to abandon some important historical laws of telecom.

1) Standards slow innovation

Collectively negotiate a standard that is more a political compromise then the simplest, most effective way of doing things is not helping innovation. In the Web 2.0 era, dotcoms launch new ideas all the time. Most of the time it is a “winner takes it all or at least most” market. So the winner sets the standard. How many Twitter competitors do you use?

By designing an architecture around obscure standards, few operators have employees that can explain their company’s architecture. Google and others have invested heavily in their architecture. They constantly update it. But on a blackboard a Google architect can draw you exactly why they choose Bigtable, GFS, etc.

2) Don’t talk about subscribers, call them users

A subscriber is an entity that signs a monthly contract with a telecom operator. By doing so a subscriber seems to subscribe to a list of applications that the marketing department of the telecom operator has preselected as the most adequate for him or her. The operators seems to know what is best for their subscribers. WRONG!!!!!!!!

Call them users and give them the tools to select/create/design/customize/configure the services they want. Let the community vote about which feature is needed. Ask users why they stop using a new service after a week. Let users define the price they are willing to pay by offering multiple alternative solutions in different price ranges with different feature sets.

3) Go from a catalog of few to an infinite catalog

If Telecom can no longer survive based on a few hit services, then they could go to the other extreme: the long tail telco. A long tail telco offers an almost infinite catalog of solutions that combine communication assets with other solutions in order to solve user’s problems, to make them more productive or to entertain them.

Users should be able to combine products to resolve their needs. A good example is what is offered by Invox. Via wizards, templates or a Yahoo Pipes drag-and-drop configuration, small to large enterprises can configure their own telecom services like call centers, PBX, etc. They can easily integrate the best of the Internet (Salesforce, Google, Yahoo, etc.) with IP-based communication. You use what you need. You configured it the way you want it.

What is missing is a market in which those users that don’t want to do it themselves or who need specific support (e.g. custom integrations), can go and find the right help.

Telecom operators should no longer focus on end-user services but on enabling the end-user and an eco-system of independent third-parties to be able to create and sell solutions and services to one another. As long as it is easier, faster and cheaper for a third-party to use an operator’s tools and assets they will see no need to design an alternative solution. This brings us to the next point…

4) Monopolists die because of greediness

Revenue shares of 40-95% are often not in line with the value and risk the operator takes in the value chain. Those operators that think that “squeezing partners until the last drop” is a good long-term strategy, will be the first to die. Innovation needs out-of-the-box thinking. People don’t take risks if they don’t see rewards.

You will need to do more than to just blindly follow these four rules. But by applying them and listening to users, you are on your way to create new cash-calfs…

If you are not doing Cloud Computing now, then you are late!

September 14, 2010 Leave a comment

Any operator that has not started a project on Cloud Computing is late. The typical data center at an operator is filled with servers that are under utilized e.g. application servers and database servers are running at 30% of memory, disk and CPU. Just by doing step one of getting to Cloud Computing: virtualization, operators are able to save substantially in the cost of hardware, electricity, maintenance, etc. Virtualization means decoupling software from hardware. This allows to run multiple operating systems on one server.

However this would only be focusing on the tip of the iceberg. Cloud Computing is so much more…

Private Clouds

Automatic Scaling

Let´s first focus on the internal systems of an operator. After solutions have been virtualized, then you are able to scale them to more or less servers. The first step is to automate this process. If you have an application server cluster, do you need 8 nodes all the time? You probably only need them the week before Christmas or during some other peak period. So the ideal is to be able to measure the load and to automate the deployment of more or less cluster nodes based on load. The same can be done with the database. During the night you have 2 nodes. In the morning 3. During the day 4. During peak moments 8. In the evening 3 again. You could save massive amounts of money if application servers and databases can be scaled in this way. You ideally also are able to pay licenses based on what you really use and not on your maximum number of nodes during a yearly peak.

Redesigning Applications and Data

Both Amazon and Google found out that if they redesign their applications then they can get even more gains than pure virtualization. Amazon´s S3 service is a clear example. However internally they started with services like Dynamo on which S3 is build. The first step is to build general data stores. Multiple applications should be using a common data store instead of needing a separate database cluster each.

Unlike popular believe in the IT world, the dotcoms are not filling their data centers with Oracle RAC clusters. The dotcoms are designing special purpose data stores. The data volumes any market-leading dotcom has to deal with are so massive that a SQL database can not keep up. SQL databases are very good at running efficient queries on structural data or making sure transactions are consistent. However they fail when data is unstructured, write operations are massive or data volumes grow with terabytes every data.

Relational Data

So for all low-volume applications that need transactional data and read more than they write, you could still use a unified Oracle RAC cluster to serve multiple applications. An alternative approach are the data stores that have been build by Amazon (Relational Database Service or SimpleDB) or Google´s App Engine (Datastore with JDO).

What other alternatives are there?

Read Mostly Data

Data that needs to be read a lot and is not updated frequently can get an enormous performance and scalability boost by using an in-memory data store. The dotcom standard is memcached. Facebook (800 servers and 28TB) and Twitter are addicted to memcached.

Documents, Images & Videos

Binary and media files are best stored outside of a database. In small numbers they are often stored on a file system. However they occupy a lot of disk as well as network bandwidth when moved around. The ideal is a document store with a content-delivery network or CDN as a front-end. Amazon´s S3 and CloudFront are examples. Storing them in a compressed format, e.g. LZO can save valuable space. Also transcoding into different formats, e.g. thumbnails or preview can help save network bandwidth.

Unstructed Realtime Data

Data that is unstructured and needs to be stored and accessed in real-time in high volumes are best stored in special purpose data stores. You can write a book about the latest NoSQL solutions. Write an email to maarten at telruptive dot com if you are interested.

Analytics Data

Twitter has described most extensively how they use all the unstructured data they get from their logs and other sources. They use technology from Facebook to stream it into a high-available file-system from Yahoo. There they run massive parallel map-reduce operations to get to know a lot more about what users are doing and who is influencing who, etc.

Social Graph

The social graph is about who knows who and what kind of relationship you have. This data is best stored in graph data stores.

Collective Intelligence

Again a chapter by itself but dotcoms are also heavy users of collective intelligence which often means dedicate systems.

Accessing Data

Instead of stove pipes with data, the dotcoms are making data accessible to all their applications. Either via search interfaces, web technology to access data (e.g. REST and JSON) or efficient binary interfaces (Thrift and Protocol Buffers).

Messaging and Notification

Amazon is having a simple queue service and a simple notification service to make sure applications communicate in a uniform matter.

Applications

If applications have access to all the above services then the architecture of an application is simplified enormously. Most of the famous dotcoms don´t use middleware. They prefer the SOA principle. However unlike the IT SOA solutions, a dotcom would take an application and make it into a chain of reusable services. Let´s take an IVR application as an example. There would be a service to do voice recognition. Another one for voice transcription. Another one for text-to-speech. A transcoding service to transcode between different media formats (e.g. high-quality voice and low-phone-quality voice). And so on. Each service has independent load-balancing and can be scaled separately. Services can be re-used between applications. An application is very short because it just need to define which services need to work together and how.

Application Deployment

The dotcoms deploy new features on a daily and even hourly basis. This means that all application deployment is fully automated. When a new feature is deployed it does not necessarily overwrite an existing feature. It is possible that a new functionality has been solved in 5 different approaches. Dotcoms would split the total user base and let small parts of users try out the different approaches. Depending on the user´s feedback they would take the preferred approach and slowly scale up from 1% to 100%. If they detect that the feature has a performance problem or a bug then they would be able to roll-back or decrease the load, fix it and deploy gradually again.

The Network, OSS and BSS

There is a substantial effort needed to redesign a network to be cloud-aware. Some components need latencies lower than 10 milli-seconds (e.g. antennas), hence most of this logic will have to be processed locally. However all systems that can live with 100 milli-seconds latencies benefit from a cloud make-over.

Especially in the area of OSS and BSS there is room for optimizing applications and making them cloud-aware. Global services like a network inventory service, a user profile service, a device profile service, etc. would mean simpler applications and less data duplication.

Opening the Cloud

So the network and IT infrastructure is being redesigned to allow for faster innovation and lower costs. However Cloud Computing can also be used to increment revenues.

Being a Cloud Infrastructure Provider

Many IT consultancies and software/hardware vendors will tell an operator that they could be a Cloud infrastructure provider. On slides this really looks nice. However unless an operator is not using the cloud computing principles for their own systems as described in the first part, they are lacking substantial knowledge about how to manage such an infrastructure. Without this knowledge it would be hard to have a very optimized solution and as such be price competitive with the existing players.

Being a Cloud Platform Provider

Although closer to the operator´s core competencies, being a cloud platform provider would still be for those operators that are Cloud experts. A Cloud platform provider would allow others to use the infrastructure services to create applications on top. The complexity lies in the fact that malicious users try to break the platform which could have a very negative effect on the infrastructure if not handled correctly.

Being a Cloud Service Provider

This is the default option most operators should explore first before moving into the other areas. Being a service provider also has a roadmap:

Reselling SaaS

The easiest step is to be the storefront and to resell IT applications from others, e.g. cloud backup storage, security solutions, etc.

Offering Telco SaaS

The next step would be to offer specific telecom applications. Applications that are build for the operator or even better applications that can be build by others based on the operator´s assets. An example would be a PBX in the Cloud.

Open Market for SaaS

Building all telecom applications yourself is hard. Attracting others to do it for you is easier. However just putting a “Net App Store” and an SDK on the web will not get you to dominate the market. Only an open market with a large eco-system of companies and developers can generate large quantities of “Net Apps”. If you are thinking about building an open market, why don´t we talk first. Send an email to maarten at telruptive dot com.

Scaling to 500 million users

September 10, 2010 Leave a comment

In the telecom domain a scalable real-time architecture means paying a lot of money in hardware and licenses. You buy the Oracle RAC solution, build a Weblogic cluster, set-up a storage area network, etc.

In the dotcom world things look differently. Facebook, Google, Twitter, Yahoo, Amazon, etc. have more active users then any telecom system. However they have build their architecture on top of open source solutions and average servers. Some even build their own software and sometimes open-sourced it.

Some of this software has very exotic names: Hadoop, Bigtable, Cassandra, Pig, Elephant-Bird, Dremel, Pregel, Dynamo, etc. Additionally design decisions are taken that would surprise every IT teacher: “do not normalize”, “do not expect immediate consistency”, “no transaction support”, “store in memory instead of on disk”, etc.

However if you can support 500 million users, 100 million daily hits, 130TB of logs, 20 billion tweet messages, 1 million servers, etc. then something you should be doing right.

The telecom software industry seems to have been isolated from the Internet during the last five years. With the shift to IP it is expected that more IT companies will be able to provide telecom solutions. Is this the solution? Not sure! Also IT companies are still playing catch-up in the cloud computing domain. Few IT solutions providers are demonstrating, they now think Map-Reduce instead of Middleware.

Google Voice is coming and most operators seem to be still more worried about churning subscribers. Google Latitude and Maps demonstrated that with new technology and innovation you can destroy the telecom monopoly  on location-based services overnight…

If you are a telecom operator and you are worried, perhaps it is time we talk.

Follow

Get every new post delivered to your Inbox.

Join 189 other followers