Welcome | Sign In
ECommerceTimes.com
Business Intelligence

INSIGHTS
MapReduce and the Database: Analytics in Hyperdrive

Print Version
E-Mail Article
Reprints
MapReduce and the Database: Analytics in Hyperdrive

Greenplum and Aster Data will both implement MapReduce technology into their database engines. By expanding the role and reach of MapReduce technologies and methods, a powerful new tool is added to the BI arsenal, writes contributor Dana Gardner of Interarbor Solutions. More data, more data types, more data sources are all rolled into an analytical framework.


How Much is 'Free' Costing You?
Learn how DaveRamsey.com saw a 567% uplift in ROI with Omniture. This complimentary guide and webinar cover the most important factors in selecting an analytics solution. Download Now.

In what could best be termed a photo finish, Greenplum and Aster Data Systems have both announced that they have integrated MapReduce into their massively parallel processing (MPP) database engines.

MapReduce, pioneered by Google (Nasdaq: GOOG) for analyzing the Web, now becomes available to enterprises and service providers, giving them more access and visibility into more data from more origins. Originally created to analyze massive amounts of unstructured data, the approach has been updated to analyze structured data as well.

Greenplum, in San Mateo, Calif., says that MapReduce will be part of its Greenplum Database beginning in September. Aster Data, of Redwood Shores, Calif., says that MapReduce will be included in its Aster nCluster.

Fruitful Marriage

Curt Monash, president of Monash Research, editor of DBMS2, and a leading authority on MapReduce, sees this as a major leap forward. He reports that both companies had completed adding MapReduce to their existing products and had been racing to the finish line to get their news out first. As it turned out, both made their announcements within hours of each other.

Curt lists some points on his blog about what this new technology marriage means.

  • Google's internal use of MapReduce is impressive. So is Hadoop's success Increase Customer Sales with Email Marketing -- Free Trial from VerticalResponse. Now commercial implementations of MapReduce are getting their shots too.
  • The hardest part of data analysis is often the recognition of entities or semantic equivalences. The rest is arithmetic, Boolean logic, sorting, and so forth. MapReduce is already proven in use cases encompassing all of those areas.
  • MapReduce isn't needed for tabular data management. That's been efficiently parallelized in other ways. But, if you want to build non-tabular structures such as text indexes or graphs, MapReduce turns out to be a big help.
  • In principle, any alphanumeric data at all can be stuffed into tables. But in high-dimensional scenarios, those tables are super-sparse. That's when MapReduce can offer big advantages by bypassing relational databases. Examples of such scenarios are found in CRM and relationship analytics.

Enterprise's Crystal Ball

Greenplum customers have been involved in an early-access program using Greenplum MapReduce for advanced analytics. For example, LinkedIn is using Greenplum Database for new, innovative social networking features such as "People You May Know" and sees it as a way to develop compelling analytics products faster. A primary benefit of the new capability is that customers can combine SQL queries and MapReduce programs into unified tasks that are executed in parallel across hundreds or thousands of cores.

Part of the appeal of business intelligence and its huge ramp-up over the past five years is that IT assets play an ever-larger role in providing unprecedented strategic guidance and insights to leaders of enterprises, governments, telecos and cloud providers. IT has gone from an automating business functions role to an essential crystal ball service of the highest order. By consequently gaining access to larger data sets that -- more than ever before can be mined and analyzed for higher levels of process and business refinements -- IT has become a member of the board.

With better data reach and inclusion come better results. So BI allows leaders to establish the trends early that will determine their future success or failures. In a fast-paced, global, hyper-competitive business landscape, these insights are the currency of success for the future. The better you do BI, the better you do business ... current, near-term and long-term. There's no better way to know your customers, competitors, employees and the variables that buffet and stir markets than effective BI.

Insatiable Appetite for Data

Now, by expanding the role and reach of MapReduce technologies and methods, a powerful new tool is added to the BI arsenal. More data, more data types, more data sources -- all rolled into an analytical framework that can be directly targeted by developers, scripters, business analysts, executives and investors.

These new MapReduce use announcements mark a significant advancement that helps makes IT another notch higher in its utility and indispensable nature to business. And it comes at a time when more data, meta data, complex events, transactions and Internet-scale inferences demand tools that can do for enterprise BI what Google has done for Web search and indexing.

Being comprehensive and deep with massive data sets analytics offers a new mantra: The database is dead, long live the data. Structured data and the containers that contain it are simply not enough to organize an access the intelligence lurking on modern networks, at Internet scale and Internet time.


Dana Gardner is president and principal analyst at Interarbor Solutions, which tracks trends, delivers forecasts and interprets the competitive landscape of enterprise applications and software infrastructure markets for clients. He also produces BriefingsDirect sponsored podcasts. Disclosure: Greenplum is a sponsor of BriefingsDirect podcasts.


Print Version E-Mail Article Reprints More by Dana Gardner


Related News Alerts

Google Activate Alert | Search Archives

More by Dana Gardner

Nothing New Under the Business Commerce Cloud?
November 22, 2009
Business commerce clouds are all about leveraging cloud architecture to go to the next level: a dynamic business-services environment that wells up around the needs of a business group or niche, and then subsides when lack of demand dictates. Is this the wave of the future, or are we really just pouring old "business webs" wine into new bottles?
Text Analysis and the Next Generation of BI
November 15, 2009
External data has grown in both volume and importance across the Internet. Companies are figuring out ways to make the most of Web data services for business intelligence. Real-time text analytics fills out a framework of Web data services that can form a whole greater than the sum of the parts. However, any BI or any text analysis is no better than the data source behind it.
Pumping Up Performance in Densely Packed Data Centers
November 08, 2009
Thanks to architectural advancements and better efficiencies, densely stuffed data centers can carry ever-greater loads, and that can certainly work to consolidate and ultimately reduce costs. However, having fewer data centers means all the information they handle will likely have to travel longer distances between server and user. Network services and Internet performance management may be the solution.
Don't miss a story -- sign up for our FREE e-mail newsletters and view the latest headlines at a glance.
Tech News Flash [ View Sample ]
E-Commerce Minute [ View Sample ]
ECT News Network Weekly Newsletter [ View Sample ]
Shortcuts
ECT News Network Information
Reader Services
Corporate
ECT News Network