By Dana Gardner CRM Buyer Part of the ECT News Network
09/11/08 4:00 AM PT
Greenplum and Aster Data will both implement MapReduce technology into their database engines. By expanding the role and reach of MapReduce technologies and methods, a powerful new tool is added to the BI arsenal, writes contributor Dana Gardner of Interarbor Solutions. More data, more data types, more data sources are all rolled into an analytical framework.
In what could best be termed a photo finish, Greenplum and Aster Data Systems have both announced that they have integrated MapReduce into their massively parallel processing (MPP) database engines.
MapReduce, pioneered by Google (Nasdaq: GOOG) for analyzing the Web, now becomes available to enterprises and service providers, giving them more access and visibility into more data from more origins. Originally created to analyze massive amounts of unstructured data, the approach has been updated to analyze structured data as well.
Greenplum, in San Mateo, Calif., says that MapReduce will be part of its Greenplum Database beginning in September. Aster Data, of Redwood Shores, Calif., says that MapReduce will be included in its Aster nCluster.
Fruitful Marriage
Curt Monash, president of Monash Research, editor of DBMS2, and a leading authority on MapReduce, sees this as a major leap forward. He reports that both companies had completed adding MapReduce to their existing products and had been racing to the finish line to get their news out first. As it turned out, both made their announcements within hours of each other.
Curt lists some points on his blog about what this new technology marriage means.
Google's internal use of MapReduce is impressive. So is Hadoop's success. Now commercial implementations of MapReduce are getting their shots too.
The hardest part of data analysis is often the recognition of entities or semantic equivalences. The rest is arithmetic, Boolean logic, sorting, and so forth. MapReduce is already proven in use cases encompassing all of those areas.
MapReduce isn't needed for tabular data management. That's been efficiently parallelized in other ways. But, if you want to build non-tabular structures such as text indexes or graphs, MapReduce turns out to be a big help.
In principle, any alphanumeric data at all can be stuffed into tables. But in high-dimensional scenarios, those tables are super-sparse. That's when MapReduce can offer big advantages by bypassing relational databases. Examples of such scenarios are found in CRM and relationship analytics.
Enterprise's Crystal Ball
Greenplum customers have been involved in an early-access program using Greenplum MapReduce for advanced analytics. For example, LinkedIn is using Greenplum Database for new, innovative social networking features such as "People You May Know" and sees it as a way to develop compelling analytics products faster. A primary benefit of the new capability is that customers can combine SQL queries and MapReduce programs into unified tasks that are executed in parallel across hundreds or thousands of cores.
Part of the appeal of business intelligence and its huge ramp-up over the past five years is that IT assets play an ever-larger role in providing unprecedented strategic guidance and insights to leaders of enterprises, governments, telecos and cloud providers. IT has gone from an automating business functions role to an essential crystal ball service of the highest order. By consequently gaining access to larger data sets that -- more than ever before can be mined and analyzed for higher levels of process and business refinements -- IT has become a member of the board.
With better data reach and inclusion come better results. So BI allows leaders to establish the trends early that will determine their future success or failures. In a fast-paced, global, hyper-competitive business landscape, these insights are the currency of success for the future. The better you do BI, the better you do business ... current, near-term and long-term. There's no better way to know your customers, competitors, employees and the variables that buffet and stir markets than effective BI.
Insatiable Appetite for Data
Now, by expanding the role and reach of MapReduce technologies and methods, a powerful new tool is added to the BI arsenal. More data, more data types, more data sources -- all rolled into an analytical framework that can be directly targeted by developers, scripters, business analysts, executives and investors.
These new MapReduce use announcements mark a significant advancement that helps makes IT another notch higher in its utility and indispensable nature to business. And it comes at a time when more data, meta data, complex events, transactions and Internet-scale inferences demand tools that can do for enterprise BI what Google has done for Web search and indexing.
Being comprehensive and deep with massive data sets analytics offers a new mantra: The database is dead, long live the data. Structured data and the containers that contain it are simply not enough to organize an access the intelligence lurking on modern networks, at Internet scale and Internet time.
Dana Gardner is president and principal analyst at Interarbor Solutions, which tracks trends, delivers forecasts and interprets the competitive landscape of enterprise applications and software infrastructure markets for clients. He also produces BriefingsDirect sponsored podcasts. Disclosure: Greenplum is a sponsor of BriefingsDirect podcasts.
Analyst Season September 03, 2008
As CRM 2.0 becomes more of a reality and less of a concept, industry analysts need to come up with a way to distinguish it from CRM 1.0 in their discussions of trends and strategies, suggests Beagle Research Group Managing Principal Denis Pombriant. Companies are faced with a fork in the road, and each path will take them into substantially different terrain.
Related Stories
Mass SQL Attack a Wake-Up Call for Developers April 28, 2008
A novel hacker attack on Web servers that rely on Microsoft SQL database technology has the security community in something of a dither. There seems general agreement that the mass SQL injection approach is highly sophisticated, that it could work against any database, and that developers need to stick to best practices to keep their systems safe.
The Art of Data Management Compliance, Part 2: Guarding Against Theft April 27, 2008
One of the biggest challenges companies face today is integrating the security process into day-to-day business operations in order to comply with strict data management regulations. Companies need to take a step back and -- instead of addressing individual issues -- take a holistic approach to network security.
Database Engine Flaw Makes Word Attachments Dangerous March 25, 2008
Microsoft believes the risk from a vulnerability in its database engine to be limited "because customers have to take several steps in order for the attacks to be successful." For example, one attack uses a safe Word file and a malicious Access file sent together as e-mail attachments. The victim must save both files in one folder and open the Word file first; this contains code that will open the malicious Access file.
Related News Alerts
More by Dana Gardner
Extreme Cloud Computing, CERN Style March 17, 2010
In many ways, CERN -- the European Organization for Nuclear Research in Geneva -- is quite possibly the New York of cloud computing. If cloud can make it there, it can probably make it anywhere. CERN deals with large data sets, big throughput requirements, a global workforce and finite budgets. Sound familiar?
Cloudgazing: What Does the Rebound Have in Store for IT? March 14, 2010
If we're lucky, the bottom of this recession is behind us, and it's time to build back up. What does the rest of this year and beyond hold for the IT industry? Here we'll speak to a range of experts in the field and get their thoughts on the trajectory of cloud computing as well as privacy, security, hiring and outsourcing.
The Shaky Dance of SOA and SaaS March 10, 2010
Service-oriented architecture's hype machine is winding down; meanwhile cloud computing's is going strong. Is this a case of one having given way to the other? How is the emergence of SOA and the cloud happening in different places inside of enterprises? Shouldn't one hand get to quickly know what the other is up to -- and perhaps even work together?