From time to time the IT community falls in love with certain concepts – cloud, virtualization and SaaS to name a few. More recently, another idea that has gained currency is In-Memory especially in conjunction with Big Data and analytics. It is a good concept and it does what it implies – data is stored in RAM as opposed to disk. While it may be a no-brainer that running data analytics using RAM rather than disk is considerably faster, the substantial hype around In-Memory, mostly aided by the marketing push made by vendors, has convinced businesses that there is no other alternative but a bank-breaking In-Memory solution. It would appear that the bitter pill of an expensive price tag is sweetened by the promise of BI and analytical nirvana.
Recently, Gartner revised down its forecast for overall global IT spending from 3.7% to 2.5% (1) and many CIOs are staring at slashed IT budgets. In this situation it’s important for them to re-examine their BI and analytic needs in order to assess if committing a massive portion of their IT budget to In-Memory is the best use of their already squeezed spending.
With the digital revolution sweeping through the world and organizations increasingly collecting data from various sources – whether it be from social networks, digital patterns, trends and online conversations – data volumes will continue to grow exponentially. The ability to capitalise on growing data volumes represents a business opportunity that can help organizations gain a competitive edge but I wonder whether they equipped to leverage this opportunity? In-Memory analytic vendors are correct in that RAM can step in by providing quick, actionable BI. But at what cost?
One of the prerequisites for implementing In-Memory is to ensure that an organization’s database system has enough RAM in order to meet the data volumes that need to be analyzed. With the mass of data multiplying at the speed of thought – a Mckinsey report estimates unstructured data volumes to increase 44 times by 2020 (2) – sooner or later the size of the database will outgrow the memory on which it is running. What do CIOs do then?
One option is to do nothing and be content with a hybrid RAM/disk-based system that offers varying analytic speeds. In my opinion, that completely goes against the concept of fast analytics and business intelligence. After all, how do you determine which data is to be analyzed in RAM and how much on disk? How do you ensure that your database system has enough memory in order to meet the data volumes you want to analyze? In many a case, servers only have a small amount of the RAM needed which means that only a portion of the data can be pinned into memory while the rest sits on disk trudging along at usual disk speeds.
The next solution, which is proving popular with a lot of organizations, is to buy more RAM. This has been aided by the falling prices of memory over the years but it is still a costly solution. At the risk of sounding like a doomsday alarmist, by going down this road organizations are setting themselves up for a fall; eventually data volumes will outgrow RAM capacities in servers again and necessitate further hardware investment.
Yes, RAM prices have fallen considerably over the years – in 1990, 8MB of RAM cost $851, in 2000 64MB of RAM cost just $72 and in 2011 8GB cost just between $50-70 (3). But it is still a pricey solution when you consider that data volumes are now in the Terabyte or even Petabyte ranges. Ultimately it comes down to how much CIOs are prepared to spend on an In-Memory solution in order to have the optimum system that can derive benefit from the speeds that RAM offers. The buck doesn’t stop here – the immediate consequence of investing in an In-Memory solution means more servers in database system. So it’s not just the cost of acquiring the RAM and server hardware but also the cost of implementation, administration, power and cooling. The list goes on.
So, what can CIOs do to meet the demands of this insatiable analytic requirement and is there a way to end this vicious In-Memory cycle? Perhaps, it would be more rewarding to make best use of the performance of the fast processors that modern servers offer. To go beyond the levels of RAM and leverage the performance gains that have been made in modern CPUs.
Or perhaps the only way to handle vast amounts of data is to accept that there will indeed be a distribution between disk and RAM but that the disadvantages of this can be mitigated through modern disk technologies and highly intelligent caching algorithms. That they can be mitigated through column store architectures with compression and vector-based processing.
What is certainly clear is that RAM has its part to play in offering accelerated analytical speeds, however there are newer technologies out there that allow CIOs to bypass the additional expense of resources, overhead, power consumption and the whole “do I or don’t I buy more RAM?” argument. Indeed, the question is not “in-memory” or “no in-memory”, but is one that is more forward-looking. CIOs should be asking themselves how they can break the confines of disk and RAM capacity to satisfy the analytical data tsunami that is coming their way. How can they maximize their hardware and make best use of the CPUs in their investments. Only then will be free to interrogate large, complex data volumes with ease, at a significantly reduced cost and with minimal impact energy-wise.