Apache Spark's powerful open-source platform enables high-speed data processing for large and complex datasets. The joint benchmarking used the k-core decomposition algorithm of Spark's GraphX analytics engine, a particularly stressful series of memory-intensive tests. Previous collaboration between Diablo and Inspur demonstrated the advantages of Memory1 for Apache Spark Streaming.
The new graph testing on Memory1 highlights that users can achieve more work per server and greatly reduce the time needed to process increasingly larger datasets than servers with DRAM alone. As a result, users can get more work done with existing resources, minimize server sprawl, and improve Total Cost of Ownership.
Behind the tests
Diablo and Inspur tested Apache Spark (version 1.5.2) k-core decomposition performance on the same cluster of five servers (Inspur NF5180M4, two Intel Xeon CPU E5-2683 v3 processors, 28 cores each, 256GB DRAM, 1TB NVME drive). The servers were first configured to use only the installed DRAM to process multiple datasets. Next, the cluster was set up to run the tests on the same datasets with 2TB of Memory1 per server.
The k-core algorithm in Apache Spark was run against three graph datasets of varying sizes:
- 164GB set of 100 million vertices with 10 billion edges
- 340GB set of 200 million vertices with 20 billion edges
- 516GB set of 300 million vertices with 30 billion edges
As illustrated in the chart above, completion times for the smallest sets were comparable. However, the medium-sized sets using Memory1 completed twice as fast as the traditional DRAM configuration (156 minutes versus 306 minutes). On the large sets, the Memory1 servers completed the job in 290 minutes, while the DRAM servers were unable to complete due to lack of memory space. As the dataset grew, Memory1 results were several factors beyond what DRAM could do alone.
Read the white paper in full here: http://www.inspursystems.com/downloads/Inspur_Spark_Whitepaper_2_Apache_Spark_Graph_Performance_with_Memory1.pdf.
"While we anticipated a substantial performance improvement with Memory1, what's notable is that as the dataset scaled, the cluster without Memory1 failed," said Maher Amer, Chief Technology Officer at Diablo Technologies. "This clearly illustrates the complexity of analytics on big data workloads. Graph processing is the latest use case that shows the benefits of expanded memory in addition to SQL queries, machine learning, and streaming."
"Inspur Memory1 Servers have shown that they are the best solution on the market to complete analytics processing of big data tasks and are a necessary infrastructure for Apache Spark," said Alfie Lew, Solutions Architect at Inspur. "These results are exciting for the big data world, and we look forward to demonstrating more impressive results on additional memory-intensive applications."
Inspur Memory1 servers use Diablo's high-capacity flash-as-memory DIMMs and intelligent memory management software to enable more work per server. Memory1 scales up memory resources, delivering up to 40TB of application memory in a single rack. More efficient and resource dense servers means improved real-time analytics, faster business decisions, and more transactions completed in a shorter amount of time. The net result provides users with the flexibility to address evolving business needs and technologies at a lower TCO.
To learn more about the Inspur Memory1 Server™, visit: http://www.inspursystems.com/solutions/integrated-solutions/inspur-memory1-server/.
ABOUT DIABLO TECHNOLOGIES
Diablo is at the forefront of developing breakthrough technologies for next-generation enterprise computing. The company's flagship Memory1 is a first-of-its-kind memory technology that delivers four times the capacity of the largest DRAM modules. Diablo's Memory Channel Storage platform combines innovative software and hardware architectures with Non-Volatile Memory to introduce a new and disruptive generation of Solid State Storage for data-intensive applications. The Diablo leadership team has decades of experience in system architecture, chipset design, enterprise software and business development at companies including PMC-Sierra, Anobit, AT&T-Microelectronics, Bell Labs, Nortel Networks, Intel, Cisco, AMD, SEGA, Cadence Design Systems, Matrox Graphics, BroadTel Communications and ENQ Semiconductor. Learn more at http://www.diablo-technologies.com.
ABOUT INSPUR SYSTEMS INC.
Inspur Systems Inc., located in Fremont, Calif., is part of Inspur Group, a leading Cloud Computing and global IT Solutions Provider. Inspur was founded in 1945 and has since provided IT products and services for over 85 countries in the world. Inspur is ranked by Gartner as one of the Top 5 largest server manufacturers in the world and #1 in China. Inspur provides our global customers with data center servers and storage solutions which are Tier1 quality and performance, energy efficient, cost effective and built specific to actual workloads and data center environments. As a leading total solutions and services provider, Inspur is capable of providing total solutions at IaaS, PaaS and SaaS level with high-end servers, mass storage systems, cloud operating system and information security technology. For more information, visit www.inspursystems.com.
To view the original version on PR Newswire, visit: http://www.prnewswire.com/news-releases/benchmarks-show-diablo-technologies-memory1-doubles-the-speed-of-apache-spark-graph-processing-300403107.html
SOURCE Diablo Technologies
Inspur Systems Inc.,