Violin Solutions for Data Warehousing

Solution Brief
Run your Data Warehouse
Applications At the Speed of Memory
Highlights
Simplify Data Warehouse
Environments
• Speed up reporting times
by removing stages in the
batch process
• Eliminate the need for
production copy data marts
• Reduce storage overprovisioning
To better enable decision makers in an era of data overload, the
mandate today is the ability to run ad hoc, dynamic queries, where
reports can be generated in real-time, anytime to gain business or
financial insights. To gain valuable real-time business insights from
data volumes, storage solutions need to be more predictable in
their performance behavior as well as scale cost-effectively.
Challenges with Existing Data Warehouse Environments
Data warehouses ingest data from multiple sources and then transform them into
meaningful data sets for consumption. Data is loaded, updated, changed and accessed
throughout the day. A typical datawarehouse comprises of various steps including
importing, staging, transformation, integration, sorting. Therefore, the challenges for
datawarehouse systems are multi-fold:
• Time consuming and complicated datawarehouse processes are difficult to manage and
support and take too long to run
Accelerate Data
Transformations (ETL)
• Legacy storage with slow performance is limiting the move towards real-time analytics
• Up to 20x increase in ETL
performance
• Secondary copies for reporting (data marts) are used to protect the performance and
integrity of production, leading to storage inefficiencies and data replication complexity.
• Make data available faster and
improve user experience
Working through the time-consuming data warehouse steps may require multiple copies of data.
When doing transformations or sorting, the application might be creating a new dataset while
retaining the raw data, storing multiple copies of the same data in the staging area. This results
in storage being overprovisioned but only used during the batch process.
• Up to 20x faster report
generation
• Sub-millisecond latency
for any workload
Lower Total Cost of
Ownership
• Reduced operational and
capital expenses
The Violin Flash Memory Array Difference
The value in data warehouse lies in the ingress and egress of terabytes of data – data
coming from the database and data expelled to the database for consumption – as well as
the administration of the data over time. Most businesses will batch up jobs into overnight
activities so as to not impact production, but in reality, these batch processes take hours to
complete with the risk that running into the next working day’s activities can cause a massive
performance and financial hit.
• Up to 80% reduction in power,
cooling, space costs
• Lower data locality
management software licensing
VMEM.COM
©2013 Violin Memory, Inc. All rights reserved. These products and technologies are protected by U.S. and international copyright and intellectual property laws.
Violin Memory is a registered trademark of Violin Memory, Inc. in the United States and/or other jurisdictions.
Solution Brief: Run your Data Warehouse Applications At the Speed of Memory
Flash Speeds Up the Transaction Processes, Enabling
Real-time Reporting
Flash storage provides maximum raw, random I/O performance with very low microsecond latency
under any workload. The distributed block nature of Violin’s flash Arrays allow for massive parallelism,
resulting in faster and more stable throughput of data. Sustained microsecond latency allows for fast
data and log writing producing 5-10x faster load times.
The natural fragmentation of data in data warehouse environments is irrelevant to Violin’s unique,
distributed architecture, therefore data loads can be reduced in complexity by dropping the sort and
single-threaded write processes. With Violin, an ETL can be a full blast of data utilizing all cores
straight into the final partition or table, instead of having to sequentially sort the data and sequentially
write it out to the physical storage.
Violin’s faster read/write latencies also allow for quicker sorting, ordering and temp space utilization,
providing for overall quicker report run durations. Its distributed architecture allows for maximum
concurrency. All memory addresses are equally accessible at the same speed at all times so that any
number of concurrent users can access data without degrading storage performance.
Data warehouse processes on Violin flash Memory Arrays can thus be reduced to the following:
Staging Table (ETL)
Main Table
Source-1
Source-2
Source-3
Ready
immediately!
Source-n
By removing the layers of the batch process required to make reports perform well on traditional
disks, the overall time taken from ingest to egress is dramatically reduced on Violin’s flash Arrays, at
reduced complexity. An 8-hour overnight batch process can be achieved in 30-60 minutes.
Flash Simplifies Data Warehouse Administration
Violin’s distributed architecture allows for any number of LUNS to run at the same speed. The same
distributed architecture removes the concept of hot spots or issues pertaining to data locality,
therefore, you do not need expensive or time-requiring software to distribute hot data, run tiering
or otherwise manage issues pertaining to data locality. Violin arrays engage all flash at all times in
order to increase parallelization and engage as many flash chips as possible for maximum speed,
at all times. Storage tasks such as backups and archives are magnitudes faster and can be ran
concurrently, without impact.
Violin’s unique and dynamic all-flash, all-silicon, parallelized array enables data warehouse
applications to be predictable in their performance and linearly scalable, at a lower TCO than
traditional disk-based storage.
Violin Memory, Inc.
685 Clyde Ave, Mountain View, CA 94043
Ph: 1-888-9VIOLIN (984-6546)
E-mail: [email protected]
vmem-13q2-sb-datawarehouse-usltr-en-r01
vmem-12q4-sb-oracledb-uslet-en-r2-print