Customer Case Study Celtra Customer Case Study Celtra Benefits • Increased the amount of ad-hoc analysis done six-fold, leading to better informed product design and quicker issue detection and resolution. • Reduced the load on the analytics engineering team by expanding access to the number of people able to work with the data directly by a factor of four. • Increased collaboration and improved reproducibility and repeatability of analyses. • Reduced the cost of cloud infrastructure through faster and easier management of Spark clusters. Summary • Celtra relied on data analytics to inform product design, troubleshoot anomalies, and fine-tune the performance of its display advertising software platform capabilities. • Celtra encountered difficulties in meeting the rising demand for data analysis due to the large scale of the data, diversity of data sources, and small size of the analytics team. • Celtra selected Databricks as their data processing platform; enabling teams from Engineering, Product Management, and QA to directly work with data and perform the required analysis. databricks.com Customer Case Study – Celtra 2 Business Background Celtra provides agencies, media suppliers and brand leaders alike with an integrated, scalable HTML5 technology for brand advertising on smartphones, tablets and desktop. The platform, AdCreator 4, gives clients such as MEC, Kargo, Pepsi and Macy’s the ability to easily create, manage, and traffic sophisticated data-driven dynamic ads, optimize them on the go, and track their performance with insightful analytics. A wide variety of data is collected by Celtra, including data related to internal company processes, data based on the usage of the product by clients and, most importantly, data focused on the engagements of consumers with their clients’ ads. In addition to providing analytics to its clients, Celtra is constantly exploring new ways to leverage this gathered information to improve their offering, for example: •P roduct usage analysis: Analyzing feature adoption, usage patterns and support cases to direct further development focus. • Environment analysis: Assessing the feasibility of new product concepts and detecting trends by analyzing the context in which Celtra’s ads run, such as the publisher and device of choice. •T echnical performance: Monitoring load times of ads closely across multiple dimensions i.e. ad complexity, geography, connectivity and CDNs. Most recently, Celtra has been evaluating the performance benefits of SPDY and HTTP/2 for improved page load times. •Q uality Control: Computing key performance metrics to detect issues at deployments, enabling the automatic detection of anomalies to detect regressions that would otherwise get lost in the averages. databricks.com Customer Case Study – Celtra 3 Challenge As Celtra’s business grew, it was challenged to meet the corresponding increase in demand for analytics due to three reasons: 1. D iversity of data sources: The production and engineering data from Celtra’s systems are scattered in different locations. Celtra did not have an easy way to combine the data from these disparate data sources and perform the necessary analysis in a single analytics platform. 2. L arge scale of data: Celtra’s production systems generate tens of terabytes data monthly. While Celtra has been using Spark as its data processing platform since its early days and accumulating lots of expertise, this knowledge was limited to the team working on the analytics portion of the product. 3. S mall analytics team: The analytics team consisted of only four people, who quickly became the bottleneck to service requests from Product Management, Engineering, and QA. To overcome these challenges, Celtra needed a powerful data platform that was capable of integrating data from disparate data sources while being fast enough to support interactive analysis at terabyte scale. This platform must also be user-friendly enough to empower teams outside of analytics to perform analysis themselves, and to remove the bottleneck created by their small analytics team. © 2015 Celtra Inc. All rights reserved. databricks.com Customer Case Study – Celtra 4 Solution Celtra adopted Databricks as their centralized analytics platform because the key features in Databricks could easily address all of Celtra’s needs: •Z ero management Apache Spark: Spark is an open source big data processing framework that was built for speed and scale. Databricks made Spark much easier to deploy by combining the power of Spark with a zero-management hosted platform on Amazon Web Services (AWS), allowing Celtra to take advantage of Spark without the DevOps burdens typically associated with big data infrastructure. •S eamless connection to diverse data sources: Databricks provided built-in APIs to access data from AWS S3 and relational databases. Since the full power of Scala is available in Databricks, data from various web service APIs could be accessed as well. Celtra could seamlessly connect its data by consolidating the disparate sources in Databricks. •R euse of production code in ad-hoc analyses: Since Databricks is based on Apache Spark, similar to Celtra’s production analytics pipeline, a lot of production code could be reused as the foundation for ad-hoc analyses instead of rewriting code in another framework. •U ser-friendly interactive workspace: Databricks included an intuitive, multiuser interactive workspace for real-time analysis and visualization, enabling teams other than analytics, to work with data directly in a single, easy to use environment. With the adoption of Databricks, Celtra has enabled teams from Engineering, Product Management, and QA to perform complex data analysis on their own, leveraging the massive production data to improve product design, address anomalies rapidly, and finetune the performance of production systems. databricks.com Customer Case Study – Celtra 5 Benefits The most important benefit Celtra gained from deploying Databricks, is the ability to remove the bottleneck within its analytics team to meet the surging demand for big data analysis across the company. Since its introduction, Databricks has been broadly utilized by over a third of the technical staff in Engineering, QA and Product Management. As a result of empowering them to work with the data directly, many more questions have been asked and hypotheses tested, leading to better informed product design and quicker issue detection and resolution. Celtra has increased the amount of analyses done and insights obtained by six-times in the first four months after adopting Databricks alone, and increased the number of people working with our most valuable data by fourfold. Aside from dramatically boosting the amount of analytics done, Celtra also experienced two additional benefits from using Databricks: • Improved collaboration and reproducibility: The self-documenting nature of notebooks in Databricks meant that ad-hoc analysis code was automatically stored in a centralized location. This feature encouraged teams to leverage the existing codebase instead of duplicating past efforts in writing new code, eventually leading to a maintainable collective codebase for ad-hoc analysis. Additionally, by having all work stored by default, past results could be easily reproduced in cases where additional insight was needed. “The notebooks feature in Databricks encourages good documentation by automatically recording the code written during an ad hoc analysis session. This has had profound effects for us, from increasing collaboration and improving reproducibility to making analysis more approachable to a wider audience, who can start off by cloning someone else’s research.” • Reduced cloud infrastructure cost: The faster and easier provisioning, resizing, and databricks.com Customer Case Study – Celtra – Jaka Jančar Chief Technology Officer at Celtra 6 deprovisioning of Spark clusters made Celtra engineers more comfortable with shutting down unused clusters whenever possible. Agility in cluster management also facilitated the use of Spot Instances by making its use less risky. When combined with the “Jobs” feature of Databricks, Celtra was able to substantially reduce the cost of its cloud infrastructure by scheduling long-running jobs that automatically provision and deprovision clusters as needed. “Databricks is used by over a third of our technical staff — from engineering to product management — to help us make smart, data-driven decisions; After implementation, the amount of analysis performed has increased sixfold, meaning more questions are being asked, more hypotheses tested.” – Jaka Jančar Chief Technology Officer at Celtra Evaluate Databricks with a trial account now. databricks.com/registration databricks.com Customer Case Study – Celtra 150417 7
© Copyright 2024