Apache Hadoop Innovation Summit Don’t Be Afraid of the Elephant in the Room February 12 & 13, 2015 Westin San Diego, San Diego, CA #Hadoop15 Confirmed Speakers Confirmed Speakers • Enterprise Engineer, Google • Big Data Engineer, Groupon • Senior Director, Data Solutions, The New York Times • Director, Consumer Science Engineering, Netflix • Lead Research Scientist, eBay • Director, Data Engineering, Wikia • Data Scientist, Live Nation • Software Architect, AOL • Manager, Business Analytics, LinkedIn • Enterprise Architect, Art.com • Director, Big Data, Sears • Data Informatics Leader, GE • Engineering Lead, Twitter • Engineering Manager, Etsy • Senior Director, Data Management, Time Warner Cabel • Principal Architect, Schneider Electric • Data Architect, Simmons Prepared Foods •Vice President, Data Platforms, ESPN • Architect, Salesforce.com Who Will You Meet? There is no question that IE. provides the gold standard events in the industry and will connect you with decision makers within the analytics industry. You will be meeting s e n i o r l ev e l ex e c u t i v e s f ro m m a j o r corporations and innovative small to medium size companies. Company Size Of Attendees 1000+ Employees 300-999 Employees 50-299 Employees Less than 49 Employees 56% 81% Job Title Of Attendees 78% Attendees are at Director level or above 3% 21% President /Principal SVP/VP 12% C-Level 42% Snr. Director /Director 25% Attendees are companies with at least 300 employees 13% Global Head / Head 8% Snr. Manager /Manager 11% 8% Academic (1%) Past Delegates include • • • • • • Director, Analytics - Facebook Director, Insight - Red Bull Vice President - Google Senior Director - Coca-Cola Data Engineer - Blizzard Entertainment Senior Vice President - Samsung About The Summit In the cutting edge market of Big Data, modern businesses are faced with the challenge of storage, management, analysis, visualization and security. New technologies, solutions and challenges are exploding outwards as Big Data continues to grow exponentially. Hadoop, a huge piece of the puzzle, continues to present both exciting opportunities and engineering challenges. Can you become cloud native? What new alternative paradigms are available with Hadoop? What are the limitations of sole Hadoop use? How can you use it for machine learning. What about Integration? Corporate Accessibility? Ethics? These burning issues are what the summit looks to address. The Apache Hadoop Innovation Summit is an industry-led event. In principle, this means that attendees are working in engineering, architectural and data science roles. In practice, this means less sales pitches and more in-depth discussion on what like-minded professionals are doing with their Big Data. Confirmed Speaker Information Sriram Krishnan Big Data, Cloud, Distributed Systems Engineering Leader Twitter Sriram is an Engineering Manager on the Data Platform team at Twitter, where he leads a fantastic group of engineers building core big data processing frameworks such as Summingbird, Scalding, Spark, and Parquet. Prior to that, he was the tech lead of the Big Data Platform team at Netflix, where he built and open sourced Genie, which is Netflix’s Hadoop Platform as a Service. Sriram has a Ph.D. in Computer Science from Indiana University, and spent several years at the San Diego Supercomputer Center working on advanced cyberinfrastructures for science and engineering applications. Gopal Krishnan Director, Consumer Science Engineering Netflix Gopal Krishnan is Director of Consumer Science Engineering at Netflix. He leads many aspects of the AB testing innovation to help personalize and improve Netflix experience. Previously, he spent over a decade at Yahoo on high scale infrastructure including building the first the global Yahoo homepage. Data Platform at Twitter - Enabling Realtime & Batch Analytics at Scale The data platform at Twitter supports engineers and data scientists running batch jobs on Hadoop clusters that are several 1000s of nodes, and real-time jobs on top of systems such as Storm. In this presentation, I will discuss the overall data platform stack at Twitter. In particular, I will talk about Scalding, which is a Scala DSL for batch jobs using MapReduce, Summingbird, which is a framework for combined real-time and batch processing, and Tsar, which is a framework for real-time time-series aggregations. I will also discuss our experience with Spark, and where it fits in the overall ecosystem. Data Platform at Twitter - Enabling Realtime & Batch Analytics at Scale Netflix is renowned for it’s use of big data to improve personalization for our members. Previously, our personalization depended only on explicit user inputs like star ratings, taste preference, plays, etc. We recently incorporated additional implicit user signals such as interactions on device like scrolling, navigation, and idle time. This session will focus on the challenges of using these new high volume data sources with billions of events/day. What are the challenges of maintaining data quality across hundreds of device types? How do we scale efficient nearline systems to serve this data for algorithmic consumption close to real time? Arek Kaczmarek Senior Director, Platform & Data Solutions The New York Times Arek Kaczmarek is responsible for the company's data platform and implementation of a new data platform based on Big Data technologies. He previously worked at Intel, as a Senior Big Data Solutions Architect at the Data Center Group. His skills include among others knowledge on the Big Data ecosystem, Hadoop/Hive/Pig, NoSQL, ELK (ElasticSearch/Logstash/Kibana), Lambda architecture, Oracle, data warehousing, ETL, BI Analytics, systems architecture, PaaS and the cloud Thanigai Vellore Enterprise Architect Art.com Thanigai is an enterprise architect, technologist and innovator with over 14 years of progressive experience specializing in building large, highly scalable software systems. At Art.com, Thanigai is the lead architect responsible for defining and driving the technology roadmap initiatives for building the next generation technology vision and platform for the company. Thanigai’s interests and specialties include Hadoop/Big data, NoSQL, Distributed Systems, Enterprise Architecture, Scalability, etc. Prior to joining Art.com, Thanigai has worked in engineering roles at Sanmina and Flextronics. Michael Lurye Senior Director, Enterprise Data Management Time Warner Cable Mike Lurye is Senior Director, Enterprise Data Management for Time Warner Cable. He and his team are responsible for shared data warehousing assets and functions that benefit multiple Business Intelligence (BI) teams and their customers. This includes creation of enterprise data assets, BI architecture, quality assurance, and data quality management. In addition, Mike and his team are responsible for evaluation and adoption of Big Data technologies. Prior to joining TWC Mike held Product Management and Product Marketing positions with Amdocs, focused on decision automation, mobile content and personalization solutions. Mike’s prior experience includes senior roles at major analytical CRM & marketing services companies. The Next Enterprise Data Warehouse is a Hadoop Data Lake As the data volumes and data generation velocity start growing, so does the value of all the enterprise data being generated. At the New York Times, we have moved away from the traditional Enterprise Data Warehouse based on dimensional modeling and created a data lake where the time to market for data solutions and applications is much faster and much more robust than it ever was before. This presentation will provide an overview of the data lake approach, how to get there, and why it makes sense for companies with growing data volumes. The discussion will focus on cost, architecture, and time to market solutions. Leveraging Hadoop in Polyglot Architectures At art.com, we have a heterogeneous web stack (java, node.js and .net) to support our global brands and multiple websites. In this session, I will share our experience in leveraging the power of Hadoop to reach multiple business goals. The talk will also focus on the tools that help in addressing concerns related to polyglot architectures such as interoperability, multi-tenancy, schema evolution and standardization. I will also talk about some frameworks and packages that help in codifying best patterns and practices in integrating Hadoop with other systems such as traditional Business Intelligence systems, Web Analytics and other distributed computing technologies like Apache Spark. Offloading ELT Workloads to Hadoop Time Warner Cable’s Journey Shifting ELT workloads from the enterprise data warehouse (EDW) to Hadoop is gaining traction for reducing costs, incorporating new data faster, and freeing up EDW capacity for user-facing analytics and BI workloads. But, where do you start and what’s the best approach? This presentation outlines the framework and processes that Time Warner Cable used to: · Evaluate potential use cases and architectural options for Hadoop · Identify ELT offload as the first focus area · Choose technology components for the next generation enterprise data integration solution · Apply best practices to configure Hadoop environment for data integration Weidong Zhang Manager, Business Analytics LinkedIn Weidong Zhang earned his Ph.D in Computation fluid dynamics. He has a nature and passion of the analytics, research and data driven decision-making. He spent 10+ years in the data warehouse field, and tends to leverage his knowledge with the business intelligence and the Hadoop massive data process capability to address business needs. Currently, he worked as a manager in Data Analytics Infrastructure team in LinkedIn and leads the marketing and customer service data warehouse vertical. Nazali Dereli Data Scientist Live Nation Nazli Dereli is currently a data scientist in Live Nation Userscoring team. She is working on realtime classification of users and detection of abusive actors that are stopping users from buying tickets by holding the tickets. Before joining Live Nation, she was working in Data Mining and Bioinformatics Lab in University of California, Santa Barbara focusing on mining brain activity networks to discover insights on human learning. Her interests include social and biological network analysis, and interesting problems on data and graph mining. Beena Ammanath Data Informatics Leader GE Beena is the Data Science Informatics Leader at GE. She leads the data efforts to support data science at GE. She works across the GE businesses to drive advanced analytics development leveraging big data technologies. She is passionate about data and analytics to aid cross functional teams to derive data insights, aid teams in articulating questions they did not know they had and help view data in more effective ways. Beena has over 20 years’ experience in the data arena with a number of international organizations including British Telecom, E*trade and Thomson Reuters. She holds a Masters in Computer Science and MBA in Finance. Releasing the Power of Hadoop As a data driven company, LinkedIn has very strong analytical teams, and has many data engineers, data scientists, business analysts and business users, who focus on different domains and business of the company. These users have different kind usage types and needs. Making them more productive and efficient is the key point to make the company success. This talk covers the ecosystem our Data Analytics Infrastructure (DAI) team built, which release the power of Hadoop and make it easy to use. This ecosystem contains several open sourced products, such as: Pinot, Cubert, and Gobblin(, for fast computation and real time reporting support), and some tools to automatic reports generations. I will also cover the roles of our data warehouse team and our mission. Detecting Abusive Actors in Hadoop Ecosystem Live Nation is the global event ticketing leader with 400,000,000 tickets sold and 180,000 events ticketed in 19 countries. However there is always the threat of growing multibillion dollars secondary market that intends to prevent users from buying primary tickets. This talk will explain how to detect such abusive actors in Hadoop ecosystem using different approaches from offline, semionline and online learning. We will go over the process of building our system starting with different Hadoopbased approaches leading to our final decision to use Apache Storm for realtime classification built on top of Hadoop ecosystem. Making Hadoop Relevant for the Industrial Internet Data management and advanced analytics are core to GE’s recent success in delivering superior software-based services to customers across aviation, power generation, oil & gas, healthcare, and transportation. The torrent of data generated from machines, networks, devices and data centers in industry verticals provide challenges and opportunities. The challenge is to make this machine data meaningful and actionable to deliver on opportunities around operational efficiencies. I will share real-world case studies, leveraging Hadoop to demonstrate tangible operational benefits - ranging from fuel savings to improving productivity to reducing unscheduled maintenance to enhancing on-time performance - by tightly integrating machines, networked sensors, industrial-strength data, and software to enable intelligent insights and affect measurable outcomes. Ben Jackson Software Engineer AOL As an engineering leader, Ben am as comfortable with strategy documents and presentations as I am deep in the code. He uses his understanding of the bigger picture to make the best tactical choices for his team in an agile environment. Bens specialties include: technical writing, hadoop, SaaS applications, big data, parallel algorithms, distributed computing, high performance computing Ranjan Sinha Lead Research Scientist eBay, inc. Ranjan Sinha is a Lead Data Scientist at eBay Inc. where he has led projects that significantly enhanced consumers’ shopping experiences. Previously, Dr. Sinha was a research academic at the University of Melbourne and holds a PHD in Computer Science from RMIT University, Australia. He has over 25 publications in top-tier venues such as IEEE Big Data, VLDB Journal, and ACM SIGMOD. He was awarded the Sort Benchmark medals for JouleSort and PennySort and was amongst WSJ’s Top-12 Asia-Pacific Young Inventors. He is a regular speaker on Big Data and Data Science and co-organizes the popular Bay Area Search Meetup. Ameya Kantikar Big Data Infrastructure Engineer Groupon Ameya is a lead engineer on Groupon’s deal relevance and personalization system working on big data technologies such as Hadoop and HBase. Earlier he also built scalable message bus system that now powers Groupon's global service oriented architecture handling hundreds of millions of messages. Before Groupon, he was Sr Software Engineer at LiveOps working with distributed systems. Ameya holds masters in Information Systems from Carnegie Mellon University and masters in Computer Science from Pune University. Valentino Tereshko Enterprise Sales Engineer Google Valentino is a Solutions Architect with Google Cloud Platform, helping companies accelerate innovation. Valentino focuses on Big Data and Cloud Computing use cases for large Enterprises. Prior to Google, Valentino spent his time at several startups, ranging from Streaming Big Data to Cloud Monitoring and Financial Analytics, and he began his career as a trader and quant developer at an options trading firm in Chicago. The Information Apache Hadoop Innovation Summit Date: Location: Venue: Accommodation: February 12 & 13, 2015 San Diego, California Westin San Diego Click here for online reservations Registration Pricing Silver Pass Gold Pass Diamond Pass $1495 $1795 $1995 Access to all sessions & networking events 7 days access to presentations from the summit via ieOnDemand Access to all sessions, networking events & unlimited access to presentations from the summit via ieOnDemand Access to all sessions, networking events, annual subscription to all content on the Big Data & Analytics channels via ieOnDemand $1295 $1595 $1795 Early Bird Price (before Dec 12) Early Bird Price (before Dec 12) Early Bird Price (before Dec 12) Access All Areas Pass 1 Day Pass $2295 Access to all sessions of the Apache Hadoop Innovation Summit, Data Science Innovation Summit & Predictive Analytics Innovation Summit On-Demand Pass $795 Full access to the sessions to your chosen day of the summit, 7 days access to presentations from the summit via ieOnDemand Annual subscription to content on the Big Data & Analytics channels via ieOnDemand 7 day online access to event materials $600 Unlimited access to presentations from the summit via ieOnDemand, including presentations, interviews & the ability to contact speakers Unlimited access to summit presentations via ieOnDemand Group Discount Offers 3 Silver Passes: 5 Silver Passes: 3 Gold Passes: 5 Gold Passes: 3 Diamond Passes: 5 Diamond Passes: $3000 ($1000 per attendee) $4500 ($900 per attendee) $3900 ($1300 per attendee) $6000 ($1200 per attendee) $4500 ($1500 per attendee) $7000 ($1400 per attendee) For larger groups or special requests contact Bola by calling +1 415 692 5378 or email [email protected] * Team discounts are applicable at the point of registration only. Ways to Register +1 415 692 5378 +1 323 446 7673 Register Here Registration Form Apache Hadoop Innovation Summit February 12 & 13, 2015 | Westin San Diego | San Diego, CA For registration or more information on the program, please call Bola on +1 415 692 5378, or fax this registration form to +1 (323) 446 7673 1. Delegate Information... NAME OF EACH ATTENDEE TITLE OF EACH ATTENDEE DEPARTMENT COMPANY INDUSTRY ADDRESS CITY STATE/PROVINCE ZIP/POSTAL CODE EMAIL OF EACH ATTENDEE COUNTRY BUSINESS PHONE NUMBER 2. Pass Types... Early Bird Pass Options until December 12, 2014 Group Discount Pass Options Early Bird Silver: $1295 Attendees ____ 3 Silver Passes $3000 ($1000 per attendee) Early Bird Gold: $1595 Attendees ____ 5 Silver Passes $4500 ($900 per attendee) Early Bird Diamond: $1795 Attendees ____ 3 Gold Passes $3900 ($1300 per attendee) Early Bird One Day: $795 Attendees ____ 5 Gold Passes $6000 ($1200 per attendee) Regular Pass Options after December 12, 2014 Silver Pass: $1495 Attendees ____ Gold Pass: $1795 Attendees ____ Diamond Pass: $1995 Attendees ____ One Day: $995 Attendees ____ 3 Diamond Passes $4500 ($1500 per attendee) 5 Diamond Passes $7000 ($1400 per attendee) For larger groups or special requests contact Bola by calling +1 415 692 5378 or email [email protected] Group passes only available when all participants register together. Pass Descriptions: Silver Pass: Access to all sessions & networking events Gold Pass: Access to all sessions, networking events & unlimited access to the summit presentations via ieOnDemand Diamond Pass: Access to all sessions, networking events, annual subscription to all content on the Big Data & Analytics channels via ieOnDemand Access All Areas Pass: Access to all sessions of the Apache Hadoop Innovation Summit, Data Science Innovation Summit & Predictive Analytics Innovation Summit, networking events, annual subscription to all content on the Big Data & Analytics channels via ieOnDemand 3. Payment Options... Check (Make checks payable to The Innovation Enterprise Ltd) Visa Mastercard CARD NUMBER American Express EXPIRATION DATE Invoice me Diners Club Discover SECURITY NO. CARDHOLDERS NAME CARDHOLDER’S SIGNATURE BILLING ADDRESS -(same as above) INDUSTRY Prices are exclusive of VAT. Places are transferable without any charge to another Summit occurring within 12 months of the original purchase. Team discounts are applicable at the point of registration only. Any cancellations within a group registration will in turn incur an increase in registration fee for the remaining group participants. Cancellations before January 12, 2015 incur an administrative charge of 50%. If you cancel your registration after January 12, 2015 you will be charged the full fee. You must notify The Innovation Enterprise in writing of a cancellation, or you will be charged the full fee. The Innovation Enterprise reserve the right to make changes to the program without notice. NB: FULL PAYMENT MUST BE RECEIVED BEFORE THE EVENT. Schedule Day One February 12 08.30 Session One 08.30 - 10.00 10.00 Coffee Break 10.00 - 10.30 10.30 Session Two 10.30 - 12.00 12.00 Lunch 12.00 - 13.30 13.30 Session Three 13.30 - 15.00 15.00 Coffee Break 15.00 - 15.30 15.30 Session Four 15.30 - 17.00 17.00 Networking Drinks 17.00 - 19.00 19.00 Day Two February 13 08.30 Session Five 08.30 - 10.00 10.00 Coffee Break 10.00 - 10.30 10.30 Session Six 10.30 - 12.00 12.00 Lunch 12.00 - 13.30 13.30 Session Seven 13.30 - 15.00 15.00 15.30 Coffee Break 15.00 - 15.30 Session Eight 15.30 - 17.00 17.00 Sponsors Platinum Sponsor Media Partner For sponsorship information contact Giles Godwin-Brown Media Partner 2015 Calendar January May Big Data Innovation Summit Big Data Innovation Summit January 22 & 23, Las Vegas May 13 & 14, London Cloud Innovation Expo Big Data & Analytics in Healthcare January 22 & 23, Las Vegas May 13 & 14, Philadelphia February Chief Data Officer Summit September Continued September 23 & 24, Boston May 20 & 21, San Francisco Data Science Innovation Summit February 12, San Diego Apache Hadoop Innovation Summit February 12 & 13, San Diego The Digital Oilfield Innovation Summit Big Data & Analytics Innovation Summit November Big Data & Analytics for Pharma November 4 & 5, Philadelphia June Big Data & Marketing Innovation Summit Big Data & Analytics for Pharma Big Data for Finance June 10 & 11, Philadelphia Open Data Innovation Summit June 10 & 11, Boston February 19 & 20, Houston Big Data Innovation Summit November 4 & 5, Miami November 11 & 12, Boston Data Visualization Summit November 11 & 12, London Big Data & Analytics for Retail Summit June 17 & 18, Chicago Chief Data Officer Summit November 11 & 12, London February 27 & 28, Singapore August March Big Data & Analytics Innovation Summit Big Data & Analytics Innovation March 25 & 26, Brazil August 5 & 6 Kuala Lumpur April Big Data & Analytics Innovation Summit Big Data & Analytics Innovation Summit November 11 & 12, London Big Data & Analytics Innovation Summit November 25 & 26, Beijing August 19 & 20, Brazil December September Big Data & Analytics in Banking Summit April 15 & 16, Santa Clara Big Data & Analytics Innovation Summit December 2 & 3, New York Data Visualization Summit September 17 & 18, Sydney Big Data Innovation Summit April 15 & 16, Santa Clara DataTalent April 15 & 16, Santa Clara Big Data Innovation Summit April 23 & 24, Hong Kong Chief Data Officer Summit December 2 & 3, New York Data Visualization Summit September 23 & 24, Boston Flagship Women Hadoop High Tech Government CXO Finance Expected Healthcare Pharma Oil & Gas Partnership Opportunities: Giles Godwin-Brown | [email protected] | +1 415 692 5498 Attendee Invitation: Sean Foreman | [email protected] | +1 415 692 5514
© Copyright 2024