The Apache Flink® Conference
Stream Processing | Event Driven | Real Time
San Francisco April 9–10, 2018
March 6, 2018
Stream Processing is a powerful paradigm, especially when backed by a system like Apache Flink. With each release and year, we see Flink being used for more challenging use case and applications. But beyond the individual application (though it may be grand and challenging in itself), stream processing is a much broader building block: the foundational piece of a platform that brings together the different parts of a data architecture. A platform that integrates data analytics, data ingestion, SQL, Machine Learning, data provenance, databases, and other aspects of a data-driven infrastructure in a meaningful way. In this keynote, we look at what goes into building a stream processing platform that is more than the sum of its parts.
February 20, 2018
Stream Processing in conjunction with a Consistent, Durable, Reliable stream storage is kicking the revolution up a notch in Big Data processing. This modern paradigm is enabling a new generation of data middleware that delivers on the streaming promise of a simplified and unified programming model. From data ingest, transformation, and messaging to search, time series and more, a robust streaming data ecosystem means we’ll all be able to more quickly build applications that solve problems we could not solve before.
March 16, 2018
Over the past few months, the Apache Flink and Apache Beam communities have been busy developing an industry leading solution to author batch and streaming pipelines with Python. This was made possible by a significant effort to revamp Beam’s portability framework, build the corresponding Flink Runner, and simplify Flink’s artifact distribution & deployment mechanisms. What is the “killer big-data app” enabled by this integration: production TensorFlow pipelines. Building production machine learning pipelines that process large distributed data sets can get complex. In this talk, we will describe a set of open source libraries developed at Google, that simplify and unify pre and post processing stages for a production TensorFlow pipeline. These libraries are authored on Beam’s python SDK, and can be run on Apache Flink at scale. Last, but not least, we will describe how Beam & Flink aim to bring the power of big-data to newer audiences, in particular, developers of the Go programming language. Beginner Anand Iyer Google Cloud
February 13, 2018
February 21, 2018
Last year, during the Flink Forward conference, a group of Yelp engineers saw in Apache Flink a perfect candidate to solve many of the pressing challenges that the Yelp’s fast growing data pipeline was encountering. One year later, Yelp is running hundreds of streaming Flink jobs. These provide everything from simple data transformation, to complex stateful stream join and stream SQL queries. In this talk, I’ll introduce the challenges that the Yelp data pipeline was facing and how Flink tackled them. I’ll then focus on the integration of Apache Flink into an existing multi-regional Kafka based streaming ecosystem at scale. I’ll discuss the challenges that we encountered during our journey and some of the best practices we learned while building and scaling the infrastructure for Flink. Intermediate Enrico Canzonieri, Yelp Inc.
January 28, 2018
Extracting insights out of continuously generated data requires a stream processor with powerful data analytics features such as Apache Flink. A stream data pipeline with Flink typically includes a storage component to ingest and serve the data. Pravega is a stream store that ingests and stores stream data permanently, making the data available for tail, catch-up, and historical reads. One important challenge for such stream data pipelines is coping with the variations in the workload. Daily cycles and seasonal spikes might require the provisioning of the application to adapt accordingly. Pravega has a feature called stream scaling, which enables the capacity offered for the ingestion of events of a stream to grow and shrink over time according to workload. Such a feature is useful when the application downstream has the ability of accommodating such changes and also scale its provisioning accordingly. In this presentation, we introduce stream scaling in Pravega and how Flink jobs leverage this feature to rescale stateful jobs according to variations in the workload. Advanced Flavio Junqueira, Pravega by DellEMC
Is it possible to build an efficient, focused web crawler using Flink? That was the question that led to the creation of the flink-crawler open source project. In this talk I’ll discuss how we use Flink’s support for AsyncFunctions and iterations to create a scalable web crawler that continuously and efficiently performs a focused web crawl with no additional infrastructure. I’ll also discuss some of the testing and debugging challenges encountered when using features such as AsyncFunctions and iterations. Intermediate Advanced Ken Kugler, Scale Unlimited
Stream processing plays an important role in Uber’s real-time business. It has been widely used to support many use cases in Uber, like surge pricing and restaurant manager. To support all the stream processing use cases at Uber, the stream processing platform team has built the Flink As a Service platform. In this talk, we will present the design and architecture of the Flink As a Service platform. Specifically, we will discuss how we manage the deployment, how we make the platform highly available to support critical real-time business, how we scale the platform to support the entire company, and our experience running the platform in production. Intermediate Shuyi Chen Ubers
eBay monitoring platform collects metrics, events and logs from network devices, kubernetes clusters, applications and other monitoring tools, data will be feeding into preprocessing and alerting engine to enrich, normalize, dedupe, alerting, etc. We start by building the first generation preprocessing and alerting engine on top of Storm. Then, migrated from Storm to Flink to embrace streaming/batch unique interface and exactly-one semantics also rich support of windowing. Our monitoring platform aims to provide self-service alert definition which mean tens of thousands of alert rules on streaming data. So we built metadata-driven processing engine on top of Flink. It provides capabilities APIs which translate capabilities to Flink data stream APIs, and provides policies APIs to submit or update users’ process and alerts requests and deploy these requests to shared Flink job for processing without restarting jobs. Also an alert DSL is designed to enable self-service alert definition. Beginner Garret Li ebay Ralph Su
Event driven micro-services are great for decomposition and decoupling large monolithic platforms, but as these platforms get decoupled, we need a way to reconcile and balance transactions across micro-service boundaries. In this talk, we will demonstrate a prototype we developed using Apache Kafka and Flink to showcase this balance and control mechanism. Beginner Faraz Babar, American Express
Over 109 million subscribers are enjoying more than 125 million hours of TV shows and movies per day on Netflix. This leads to massive amount of data flowing through our data ingestion pipeline to improve service and user experience. They are powering various data analytic cases like personalization, operational insight, fraud detection. At the heart of this massive data ingestion pipeline is a self-serve stream processing platform that processes 3 trillion events and 12 PB of data every day. We have recently migrated this stream processing platform from Samza to Flink. In this talk, we will share the challenges and issues that we run into when running Flink at scale in cloud. We will dive deep into the troubleshooting techniques and lessons learned. Intermediate Steven Wu, Netflix
CloudStream service is a Full Management Service in Huawei Cloud. Support several features, such as On-Demand Billing, easy-to-use Stream SQL in online SQL editor, test Stream SQL in real-time style, Multi-tenant, security isolation and so on. We choose Apache Flink as streaming compute platform. Inside of CloudStream Cluster, Flink job can run on Yarn, Mesos, Kubernetes. We also have extended Apache Flink to meet IoT scenario needs. There are specialized tests on Flink reliability with college cooperation. Finally continuously improve the infrastructure around CS including open source projects and cloud services. CloudStream is different with any other real-time analysis cloud service. The development process can also be shared at architecture and principles.
SQL is the lingua franca of data processing and everybody working with data knows SQL. Apache Flink provides SQL support for querying and processing batch and streaming data. Flink’s SQL support powers large-scale production systems at Alibaba, Huawei, and Uber. Based on Flink SQL, these companies have built systems for their internal users as well as publicly offered services for paying customers. In our talk, we will discuss why you should and how you can (not being Alibaba or Uber) leverage the simplicity and power of SQL on Flink. We will start exploring the use cases that Flink SQL was designed for and present real-world problems that it can solve. In particular, you will learn why unified batch and stream processing is important and what it means to run SQL queries on streams of data. After we explored why you should use Flink SQL, we will show how you can leverage its full potential. Since recently, the Flink community is working on a service that integrates a query interface, (external) table catalogs, and result serving functionality for static, appending, and updating result sets. We will discuss the design and feature set of this query service and how it can be used for exploratory batch and streaming queries, ETL pipelines, and live updating query results that serve applications, such as real-time dashboards. The talk concludes with a brief demo of a client running queries against the service. Intermediate Fabian Hueske, data Artisans Timo Walther
February 21, 2018
November 11th (double eleven) has become Alibaba’s Global Shopping Festival, and Alibaba generated gross merchandise volume (GMV) of US$25.3 billion on Nov 11 this year. On that day, Alibaba’s realtime computing engine Blink, which was Alibaba’s version of Flink, processed more than 472 millions records per second at the peak, with subsecond latency. In this talk, we will introduce the optimizations in Blink, such as credit-based network stack, dynamic load balance and improved checkpointing for large scale jobs. Some of these work has been contributed to Flink and will be released in Flink 1.5. Advanced Feng Wang, Alibaba
January 28, 2018
Apache Flink is a popular stream computing framework for real-time stream computing. Many stream compute algorithms require trailing data in order to compute the intended result. One example is computing the number of user logins in the last 7 days. This creates a dilemma where the results of the stream program are incomplete until the runtime of the program exceeds 7 days. The alternative is to bootstrap the program using historic data to seed the state before shifting to use real-time data. This talk will discuss alternatives to bootstrap programs in Flink. Some alternatives rely on technologies exogenous to the stream program, such as enhancements to the pub/sub layer, that are more generally applicable to other stream compute engines. Other alternatives include enhancements to Flink source implementations. Lyft is exploring another alternative using orchestration of multiple Flink programs. The talk will cover why Lyft pursued this alternative and future directions to further enhance bootstrapping support in Flink. Intermediate Jason Carey, Lyft
Event streams as source of truth for applications have risen in popularity. Thoughtworks lists it in the assess phase of their technology radar (Nov 17). At DriveTribe we’re slightly ahead of the curve as we’ve been using this technique in production with Flink since the launch of our platform in Nov 2016. In this talk we discuss how we can use this technique with some functional programming and Flink to successfully design distributed and highly scalable applications and data platforms. Beginner Aris Koliopoulos, Drivetribe Alex Garella
Tensorflow is all kind of fancy, from helping startups raising their Series A in Silicon Valley to detecting if something is a cat. However, when things start to get “real” you may find yourself no longer dealing with mnist.csv, and instead needing do large scale data prep as well as training. This talk will explore how Tensorflow can be used in conjunction with Apache BEAM, Flink, and Spark to create a full machine learning pipeline including that annoying “feature engineering” and “data prep” components that we like to pretend don’t exist. We’ll also talk about how these feature prep stages need to be integrated into the serving layer. In addition to Apache BEAM this talk also examines changing industry trends, like Apache Arrow, and how they impact cross-language development for things like deep learning. Even if you’re not trying to raise a round of funding in Silicon Valley, this talk will give you tools to do interesting machine learning problems at scale. Beginner Holden Karau, Google Cloud
There are many batch and stream scenarios in Alibaba, and many data analysts are non-technical, like to use GUI or script tool to deal with data to help business decisions. We’d like to share our experiences on developing algorithms on Apache Flink and build web UI and client to help people easily use algorithms on data analysis, training and inferencing with machine learning model. Intermediate Xu Yang
January 30, 2018
Apache Flink has become the engine powering many streaming platforms at companies like Uber and Alibaba. The benefits of Flink are well documented including it’s streaming first design allowing for low-latency streaming. Flink supports deployments with many resource managers including YARN, Mesos, and Kubernetes. Deploying on Kubernetes has many benefits including making Flink “Cloud Native.” Cloud Native deployments benefit from (1) Log aggregation (2) Tracing (3) Containers and (4) extensibility, among other things. This talk is about making Flink Cloud Native and the benefits that come as a result. These benefits include distributed tracing, log aggregation, service mesh capabilities and new patterns of exposing Flink’s APIs. In this talk, I walk through the process of taking low-level constructs from Flink and plugging them into the Cloud Native ecosystem. I’ll walk through getting distributed tracing, log aggregation and a service mesh working with Apache Flink on Kubernetes. I do this with a demo of live streaming data produced from automobile telemetry data. Attendees can expect to learn about the benefits of Cloud Native deployments of Flink and how to get started with them. They will also walk away with new patterns that can be used in Flink deployments. Intermediate Jowanza Joseph One Click Retail
January 28, 2018
Operationalizing Machine Learning models is never easy. Our team at Comcast has been challenged with operationalizing predictive ML models to improve customer care experiences. Using Apache Flink we have been able to apply real-time streaming to all aspects of the Machine Learning lifecycle. This includes data feature exploration and preparation by data scientists, deploying live models to serve near-real-time predictions, and validating results for model retraining and iteration. We will share best practices and lessons learned from Flink’s role in our operationalized lifecycle including: • Executing as the “Prediction Pipeline” – a model container environment for near-real-time streaming and batch predictions • Preparing streaming features and data sets for model training, as input for production model predictions, and for a continually-updated customer context • Using connected streams and savepoints for “Live in the Dark”, multi-variant testing, and validation scenarios • Incorporating Flink’s Queryable State as an approach to the online “Feature Store” – a data catalog for reuse by multiple models and use cases • Enabling versioned models, versioned feature sets, and versioned data through DevOps approaches Intermediate Dave Torok, Comcast Corporation Sameer Wadkar
Many marketplace products (e.g pricing, positioning etc.) in Uber require intensive realtime optimizations. Such applications help Uber automatically maintain marketplace reliability, generate market insights and improve the network efficiency across more than 600 cities in realtime. Underneath, Uber engineers leverage Apache Flink to build a platform that not only runs compute intensive optimization models, but also very quickly reacts to rapid changes in marketplace. In this talk, I will cover the compute platform that leverages Apache Flink to i.) aggregate billions of realtime and forecasted demand and supply level information across the globe. ii.) trigger on-demand optimization models to respond to changes in marketplace and iii.) scale both horizontally and vertically as we expand the platform to onboard new applications and experiences. Intermediate Xingzhong Xu, Uber Technologies Inc.
Flink has supported Apache Mesos officially since the 1.2 release and many users have been using them together even before that. The latest releases 1.4 and 1.5 (not released at the time of writing) add a deeper integration for resource schedulers, such as Mesos, which also resulted in many new features around this integration. But what does that mean in practice for operating large cluster? In this talk, we will discuss operational best practices-alongside with some pitfalls- for operating large Flink cluster on top of Apache Mesos, including topics such as: * Deployments, * Monitoring, * Scaling, * Upgrades, * Debugging. Intermediate Jörg Schad, Mesosphere
February 21, 2018
Within fintech catching fraudsters is one of the primary opportunities for us to use streaming applications to apply ML models in real-time. This talk will be a review of our journey to bring fraud decisioning to our tellers at Capital One using Kafka, Flink and AWS Lambda. We will share our learnings and experiences to common problems such as custom windowing, breaking down a monolith app to small queryable state apps, feature engineering with Jython, dealing with back pressure from combining two disparate streams, model/feature validation in a regulatory environment, and running Flink jobs on Kubernetes. Intermediate Andrew Gao Capital One Jeff Sharpe
January 28, 2018
In this talk, we are going to present dA Platform, a production-ready platform for stream processing with Apache Flink® from data Artisans. The platform includes open source Apache Flink and Application Manager, a central deployment and management component. dA Platform schedules clusters on Kubernetes, deploys stateful Flink applications, and controls these applications and their state. We will look at how dA Platform makes stream processing easier for enterprises and explain how it works. The talk will also include a demonstration of dA Platform’s capabilities.
April 3, 2018
“Customer experience is the next big battle ground for telcos,” proclaimed recently Amit Akhelikar, Global Director of Lynx Analytics at TM Forum Live! Asia in Singapore. But, how to fight in this battle? A common approach has been to keep “under control” some well-known network quality indicators, like dropped calls, radio access congestion, availability, and so on; but this has proven not to be enough to keep customers happy, like a siege weapon is not enough to conquer a city. But, what if it were possible to know how customers perceive services, at least most demanded ones, like web browsing or video streaming? That would be like a squad of archers ready to battle. And even having that, how to extract value of it and take actions in no time, giving our skilled archers the right targets? Meet CANVAS (Customer And Network Visualization and AnaltyticS), one of the first LATAM implementations of a Flink-based stream processing use case for a telco, which successfully combines leading and innovative technologies like Apache Hadoop, YARN, Kafka, Nifi, Druid and advanced visualizations with Flink core features like non-trivial stateful stream processing (joins, windows and aggregations on event time) and CEP capabilities for alarm generation, delivering a next-generation tool for SOC (Service Operation Center) teams. Intermediate David Reniz, everis Dahyr Vergara
January 28, 2018
Stream Processing has evolved quickly in a short time: a few years ago, stream processing was mostly simple real-time aggregations with limited throughput and consistency. Today, many stream processing applications have complex logic, strict correctness guarantees, high performance, low latency, and maintain large state without databases. Since then, Stream processing has become much more sophisticated because the stream processors – the systems that run the application code, coordinate the distributed execution, route the data streams, and ensure correctness in the face of failures and crashes – have become much more technologically advanced. In this talk, we walk through some of the techniques and innovations behind Apache Flink, one of the most powerful open source stream processors. In particular, we plan to discuss: The evolution of fault tolerance in stream processing, Flink’s approach of distributed asynchronous snapshots, and how that approach looks today after multiple years of collaborative work with users running large scale stream processing deployments. How Flink supports applications with terabytes of state and offers efficient snapshots, fast recovery, rescaling, and high throughput. How to build end-to-end consistency (exactly-once semantics) and transactional integration with other systems. How batch and streaming can both run on the same execution model with best-in-class performance. Intermediate Stefan Richter, data Artisans
As more workloads migrate from batch to stream processing there is ever increasing demand on streaming applications to operate based on more complex business rules. Apache Flink provides many high and low level api’s for writing a wide breadth of stateful streaming applications. However, as applications get more complex they can become harder to understand and debug. This talk will discuss best practices for writing well tested streaming applications with an eye towards working with stateful operators and validating against failure. We will do this by walking through the world’s most thoroughly tested implementation of word count. Beginner Seth Wiesman, MediaMath
Flink metrics module allows to use Dropwizard-like metrics and reporters in Flink pipelines. It opens a rich opportunity to not only monitor health of Flink pipelines but also attach real-time business intelligence metrics to run alongside existing Flink data jobs, thus avoiding a need to build a separate BI data-flow infrastructure. However, most of the work still resides with application developers: they have to define how those metrics are registered, calculated and attached to Flink functional operators – and then Flink will do the computation and reporting. The most challenging task is to add BI-like metrics, with key dimensions dynamically extracted from streamed data and metric values likely needed to be aggregated over multiple Flink tasks. We present a toolkit that extends Flink metrics and allows to decorate existing Flink operators with both simple health-check and complex BI-like metrics, updated and observable in real-time through IT monitoring and BI visualization dashboards. This work is based on a real-world production use-case of computing retail merchandise prices in real-time for the Walmart e-commerce catalog using Flink. Intermediate Sanket Kadarkar, Tata Consultancy Services Ltd.
Apache Flink committers, Aljoscha Krettek, Till Rohrmann, Fabian Hueske, and Gordon Tai, will be available in this bonus session to answer questions about Apache Flink. Join this informal session to ask about anything that you didn’t get answered today or feel free to just listen and learn more.
March 12, 2018
March 14, 2018