Low-Synchronization, Mostly Lock-Free, Elastic Scheduling for Streaming Runtimes
We present the scalable, elastic operator scheduler in IBM Streams 4.2. Streams is a distributed stream processing system used in production at many companies in a wide range of industries. The programming language for Streams, SPL, presents operators, tuples and streams as the primary abstractions. A fundamental SPL optimization is operator fusion, where multiple operators execute together in the same process. Streams 4.2 automatically performs fusion at submission time, because we discovered that in practice, customers did not have the expertise to do so. However, this presented a new problem: potentially thousands of operators would execute together in the same process, with no user guidance for thread placement. We needed a way to automatically figure out how many threads to use, with arbitrarily sized applications on a wide variety of hardware, and without any input from programmers. Our solution has two components. The first is a scalable operator scheduler that minimizes synchronization, locks and global data, while allowing threads to execute any operator and dynamically come and go. The second are elastic algorithms to dynamically adjust the number of threads to optimize performance, using the principles of trust and establishing trends. We demonstrate our scheduler’s ability to scale to over a hundreds threads, and our elasticity algorithm’s ability to adapt to diferent workloads on an Intel Xeon system with 176 logical cores, and an IBM Power8 system with 184 logical cores.