BabbleFlow : a translator for analytic data flow programs
Document typeConference lecture
PublisherAssociation for Computing Machinery (ACM)
Rights accessRestricted access - publisher's policy
A complex analytic data flow may perform multiple, inter-dependent tasks where each task uses a different processing engine. Such a multi-engine flow, termed a hybrid flow, may comprise subflows written in more than one programming language. However, as the number and variety of these engines grow, developing and maintaining hybrid flows at the physical level becomes increasingly challenging. To address this problem, we present BabbleFlow, a system for enabling flow design at a logical level and automatic translation to physical flows. BabbleFlow translates a hybrid flow expressed in a number of languages to a semantically equivalent hybrid flow expressed in the same or a different set of languages. To this end, it composes the multiple physical flows of a hybrid flow into a single logical representation expressed in a unified flow language called xLM. In doing so, it enables a number of graph transformations such as (de-)composition and optimization. Then, it converts the, possibly transformed, xLM data flow graph into an executable form by expressing it in one or more target programming languages.
CitationJovanovic, P.; Simitsis, A.; Wilkinson, K. BabbleFlow : a translator for analytic data flow programs. A: ACM SIGMOD Conference. "Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data : Snowbird, UT, USA : June 22 - 27, 2014". Snowbird, UT: Association for Computing Machinery (ACM), 2014, p. 713-716.