Projects related to ParalleX
Evaluating Execution Models
In partnership with the Pacific Northwest National Laboratory, CREST analyzed methodologies for modeling and evaluating parallel execution models. This work encompassed areas including the development of modeling methodologies and techniques designed to quantify the performance impact of execution models, the development of formal methods for modeling execution model semantics, derivation of performance models for quantitative evaluation of execution models, the design and implementation of mini-apps translated to ParalleX, and a comparative analysis quantifying the performance impact incurred by execution model primitive operations. This project was one of the key funded projects that led to the creation of the current PAralleX model.
XPS: FP: Language Support for the ParalleX Execution Model
The XPS program, funded by Indiana University’s Faculty Research Support Program, supported research leading to a new era of parallel computing. XPS seeks research re-evaluating the traditional computer hardware and software stack for today’s heterogeneous parallel and distributed systems and exploring new holistic approaches to parallelism and scalability. This project was one of the key funded projects that led to the creation of the current PAralleX model.
Brodowicz, Maciej (PI), Sterling, Thomas (co-PI). BIGDATA: F: DKM: Collaborative Research: PXFS: ParalleX Based Transformative I/O System for Big Data
NSF #1447650; $300,000; 09/01/2014-08/31/2017. Summary: PXFS is a parallel file system based on the ParalleX execution model that will allow researchers to develop highly efficient data management, discovery, and analysis codes for Big Data applications covering a wide range of fields that both favor large scale computation and efficient large data set handling. To achieve this, PXFS combines C++ based HPX runtime system developed at LSU with Clemson University’s OrangeFS parallel file system. The performance and correctness of the solution are tested with multiple concurrent instances of framework quantifying gene expression levels that process large (up to few hundred GBs) files containing raw RNA sequencing data. The intellectual merit of this work was the development of a new approach for I/O perfectly suited for the ParalleX parallel execution model which together would lead to the development of software capable of efficient execution of big data applications such as genomics. It contributes new understanding, concepts, and methods for realizing advanced persistent storage management. It is inspired by two innovative system classes: an advanced parallel file system, OrangeFS, and a unique execution model, ParalleX. The result is an exploratory vehicle for establishing a new paradigm for mass storage at extreme scale. The project explores and integrates ephemeral execution data and persistent data into a single unified object domain. It provides an event driven dynamic adaptive computation environment in response to the uncertainty of data access times, related asynchrony and imposed overheads through the embedding of futures-based synchronization. It eliminates the division of programming imposed by conventional file systems through the unification of name spaces and their management. Critical aspects of this new approach include the use of dynamic locality management, dynamic resource management, hierarchical name spaces, and an Active Global Address Space. The broader impacts of this research were related to three areas: education and training, dissemination of research results, and engagement of a diverse set of people. This research is conducted at three major universities and involves both undergraduate and graduate students (including Ph.D. track students). The students at different locales are exposed to collaborative environment in which they can work together effectively and learn from each other. They acquire not only programming and problem solving skills preparing them for future careers, but also gain understanding of approaches to successful multi-disciplinary collaboration. The computational codes developed by the project are archived using git repository with integrated tools such as bug tracking software, wiki, and continuous integration test system in addition to code releases with accompanying documentation. In each of these fields we intend to make an effort to engage members of traditionally underserved groups with the goal to inspire them to pursue a career in computer science or genomics.
Projects related to HPX-5 development and applications running on HPX-5
HOBBES: OS and Runtime Support for Application Composition
This project is sponsored by the Department of Energy and subcontracted from Sandia National Laboratory. Indiana University’s contributions centered on debugging and Graph Analytics. In debugging, a key component added to the runtime software will be diagnostic support to identify and report the correlation of the detected error with the regime of the perpetrated software fault (bug) for future correction. IU’s graph analytics contribution will explore active messages. Specific features of the active message layer include message coalescing, active routing, message reduction, and termination detection. More information at http://xstack.sandia.gov/hobbes/
NSF Graphs: CSR: Small: High-Level Programming Languages and Environments for Scalable Graph Processing
This NSF sponsored project proposes to studied an approach to high-level, domain-specific programming languages that would enable a single representation of a graph algorithm to run efficiently and scalably on a variety of parallel architectures. The languages existing in this area are limited in scope and expressiveness. Accordingly, this research incorporates the study of appropriate abstractions to create reusable algorithm implementations enabling scalable graph algorithms on multiple types of high-performance platforms.
Funded by the Defense Advanced Research Projects Agency (DARPA), CREST developed the experimental software tool called Stencil, based on a declarative language for building visualizations utilizing dynamic data. Stencil is used internally to explore the actual runtime behavior of AM++ applications. These explorations assisted further algorithm development and tuning. Stencil was used by the Epidemics Cyberinfrastructure Tool to display the predictions of epidemiology models.
XPRESS: eXascale Programming Environment and System Software
Funded by the Department of Energy (DOE), XPRESS developed new software DOE mission-critical applications. This project was led Sandia National Laboratories and included CREST, along with Louisiana State University, University of Houston, University of Oregon, RENCI at UNC Chapel Hill, Oak Ridge National Laboratory, and Lawrence Berkley National Laboratory.
This project created a user-level library for active messages based on the Active Pebbles programming model. AM++ allows message handlers to be run in an explicit loop that can be optimized and vectorized by the compiler and that can also be executed in parallel on multicore architectures. Runtime optimizations, such as message combining and filtering, are also provided by the library, removing the need to implement that functionality at the application level. More information at https://dl.acm.org/citation.cfm?id=1854323 and http://ieeexplore.ieee.org/document/7851538/
Bokeh: A Declarative, Scalable Framework for Extensive Visualization
A new visualization system, named Bokeh, was created for interactive visual explorations of large, multidimensional datasets. Domain expertise and intuition are critical to effective exploration of large data. Thus, we have chosen a novel architectural approach that addressed scalability, interactivity, and extensibility as its core challenges, while maintaining a simple conceptual model for the non-programmer end user. This project was sponsored by DARPA. More information at https://bokeh.pydata.org/en/latest/