The quest for greater computational power is never-ending, and exploiting parallelism is central to this quest. We are helping to design systems which will enable programmers to implement their applications not only on supercomputers and high-performance workstations, but on multiple architectures connected by networks. The development of algorithms, models, languages, compilers and tools for these systems are among the most compelling and difficult problems in computing.
Our faculty are Senior Fellows at the San Diego Supercomputer Center (SDSC) and also partners in NPACI, the NSF-funded National Partnership for Advanced Computational Infrastructure, led by SDSC. These connections give faculty access to a wide variety of machines at SDSC as well as to computational scientists and their challenging computing problems.
Scientific applications are becoming increasingly complex in part because they are relying more heavily on elaborate, dynamic numerical representations to capture irregular physical phenomena. Such applications are a challenge to implement on parallel computers since compilers presently cannot treat the requisite data decomposition and data motion at run time.
The Scientific Computation Group (SCG) investigates run-time software techniques aimed at reducing the development time of high-performance scientific applications. The group focuses on applications with special needs, including adaptive computations that employ elaborate, dynamic representations to enhance accuracy, improve running time, or both. The SCG has developed the KeLP software infrastructure to meet the needs of such applications. KeLP permits the programmer to exploit specific properties of an application in order to optimize data decompositions and data motion at run time. KeLP has been applied to a variety of scientific applications ranging from computational materials science to computational fluid dynamics.
More recently, the SCG is investigating multi-tier parallelism, in which multiple levels of parallelism exist within a single application. Multi-tier parallelism is appropriate for multicomputers comprising multiple symmetric multiprocessor nodes.
The programming abstractions explored by the SCG are vital not only in eliminating non-essential detail (e.g., load balancing and locality management), but also in protecting application software against inevitable changes in hardware.
Fast networks have made it possible to use distributed CPUs, memory, and data resources as an ensemble to support the increasing resource requirements of high-performance applications. Such networked resource ensembles form a "metacomputer" or "computational grid"; which aggregates resources to improve the execution performance of high-performance and resource-intensive applications. The Grid Computing Laboratory (GCL) focuses on the development of applications which can tap the performance potential of metacomputing platforms. Adaptive techniques must be used in order for applications to leverage the deliverable performance of resources in this dynamic and heterogeneous environment.
Currently, the GCL is focusing on the development of an adaptive scheduling methodology for metacomputing applications. Recent work focuses on the development of adaptive agent-based application schedulers (called "AppLeS" for Application-Level Scheduler). Each AppLeS agent determines a customized application schedule for its application, and executes the scheduled application on the underlying metacomputing system. This research involves the development of system software for scheduling, performance monitoring and prediction, and application execution, as well as the development of models for predicting application behavior in a dynamic environment. Research group members participate in active collaborations with the developers of metacomputing infrastructure environments, computational scientists, supercomputer center staff, and others interested in high-performance metacomputing.
When a particular kernel computation will run for weeks or years, it makes economic sense to be obsessed with its performance. Performance programming is the design, writing, and tuning of programs to sustain near-peak performance. The objective is to fully utilize a computer's potential; this requires a different set of programming paradigms and tools, and even a new model of computation. Techniques such as parallelization, localization, load balancing, and overlapping communication with computation can be applied at many levels, from optimizing the inner loops to choreographing data movement between processors.
The Performance Programming Group is pursuing the goal of changing performance programming from an art into a science. By hand-tuning specific scientific codes, we gain insight into what methods that are effective. We study what tools can make performance programming easier and to what extent the techniques can be automated. Architectural support is another concern. The most elusive goal is finding methods of developing programs that are truly portable and truly high-performance.
Compiler technology provides the necessary interface between programming languages and architectures, and is intimately tied to new developments in both. The group's research focuses on the development of Compiler Technology, with particular interest in exploiting multiple levels of parallelism and optimizing data movement to achieve high performance.
Trends in both high level languages and machines have increased the gap between them, making the compiler's job of bridging this gap increasingly more challenging. For example, a major trend in machine architecture is towards a hierarchy of memory and parallelism.
We developed Hierarchical Tiling to simultaneously exploit such parallelism and locality. Hierarchical tiling extends traditional tiling to include optimization for storage and data movement. It has been used on perfectly nested loops with regular dependence structure to reduce cache misses, enhance interprocessor parallelism, and manage secondary storage, and to improve the use of superscalar pipelined and vector processors. Our current research will extend Hierarchical Tiling to obtain its advantages on a larger class of programs that include unstructured applications with less regular memory accesses. We are also looking at the compiler implications of multithreading and predicated execution capabilities on emerging architectures.
Scientific Computing also includes the development of parallel algorithms and software for use in computational molecular biology. Problems such as molecular structure prediction and protein-ligand docking require new methods and high performance computing. Such methods include a new global optimization algorithm which has been implemented on the SDSC CRAY T3E to compute global minimum energy conformations of small protein molecules.