Group Publications

Index

About AppLeS

High-Performance Scheduling
a chapter by Fran Berman from "The Grid -- Blueprint for a New Computing Infrastructure", edited by Ian Foster and Carl Kesselman.

This chapter is part of a text covering the basic issues involved in developing applications, middleware, and infrastructure for a metacomputing platform ("compuutational grid"). The metaphore "computational grid" is based on the electrical power grid which provides ubiquitous electrical power. In the same way the metacomputer has the potential of developing into a computational grid which can provide ubiquitous computing power.

The chapter deals with the problem of achieving application performance via scheduling on such a platform. Application scheduling on a metacomputer is compared with application scheduling on MPPs. Current work in the area of metacomputer application scheduling is reviewed. Finally, trends and challenges in the metacomputing application scheduling area as well as system support for such "high-performance" schedulers are discussed.
Performance Effects of Scheduling Strategies for Master/Slave Distributed Applications
by Gary Shao, Rich Wolski and Fran Berman
(UCSD Technical Report No. CS98-598)

In this paper we examine the performance sensitivity of a master/slave parallel ray-tracing application when running on non-dedicated clusters of workstations. Our experiments show that two different scheduling strategies, one static and one dynamic, exhibit different performance sensitivities to variabilities in resource capabilities and workload distributions. We demonstrate for our example application that neither scheduling strategy by itself consistently induces the best application performance (minimal execution time) when running on the same resources under normally experienced production operating conditions. These results support the idea that dynamic selection of appropriate scheduling strategies to match run-time conditions could be a promising approach to achieving better application performance for master/slave applications on heterogeneous platforms.
Application Level Scheduling of Gene Sequence Comparison on Metacomputers
by Neil Spring and Rich Wolski.
from Proceedings of the 12th ACM International Conference on Supercomputing, Melbourne, Australia, July, 1998.
This paper investigates the efficacy of Application-Level Scheduling (AppLeS) for a parallel gene sequence library comparison application in production metacomputing settings. We compare an AppLeS-enhanced version of the application to an original implementation designed and tuned to use the native scheduling mechanisms of Mentat -- a metacomputing software infrastructure. The experimental data shows that the AppLeS versions outperform the best Mentat versions over a range of problem sizes and computational settings.

The structure of the AppLeS we have defined for this application does not depend on the scheduling algorithms that it uses. Instead, the AppLeS scheduler considers the uncertainty associated with the information it uses in its scheduling decisions to choose between the static placement of computation, and the dynamic assignment of computation during execution. We propose that this framework is general enough to represent the class of metacomputing applications that are organized as a master and set of parallel slaves, in which the master distributes uncomputed work.
The AppLeS Project: A Status Report
by Fran Berman and Rich Wolski.
from Proceedings of the 8th NEC Research Symposium, Berlin, Germany, May 1997.
This paper describes the state-of-the-art of the AppLeS Project as of Summer, 1997. Application-level scheduling, the AppLeS architecture, ongoing work in developing application AppLeS, and AppLeS coordinated subprojects are described. References are given to related work both within and outside the AppLeS group.
Scheduling from the Perspective of the Application
by Fran Berman and Rich Wolski.
from Proceedings of Symposium on High Performance Distributed Computing, 1996
We focus om the problems of scheduling applications on metacomputing systems. We intoduce the concept of application-centric scheduling in which everything about the system is evaluated in terms of its impact on the application. Application-centric scheduling is used by virtually all metacomputer programmers to achieve performance on metacomputing systems. We describe two successful metacomputing appli and Panoramacations to illustrate this approach, and describe AppLeS scheduling agents which generalize the application-centric scheduling approach. Finally, we show preliminary results which compare AppLeS-dervied schedules with conventional strip and blocked schedules for a two-dimensional Jacobi code.
Application-Level Scheduling on Distributed Heterogeneous Networks, ( html)
by Fran Berman, Richard Wolski, Silvia Figueira, Jennifer Schopf, and Gary Shao
from Proceedings of Supercomputing 1996
(also UCSD CS Tech Report #CS96-482)

In this paper, we define a set of principles underlying application-level scheduling and describe our work-in-progress building AppLeS (application-level scheduling) agents. We illustrate the application-level scheduling approach with a detailed description and results for a distributed 2D Jacobi application on 2 heterogeneous platforms.
Predicting the Cost of Redistribution in Scheduling
by Gary Shao, Rich Wolski and Fran Berman
Proceedings of the 8th SIAM Conference on Parallel Processing for Scientific Computing
While running on parallel distributed resources, schedulers may find it advantageous to redistribute elements of a computation in response to changing conditions. In this paper, we focus on the development of dynamically parametrizable models to determine the cost (in terms of execution delay) of performing redistribution. We illustrate our approach by examining in detail the modeling of redistribution costs for a 2D Jacobi application running in a cluster of workstations environment.
Using AppLeS to Schedule a Distributed Visualization Tool on the Computational Grid, ( html, pdf)
by Alan Su, Francine Berman, Richard Wolski, and Michelle Mills Strout.

Computational Grids composed of distributed, high-performance computing resources, are quickly becoming the platform-of-choice for performance-challenged applications. In this paper, we describe the AppLeS (Application-Level Scheduling) approach to achieving application performance via scheduling on dynamic multiple-user Computational Grids and clusters. We illustrate AppLeS methodology by describing an AppLeS developed for JPL's Synthetic Aperture Radar Atlas (SARA) -- a visualization tool for satellite radar images. We demonstrate the effectiveness of application scheduling by providing a performance-efficient strategy for retrieving SARA data files in everyday, multiple-user Grid environments.
Using Effective Network Views to Promote Distributed Application Performance
by Gary Shao, Fran Berman and Rich Wolski
Proceedings of the 1999 International Conference on Parallel and Distributed Processing Techniques and Applications
In this paper, we show that distributed parallel application performance in clusters of workstations benefits from knowledge about network topology in cases where LAN resources can become potential performance bottlenecks. These performance bottlenecks often appear in common networking technologies that employ highly shared resources, such as ethernets and token-rings. A method to discover network configuration as it relates to application performance, called Effective Network Views (ENV), is presented, and experimental results with an example application are given to show the value of including ENV information in the scheduling process when good performance is required.
Application Scheduling on the Information Power Grid (ps, ps.gz, pdf, pdf.gz)
by Dmitrii Zagorodnov, Francine Berman, and Rich Wolski
submitted to International Journal of High-Performance Computing.
One of the compelling reasons for developing the Information Power Grid (IPG) is to provide a platform for more rapid development and execution of simulations and other resource-intensive applications. However, the IPG will ultimately not be successful unless users and application developers can achieve execution performance for their codes. In this paper, we describe a performance-efficient approach to scheduling applications in dynamic multiple-user distributed environments such as the IPG. This approach provides the basis for application scheduling agents called AppLeS. We describe the AppLeS methodology and discuss the lessons learned from the development of AppLeS for a variety of distributed applications. In addition, we describe an AppLeS-in-progress currently being developed for NASA's INS2D code, a distributed "parameter sweep" application.
Adaptive Performance Prediction for Distributed Data-Intensive Applications (ps, ps.gz, pdf, pdf.gz)
by Marcio Faerman, Alan Su, Rich Wolski and Francine Berman
to appear at SC99, Portland, OR

The computational grid is becoming the platform of choice for large-scale distributed data-intensive applications. Accurately predicting the transfer times of remote data files, a fundamental component of such applications, is critical to achieving application performance. In this paper, we introduce a performance prediction method, AdRM (Adaptive Regression Modeling), to determine file transfer times for network-bound distributed data-intensive applications. We demonstrate the effectiveness of the AdRM method on two distributed data applications, SARA (Synthetic Aperture Radar Atlas) and SRB (Storage Resource Broker), and discuss how it can be used for application scheduling. Our experiments use the Network Weather Service, a resource performance measurement and forecasting facility, as a basis for the performance prediction model. Our initial findings indicate that the AdRM method can be effective in accurately predicting data transfer times in wide-area multi-user grid environments.
Combining Workstations and Supercomputers to Support Grid Applications: The Parallel Tomography Experience
by Shava Smallen, Walfredo Cirne, Jaime Frey, Francine Berman, Rich Wolski, Mei-Hui Su, Carl Kesselman, Steve Young, and Mark Ellisman
Submitted to HCW2000
(also available as UCSD CSE Technical Report No. CS99-620)

Computational Grids are becoming an increasingly important and powerful platform for the execution of large-scale, resource-intensive applications. However, it remains a challenge for applications to tap the potential of Grid resources in order to achieve performance. In this paper, we illustrate how applications can leverage Grids to achieve performance through coallocation. We describe our experiences developing a scheduling strategy for a real-life parallel tomography application targeted to Grids which contain both workstations and parallel supercomputers. Our strategy uses dynamic information exported by a supercomputer's batch scheduler to simultaneously schedule on workstations and immediately available supercomputer nodes. This strategy is of great practical interest because it combines resources available to the typical research lab: time-shared workstations and CPU time in remote space-shared supercomputers. We show that this strategy improves the performance of the parallel tomography application compared to traditional scheduling strategies, which target the application to either type of resource alone.

Also visit main AppLeS page for more info.

About Network Weather Service
- The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing
  by Rich Wolski, Neil T. Spring, and Jim Hayes
  to appear in the Journal of Future Generation Computing Systems, 1999, also (UCSD Technical Report Number TR-CS98-599, September, 1998)
  The goal of the Network Weather Service is to provide accurate forecasts of dynamically changing performance characteristics from a distributed set of metacomputing resources. Providing a ubiquitous service that can both track dynamic performance changes and remain stable in spite of them requires adaptive programming techniques, an architectural design that supports extensibility, and internal abstractions that can be implemented efficiently and portably. In this paper, we describe the current implementation of the NWS for Unix and TCP/IP sockets and provide examples of its performance monitoring and forecasting capabilities.
- Dynamically Forecasting Network Performance Using the Network Weather Service
  by Rich Wolski
  in Journal of Cluster Computing, 1998 also (UCSD Technical Report Number TR-CS96-494, October, 1996)
- Dynamically Forecasting Network Performance to Support Dynamic Scheduling Using the Network Weather Service
  by Rich Wolski
  from Proceedings of the 6th High-Performance Distributed Computing Conference, August, 1997.
  The Network Weather Service is a generalizable and extensible facility designed to provide dynamic resource performance forecasts in metacomputing environments. In this paper, we outline its design and detail the predictive performance of the forecasts it generates. While the forecasting methods are general, we focus on their ability to predict the TCP/IP end-to-end throughput and latency that is attainable by an application using systems located at different sites. Such network forecasts are needed both to support scheduling, and by the metacomputing software infrastructure to develop quality-of-service guarantees.
- Predicting the CPU Availability of Time-shared Unix Systems
  by Rich Wolski, Neil Spring, and Jim Hayes,
  submitted to HPDC8
  (also available as UCSD Technical Report Number CS98-602)
  In this paper, we focus on the problem of making short and medium term forecasts of CPU availability on time-shared Unix systems. We evaluate the accuracy with which availability can be measured using Unix load average, the Unix utility vmstat, and the Network Weather Service CPU sensor that uses both. We also examine the autocorrelation between successive CPU measurements to determine their degree of self-similarity. While our observations show a long-range autocorrelation dependence, we demonstrate how this dependence manifests itself in the short and medium term predictability of the CPU resources in our study.
- Implementing a Performance Forecasting System for Metacomputing: The Network Weather Service
  by Rich Wolski, Neil Spring, and Chris Peterson.
  from Proceedings of Supercomputing 1997
  (UCSD Technical Report No. TR-CS97-540)
  We describe the architecture of the Network Weather Service and implementations that we have developed and are currently deploying for the Legion and Globus/Nexus metacomputing infrastructures. We also detail NWS forecasts of resource performance using both the Legion and Globus/Nexus implementations. Our results show that simple forecasting techniques substantially outperform measurements of current conditions (commonly used to gauge resource availability and load) in terms of prediction accuracy.
Also visit NWS website for more info.

About Performance Prediction
- Performance Prediction in Production Environments
  by Jennifer M. Schopf and Francine Berman
  submitted to IPPS/SPDP '98
  (extended version available as UCSD CSE Technical Report #CS97-558)
  Accurate performance predictions are difficult to achieve for parallel applications executing on production distributed systems. Conventional point-valued performance parameters and prediction models are often inaccurate since they can only represent one point in a range of possible behaviors. We address this problem by allowing characteristic application and system data to be represented by a set of possible values and their probabilities, which we call stochastic values. In this paper, we give a practical methodology for using stochastic values as parameters to adaptable performance prediction models. We demonstrate their usefulness for a distributed SOR application, showing stochastic values to be more effective than single (point) values in predicting the range of application behavior that can occur during execution in production environments.
- Performance Prediction Using Intervals
  by Jennifer Schopf
  (UCSD CSE Technical Report #CS97-541)
  Accurate performance predictions are difficult to achieve for applications on production distributed parallel systems. We address this problem by allowing characteristic application and system data to be represented as an interval of possible values. We use intervals as parameters to a structural modeling technique, and predict the range of application behavior that will occur. We test this approach on a production system using a distributed SOR code, and demonstrate the resulting additional information and accuracy resulting from the interval values.
- Structural Prediction Models for High-Performance Distributed Applications
  by Jennifer Schopf
  Proceedings of the Cluster Computing Conference (CCC '97).
  We present a structural performance model that uses application profiles and component models to predict an application's performance on a set of distributed resources. We decompose application performance in accordance with the structure of the application: that is, into interacting component models that correspond to component tasks. Then, using the application profile and available information as guides, we select models for each component appropriately. As a proof of concept, we have implemented this approach for two distributed applications, a master-slave genetic algorithm code and a red-black stencil successive over-relaxation code. Our predictions are within 10% of actual time.
Also visit Structural Modeling and Performance Prediction Engineering pages for more info.
About Contention Modeling
- Predicting Slowdown for Networked Workstations
  by Silvia M. Figueira and Francine Berman.
  (UCSD Tech. Rept. CS97-525, February 97)
  In this paper, we present a model that provides an estimate of the slowdown imposed by competing load on applications targeted to high-performance clusters and networks of workstations. The model provides a basis for predicting realistic communication and computation costs and is shown to achieve good accuracy for a set of scientific benchmarks commonly found in high-performance applications.
- Modeling Contention Effects in Clustered Environments
  by Silvia M. Figueira and Francine Berman.
  (UCSD Tech. Rept. CS96-512, December 96)
  Data-parallel applications executing in clustered environments share resources with other applications. Since system load can vary dramatically, it is critical to provide an accurate model of the effects of contention on application performance in order to provide realistic assessments of application behavior. In this paper, we present a model to predict contention effects in clustered environments. This model provides a basis for predicting realistic execution times for applications on clusters of workstations and is parameterized by the data allocation policy employed by the targeted application, the local slowdown present in each node of the cluster, and the relative weight associated with each node in the cluster.
- Modeling the Effects of Contention on the Performance of Heterogeneous Applications
  by Silvia M. Figueira and Francine Berman.
  from Proceedings of Symposium on High Performance Distributed Computing, August 96
  We present a model for predicting the effects of contention on application behavior in a two-machine heterogeneous system. Our model porovides a characterization of contention in a heterogeneous system, and predicts the effect of contention on application communication and computation costs. We describe the model for the Sun/CM2 and Sun/Paragon coupled heterogeneous systems. We also present experiments which show the predicted communication and computation costs to be within 15% on average of actual costs. These experiments were performed on production systems with emulated contention.
About Mapping Strategies
- Mapping Parallel Applications to Distributed Heterogeneous Systems
  by Silvia M. Figueira and Francine Berman
  (UCSD CS Tech Report #CS96-484)
  In this paper, we propose a mapping strategy for applications formed by multiple tasks targeted to heterogeneous platforms. We first define a mapping model, the match-tree, which reflects the data movement and conversion costs of distributed algorithms and allows for alternative implementations of individual tasks on different machines. We then define the find-mapping and split-partition algorithms, based on the match-tree model, to determine the best allocation of tasks to resources in heterogeneous systems. We illustrate the use of these algorithms with a sample distributed application.
About Zoom
- Zoom: A Hierarchical Representation for Heterogeneous Applications
  by Cosimo Anglano, Jennifer Schopf, Rich Wolski and Francine Berman.
  (UCSD CS Tech Report #CS95-451)
  We describe Zoom, a hierarchical representation in which heterogeneous applications can be described. The goal of Zoom is to provide an abstraction that computer and computational scientists can use to describe heterogeneous applications, and to provide a foundation from which program development tools for heterogeneous network computing can be built. Three levels (structure, implementation and data) of the Zoom hierarchy are described and are used to illustrate two heterogeneous applications. Extensions to Zoom to include additional resource parameters required by program development tools are also discussed.
- Developing Heterogeneous Applications Using Zoom and HeNCE
  by Richard Wolski, Cosimo Anglano, Jennifer Schopf and Francine Berman
  from Proceedings of the Workshop on Heterogeneous Computing Scientific Computing, Santa Barbara, CA, April, 1995.
  We couple the Zoom representation designed to facilitate development of heterogeneous applications, and the HeNCE graphical language and tool, designed as a representation for and an executional model of heterogeneous programs targeted to PVM. The combination of Zoom and HeNCE provides a hierarchical representation which exposes performance issues and a means of automatically translating that representation into code executable on heterogeneous networks of computers.
Miscellaneous Publications
- Panorama, a graphical debugger for parallel programs that was designed to be both retargetable and extensible.
- Program Speedup in a Heterogeneous Computing Network
  by Val Donaldson, Francine Berman, and Ramamohan Paturi.
  from Journal of Parallel and Distributed Computing, June 1994, vol. 21, no. 3, page 316-22.
  Program speedup is an important measure of the performance of an algorithm on a parallel machine. Of particular importance is the near linear or superlinear speedup exhibited by the most performance-efficient algorithms for a given system. We describe network and program models for heterogeneous networks, define notions of speedup and superlinear speedup, and observe that speedup consists of both heterogeneous and parallel components. We also consider the case of linear tasks, give a lower bound for speedup, and show that there is no theoretical upper limit on heterogeneous speedup.
- Cruising the Lemans FunRun
  by Secosky, J., Snavely, A., Berman, F. and Kitchens, T.
  from IEEE Parallel and Distributed Technology, Volume 2, Number 2, Summer, 1994.
  The LeMans FunRun is a heterogeneous competition which was held at Supercomputing '93 in Portland Oregon. The participants raced to "solve" a puzzle in a fixed amount of time on a network of machines comprised of a CM-5, Paragon, MasPar, Cray and Silicon Graphics workstations. This paper describes the design and implementation of the LeMans FunRun and discusses its potential for use as a heterogeneous education tool.
- On Market Mechanisms as a Software Technique
  by Paul Tucker and Fran Berman
  (UCSD Technical Report #CS96-513)
  This report explores the feasibility of using market mechanisms for efficient, automated resource allocation within a software system.
Also visit publications page of Rich Wolski. While part of the AppLeS group, he has many publications outside the realm of heterogeneous computing.
Home Projects Publications People Collaborators
Last updated 7/3/99 by gshao@cs.ucsd.edu