%A M. F. Schwartz
%T Autonomy vs. Interdependence in the Networked Resource Discovery Project
%O Position paper, ACM SIGOPS European Workshop, Cambridge, England
%D September 1988
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/Auton.vs.Interdep.Wkshop.ps.Z
(compressed PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/Auton.vs.Interdep.Wkshop.txt.Z
(compressed ASCII).

%A M. F. Schwartz
%T The Networked Resource Discovery Project
%J Proceedings of the IFIP XI World Congress
%C San Francisco, California
%D August 1989
%P 827-832
%K Track on Communications and distributed systems
%K Early project description, probabilistic yellow pages
%X Available by anonymous FTP from ftp.cs.colorado.edu in the directory
pub/cs/techreports/schwartz/PostScript/Early.Pjct.Descr (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/Early.Pjct.Descr.txt.Z (compressed
ASCII).
%X Abstract:  "Large scale computer networks provide access to a
bewilderingly large number and variety of resources, including retail
products, network services, and people in various capacities.  We
consider the problem of allowing users to \fIdiscover the existence\fR
of such resources in an administratively decentralized environment,
using a system architecture that accesses the distributed collection of
repositories that naturally maintain resource information.  A key
problem is organizing the resource space flexibly.  Rather than
imposing a hierarchical organization, our approach allows the resource
space organization to evolve in accordance with usage patterns.
Concretely, a set of \fIagents\fR organize and search the resource
space by constructing links between the repositories of resource
information based on keywords that describe the contents of each
repository, and the semantics of the resources being sought.  The links
form a general graph, with a flexible set of hierarchies embedded
within the graph to provide some measure of scalability.  The graph
structure evolves over time through the use of cache aging protocols.
Additional scalability is targeted through the use of probabilistic
graph protocols.  A simulation, prototype implementation, and
measurement study are under way."

%A M. F. Schwartz
%T A Scalable, Non-Hierarchical Resource Discovery Mechanism Based on
Probabilistic Protocols
%R Technical Report CU-CS-474-90
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%D June 1990
%K Yellow pages, YP
%X Available by anonymous FTP from ftp.cs.colorado.edu in the directory
pub/cs/techreports/schwartz/PostScript/ProbYP (compressed PostScript) or
in the file pub/cs/techreports/schwartz/ASCII/ProbYP.txt.Z (compressed
ASCII).
%X Abstract:  "Computer network interconnection provides access to a
bewildering array of resources, including databases, network services,
and people in various capacities.  We consider the problem of allowing
users to discover the existence of such resources in a large scale,
administratively decentralized environment.  While hierarchically
organized resource registries have good scalability properties, they
provide poor support for resource discovery, because users must
understand how the nested components are arranged.  In this paper we
present a probabilistic approach that supports non-hierarchical,
attribute based "yellow pages" searches.  The protocols support
locating a small number of instances of moderately large classes of
objects.  The resource graph evolves over time in accordance with what
resources exist and the types of searches that users make.  Simulation
results indicate that the approach can support scalable and flexible
resource discovery for an environment roughly the size of a large
country, with several thousand administrative domains participating in
resource registration and searches.  Moreover, the probabilistic search
strategy naturally supports fair access among competing information
providers."

%A M. F. Schwartz
%A D. H. Goldstein
%A R. K. Neves
%A D. C. M. Wood
%T An Architecture for Discovering and Visualizing Characteristics of Large
Internets
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report CU-CS-520-91
%D February 1991
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/NetVis.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/NetVis.txt.Z (compressed ASCII).
%X Abstract: "In this paper we present an architecture for discovering
characteristics of large internets, such as topology, congestion,
routing, and protocol usage.  Our approach uses a very loosely coupled
architecture that does not require global agreement over a particular
network management standard, such as the Simple Network Management
Protocol.  Instead, we use a number of different network protocols and
information sources to derive information about networks,
cross-correlating this information when necessary to determine important
characteristics or to uncover inconsistent information.  This approach
recognizes that different sources of network information have different
characteristics with respect to timeliness of discovered information,
expense, danger of generating network problems, and completeness of
discovered information.  Our architecture gives the network
administrator control over which discovery protocols are used, and how
frequently each is scheduled.  Moreover, the architecture focuses on
supporting network management in large scale internets, such as the
global TCP/IP Internet.  We have built a prototype implementation that
can collect network information using a few network protocols, and
display this information graphically."

%A M. F. Schwartz
%T The Great Disconnection?
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report CU-CS-521-91
%D February 1991
%X Available by anonymous FTP from ftp.cs.colorado.edu in the directory
pub/cs/techreports/schwartz/PostScript/Disconnection.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/Disconnection.txt.Z (compressed
ASCII).
%X Abstract: "In this paper we present measured data about the types of
sites reachable by upper layer services (such as mail and "finger") on
the global TCP/IP Internet.  We analyze changes in this type of
reachability by comparing data from two world wide measurements,
conducted 6 months apart.  Our impetus for this analysis is to examine
the extent to which sites are reducing their accessibility from the
Internet, in response to increasing security concerns.  We consider
upper layer service connectivity instead of basic IP connectivity
because the former indicates the willingness of organizations to
participate in inter-organizational computing, which will be an
important component of future wide area distributed applications.
Surprisingly, we find that while some sites are disconnecting or
otherwise distancing themselves from the Internet, the vast majority of
sites have retained full or nearly full Internet connectivity.
Moreover, we estimate that the number of sites accessible via the
Internet has grown by approximately 31% in the past 6 months,
significantly outpacing the rate at which sites are distancing
themselves from the Internet.  Our measurements are broken down by
distancing mechanism and institution type/location."

%A M. F. Schwartz
%A P. G. Tsirigotis
%T Experience with a Semantically Cognizant Internet White Pages Directory
Tool
%J Journal of Internetworking Research and Experience
%D March 1991
%P 23-50
%K Netfind
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/White.Pages.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/White.Pages.txt.Z (compressed ASCII).
%X Note: the Netfind prototype is available by anonymous FTP from
ftp.cs.colorado.edu, in the directory pub/cs/distribs/netfind.  You can
use the University of Colorado Netfind server by telnet to
bruno.cs.colorado.edu and logging in as "netfind" (with no password).
%X Abstract: "As wide area networking technology and interconnection
improve, an increasingly important problem is allowing users to navigate
through the vast array of network accessible resources.  In this paper
we discuss experience with one technique we have developed in this
regard, applied to a specific resource class.  We have built a prototype
tool that provides a simple Internet "white pages" directory facility.
Given the name of a user and a rough description of where the user works
(e.g., the company name or city), the tool attempts to locate telephone
and electronic mailbox information about that user.  Measurements
indicate that the scope of the directory is upwards of 1,147,000 users
in 1,929 administrative domains, yet the tool does not require the type
of global cooperation that many existing or proposed directory services
require, namely, running special directory servers at many sites around
the Internet.  We accomplish this by building an understanding of the
semantics of this particular resource discovery application into the
algorithms that support searches, allowing the tool to make aggressive
use of existing sources of relatively unstructured information.  Being
able to make use of such information is important in heterogeneous,
administratively decentralized environments, where global agreement
about highly structured information formats is difficult to achieve.  At
present, the tool utilizes information from USENET news messages, the
Domain Naming System, the Simple Mail Transfer Protocol, and the
"finger" protocol, as well as a variety of information about the meaning
of and relationships between these information sources.  Other sources
of resource information (such as the CCITT X.500 directory service) can
easily be incorporated into the tool as they become available.  The tool
achieves good response time through the use of parallel queries."

%A M. F. Schwartz
%T The Role of Resource Discovery in Support of a National Software
Exchange
%D March 1991
%O Position paper, RIACS National Software Exchange Workshop
%X Available by anonymous FTP from ftp.cs.colorado.edu in the directory
pub/cs/techreports/schwartz/PostScript/NSE.Wkshop.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/NSE.Wkshop.txt.Z (compressed ASCII).

%A M. F. Schwartz
%A D. R. Hardy
%A W. K. Heinzman
%A G. Hirschowitz
%T Supporting Resource Discovery Among Public Internet Archives Using a
Spectrum of Information Quality
%R Technical Report CU-CS-487-90
%J Proceedings of the Eleventh International Conference on Distributed
Computing Systems
%C Arlington, Texas
%D May 1991
%P 82-89
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/RD.For.Anon.FTP.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/RD.For.Anon.FTP.txt.Z (compressed
ASCII).
%X Abstract: "Wide area networks offer access to an increasing number
and variety of resources, such as documents, software, and network
services.  Yet, it is difficult to locate resources of interest, because
of the scale and decentralized nature of the environment.  We are
interested in supporting a global confederation of loosely cooperating
systems and users that share far more resources than can be completely
organized.  Therefore, mechanisms are needed to support incremental
organization of the resources, based on the efforts of many
geographically decentralized individuals, and a range of different
information sources of varying degrees of quality.  In this paper we
describe a prototype implementation of a set of mechanisms intended to
explore this problem in the specific domain of public Internet archives,
accessible via the "anonymous" File Transfer Protocol.  This is an
interesting test case, because it encompasses a very large scale,
administratively decentralized collection of resources, with
considerable practical value."

%A M. F. Schwartz
%A P. G. Tsirigotis
%T Techniques for Supporting Wide Area Distributed Applications
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report CU-CS-519-91
%O Submitted for publication
%D February 1991; Revised August 1991
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/Techniques.Wide.Area.ps.Z
(compressed PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/Techniques.Wide.Area.txt.Z (compressed
ASCII).
%X Abstract: "In this paper we present a number of techniques for
supporting distributed applications that span many nodes across national
or international networks.  We focus particular attention on issues
concerning scalability and administrative decentralization.  Our
experiences derive in large part from prototypes we have built in the
context of research into resource discovery.  However, many of these
experiences are applicable to supporting any wide area distributed
application.  The techniques covered relate to fault tolerance,
administrative decentralization, scalability, organization, controlling
the spread of distributed operations, and user interface issues."

%A M. F. Schwartz
%T A Measurement Study of Changes in Service-Level Reachability in the
Global TCP/IP Internet: Goals, Experimental Design, Implementation, and
Policy Considerations
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Internet Request For Comments 1273
%D November 1991
%X Available by anonymous FTP from ftp.cs.colorado.edu in the directory
pub/cs/techreports/schwartz/PostScript/Inet.Meas.Plan.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/Inet.Meas.Plan.txt.Z (compressed
ASCII, in RFC format) or in the file pub/InetMeasStudy/Study.Plan
(uncompressed ASCII, in RFC format).
%X An Earlier version of this report appeared as University of Colorado
Technical Report CU-CS-551-91, October 1991.
%X "In this report we discuss plans to carry out a longitudinal measurement
study of changes in service-level reachability in the global TCP/IP
Internet.  We overview our experimental design, considerations of
network and remote site load, mechanisms used to control the measurement
collection process, and network appropriate use and privacy issues,
including our efforts to inform sites measured by this study.  A list of
references and information on how to contact the Principal Investigator
are included.

%A D. J. Ewing
%A R. S. Hall
%A M. F. Schwartz
%T A Measurement Study of Internet File Transfer Traffic
%D January 1992
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report CU-CS-571-92
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/FTP.Meas.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/FTP.Meas.txt.Z (compressed ASCII).
%X "The lack of a topology-based information distribution mechanism in
wide area networks causes users to waste a large amount of bandwidth on
repeat transfers of widely accessed files.  As a first step towards
designing such a mechanism, in this paper we present measurements of the
Internet File Transfer Protocol, gathered over two weeks at the main
gateway to the University of Colorado network.  We found that nearly 40%
of all transfers were duplicate transmissions, accounting for over 54%
of the file transmission traffic.  A small proportion of files were
significantly larger and more frequently transferred, with 6.30% of the
files accounting for 19.30% of the transfers. 8.21% of duplicate file
transmissions were caused by user errors when transferring binary data,
underscoring the need to insulate users from such details.  We also
found that file transfers occurred in only 44% of the FTP connections.
Other connections were probably directory-only requests, underscoring
the need for better resource discovery support in the Internet.  We
present measurements of a number of other statistics as well, including
distributions of file sizes, file types, peak transfer times, and
sources and destinations."

%A M. F. Schwartz
%A D. C. M. Wood
%T Discovering Shared Interests Among People Using Graph Analysis of Global
Electronic Mail Traffic
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%D February 1992
%O Submitted for publication
%X Based in part on an earlier paper entitled "A Measurement Study of
Organizational Properties in the Global Electronic Mail Community"
(University of Colorado technical report CU-CS-482-90, August 1990).
%X Available by anonymous FTP and e-mail from ftp.cs.colorado.edu in the
directory pub/cs/techreports/schwartz/PostScript/Email.Study (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/Email.Study.txt.Z (compressed ASCII).
%X "An important problem faced by users of large networks is how to
discover resources of interest, such as data or people.  In this paper we
focus on locating people with particular interests or expertise.  The usual
approach is to build interest group lists from explicitly registered data.
However, doing so assumes one knows what lists should be built, and who
should be included in each list.  We present an alternative approach, which
can support a more fine grained and dynamically adaptive notion of shared
interests.  Our approach deduces interests from the history of electronic
mail communication, using a set of heuristic graph algorithms.  We
demonstrate the algorithms by applying them to data collected from 15 sites
for two months.  Using these algorithms we were able to deduce shared
interest lists for people far beyond the data collection sites, in such
closely related areas as distributed computing and networks.  The
algorithms we present are powerful, and if abused can th eaten privacy.  We
propose guidelines that we believe should underly the ethical use of these
algorithms.  We discuss several possible applications that we believe do
not threaten privacy, including discovering resources other than people,
such as file system data."

%A M. F. Schwartz
%T Paradigms for Resource Discovery
%O Workshop on Resource Discovery - Principles and Practice.  In
Proceedings of the IFIP International Working Conference on Upper Layer
Protocols, Architectures and Applications
%C Vancouver, Canada
%D May, 1992
%K RD, WG 6.5
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/ParadigmsRD.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/ParadigmsRD.txt.Z (compressed ASCII).

%A M. F. Schwartz
%T Internet Resource Discovery at the University of Colorado
%J To appear, IEEE Computer
%D Revised October 1992
%K Project overview
%X A preliminary version of this paper appeared as "Resource Discovery and
Related Research at the University of Colorado", ConneXions - The
Interoperability Report, pp.12-20, Interop, Inc., May 1991.  A second
version appeared as University of Colorado Technical Report
CU-CS-555-91, entitled "Resource Discovery in the Global Internet",
November 1991.
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/Proj.Overview.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/Proj.Overview.txt.Z (compressed
ASCII).
%X "Rapidly increasing global Internet connectivity offers tremendous
opportunities for collaboration and information sharing.  An important
problem in this environment is how to discover resources of interest,
such as documents, network services, and people.  In this paper we
discuss a number of aspects of the resource discovery problem, and
summarize results from efforts to address these problems carried out in
the Networked Resource Discovery Project at the University of Colorado."

%A D. Heimbigner
%T Experiences With an Object Manager for a Process--Centered Environment.
%R Technical Report CU-CS-484-92
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%D Revised 20 February 1992
%J Proceedings of the 18th International Conference on Very Large Data Bases (to appear)
%C Vancouver, B.C.
%K triton experience
%X Available by anonymous FTP from ftp.cs.colorado.edu:/pub/cs/techreports/arcadia/triton.
%X Abstract: Process-centered software engineering environments,
such as Arcadia, impose a variety of requirements on database technology
that to date have not been well supported by available
object-oriented databases.
Some of these requirements include
multi-language access and sharing,
support for independent relations,
and support for triggers.
Triton is an object-oriented database management system
designed to support the Arcadia software engineering environment.
It can be used as a general purpose DBMS, although
it has specialized features to support the software process capabilities
in Arcadia in the form of the APPL/A~\cite{sutton90d} language.
Triton was developed as prototype to explore the requirements
for software environments and to provide prototypical solutions.
By making these requirements known
it is hoped that better solutions will eventually be provided
by the database community.

%A D. Heimbigner
%T Triton Reference Manual Version 0.8.1
%R Technical Report CU-CS-483-92
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%D Revised 5 May 1992
%K triton reference
%X Available by anonymous FTP from ftp.cs.colorado.edu:/pub/cs/techreports/arcadia/triton.

%A M. F. Schwartz
%A A. Emtage
%A B. Kahle
%A B. C. Neuman
%T A Comparison of Internet Resource Discovery Approaches
%D August 1992
%K CUDCS
%X Earlier version appeared as TR CU-CS-601-92, July 1992
%J \fRTo appear,\fP Computing Systems
%K USENIX Journal, resource discovery taxonomy, tax, systems, RD,
dimensions: object granularity, object distribution, object interconnection
topology, and data integration scheme, information retrieval, IR
%K Archie, Prospero, WAIS, Netfind, X.500, Gopher, WWW, Indie, DLS, Alex,
AFS, Univers
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/PostScript/RD.Comparison.ps.Z (compressed
PostScript) or in the file
pub/cs/techreports/schwartz/ASCII/RD.Comparison.txt.Z (compressed ASCII).
%X "In the past several years, the number and variety of resources
available on the Internet have increased dramatically.  With this increase,
many new systems have been developed that allow users to search for and
access these resources.  As these systems begin to interconnect with one
another through "information gateways", the conceptual relationships
between the systems come into question. Understanding these relationships
is important, because they address the degree to which the systems an be
made to interoperate seamlessly, without the need for users to learn the
details of each system.  In this paper we present a taxonomy of approaches
to resource discovery.  The taxonomy provides insights into the
interrelated problems of organizing, browsing, and searching for
information.  Using this taxonomy, we compare a number of resource
discovery systems, and examine several gateways between existing systems."

%A Dirk Grunwald
%A Benjamin Zorn
%T CustoMalloc: Efficient Synthesized Memory Allocators
%D July 1992
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report CU-CS-602-92
%O Submitted for publication
%K memory allocation
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
   pub/cs/techreports/grunwald/CU-CS-602-92.ps.Z
%X "The allocation and disposal of memory is a ubiquitous operation in
    most programs. Rarely do programmers concern themselves with details
    of memory allocators; most assume that memory allocators provided by
    the system perform well. Yet, in some applications, programmers use
    domain-specific knowledge in an attempt to improve the speed or memory
    utilization of memory allocators.
    In this paper, we describe a program (CustoMalloc) that synthesizes a
    memory allocator customized for a specific application. Our
    experiments show that the synthesized allocators are uniformly faster
    than the common binary-buddy (bsd) allocator,
    and are more space efficient.
    Constructing a custom allocator requires little programmer
    effort. The process can usually be accomplished in a few minutes, and
    yields results superior even to domain-specific allocators designed by
    programmers. Our measurements show the synthesized allocators are from
    two to ten times faster than widely used allocators."
    
%A Benjamin Zorn
%A Dirk Grunwald
%T Evaluating Models of Memory Allocation
%D July 1992
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report CU-CS-603-92
%O Submitted for publication
%K memory allocation, program behavior, synthetic models, simulation
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
   pub/cs/techreports/zorn/CU-CS-603-92.ps.Z
%X "Because dynamic memory management is an important part of a large
class of computer programs, high-performance algorithms for dynamic
memory management have been, and will continue to be, of considerable
interest. We evaluate and compare models of the memory allocation
behavior in actual programs and investigate how these models can be
used to explore the performance of memory management algorithms.
These models, if accurate enough, provide an attractive alternative to
algorithm evaluation based on trace-driven simulation using actual
traces.  We explore a range of models of increasing complexity
including models that have been used by other researchers.  Based on
our analysis, we draw three important conclusions.  First, a very
simple model, which generates a uniform distribution around the mean
of observed values, is often quite accurate.  Second, two new models
we propose show greater accuracy than those previously described in
the literature.  Finally, none of the models investigated appear
adequate for generating an operating system workload."

%A Benjamin Zorn
%A Dirk Grunwald
%T Empirical Measurements of Six Allocation-intensive C Programs
%D July 1992
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report CU-CS-604-92
%O Submitted for publication in SIGPLAN Notices
%K memory allocation, program behavior
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
   pub/cs/techreports/zorn/CU-CS-604-92.ps.Z
%X "Dynamic memory management is an important part of a large class of
computer programs and high-performance algorithms for dynamic memory
management have been, and will continue to be, of considerable
interest.  This paper presents empirical data from a collection of six
allocation-intensive C programs.  Extensive statistics about the
allocation behavior of the programs measured, including the
distributions of object sizes, lifetimes, and interarrival times, are
presented.  This data is valuable for the following reasons: first,
the data from these programs can be used to design high-performance
algorithms for dynamic memory management.  Second, these programs can
be used as a benchmark test suite for evaluating and comparing the
performance of different dynamic memory management algorithms.
Finally, the data presented gives readers greater insight into the
storage allocation patterns of a broad range of programs.  The data
presented in this paper is an abbreviated version of more extensive
statistics that are publically available on the internet."
    
%A Dirk Grunwald
%A Harini Srinivasan
%T Data Flow Equations for Explicitly Parallel Programs
%D July 1992
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report CU-CS-605-92
%K dataflow equations, parallel compilers
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
   pub/cs/techreports/grunwald/CU-CS-605-92.ps.Z
%X "We have extended the standard monotone dataflow system for the
    reaching definitions problem to accommodate explicitly parallel
    programs; this information is used in many standard optimization problems.
    This paper considers the parallel_sections} construct,
    both with and without explicit synchronization; a future paper
    considers the parallel_do} construct.
    Although work has been
    done on analyzing parallel programs to detect data races, 
    little work has been done on optimizing such programs; the
    equations in this paper should form the basis for extensive work
    on optimization."


%A Benjamin Zorn
%T The Measured Cost of Conservative Garbage Collection
%D April 1992, revised August 1992
%I Department of Computer Science, University of Colorado, Boulder,
Colorado
%R Technical Report C\&U-C\&S-573-92
%O Submitted for publication
%K garbage collection, memory allocation, conservative collection
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
   pub/cs/techreports/zorn/CU-CS-573-92.ps.Z
%X "Because dynamic memory management is an important part of a large
class of computer programs, high-performance algorithms for dynamic
memory management have been, and will continue to be, of considerable
interest.  Experience indicates that for many programs, dynamic
storage allocation is so important that programmers feel compelled to
write and use their own domain-specific allocators to avoid the
overhead of system libraries.  Conservative garbage collection has
been suggested as an important algorithm for dynamic storage
management in C programs.  In this paper, I evaluate the costs of
different dynamic storage management algorithms, including
domain-specific allocators; widely-used general-purpose allocators;
and a publicly available conservative garbage collection algorithm.
Surprisingly, I find that programmer enhancements often have little
effect on program performance.  I also find that the true cost of
conservative garbage collection is not the CPU overhead, but the
memory system overhead of the algorithm.  I conclude that conservative
garbage collection is a promising alternative to explicit storage
management and that the performance of conservative collection is
likely to be improved in the future.  C programmers should now
seriously consider using conservative garbage collection instead of
malloc/free in programs they write."

%A M. F. Schwartz
%T Which White Pages Service is Appropriate for My Site?
%J Internet News: The Newsletter of the Internet Society
%V 1
%N 4
%D \fRTo appear,\fR Fall 1992
%K UNIV of Colorado - Boulder
%X N-1-4-040.31.1
%K RD, ISOC, WP, X.500, Netfind, WAIS, Gopher, KIS, WHOIS, CSO, PH, 411,
UIUC, da
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/schwartz/ASCII/White.Pages.Comparison.txt.Z
(compressed ASCII).

%A William M. Waite
%A Anthony M. Sloane
%T Software Synthesis via Domain-Specific Software Architectures
%R CU-CS-611-92
%I Department of Computer Science, University of Colorado, Boulder
%D |SEP| 1992
%O Submitted for publication in IEEE Software
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/Eli/dssa.ps.Z (compressed PostScript).
%X "Current software engineering practice concentrates on improving the
process by which a programmer develops a solution from the description
of a problem; we describe a new paradigm for software synthesis based
on Domain-Specific Software Architectures (DSSAs) that eliminates this
process entirely.  A DSSA provides an overall software design that
solves a whole class of problems in a broad area.  It focuses the
designer's attention on the unique requirements of the current problem,
suppressing those that are common to all problems of the type addressed
by that DSSA.  To use the DSSA approach, a software engineer provides a
description of the unique requirements of a particular problem.  A
solution to that problem is then generated according to the DSSAs
overall design by a system that implements the DSSA.  Problem
descriptions are checked for consistency by the system, and the
generated software is guaranteed to solve the problem described.  We
briefly describe how we have used the DSSA approach to build Eli, a
system for compiler construction.  Generalizing from Eli, we identify
requirements that the implementation of any DSSA should satisfy: 1)
incorporation of a manufacturing language to describe the incremental
derivation of software objects with architecture-based error reporting,
2) incorporation of an authoring language to allow on-line access to
documentation and system components, and 3) the ability to incorporate
externally developed tools and export constructed programs."

%A Uwe Kastens
%A William M. Waite
%T Modularity and Reusability in Attribute Grammars
%R CU-CS-613-92
%I Department of Computer Science, University of Colorado, Boulder
%D |SEP| 1992
%O Submitted for publication in Acta Informatica
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/Eli/modular.ps.Z (compressed PostScript).
%X "An attribute grammar is a declarative specification of dependence
among computations carried out at the nodes of a tree.  Attribute
grammars have proven remarkably difficult to decompose into logical
fragments.  As a result, they have not yet been accepted as a viable
specification technique.  By combining the ideas of remote attribute
access and inheritance, we have been able to define ``attribution
modules'' that can be reused in a variety of applications.  As an
example, we show how to define reusable modules for name analysis that
embody different scope rules."

%A William M. Waite
%T Error Analysis and Reporting in Programming Environments
%R CU-CS-456-90
%I Department of Computer Science, University of Colorado, Boulder
%D |JAN| 1990
%O Submitted for publication in Computer Journal
%X Available by anonymous FTP from ftp.cs.colorado.edu in the file
pub/cs/techreports/Eli/reports.ps.Z (compressed PostScript).
%X "A programming environment combines elementary operations into
single composite operations to be applied by the user.  The operands of
the composite operation may be decomposed, their components processed
by elementary operations, the results combined in arbitrary ways and
manipulated by other elementary operations, and a final product
delivered to the user.  If one of the elementary operations detects an
inconsistency in its data, the raw report of that inconsistency will
almost certainly be incomprehensible to the user.  The programming
environment must therefore be capable of using information about its
own structure to process the raw report and provide the user with
relevant diagnostics.  We have developed a general approach to the
design of such error recovery mechanisms and demonstrated its
usefulness in an experimental programming environment.  Our technique
can also be used to provide a comprehensive help facility based on a
hypertext version of the system documentation."

%A G. Graefe
%A A. Linville
%A L. D. Shapiro
%T Sort versus Hash Revisited
%K Volcano duality join benchmarks equal different size skew
%J to appear in IEEE Trans. on Knowledge and Data Eng.
%K TKDE
%D 1993
%O An earlier version is available as
CU Boulder CS TR 534, July 1991

%A G. Graefe
%T Five Performance Enhancements for Hybrid Hash Join
%K hash-join hashjoin tuning compression  cluster size fan-out
recursion depth statistics histograms non-uniformity duplicate
skew role reversal multi-way joins
%J submitted for publication
%K IEEE TKDE
%D July 1992
%O Also CU Boulder CS TR 606

%A G. Graefe
%A D. L. Davison
%T Encapsulation of Parallelism and Architecture-Independence
in Extensible Database Query Processing
%K Volcano hierarchical memory
%J submitted for publication
%K IEEE TSE
%D November 1991
%O Also CU Boulder CS TR 559

%A G. Graefe
%A W. J. McKenna
%T Extensibility and Search Efficiency in the Volcano Optimizer
Generator
%K rule-based query OptGen intro
logical physical algebra operator method enforcer property vector
%J submitted for publication
%K IEEE DE
%D 1992
%O An earlier version is also available as
CU Boulder CS TR 563, December 1991

%A G. Graefe
%T Volcano, An Extensible and Parallel Dataflow Query Processing System
%K overview file buffer operators iterators exchange choose-plan
hash one-to-one match overflow
%J to appear in IEEE Trans. on Knowledge and Data Eng.
%K TKDE
%D 1993
%O A more detailed version is available as
CU Boulder CS TR 481, July 1990

%A G. Graefe
%T Query Processing Techniques for Large Databases
%K survey execution iterators algorithms parallelism
logical physical algebra operator extensible object-oriented
%J submitted for publication
%K ACM Computing Surveys
%D January 1992
%O Also CU Boulder CS TR 579

%A G. Graefe
%A R. L. Cole
%A D. L. Davison
%A W. J. McKenna
%A R. H. Wolniewicz
%T Extensible Query Optimization and Parallel Execution in Volcano
%K optgen generator exchange operator distributed memory hierarchy
%B Query Processing for Advanced Database Applications
%K Dagstuhl 1991 Germany workshop
%E J.C.\0Freytag,\0G.\0Vossen\0and\0D.\0Maier
%K Freytag Vossen Maier
%I Morgan-Kaufman
%C San Mateo, CA
%D 1992

