Using DistributedMake

1

This chapter describes the way DistributedMake (DMake) distributes builds over several hosts to build programs concurrently over a number of workstations or multiple CPUs.


Basic Concepts	page 3
What You Should Know About DMake Before You Use It	page 7
How to Use DMake	page 13

Basic Concepts

DistributedMake (DMake) allows you to concurrently distribute the process of building large projects, consisting of many programs, over a number of workstations and, in the case of multiprocessor systems, over multiple CPUs. DMake parses your makefiles and:

Determines which targets can be built concurrently
Distributes the build of those targets over a number of hosts designated by you

DMake is a superset of the make utility.

To understand DMake, you should know about the following:

Configuration files
- Runtime
- Build server
The DMake host
The build server


# My machine. This entry causes dmake to distribute to it. falcon { jobs = 1 } hawk eagle { jobs = 3 } # Manager's machine. She's usually at meetings heron { jobs = 4 } avocet

The entries: falcon, hawk, eagle, heron, and avocet are listed build servers.
You can specify the number of jobs you want distributed to each build server. The default number of jobs is two.
Any line that begins with the "#" character is interpreted as a comment.

Note - This list of build servers includes falcon which is also the DMake host. The DMake host can also be specified as a build server. If you do not include it in the runtime configuration file, no DMake jobs are distributed to it.

You can also construct groups of build servers in the runtime configuration file. This provides you with the flexibility of easily switching between different groups of build servers as circumstances warrant. For instance you may define a different group of build servers for builds under different operating systems, or on groups of build servers that have special software installed on them.

The following is an example of a runtime configuration file that contains groups of build servers:

earth { jobs = 2 }

mars { jobs = 3 }

group lab1 {

host falcon { jobs = 3 }

host hawk

host eagle { jobs = 3 }

}

group lab2 {

host heron

host avocet { jobs = 3 }

host stilt { jobs = 2 }

}

group labs {

group lab1

group lab2

}

group sunos5.x {

group labs

host jupiter

host venus { jobs = 2 }

host pluto { jobs = 3 }

}

earth { jobs = 2 } mars { jobs = 3 } group lab1 { host falcon { jobs = 3 } host hawk host eagle { jobs = 3 } } group lab2 { host heron host avocet { jobs = 3 } host stilt { jobs = 2 } } group labs { group lab1 group lab2 } group sunos5.x { group labs host jupiter host venus { jobs = 2 } host pluto { jobs = 3 } }

Formal groups are specified by the "group" directive and lists of their members are delimited by braces ({}).
Build servers that are members of groups are specified by the optional "host" directive.
Groups can be members of other groups.
Individual build servers can be listed in runtime configuration files that also contain groups of build servers; in this case DMake treats these build servers as members of the unnamed group.

In order of precedence, DMake distributes jobs to:

: 1. The formal group specified on the command-line as an argument to the -g option
: 2. The formal group specified by the DMAKE_GROUP makefile macro
: 3. The formal group specified by the DMAKE_GROUP environment variable
: 4. The first group specified in the runtime configuration file.

The Build Server

The /etc/opt/SPROdmake/dmake.conf file is in the file system of build servers. Use this file to limit the maximum total number of DMake jobs (from all users) that can run concurrently on a build server. The following is an example of an /etc/opt/SPROdmake.conf file. This file sets the maximum number of DMake jobs permitted to run on a build server (from all DMake users) to be eight.


jobs: 8

Note - If the /etc/opt/SPROdmake.conf file does not exist on a build server, no DMake jobs will be allowed to run on that server.

What You Should Know About DMake Before You Use It

To use DMake, you use the executable file (dmake) in place of the standard make utility. You should understand the Solaris make utility before you use DMake. If you need to read more about the make utility see the Programming Utilities Guide in the Solaris 2.5 Software Developer AnswerBook documentation set. If you use the make utility, the transition to DMake requires little if any alteration.

DMake's Impact on Makefiles

The methods and examples shown in this section present the kinds of problems that lend themselves to solution with DMake. This section does not suggest that any one approach or example is the best. Compromises between clarity and functionality were made in many of the examples.

As procedures become more complicated, so do the makefiles that implement them. You must know which approach will yield a reasonable makefile that works. The examples in this section illustrate common code-development predicaments and some straightforward methods to simplify them using DMake.

Using Makefile Templates

If you use a makefile template from the outset of your project, custom makefiles that evolve from the makefile templates will be:

More familiar
Easier to understand
Easier to integrate
Easier to maintain
Easier to reuse

The less time you spend editing makefiles, the more time you have to develop your program or project.

Building Targets Concurrently

Large software projects typically consist of multiple independent modules that can be built concurrently. DMake supports concurrent processing of targets on a multiple machines over a network. This concurrency can markedly reduce the time required to build a large project.

When given a target to build, DMake checks the dependencies associated with that target, and builds those that are out of date. Building those dependencies may, in turn, entail building some of their dependencies. When distributing jobs, DMake starts every target that it can. As these targets complete, DMake starts other targets. Nested invocations of DMake are not run concurrently by default, but this can be changed (see "Restricting Parallelism" on page 12 for more information).

Since DMake builds multiple targets concurrently, the output of each build is produced simultaneously. To avoid intermixing the output of various commands, DMake collects output from each build separately. DMake displays the commands before they are executed. If an executed command generates any output, warnings, or errors, DMake displays the entire output for that command. Since commands started later may finish earlier, this output may be displayed in an unexpected order.

Limitations on Makefiles

Concurrent building of multiple targets places some restrictions on makefiles. Makefiles that depend on the implicit ordering of dependencies may fail when built concurrently. Targets in makefiles that modify the same files may fail if those files are modified concurrently by two different targets. Some examples of possible problems are discussed in this section.

Dependency Lists

When building targets concurrently, it is important that dependency lists be accurate. For example, if two executables use the same object file but only one specifies the dependency, then the build may cause errors when done concurrently. For example, consider the following makefile fragment:


all: prog1 prog2 prog1: prog1.o aux.o $(LINK.c) prog1.o aux.o -o prog1 prog2: prog2.o $(LINK.c) prog2.o aux.o -o prog2

When built serially, the target aux.o is built as a dependent of prog1 and is up-to-date for the build of prog2. If built in parallel, the link of prog2 may begin before aux.o is built, and is therefore incorrect. The .KEEP_STATE feature of make detects some dependencies, but not the one shown above.

Explicit Ordering of Dependency Lists

Other examples of implicit ordering dependencies are more difficult to fix. For example, if all of the headers for a system must be constructed before anything else is built, then everything must be dependent on this construction. This causes the makefile to be more complex and increases the potential for error when new targets are added to the makefile. The user can specify the special target .WAIT in a makefile to indicate this implicit ordering of dependents. When DMake encounters the .WAIT target in a dependency list, it finishes processing all prior dependents before proceeding with the following dependents. More than one .WAIT target can be used in a dependency list. The following example shows how to use .WAIT to indicate that the headers must be constructed before anything else.


all: hdrs .WAIT libs functions

You can add an empty rule for the .WAIT target to the makefile so that the makefile is backward-compatible.

Concurrent File Modification

You must make sure that targets built concurrently do not attempt to modify the same files at the same time. This can happen in a variety of ways. If a new suffix rule is defined that must use a temporary file, the temporary file name must be different for each target. You can accomplish this by using the dynamic macros $@ or $*. For example, a .c.o rule which performs some modification of the .c file before compiling it might be defined as:


.c.o: awk -f modify.awk $.c > $.mod.c $(COMPILE.c) $.mod.c -o $.o $(RM) $*.mod.c

Concurrent Library Update

Another potential concurrency problem is the default rule for creating libraries that also modifies a fixed file, that is, the library. The inappropriate .c.a rule causes DMake to build each object file and then archive that object file. When DMake archives two object files in parallel, the concurrent updates will corrupt the archive file.


.c.a: $(COMPILE.c) -o $% $< $(AR) $(ARFLAGS) $@ $% $(RM) $%

A better method is to build each object file and then archive all the object files after completion of the builds. An appropriate suffix rule and the corresponding library rule are:

.c.a:

$(COMPILE.c) -o $% $<

lib.a: lib.a($(OBJECTS))

$(AR) $(ARFLAGS) $(OBJECTS)

$(RM) $(OBJECTS)

.c.a: $(COMPILE.c) -o $% $< lib.a: lib.a($(OBJECTS)) $(AR) $(ARFLAGS) $(OBJECTS) $(RM) $(OBJECTS)

Multiple Targets

Another form of concurrent file update occurs when the same rule is defined for multiple targets. An example is a yacc(1) program that builds both a program and a header for use with lex(1). When a rule builds several target files, it is important to specify them as a group using the + notation. This is especially so in the case of a parallel build.


y.tab.c y.tab.h: parser.y $(YACC.y) parser.y

This rule is actually equivalent to the two rules:

y.tab.c: parser.y

$(YACC.y) parser.y

y.tab.h: parser.y

$(YACC.y) parser.y

y.tab.c: parser.y $(YACC.y) parser.y y.tab.h: parser.y $(YACC.y) parser.y

The serial version of make builds the first rule to produce y.tab.c and then determines that y.tab.h is up-to-date and need not be built. When building in parallel, DMake checks y.tab.h before yacc has finished building y.tab.c and notices that it does need to be built, it then starts another yacc in parallel with the first one. Since both yacc invocations are writing to the same files (y.tab.c and y.tab.h), these files are apt to be corrupted and incorrect. The correct rule uses the + construct to indicate that both targets are built simultaneously by the same rule. For example:

y.tab.c + y.tab.h: parser.y

$(YACC.y) parser.y

y.tab.c + y.tab.h: parser.y $(YACC.y) parser.y

Restricting Parallelism

Sometimes file collisions cannot be avoided in a makefile. An example is xstr(1), which extracts strings from a C program to implement shared strings. The xstr command writes the modified C program to the fixed file x.c and appends the strings to the fixed file strings. Since xstr must be run over each C file, the following new .c.o rule is commonly defined:


.c.o: $(CC) $(CPPFLAGS) -E $.c \| xstr -c - $(CC) $(CFLAGS) $(TARGET_ARCH) -c x.c mv x.o $.o

DMake cannot concurrently build targets using this rule since the build of each target writes to the same x.c and strings files, nor is it possible to change the files used. You can use the special target .NO_PARALLEL: to tell DMake not to build these targets in concurrently. For example, if the objects being built using the .c.o rule were defined by the OBJECTS macro, the following entry would force DMake to build those targets serially:

.NO_PARALLEL: $(OBJECTS)

.NO_PARALLEL: $(OBJECTS)

If most of the objects must be built serially, it is easier and safer to force all objects to default to serial processing by including the .NO_PARALLEL: target without any dependents. Any targets that can be built in parallel can be listed as dependencies of the .PARALLEL: target:

.NO_PARALLEL:

.PARALLEL: $(LIB_OBJECT)

.NO_PARALLEL: .PARALLEL: $(LIB_OBJECT)

Nested Invocations of DistributedMake

When DMake encounters a target that invokes another DMake command, it builds that target serially, rather than concurrently. This prevents problems where two different DMake invocations attempt to build the same targets in the same directory. Such a problem might occur when two different programs are built concurrently, and each must access the same library. The only way for each DMake invocation to be sure that the library is up-to-date is for each to invoke DMake recursively to build that library. DMake only recognizes a nested invocation when the $(MAKE) macro is used in the command line.

If you nest commands that you know will not collide, you can force them to be done in parallel by using the .PARALLEL: construct.

When a makefile contains many nested commands that run concurrently, the load-balancing algorithm may force too many builds to be assigned to the local machine. This may cause high loads and possibly other problems, such as running out of swap space. If such problems occur, allow the nested commands to run serially.

How to Use DMake

You execute dmake on a DMake host and distribute jobs to build servers. You can also distribute jobs to the DMake host, in which case it is also considered to be a build server. DMake distributes jobs based on makefile targets that DMake determines (based on your makefiles) can be built concurrently. You can use any machine as a build server that meets the following requirements:

From the DMake host (the machine you are using) you must be able to use rsh, without being prompted for a password, to remotely execute commands on the build server. See man rsh(1) or the system AnswerBook for more information about the rsh command. For example:
demo% rsh build_server which dmake

/opt/SUNWspro/bin/dmake
The bin directory in which the DMake software is installed must be accessible from the build server. See the share (1M) and mount (1M) man pages or the system AnswerBook for more information.
The bin directory in which the DMake software is installed must be in your execution path when you rsh to the build server. Be sure this directory is added to the PATH variable in your .cshrc file (or equivalent), not in your .login file. You can verify this as follows:
demo% rsh build_serverwhich dmake

/opt/SUNWspro/bin/dmake
The source hierarchy you are building must be:
- accessible from the build server
- mounted under the same name

demo% `rsh` build_server `which` `dmake` `/opt/SUNWspro/bin/dmake`

demo% `rsh` build_server`which dmake` `/opt/SUNWspro/bin/dmake`

From the DMake host you can control which build servers are used and how many DMake jobs are allotted to each build server. The number of DMake jobs that can run on a given build server can also be limited on that server.

Notes

If you specify the -m option with the "parallel" argument, or set the DMAKE_MODE variable or macro to the value "parallel," DMake does not scan your runtime configuration file. Therefore, you must specify the number of jobs using the -j option or the DMAKE_MAX_JOBS variable/macro. If you do not specify a value this way, a default of two jobs is used.
If you modify the maximum number of jobs using the -j option, or the DMAKE_MAX_JOBS variable/macro when using DMake in distributed mode (DMake default, or specified either by option, variable or macro), the value you specify overrides the values listed in the runtime configuration file. The value you specify is used as the total number of jobs that can be distributed to all build servers.

Controlling DistributedMake Jobs

The distribution of DMake jobs is controlled in two ways:

: 1. A DMake user on a DMake host can specify the machines they want to use as build servers and the number of jobs they want to distribute to each build server.
: 2. The "owner" on a build server can control the maximum total number of DMake jobs that can be distributed to that build server. The owner is a user that can alter the /etc/opt/SPROdmake/dmake.conf file.

Note - If you access DMake from the GUI (Building) use the online help to know how to specify your build servers and jobs. If you access DMake from the CLI see the DMake man page (dmake.1).

Getting Help on the GUI or the CLI

DMake is fully implemented in both the GUI and CLI. To use DMake from the GUI, see the Sun WorkShop TeamWare online help.

To Access the Online Help

: 1. Open any Sun WorkShop TeamWare GUI from the Sun WorkShop.

2. Or, open any Sun WorkShop TeamWare from the command line. For example, to open the Configuring GUI, enter the following:


demo% `twconfig &`

3. Open the pull-down menu from the Help button.

4. Click on Help Contents.

To Access Help for the CLI

To access the manual page for information on how to use DMake from the CLI, enter the following. The manual page gives information on all command-line options, variables, and macros necessary to use DMake.

demo% man dmake.1