Globally rename an identifier throughout a set of source files
|Debian (Lenny, Squeeze)|
To globally rename an identifier throughout a set of source files
Suppose that you have a set of C source files in which the function
foo is declared, defined and used. You wish to change the name of this function to
bar, but without affecting identifiers with similar names such as
The files are located in a single directory and (as is normal practice) have names ending in the extensions
.h. One of the files that makes use of the function
foo is named
There are a number of ways in which the required effect can be achieved with commonly available tools. Three methods are described here:
- using GNU
- using POSIX
- using Perl.
All are equally effective, but vary in terms of portability and complexity.
Be warned that making global changes to a body of source code has the potential to cause severe data loss if the procedure were to go wrong for any reason. It would be prudent either to make a copy of the code, or better, ensure that it is fully checked into a revision control system before attempting to use any of the methods described here.
If the GNU implementation of
sed is available then the required effect can be obtained using the following editing script:
s command means ‘substitute’ and it takes two arguments: a regular expression to search for (
\bfoo\b) and the text to substitute when a match is found (
g flag requests that the substitution be performed globally, as opposed to only for the first match on each line.
Within the regular expression
\b matches a zero-width string at a word boundary. Word characters are letters, digits and underscores, therefore the boundaries matched are in most cases the same as would be recognised by a C compiler.
sed should be invoked with just the script and the name of one of the files:
sed 's/\bfoo\b/bar/g' main.c
The result will be written to
stdout. If it is satisfactory then the
-i option can be added to enable in-place editing, and the list of files extended to include all source and header files:
sed -i 's/\bfoo\b/bar/g' *.c *.h
-i are GNU extensions that are not required by POSIX. This is not therefore a suitable method for use in scripts that should be portable.
It is possible to achieve the required effect using a minimally POSIX-compatible implementation of
sed, however the procedure for doing so is somewhat more complicated than when the
\b extension is available, and the effort unlikely to be worthwhile in most cases. A suitable editing script would be:
The boundaries are detected here by looking for non-word characters before and after the identifier. Because those characters now form part of the string that will be matched they must be reinserted into the output using backreferences. Identifiers at the start and/or end of a line cannot be matched using this technique so are treated as a special case.
In-place editing can be achieved either by writing to a temporary file then renaming it:
sed 's/\(^\|[^a-zA-Z0-9_]\)foo\([^0-9A-Za-z_]\|$\)/\1bar\2/g' < main.c > main.c.tmp mv main.c.tmp main.c
or by using the
sponge command (provided by the
moreutils package on Debian-based systems):
sed 's/\(^\|[^a-zA-Z0-9_]\)foo\([^0-9A-Za-z_]\|$\)/\1bar\2/g' < main.c | sponge main.c
This method does not by itself allow multiple files to be processed, however it can be placed within a
for loop or used in combination with the
find command if that is a requirement.
The same outcome can be achieved using the following Perl script:
This is identical to the GNU Sed script presented above, and has the same meaning. For testing it should be invoked with the
perl -p -e 's/\bfoo\b/bar/g' main.c
-e option specifies that the next argument is the script to be executed. The
-p option causes repeated execution of that script, once for each line of input. The result will be written to
stdout. If it is found to be satisfactory then the
-i option can be added to enable in-place editing and the script applied to all files:
perl -pi -e 's/\bfoo\b/bar/g' *.c *.h