Perl Replace Substring

Let's say you've got a string in Perl and you want to replace just a part of it. How can you do it?

Replacing a Fixed-Length Section of a String in Perl



If the bit you want to replace is a fixed range of characters in the string, you can use the substr function.

substr has the form substr EXPRESSION, OFFSET, LENGTH and is usually used for returning substrings from within larger strings. For instance:

my $string = "Tea is good with milk.";

print substr($string, 4, 2);




is




In the program above, we've extracted a 2-character substring from the larger string, starting at character offset 4. The interesting thing is, you can also use substr on the left hand side of an expression, assigning a new value of any length to a substring:

my $string = "Tea is good with milk.";

substr($string, 4, 2) = "might be";

print $string;




Tea might be good with milk.




We've replaced the 2-character substring "is" with a longer substring, "might be".



Replacing a Substring of Unknown Position Or Length in Perl



If we don't know the exact position and length of the substring we want to replace in Perl, we need to use a regular expression to do the replace instead of substr

For example, suppose we want to replace all occurrences of "tea" with "coffee".

my $string = "Tea is good with milk.";

$string =~ s/tea/coffee/ig;

print $string;




coffee is good with milk.




Here we've used the s/FIND/REPLACE/ syntax to replace all occurrences of "tea" with "coffee".

We've used two flags here. i specifies that the match should be case-insensitive. g says that we want to replace all occurrences of the specified string or regular expression, not just the first one. In this case g is superfluous since there is only one "tea" in the string.

Let's look at a slightly more complex example. We'll search for all words in some multi-line text beginning with "c" and replace them with the word "badger".

my $string = q|
Here is some multi-line text.
We could replace all words in this text
beginning with "c" with the word "badger".
Or of course we could do something 
completely different.
|;

$string =~ s/\bc\w*\b/badger/ig;

print $string;




Here is some multi-line text.
We badger replace all words in this text
beginning with "badger" with the word "badger".
Or of badger we badger do something
badger different.




Here we've used the incredibly-useful q| ... | multi-line quote. The pipe characters here can in fact be any character.

In the regular expression we've used b to match word boundaries and w to match alphanumeric characters.

Sometime you want to use the thing you've matched in the replacement text itself. For instance, what if we want to find all numeric digits in a string and surround them with quotes?


my $string = q|
In 1956, Hungarians rose up against
their Russian-influenced communist
government. Stories abound of 13-year-old
girls throwing molotov cocktails at tanks.
For 4 days, the revolution seemed to have
succeeded.
|;


$string =~ s/(\d+)/"$1"/ig;

print $string;




In "1956", Hungarians rose up against
their Russian-influenced communist
government. Stories abound of "13"-year-old
girls throwing molotov cocktails at tanks.
For "4" days, the revolution seemed to have
succeeded.




We've used d to match digits in the string; the d+ specifies that we want to match one or more consecutive occurrences of digits. We surround the regular expression with brackets (...) to capture the matched string. The captured string can then be used in the replacement expression, where it can be referred to as $1.

Complex Substring Replacement In Perl



Sometimes we want to do some kind of complex processing on the text we want to replace in order to determine what to replace it with. In this case, we can use the return value of a subroutine as our replacement text. The following example could have been written more simply without the use of a subroutine, but using this code as a template you can make extremely complex replacements in a body of text.

use strict;
use warnings;

my $string = q|
In 1956, Hungarians rose up against
their Russian-influenced communist
government. Stories abound of 13-year-old
girls throwing molotov cocktails at tanks.
For 4 days, the revolution seemed to have
succeeded.
|;

sub fix_numbers 
{
    # Get the subroutine's argument.
    my $arg = shift;
    
    # Hash of stuff we want to replace.
    my %replace = (
        "13" => "thirteen",
        "4" => "four",
    );
    
    # See if there's a replacement
    # for the given text.
    my $text = $replace{$arg};
    
    if(defined($text)) 
    {
        # Got a replacement; return it.
        return $text;
    }
    
    # No replacement; return original text.
    return $arg;
}

sub main
{
    $string =~ s/(\d+)/fix_numbers($1)/eig;

    print $string;
}

main();





In 1956, Hungarians rose up against
their Russian-influenced communist
government. Stories abound of thirteen-year-old
girls throwing molotov cocktails at tanks.
For four days, the revolution seemed to have
succeeded.




Note that we've had to add the e flag on the end of the s/// expression in order to run the replacement text as Perl code. As before we've captured the text we want to replace, but this time we've passed it to the subroutine as an argument.

Notice also that both lines of the hash containing replacement texts are terminated in a comma; this is not a mistake, but is in fact considered best practice in Perl.