Perl Array: Quick Guide to Arrays In Perl

Declaring Perl Arrays
Joining Arrays with join()
Creating arrays with split()
Dumping (Viewing) Perl Arrays
Accessing Array Elements
Perl Push: Adding Items to Arrays With Push and Unshift
Perl Pop: Removing Items from Arrays with Pop and Shift
The Size of an Array
Perl Foreach: Iterating Over Arrays
Arrays of Arrays
Add an Element to an Array of Arrays
Map and Grep: Two Useful Array Functions
Sorting Arrays




Declaring Perl Arrays



You can declare an array in Perl like this:

my @items;




You can initialize and declare the array at the same time like this;

# Declare an array and initialize with three strings.
my @items = ('apple', 'banana', 'orange');





Arrays in Perl can happily contain mixed data types, sort of. Actually, since Perl views all the following array elements as scalars, even though we've put numbers, strings and references into the array, the array really only contains the scalar datatype.

# Declare an empty hash
my %hash;

# Declare an array containing a number, a string, 
# a reference to a hash and another string.
my @items = (42, 'wombat', %hash, 'hello');




Joining Arrays with join()



A good way to join the elements of an array together is by using join().

The first argument to this function is the string you want to use to join the array elements together. The second argument is the array.


my @animals = ('dog', 'cat', 'rabbit');

print join(', ', @animals);

# prints:
# dog, cat, rabbit




join() can be particularly useful for forming SQL queries. It's also good for printing arrays, as above.

Creating arrays with split()



The converse of join() is split(). Split allows you to split up a string into array elements.

The first argument is the string to split on (it can also, very usefully, be a regular expression), and the second argument is the string to split. An array is returned.


my $text = "No money and no hair";

# Split on spaces.
my @items = split ' ', $text;

use Data::Dumper;
print Dumper(@items);

# prints:
# $VAR1 = 'No';
# $VAR2 = 'money';
# $VAR3 = 'and';
# $VAR4 = 'no';
# $VAR5 = 'hair';






Dumping (Viewing) Perl Arrays



The quickest way to view the contents of an array in Perl is to use the Data::Dumper module.

use Data::Dumper;   
print Dumper(@items);




... displays:

$VAR1 = 42;
$VAR2 = 'wombat';
$VAR3 = {};
$VAR4 = 'hello';




Another good way of viewing an array is to join the elements with join() to form a string, then print the string.


my @fruits = ('apple', 'orange', 'banana');

print join(', ', @fruits);

# prints:
# apple, orange, banana





Accessing Array Elements


Accessing array elements in Perl is somewhat similar to array usage in other languages. The twist is that, since each element of the array is a scalar datatype, you need to access it using the '$' scalar prefix, not the '@' array prefix.

Don't forget that the first array element has index 0.

So in a three-element array, the last element is at index 2.

A quick way to get the number of the last element in a Perl array is to use $#array_name notation. For instance, if you have an array called @test, the index of the last element is $#test.

my @items = (42, 'wombat', 'hello');
    
print $items[0]; # prints '42'
print $items[1]; # prints 'wombat'
print $items[2]; # prints 'hello'
print $items[$#items]; # Also prints 'hello' (the last element)




Perl Push: Adding Items to Arrays With Push and Unshift



You can add items to the end of an array with push(), or to the start of the array with unshift()

my @fruit = ('apple', 'orange', 'banana');

# Add an item to the end of the array
push @fruit, 'pear';

# @fruit now contains 'apple', 'orange', 'banana', 'pear'

# Add an item to the start of the array
unshift @fruit, 'grape';

# @fruit now contains 
# 'grape', 'apple', 'orange', 'banana', 'pear'





Why the unwieldy name 'unshift'? Because as we'll see later, unshift() is the opposite of shift().

You can also use push() and unshift() to add one array to another. For example:

my @fruit = ('apple', 'orange', 'banana');

my @more = ('kiwi', 'lemon');

# Add @more to the end of @fruit
push @fruit, @more;

# @fruit now contains 
# 'apple','orange', 'banana', 'kiwi', 'lemon'




Perl Pop: Removing Items from Arrays with Pop and Shift



You can remove items one at a time from the end of an array using pop(), which also returns the removed item, in case you need it.

If you want to remove an item from the front of an array, you need shift(), which is often used inside functions for getting at the function arguments.


# Declare and initialize array.
my @fruit = ('apple', 'orange', 'banana');

# Remove the last item ('banana') and save it.
my $end = pop @fruit; 

print $end; # prints 'banana'.

# Remove first item off the array and save it.
my $start = shift @fruit;

print $start; # prints 'apple'.

# The array now contains only 'orange'.





The Size of an Array



The simplest way to find the size of an array is to use the array like a scalar. An array used in a scalar context returns its size. What does that mean?


my @fruit = ('apple', 'orange', 'banana');

my $size = @fruit;

print $size; # prints '3'





By assigning the array to a scalar ($size), we get the size of the array. If you want to make your scalar array usage explicit to avoid confusion, you can use the scalar() keyword.


my @fruit = ('apple', 'orange', 'banana');

print scalar(@fruit); # prints '3'




Perl Foreach: Iterating Over Arrays



The simplest way to iterate over an array is with foreach.


# Declare and initialize array.
my @fruit = ('apple', 'orange', 'banana');

foreach my $fruit(@fruit) {
    print $fruit, "\n";
}

# Prints:
# apple
# orange
# banana





Actually you don't even need the $fruit variable here. If you don't specify a variable, Perl sets $_ to each of the array elements in turn.


# Declare and initialize array.
my @fruit = ('apple', 'orange', 'banana');

foreach(@fruit) {
    print $_, "\n";
}

# Prints:
# apple
# orange
# banana




Since print() implicitly prints $_ if you don't tell it what else to print, you can even do this:


# Declare and initialize array.
my @fruit = ('apple', 'orange', 'banana');

foreach(@fruit) {
    print;
    print "\n";
}





... although some might consider that a bit confusing.

If you modify the variable you're using while looping, the actual array element is modified, like this:


# Declare and initialize array.
my @fruit = ('apple', 'orange', 'banana');

foreach my $fruit(@fruit) {
    $fruit = 'hello';
}

foreach my $fruit(@fruit) {
    print $fruit, "\n";
}

# Prints:
# hello
# hello
# hello





Very useful for modifying the contents of an array item by item.


Arrays of Arrays



You can declare an array of arrays in perl like this:


my @stuff = (
    ['apple', 'orange', 'banana'],
    [42, 1234],
    ['some', 'more', 'stuff', 'here'],
    ['assorted', 100, 0.7, 'hello']
);

print $stuff[0][1]; # prints 'orange'
print $stuff[2][3]; # prints 'here'
print $stuff[3][2]; #prints 0.7





What's going on here? The key to understanding this is that square brackets [] actually declare a reference to an array, which is a scalar.


# Declare a reference to an array.
my $arrayRef = ['apple', 'orange', 'banana'];

# Use the reference.

print $arrayRef->[1]; # prints 'orange'

# Note the -> operator. This 'dereferences' the reference, 
# allowing us to access the thing it points to (an array!)






So @stuff in the above example is actually just an array of (scalar) references. So $stuff[0] is a reference to an array containing fruits, and so on.

So the first array index after $stuff, e.g. $stuff[0], tells us that we want the first reference in the @stuff array -- a reference to an array of fruits.

The second index then gets us at the elements within the array pointed to by the reference; $stuff[0][2] is 'banana'.

It sounds worse than it is! You can of course build arrays with many dimensions, to your heart's content.

Part of the upshot of all this is that an alternative way of initializing @stuff would be to declare each sub-array separately, then take a reference
using the backslash operator.


my @fruits = ('apple', 'orange', 'banana');
my @numbers = (42, 1234);
my @strings = ('some', 'more', 'stuff', 'here');
my @assorted = ('assorted', 100, 0.7, 'hello');

# Build an array from references to the above arrays.
# (The same trick works with references to hashes!)

my @stuff = (@fruits, @numbers, @strings, @assorted);

print $stuff[2][0]; # prints 'some'





Add an Element to an Array of Arrays



What if we want to add an element to an array of arrays? There are lots of ways of doing this, but here's one.

Remember that each element of an array of arrays is a reference to an array. For instance, in the above example, $stuff[0] is a reference to the first array in the @stuff array of arrays.

We can use this reference with push, pop, grep and so on if we first cast the reference to an actual array. To cast a reference to an array in Perl, surround the reference with {} brackets and prefix it with the array symbol @.

In a nutshell:

# Declare and initialize array of arrays.
my @stuff = (
    ['apple', 'orange', 'banana'],
    [42, 1234],
    ['some', 'more', 'stuff', 'here'],
    ['assorted', 100, 0.7, 'hello']
);

# Append to the first array
# We need to typecast the reference to an 
# array before using push.
push @{$stuff[0]}, 'kiwi';

print join(', ', @{$stuff[0]});
print "\n"; # print newline.

# prints:
# apple, orange, banana, kiwi

# Append to the fourth array.
push @{$stuff[3]}, 'there';

print join(', ', @{$stuff[3]});

# prints:
# assorted, 100, 0.7, hello, there





The {} brackets around the cast are only necessary because of the [] index brackets on the end of the array name. If we just had a simple reference by itself, we could have simply stuck a '@' onto the start of it.



my @fruit = ('orange', 'apple');

my $fruitRef = @fruit;

push @$fruitRef, 'kiwi';

print join(', ', @fruit);

# prints: 
# orange, apple, kiwi





There is an easier way to add an element to an array of arrays in Perl. Since you can create an array element in Perl merely by assigning a value to it (don't try this in C++!) you can just do this:


# Declare and initialize array of arrays.
my @stuff = (
    ['apple', 'orange', 'banana'],
    [42, 1234],
    ['some', 'more', 'stuff', 'here'],
    ['assorted', 100, 0.7, 'hello']
);

$stuff[1][2] = 666;

print join(', ', @{$stuff[1]});

# prints:
# 42, 1234, 666





If you were to refer to an element far past the end of the array by mistake, the intervening elements would also be created, but with undefined values. Then if you tried print out the array, you'd end up up with uninitialized value warnings (if you have use warnings at the top of your script -- which you should!).

For that reason, push is often handier.


Map and Grep: Two Useful Array Functions



Two useful functions to know about are map() and grep()

grep allows you to filter certain elements from an array.

map allows you to change all elements of an array.

Let's look at some examples.


# Create an array containing some small numbers
# and some big numbers.
my @numbers = (1, 2, 300, 4, 500);

# Get only the small numbers
my @small = grep($_ < 10, @numbers);

print join(', ', @small);

# prints:
# 1, 2, 4





Grep iterates over the array items, setting $_ to each one in turn. It returns only those items that fulfill the condition specified via its first arguments.

Note that the original array remains unchanged. Unless of course, you assign some value to $_. But don't do that with grep() -- it's horribly confusing.

Very usefully, you can also use regular expressions to grep.


my @text = ('catfish', 'badger', 'dogfish', 'aardvark');

my @fish = grep(/fish/, @text);

print join(', ', @fish);

# prints: catfish, dogfish





map(), on the other hand, loops through each item of the array, setting $_ to each element, and returns a new array consisting of whatever you set in the first argument.

An example will make things clearer.



my @names = ('Bob', 'Pete', 'Sue', 'Alice');

my @prefixed = map("NAME: $_", @names);

print join(', ', @prefixed);

# Prints:
# NAME: Bob, NAME: Pete, NAME: Sue, NAME: Alice





You can use map as a quick, sometimes confusing way to iterate through the array and change each element.

To actually change the original array, assign a value to $_.

One great use of this is to add quotes to each element of an array.


my @names = ('Bob', 'Pete', 'Sue', 'Alice');

# Note that we are assigning a new value to $_
map($_ = "'$_'", @names);

print join(', ', @names);

# prints: 
# 'Bob', 'Pete', 'Sue', 'Alice'





The combination of join() and map() can be very useful in creating quoted lists for SQL queries.

Sorting Arrays



You can sort arrays in Perl with the sort() function. If you just want to sort ordinary English strings in alphabetical order, you're in luck. All you have to do is this:

my @strings = ("cat", "dog", "aardvark", "lizard");

my @sorted = sort @strings;

# @sorted is now sorted. @strings remains the same, of course.

print join(', ', @sorted);

# prints:
# aardvark, cat, dog, lizard




If you want to sort numbers or complex data structures, you need to do something a bit more fancy.

Basically you need to supply sort() with a code block that specifies how the sort should be performed. The code block will be supplied with the elements of the array two at a time in the variables $a and $b. You need to return -1 from the block if $a is less than $b (sorts before $b in the array in other words), 1 if $b is more than $a, and zero if they're equal.

Since Perl automatically returns the last defined value from a code block, there's no need to use the word 'return' most of the time.

The <=> (for numbers) and cmp (for strings) operators are very useful here.

A few examples will elucidate the matter.


my @strings = ("cat", "dog", "aardvark", "zebra");

# Sort in alphabetical order (default).
my @sorted = sort { $a cmp $b } @strings;

print join(', ', @sorted);

# prints:
# aardvark, cat, dog, zebra





Note, there's no comma after the {$a cmp $b}. It's not an argument, but something weirder.

To sort in reverse alphabetical order, all you have to do is switch $a and $b.

my @strings = ("cat", "dog", "aardvark", "zebra");

# Sort in reverse alphabetical order 
my @sorted = sort { $b cmp $a } @strings;

print join(', ', @sorted);

# prints:
# zebra, dog, cat, aardvark




cmp is no good for numbers. For numbers, you must use <=> (or roll your own!)


my @numbers = (1, 6, 4, 3, 10);

# Sort in numerical order
my @sorted = sort { $a <=> $b } @numbers;

print join(', ', @sorted);

# prints:
# 1, 3, 4, 6, 10





Let's sort some strings in order of string length, longest first.

my @strings = ("cat", "dog", "aardvark", "zebra");

# Sort in order of string length, longest first
my @sorted = sort { length($b) <=> length($a) } @strings;

print join(', ', @sorted);

# prints:
# aardvark, zebra, cat, dog




And finally, just for something really custom, let's sort some strings in order of how many e's they contain!

We'll use the regular expression /e/g to return the number of e's in each string.


my @strings = ('hello', 'eeeee', 'six', 'engine', 're-enter');

# Sort in order of number of e's, least e's first.
my @sorted = sort { 

    # Get number of matches for 'e'
    # in $a and $b
    my @inA = $a =~ /e/g;
    my @inB = $b =~ /e/g;

    # Compare numbers of matches.
    # return is just for clarity.
    return scalar(@inA) <=> scalar(@inB);

} @strings;

print join(', ', @sorted);

# prints:
# six, hello, engine, re-enter, eeeee








A Final Note ...



If there's anything else you'd like to know about Perl arrays, feel free to leave a question in the comments.

If you spot a mistake here, please let me know!