PHP Regular Expressions

PHP Regular Expressions

In this tutorials we will about regular expressions in PHP and how a programmer can apply them in pattern matching.

What is Regular Expression?

They are popularly referred to as ‘regex’ or ‘RegExp’. Usually, they are uniquely formatted text strings that can one use to determine patterns in a text. Regular expressions are very useful when it comes to processing text and manipulating text. For instance, an individual can use it to determine if the data format such as name, phone number, and email typed by user is correct or not, find or replace a matching string in text content, and many others.

The table below shows some of the most common PHP in-built pattern matching functions.

FunctionWhat it Does
preg_macth ()Perform a regular expression match.
preg_match_all ()Perform a global regular expression match.
preg_replace ()Perform a regular expression search and replace.
preg_grep ()Returns the elements of the inputs array that matched the pattern.
preg_split ()Splits a string into substring using a regular expression.
preg_quote ()Quote regular expression characters found within a string.

The PHP preg_match() function will stop to search once it finds the first match. On the other hand, the preg_match_all() function will continue to search until it reaches the end of the string and identifies all the possible matches instead of halting at the first match.

The syntax of a Regular Expression

The syntax of a regular expression has special characters. The specific types of characters that have a unique meaning inside a regular expression include: . * ? + [] () {} ^ $ | .

However, before you can use these characters, you must backslash them. For instance, if you want to match “.”, you will need to write . Other remaining characters assume their literal meaning automatically.

Character classes

The square brackets which enclose a pattern of characters are referred to as character class. A character class will match a single character from a list of unique character.

Negated character classes can also be generated that match any type of character except the ones that are inside the brackets. To define a negated character class, the caret symbol should immediately follow the opening bracket. Example,[^abc].

Still, it is possible to define a range of characters by placing the hypen(-) character inside a character such as [0-9]. Below are examples of character classes.

RegExpWhat it Does
[abc]Matches any one of the characters a,b, or c.
[^abc]Matches any one character other than a, b, c.
[a-z]Matches any one character from lowercase a to lowercase z.
[A-Z]Matches any one character from uppercase a to uppercase z.
[a-Z]Matches any one character from lowercase to uppercase Z.
[0-9]Matches a single digit between 0 and 9.
[a-z0-9]Matches a single character between a and z or between 0 to 9.

The example below demonstrates how to determine whether a pattern is present in a string or not by using regular expressions and the PHP preg_match() function.

<?php
$pattern="/ca[cf]e/";
$text="He was eating cake in the cafe.";
if(preg_match($pattern, $text)){
   echo "Match found!";
} else {
     echo "Match not found";
}

At the same time, all matches in a string can be identified by applying the preg_match_all() function:

<?php
$pattern="/ca[cf]e/";
$text="He was eating cake in the cafe.";
$matches=preg_match_all($pattern, $text, $array);
echo $matches. "matches were found".

Predefined Character Classes

There are certain character classes such as whitespaces, letters, and digits that are often used. As a result, they have shortcut names defined for them. The table below lists some of the predefined character classes:

ShortcutFunction
.It matches a single character except for only a new line n
dIt matches any digit character. Similar to [0-9]
DIt matches any non-digit character. Similar to [^0-9]
sIt matches the whitespace character. Similar to [ tnr]
wIt will match any word character and underscore. Similar to [a-zA-Z_0-9]
WIt will match any non-word character. Similar to [^a-zA-Z_0-9]

The example below shows how to find and replace space using a hyphen character in a string by applying a regular expression and PHP preg_replace () function.

<?php
$pattern="/s/";
$replacement="-";
$text="Earth revolves aroundnthetSun";
// Replace spaces, newlines and tabs
echo preg_replace ($pattern, $replacement, $text);
echo "<br >";
// Replace  only spaces
 echo str_replace (" ","-", $text);

Repetition Quantifiers

Quantifiers describe the number of times a character in a regular expression should match. The table below shows several ways that one can quantify a specific pattern.

RegExpWhat it Does
p+Matches one or more occurrences of the letter p.
p*Matches zero or more occurrences of the letter p.
p?Matches zero or more occurrences of the letter p.
p{2}Matches exactly two occurrences of the letter p.
p{2,3}Matches at least two occurrences of the letter p, but not more than three occurrences of the letter p.
p{2,}Matches two or more occurrences of the letter p.
p{,3}Matches at most three occurrences of the letter p.

The regular expression used in the example below split the string at a comma, series of a comma, and a combination applying PHP preg_split () function.

<?php
$pattern="/[s,]+/";
$text="My favourite color are red, green and blue";
$parts=preg_split ($pattern, $text);

// Loop through  parts array and display substrings
foreach($parts as part){
echo  $part."<br>"
}

Position Anchors

There are specific cases that you might want to match at the start or end of a line, string, or word. To achieve this, you can apply anchors. Two common anchors include the caret(^) which signals the start of a string and a $ sign that represents the end of a string.

RegExpFunction
^pIt will match the letter p at the start of a line.
p$It will match the letter p at the end of a line.

The regular expression applied in the following example displays only names from the names array which begins with the letter “j” and the preg_group () function.

<?php
$pattern="/^J/";
$names=array("John Carter","Clark Kent", "John Rambo");
$matches=preg_grep($pattern,$names);
//Loop through matches array and display matched  names
foreach($matches as $match){
 echo $match."<br>"
}

Pattern Modifiers

A pattern modifier will allow a developer to respond to a pattern match. Pattern modifiers appear directly after the regular expression. If you want to look for a pattern in a case-insensitive way, for example, then you should use the I modify such as /pattern/I. The table below has some of the most commonly used pattern modifiers.

ModifierWhat is Does
iMakes the match case-insensitive manner.
mChanges the behavior of ^ and $ to match against a newline boundary (i.e. start or end of each line within a multiline string), instead of a string boundary.
gPerform a global match i.e. finds all occurrences.
oEvaluates the expression only once.
sChanges the behavior of . (dot) to match all characters, including newlines.
xAllows you to use whitespace and comments within a regular expression for clarity.

The example below will demonstrate how you can carry out a global case-insensitive search by applying the i modifier and PHP preg_match_all () function.

ModifierWhat is Does
iMakes the match case-insensitive manner.
mChanges the behavior of ^ and $ to match against a newline boundary (i.e. start or end of each line within a multiline string), instead of a string boundary.
gPerform a global match i.e. finds all occurrences.
oEvaluates the expression only once.
sChanges the behavior of . (dot) to match all characters, including newlines.
xAllows you to use whitespace and comments within a regular expression for clarity.

Word Boundaries

A word boundary character (b) will help you identify the words that start and end with a pattern. For instance, the regexp /bcar/ matches words that start with a pattern car and match cartoon, carrot, cart but cannot match Oscar.

In the same way, the regexp /carb/ matches words that end with the pattern car, and match scar, supercar, Oscar but cannot match cart. Similarly, /bcarb matches words that start and end with the pattern car and that will only match the word car. The example will show words starting with the car in bold:

<?php
$pattern='/bcarw*/';
$replacement='<b>$0</b>';
$text='Words beginning with car: cart, carrot, cartoon
words ending with car: scar, oscar, supercar';
echo preg_replace ($pattern, $replacement, $text);