.?$|^(..+?)\\1+$
Regular Expression to find a prime number
[Primes Home][Home]
And, so, prime numbers have defeated us in finding a pattern. There may be an equation that allows us to find prime numbers, but we have not found it. But we can determine if we have a prime number with the regular expression of:
.?$|^(..+?)\\1+$ |
As a cybersecurity analyst, what’s one of the best skills you can pick up? Well, knowing how to use regular expressions is certainly well up there in terms of one of the best skills, as they allow us to search for complex patterns. So, let’s do a fun example of finding prime numbers.
And, so, prime numbers have defeated us in finding a pattern. There may be an equation that allows us to find prime numbers, but we have not found it. So, there’s a cool little RegEx trick that can find them:
import re import sys n=13 if (len(sys.argv)>1): n=int(sys.argv[1]) matches=re.match(r'^.?$|^(..+?)\1+$', '1'*n) print("Use the RegEx of ^.?$|^(..+?)\\1+$\n") if matches: print(f"{n} is not a prime. Match details: {matches.groups()}") else: print(f"{n} is a prime")
If we run we get:
$ python regprime.py 97 Use the RegEx of ^.?$|^(..+?)\1+$ 97 is a prime $ python regprime.py 98 Use the RegEx of ^.?$|^(..+?)\1+$ 98 is not a prime. Match details: ('11',) $ python regprime.py 99 Use the RegEx of ^.?$|^(..+?)\1+$ 99 is not a prime. Match details: ('111',) $ python regprime.py 997 Use the RegEx of ^.?$|^(..+?)\1+$ 997 is prime.
We can see that it works for these three values, and where 97 and 997 are prime numbers and 999 is not (as it is 3x3x3x37). A non-prime number is known as a composite number.
First, we create a string of 1s. For a value of 13, we get a string with 13 1's:
1111111111111
Now:
^.?$|^(..+?)\1+$
has two parts:
^.?$
and
^(..+?)\1+$
If the first part fails, we move on to the second part.
The expression
^.?$
with match strings that have ‘0’ or ‘1’ characters. As we are generating a number of 1’s, this will pass and go onto the next part.
With:
^(..+?)\1+$
the search will try to match two characters (for the number 2), and then four characters (for the number 4), and so on. This match all the even numbers (and which will not be a prime number — apart from 2). Next, it will try for matches of three characters. For three characters, such as “111”, then “111111”, and so on. If these match, then the number has three as a factor. After this, it will match multiples of 4, multiples of 5, and so on. If it cannot find a match, it will then fail, or get to the end of the possible multiples.
If that fails, it will try to first match 3 characters (corresponds to the number 3), then 6 characters (corresponds to the number 6), then 9 characters, then 12 characters, and so on. This means that it will try to match multiples of 3. If that fails, it proceeds to try to match multiples of 4, then if that fails it will try to match multiples of 5, and so on, until the number whose multiple it tries to match is the length of the string (failure case) or there is a successful match (success case).