Inactive
Classification: Semantic Change Syntactic Change
Human Validated: KW
Title: RegExp Atomic Groups & Possessive Quantifiers
Authors: Justin Ridgewell
Never presented; engines are not interested in the feature, mainly because it doesn’t solve backtracking for most users
Last Presented: None
Stage Upgrades:
Stage 1: NA Stage 2: NA
Stage 2.7: NA
Stage 3: NA
Stage 4: NA
Last Commit: 2020-01-26
Topics: regex concurrency
Keywords: regex concurrency atomic
GitHub Link: https://github.com/jridgewell/proposal-regexp-atomic-and-possessive
GitHub Note Link: None

Proposal Description:

proposal-regexp-atomic-and-possessive

A proposal to add Atomic Groups and Possessive Quantifiers to RegExps.

Atomic Groups

Atomic groups, denoted by (?>TEXT_TO_MATCH), prevent backtracking once the group matches and “locks”. Even if backtracking into an alternate branch could allow a full match, the atomic group will not unlock.

const atomic = /a(?>bc|b)c/;
 
// (?>bc) matches, so atomic group "locks".
// Then, the "c" after the group matches.
atomic.test('abcc'); // => true
 
// (?>bc) matches, so atomic group "locks".
// The rest of the expression no longer matches.
// Because the atomic group is locked, it will **not** backtrack
// to the alternate (b) branch.
atomic.test('abc'); // => false
 
// Atomic groups do not capture their match.
atomic.exec('abcc'); // => ['abcc']

Possessive Quantifiers

Possessive quantifiers, denoted by using + after a quantifier, prevent backtracking once the token has matched. Even if backtracking would allow a full match, the possessive quantifier will not allow it.

const possessive = /^(a|.b)++$/;
 
// (a) matches
possessive.test('a'); // => true
// (a) matches, then (.b) matches
possessive.test('abb'); // => true
 
// (a) matches, (a) matches, but we can't match "b"
// Because it's possessive, it will not **not** backtrack
// to the (.b) branch.
possessive.test('aab'); // => false

Champions

Status

Current Stage: 0

Motivation

JavaScript’s Regular Expressions are great, but developers can unintentionally write one with “catastrophic backtracking”. These regexs can completely freeze the program, and is especially dangerous for servers.

Atomic Groups and Possessive Quantifiers allow developers to write understandable performance guarantees. Since they do not allow backtracking, they’ll never suffer exponential execution time.

const regex = /^(a|[ab])*$/;
 
function test(length) {
  const str = 'a'.repeat(length) + 'c';
  const now = performance.now();
  regex.test(str);
  return performance.now() - now;
}
 
for (let i = 0; i < 50; i++) {
  console.log({ length: i, time: test(i) });
}
String LengthTime (seconds)
…snip0.00
190.01
200.01
210.03
220.06
230.11
240.23
250.44
260.89
271.79
283.65
297.32
3014.29
3128.45
3257.94
33114.02
34233.58
I got bored waiting…