Stage 1
Classification: API Change
Human Validated: KW
Title: Iterator unique
Authors: Michael Ficarra
Champions: Michael Ficarra
Last Presented: February 2024
Stage Upgrades:
Stage 1: 2024-02-08
Stage 2: NA
Stage 2.7: NA
Stage 3: NA
Stage 4: NA
Last Commit: 2024-02-08
Topics: iterators others
Keywords: iterator unique values duplicates
GitHub Link: https://github.com/tc39/proposal-iterator-unique
GitHub Note Link: https://github.com/tc39/notes/blob/HEAD/meetings/2024-02/feb-6.md#iterator-unique-for-stage-1
Proposal Description:
Iterator Unique
A TC39 proposal to produce an iterator of unique values from any iterator.
Stage: 1
See the January 2024 presentation to committee.
motivation
Removing duplicates from any kind of collection is a common operation. It’s not very easy to do for iterators.
For some iterables, you can do something like the following:
let uniques = new Set(iter).values();
This has a few downsides, though:
- Consumes the whole iterator before producing any results.
- Doesn’t work for infinite iterators.
- Yields 0 when the underlying iterator yields -0.
- Can’t yield both 0 and -0.
- Doesn’t work for non-iterable iterators.
A better solution is much harder to write and doesn’t work well with chaining, as it requires a bunch of side variables for state.
let objSeen = new WeakSet,
primSeen = new Set,
seenNegZero = false;
let uniques = iter.filter(e => {
if (e === 0 && 1/e < 0) {
if (seenNegZero) return false;
seenNegZero = true;
return true;
}
let seen = Object(e) === e ? objSeen : primSeen;
if (seen.has(e)) return false;
seen.add(e);
return true;
});
Worse, when you want to unique by some applied transform, you need to surround the filter
with tupling and untupling map
s.
let uniques = iter
.map(e => [e, f(e)])
.filter([, criteria] => { ... })
.map(([e]) => e);
chosen solution
Iterator.prototype.uniqBy
which takes an optional mapper.
let uniques = iter.uniqBy();
let uniques = iter.uniqBy(obj => obj.field);
design space
- still no good solution for composite keys, but that’s an unsolved problem generally
- mapper? comparator? both? neither?
- separate methods or combined with optional params?
- would mapper be passed an index?
- naming:
distinct
is also common
prior art
other languages
language | library | simple API | with comparator | with mapping |
---|---|---|---|---|
Clojure | core | distinct | — | — |
Elm | List.Extra | unique | — | uniqueBy |
Haskell | Data.List | nub | nubBy | — |
Java | Stream | distinct | — | — |
Kotlin | Sequence | distinct | — | distinctBy |
.NET | System.Linq | Distinct | Distinct , DistinctBy | DistinctBy |
PHP | array | array_unique | — | — |
Python | more-itertools | unique_everseen | — | unique_everseen |
Ruby | Enumerable | uniq | — | uniq |
Rust | Iterator | — | — | — |
Scala | Seq | distinct | — | distinctBy |
Shell | GNU coreutils | uniq | — | — |
Swift | Sequence | — | — | — |
JS libraries
library | simple API | with comparator | with mapping |
---|---|---|---|
extra-iterable | unique | unique | unique |
iter-ops | distinct | — | distinct |
iter-tools | distinct | — | distinct |
itertools-ts | distinct | — | distinct |
Lodash / Underscore | uniq | uniqWith | uniqBy |
Ramda | uniq | uniqWith | uniqBy |
sequency | distinct | — | distinctBy |
wu | unique | — | — |