How to find the number of unique elements in a stream?
So, I’ve been reading about streaming algorithms. Seems like the journey to streaming algorithms(aka Algorithms for Big Data) starts with the Flajolet-Martin algorithm.
The Problem We are given a sequence <u_0, u_1, u_2, u_3, … , u_n> of n elements. Each u_i comes from the fixed set U of some finite size. We want to see how many elements are unique.
A simple Python code like below can solve the problem, if we have required memory.