Count Distinct Elements in Every Window of Size K

Difficulty Level Medium
Frequently asked in Accolite Amazon Microsoft
Array Hash Sliding WindowViews 2734

Subsets are something which we have been dealing with for some time now. In the last episode, we covered the number of subsets we could make with distinct even numbers. This time we count distinct elements in every window of size K.

Section-1

About the problem.

Given an unsorted array of integers. We are to find out the distinct number of integers each subset has. Let us check the statement out with a sample test case.

Given:

Array:{1,2,3,4,4,2,3}

Size of subsets:4

The possible subsets can be:

{1,2,3,4}

{2,3,4,4}

{3,4,4,2}

{4,4,2,3}

The unique numbers in each of the above subsets are:

4,3,2 and 3

Section-2

There are multiple ways in which we can approach the problem. But I will be discussing the one that is the best out of the rest. As I scaled through the subsets my audiences would have noticed a pattern. We let go of the first element from the last window as we add a fresh element to maintain the size of k for each window.

Let us run through the process one by one

  • Firstly, run a loop of k to iterate through all the elements of the first window
  • Maintain a count of all the elements with a HashMap
  • The key is the element with the value being the number of times the element occurs in the HashMap
  • As we go forward we remove the first element out of the HashMap
  • If the element occurs only once we simply remove it
  • Elsewise we decrease the number of occurrences of the element in the HashMap
  • We check the fresh element
  • If it exists beforehand. We add to its occurrence.
  • If it does not exist we add it to the HashMap
  • In each iteration, we add the count of unique elements to the ArrayList or we may print it out.
  • The count of unique elements is the size of the HashMap

Here is an image to give an overview of the process.

We have here an array of size 6 and k has been given as 3.The red highlights are subsets as we move our sliding window.

Count Distinct Elements in Every Window of Size K
Showing subsets for window of size 3

Java Code for Count Distinct Elements in Every Window of Size K

// "static void main" must be defined in a public class.
public class Main 
{
    public static void countDist(int[] arr,int k)
    {
        HashMap<Integer,Integer>store=new HashMap<Integer,Integer>();
        List<Integer>ans=new ArrayList<Integer>();
        for(int i=0;i<k;i++)
        {
            if(!store.containsKey(arr[i]))
                store.put(arr[i],1);
            else
                store.put(arr[i],store.get(arr[i])+1);
        }
        ans.add(store.size());
        for(int i=k;i<arr.length;i++)
        {
            if(store.get(arr[i-k])==1)
                store.remove(arr[i-k]);
            else
                store.put(arr[i-k],store.get(arr[i-k])-1);
            if(!store.containsKey(arr[i]))
                store.put(arr[i],1);
            else
                store.put(arr[i],store.get(arr[i])+1);
            ans.add(store.size());
        }
        for(int i=0;i<ans.size();i++)
            System.out.print(ans.get(i)+" ");
    }
    public static void main(String[] args) 
    {
        int arr[] = { 1, 2, 1, 3, 4, 2, 3 }; 
        int k = 4; 
        countDist(arr, k); 
    }
}

As we switch from Java to C++. We change from the Collection framework to the STL. This means that we will use an unordered map instead of the HashMap and vector instead of ArrayList.

Let us now switch to the code for C++.

C++ Code for Count Distinct Elements in Every Window of Size K

#include<bits/stdc++.h>
using namespace std;
void countDist(int arr[],int k,int size)
{
        unordered_map<int,int>store;
        vector<int>ans;
        for(int i=0;i<k;i++)
        {
            if(store.find(arr[i])==store.end())
                store[arr[i]]=1;
            else
                store[arr[i]]=store[arr[i]]+1;
        }
        ans.push_back(store.size());
        for(int i=k;i<size;i++)
        {
            if(store[arr[i-k]]==1)
                store.erase(arr[i-k]);
            else
                store[arr[i-k]]=store[arr[i-k]]-1;
            if(store.find(arr[i])!=store.end())
                store[arr[i]]=1;
            else
                store[arr[i]]=store[arr[i]]+1;
            ans.push_back(store.size());
        }
        for(int i=0;i<ans.size();i++)
            cout<<ans[i]<<" ";
}
int main() 
{ 
    int arr[] = { 1, 2, 1, 3, 4, 2, 3 }; 
    int size = sizeof(arr) / sizeof(arr[0]); 
    int k = 4; 
    countDist(arr, k, size); 
  
    return 0; 
}
3 4 4 3

Complexity Analysis for Count Distinct Elements in Every Window of Size K

Time Complexity=O(n)

Space Complexity=O(k)

How?

About Time Complexity

  • When we search for the elements
  • The first loop runs for k elements
  • The loop we then run picks the first element and then takes into account the last element
  • In this way, we have picked up all the elements at least once
  • This makes the time complexity as O(n)

About Space Complexity

  • The HashMap we have takes only k elements at a time
  • The HashMap takes the first k elements in the first loop
  • As we move forward we get rid of the stale elements and add fresh elements
  • In case of repetition, there may be lesser elements
  • there will be k elements in case the elements are all distinct

References

Translate ยป