Minimum sum of squares of character counts in a given string after removing k characters

Difficulty Level Medium
Frequently asked in Amazon
Queue StringViews 1955

Problem Statement

The problem “Minimum sum of squares of character counts in a given string after removing k characters” states that you are given a string containing lower case characters only. You are allowed to remove k characters from the string such that in the remaining string the sum of squares of frequency of each character is minimum. Find that minimum value and print it.

Examples

str = "aabcd"
k = 2
3

 

str = "aabbcc"
k = 2
6

 

Minimum sum of squares of character counts in a given string after removing k characters

str = "aaaaabbxd"
k = 1
22

Naive Approach

It is always optimal to remove the character with maximum frequency.

  1. In the given original string, count the frequency of all the characters and store it in freq array.
  2. Sort the freq array in decreasing order.
  3. Repeat step 4 ‘k’ times.
  4. Reduce the frequency of first element in the freq array by 1 and sort the freq array again.
  5. The minimum value is the sum of squares of values present in the freq array.

Time Complexity = O(n + k * c)
Space Complexity = O(1)
where c is constant which is equals to the number of unique characters in the string.

Code to find Minimum sum of squares of character counts in a given string after removing k characters

Java Code

import java.util.Arrays;
import java.util.Collections;
class MinimumSumOfSquaresOfCharacterCountsInAGivenStringAfterRemovingKCharacters {
    private static int minSumSquares(String str, int k) {
        char sArr[] = str.toCharArray();
        int minSum = 0;

        // create a freq array, with initial value as 0
        Integer freq[] = new Integer[26];
        for (int i = 0; i < 26; i++) {
            freq[i] = 0;
        }

        // traverse in the original string and calculate the frequency of every character
        for (int i = 0; i < sArr.length; i++) {
            freq[sArr[i] - 'a']++;
        }

        // sort the array in descending order
        Arrays.sort(freq, Collections.reverseOrder());

        // remove k characters
        for (int i = 0; i < k; i++) {
            // remove the character with maximum freq and reduce the freq
            freq[0]--;
            // sort the array in descending order
            Arrays.sort(freq, Collections.reverseOrder());
        }

        // minSum is the sum of squares of all the elements in freq array
        for (int i = 0; i < 26; i++) {
            minSum += (freq[i] * freq[i]);
        }

        return minSum;
    }

    public static void main(String[] args) {
        // Example 1
        String str1 = "aabcd";
        int k1 = 2;
        System.out.println(minSumSquares(str1, k1));

        // Example 2
        String str2 = "aabbcc";
        int k2 = 2;
        System.out.println(minSumSquares(str2, k2));

        // Example 3
        String str3 = "aaaaabbxd";
        int k3 = 1;
        System.out.println(minSumSquares(str3, k3));
    }
}
3
6
22

C++ Code

#include<bits/stdc++.h> 
using namespace std; 

int minSumSquares(string str, int k) {
    int minSum = 0;
    
    // create a freq array, with initial value as 0
    int freq[26];
    for (int i = 0; i < 26; i++) {
        freq[i] = 0;
    }
    
    // traverse in the original string and calculate the frequency of every character
    for (int i = 0; i < str.length(); i++) {
        freq[str[i] - 'a']++;
    }
    
    // sort the array in descending order
    sort(freq, freq + 26, greater<int>());
    
    // remove k characters
    for (int i = 0; i < k; i++) {
        // remove the character with maximum freq and reduce the freq
        freq[0]--;
        // sort the array in descending order
        sort(freq, freq + 26, greater<int>());
    }
    
    // minSum is the sum of squares of all the elements in freq array
    for (int i = 0; i < 26; i++) {
        minSum += (freq[i] * freq[i]);
    }
    
    return minSum;
}

int main() {
    // Example 1
    string str1 = "aabcd";
    int k1 = 2;
    cout<<minSumSquares(str1, k1)<<endl;

    // Example 2
    string str2 = "aabbcc";
    int k2 = 2;
    cout<<minSumSquares(str2, k2)<<endl;

    // Example 3
    string str3 = "aaaaabbxd";
    int k3 = 1;
    cout<<minSumSquares(str3, k3)<<endl;
    
    return 0;
}
3
6
22

Optimal Approach

The goal of removing the character with maximum frequency can be optimally achieved using a Priority Queue.

  1. Create a Priority Queue q, which is a max heap.
  2. In the original string, count the frequency of all the characters and make a priority queue of them.
  3. Repeat step 4 ‘k’ times.
  4. Remove the top of priority queue, reduce it by 1 and if it is not zero push it back to the priority queue.
  5. The minimum value is the sum of squares of values present in the priority queue.

Time Complexity = O(n + k * log(c))
Space Complexity = O(c)
where c is constant which is equals to the number of unique characters in the string. Since we are removing k elements from the priority queue. Insertion and removal from priority queue take O(log N) time. And thus a factor logN came. The space complexity is because we created a frequency array but since the size of frequency array is independent of the input. And thus space complexity is constant.

Code to find Minimum sum of squares of character counts in a given string after removing k characters

Java code

import java.util.Collections;
import java.util.PriorityQueue;

class MinimumSumOfSquaresOfCharacterCountsInAGivenStringAfterRemovingKCharacters {
    private static int minSumSquares(String str, int k) {
        char[] sArr = str.toCharArray();
        // create a frequency array
        int freq[] = new int[26];

        // traverse in the original string and count the frequency of each character
        for (int i = 0; i < sArr.length; i++) {
            freq[sArr[i] - 'a']++;
        }

        // create a priority queue and store the frequency of each character
        PriorityQueue<Integer> priorityQueue = new PriorityQueue<>(Collections.reverseOrder());
        for (int i = 0; i < 26; i++) {
            if (freq[i] != 0)
                priorityQueue.add(freq[i]);
        }

        // remove k characters
        for (int i = 0; i < k; i++) {
            // remove the character with maximum frequency
            int curr = priorityQueue.poll();
            // reduce the frequency
            curr--;
            if (curr != 0)
                priorityQueue.add(curr);
        }

        int minSum = 0;

        // min sum is the sum of squares of elements in priority queue
        for (Integer i : priorityQueue) {
            minSum += (i * i);
        }

        return minSum;
    }

    public static void main(String[] args) {
        // Example 1
        String str1 = "aabcd";
        int k1 = 2;
        System.out.println(minSumSquares(str1, k1));

        // Example 2
        String str2 = "aabbcc";
        int k2 = 2;
        System.out.println(minSumSquares(str2, k2));

        // Example 3
        String str3 = "aaaaabbxd";
        int k3 = 1;
        System.out.println(minSumSquares(str3, k3));
    }
}
3
6
22

C++ Code

#include<bits/stdc++.h> 
using namespace std; 

int minSumSquares(string str, int k) {
    int minSum = 0;
    
    // create a freq array, with initial value as 0
    int freq[26];
    for (int i = 0; i < 26; i++) {
        freq[i] = 0;
    }
    
    // traverse in the original string and calculate the frequency of every character
    for (int i = 0; i < str.length(); i++) {
        freq[str[i] - 'a']++;
    }
    
    // create a priority queue and store the frequency of each character
    priority_queue<int> pq;
    for (int i = 0; i < 26; i++) {
        if (freq[i] != 0) {
            pq.push(freq[i]);
        }
    }
    
    // remove k characters
    for (int i = 0; i < k; i++) {
        // remove the character with maximum frequency
        int curr = pq.top();
        pq.pop();
        // reduce the frequency
        curr--;
        if (curr != 0) {
            pq.push(curr);
        }
    }
    
    // min sum is the sum of squares of elements in priority queue
    while (!pq.empty()) {
        minSum += (pq.top() * pq.top());
        pq.pop();
    }
    
    return minSum;
}

int main() {
    // Example 1
    string str1 = "aabcd";
    int k1 = 2;
    cout<<minSumSquares(str1, k1)<<endl;

    // Example 2
    string str2 = "aabbcc";
    int k2 = 2;
    cout<<minSumSquares(str2, k2)<<endl;

    // Example 3
    string str3 = "aaaaabbxd";
    int k3 = 1;
    cout<<minSumSquares(str3, k3)<<endl;
    
    return 0;
}
3
6
22
Translate »