How to divide a set into two subsets such that difference between the sum of numbers in two sets is minimal? How to divide a set into two subsets such that difference between the sum of numbers in two sets is minimal? arrays arrays

How to divide a set into two subsets such that difference between the sum of numbers in two sets is minimal?


The decision version of the problem you are describing is an NP-complete problem and it is called the partition problem. There are a number of approximations which provide, in many cases, optimal or, at least, good enough solutions.

The simple algorithm you described is a way playground kids would pick teams. This greedy algorithm performs remarkably well if the numbers in the set are of similar orders of magnitude.

The article The Easiest Hardest Problem, by American Scientist, gives an excellent analysis of the problem. You should go through and read it!


No, that doesn't work. There is no polynomial time solution (unless P=NP). The best you can do is just look at all different subsets. Have a look at the subset sum problem.

Consider the list [0, 1, 5, 6]. You will claim {0, 5} and {1, 6}, when the best answer is actually {0, 1, 5} and {6}.


No, Your algorithm is wrong. Your algo follows a greedy approach.I implemented your approach and it failed over this test case:(You may try here)

A greedy algorithm:

#include<bits/stdc++.h>#define rep(i,_n) for(int i=0;i<_n;i++)using namespace std;#define MXN 55int a[MXN];int main() {    //code    int t,n,c;    cin>>t;    while(t--){        cin>>n;        rep(i,n) cin>>a[i];        sort(a, a+n);        reverse(a, a+n);        ll sum1 = 0, sum2 = 0;        rep(i,n){            cout<<a[i]<<endl;            if(sum1<=sum2)                 sum1 += a[i];             else                 sum2 += a[i];         }        cout<<abs(sum1-sum2)<<endl;    }    return 0;}

Test case:

18 16 14 13 13 12 10 9 3Wrong Ans: 616 13 10 914 13 12 3Correct Ans: 016 13 13 314 12 10 9

The reason greedy algorithm fails is that it does not consider cases when taking a larger element in current larger sum set and later a much smaller in the larger sum set may result much better results. It always try to minimize current difference without exploring or knowing further possibilities, while in a correct solution you might include an element in a larger set and include a much smaller element later to compensate this difference, same as in above test case.

Correct Solution:

To understand the solution, you will need to understand all below problems in order:

My Code (Same logic as this):

#include<bits/stdc++.h>#define rep(i,_n) for(int i=0;i<_n;i++)using namespace std;#define MXN 55int arr[MXN];int dp[MXN][MXN*MXN];int main() {    //code    int t,N,c;    cin>>t;    while(t--){        rep(i,MXN) fill(dp[i], dp[i]+MXN*MXN, 0);        cin>>N;        rep(i,N) cin>>arr[i];        int sum = accumulate(arr, arr+N, 0);        dp[0][0] = 1;        for(int i=1; i<=N; i++)            for(int j=sum; j>=0; j--)                dp[i][j] |= (dp[i-1][j] | (j>=arr[i-1] ? dp[i-1][j-arr[i-1]] : 0));        int res = sum;        for(int i=0; i<=sum/2; i++)            if(dp[N][i]) res = min(res, abs(i - (sum-i)));        cout<<res<<endl;    }    return 0;}