Thursday, July 02, 2015

Tell if a List is a subset of another List

Problem:
two sorted list, please tell if one contains another.



Solutions:
1. Cheating: using Java's containsAll function in List
2. Seriously, this concept should be applied to other type of "list", such as array.



Analysis:
Do not mix this up with substring problem.  In substring, all elements have to be adjacent. in this problem, it only cares if one contains another.



2.1 sort the list the first, then from there, find each element in subset in list that is searched on.


 public static boolean isSubset(List<Integer> first, List<Integer> subset){
       if (subset==null)return true;
       if (first ==null) return false;

       int sCursor = 0;
       int fCursor = 0;
      
       while(sCursor<subset.size() && fCursor<first.size()){
        if (first.get(fCursor)>( subset.get(sCursor)))
          return false;
        else if (first.get(fCursor)<( subset.get(sCursor)))
          fCursor++;
 //then there is an equal.
        else{
//try to work on next element in subset,for that element, keep increasing/searching 
//cursor in super set to find a match
          sCursor++;

        }
       }
       if (sCursor==subset.size())
         return true;
      
       return false;
 }

2.2 how to do it if lists are not sorted.
Well we can sort them and then use 2.1. But let's be  realistic and have a real solution.
Off top of my head, we can loop through subset, for each one of elements, search it in searched list. O(m8n)==O(n^2) time. Naive but should work. I like the sorting idea.

A little bit improving, let's sort the list to be searched, then keeping the first loop of subset, using binary search on sorted searched list.  It'd be O(nlogn), still worse then 2.1, but better than first solution in 2.2

3. Resort to a Hashset
put elements in list to be searched to hashset, do not bother to store the duplicated values.
then loop through searching list to see if every element is in the hashset. return false if not in hashset, otherwise, after looping, return true. still linear time, but more space. This does not require the list to be sorted. and it more common solution to this kind of problems.


public static boolean isSubsetCommon2(int[] first, int[] subset){
    if (subset==null)return true;
    if (first ==null) return false;
    Set<Integer> superSet = new HashSet<Integer>();
    //this can be wirtten as a for loop to add array elements to set
    
    for(int i=0;i<first.length;i++){
     superSet.add(first[i]);
    }
    for (Integer o:subset){
     if (!superSet.contains(o))return false;
    }
    return true;
}

4. brute force approach
loop collection to search in
    loog collection of subset
        if there  is a match, break
        if reaching to end of subset, means no matching, return false
reach this point, return true

O(m*n)

No comments: