Longest Subsequence of a String

Longest Subsequence of a string is the longest chronological sequence of the characters in a string. For example, if “abcdggabcdeggabcdefghifklmabc” is a string then the longest subsequence of that string is, “abcdggabcdeggabcdefghifklmabc”. The following code shows how to find it using java.

public class LongestSubsequence {
    StringBuffer sub_sequ = new StringBuffer();
    List<String> sub_sequences = new ArrayList<>();
    int index = 0;
    public List<String> sequence(String s) {
        for (int i = 0; i < s.length(); i++) {
            for (char j = 'a'; j <= 'z'; j++, i++) {
                if (s.charAt(i) == j) {
                    sub_sequ.append(s.charAt(i));
                } else {
                    break;
                }
            }
            sub_sequences.add(index, sub_sequ.toString());
            sub_sequ.delete(0,sub_sequ.length());
        }
        sub_sequences.removeAll(Arrays.asList("",null));
        return sub_sequences;
    }
    

    public String longestSubsequence(List<String> sub_sequences) {
       Collections.sort(sub_sequences);
       return sub_sequences.get(sub_sequences.size()-1);
    } 
}

Quicksort with Lomuto partition scheme

An efficient sorting Algorithm, which follows the divide-and-conquer strategy. Partitioning is the center of attention in this algorithm. Partitioning is the process of dividing, and sorting a list of data. It is done through a selection of pivot. A pivot is the key value, using which the list of data is divided into two major sections in each iteration. But in Quicksort, data is sorted based on the pivot value.

Partitioning: reorder the array so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation. – quicksort

Lomuto partitioning scheme is one of the two well partitioning scheme. Another one is Haore partitioning scheme. In this article we will see how the quicksort can be implemented using the Lomuto partitioning scheme.

def quicksort(A, lo, hi):
 if(lo < hi):
       p = partition(A, lo, hi)
       quicksort(A, lo, p-1)
       quicksort(A, p+1, hi)
 return A
 

def partition(A, lo, hi):
 pivot = A[hi]
 i = lo
 for j in range(lo, hi):
       if(A[i]<pivot):
              temp = A[j]
              A[j] = A[i]
              A[i] = temp
              i = i+1
 temp = A[hi]
 A[hi] = A[i]
 A[i] = temp
 return i 

A = [0,32,11,0,67,34]
print(quicksort(A, 0, len(A)-1))
output:
[0, 0, 11, 32, 34, 67]       

Time complexity :

Worst-case : O(n^2); Average-case: O(nlogn); Best-case: O(nlogn)

Why is “dynamic programming?” & the key notes

Dynamic programming is the most influential paradigm in algorithm. It solves multiple key problems of computing.

Why is dynamic programming?

  • To reduce the function call (improves the time, and space complexity).
  • Breaks the whole problem into sub-problems (:only if the sub problems are similar to the whole problem, i.e. problems like Fibonacci series where each function call does the same thing).
  • Re-using the subproblems.
  • Anything else!?

Key Points for Dynamic Programming:

  • Recursion
  • Memoization (using the sub-solutions)

Further reading : dynamic programming

Bubble Sort in Moderation

Bubble sort is one of the lousy kinds of sorting approach available. But still it can perform very decent in moderation. The basic kind of bubble sort is:

def bubbleSort(A, n):
    for i in range(0, n-1):
        for j in range(0, n-1):
            if(A[j] > A[j+1]):
                temp = A[j]
                A[j] = A[j+1]
                A[j+1] = temp
    return A

If we look at the algorithm above we find that the second loop lapping over the iterated part of sorted elements, which increases the time complexity, so we can reduce the time complexity by reducing the iteration.

def bubbleSort(A, n):
    for i in range(0, n-1):
        for j in range(0, n-i-1):
            if(A[j] > A[j+1]):
                temp = A[j]
                A[j] = A[j+1]
                A[j+1] = temp
    return A

Comb Sort

In each run “gap” based sorting takes place in the algorithm. In each run, the sorting goes through the list over a distance of (0 to gap/shrink) on two elements at a time. To note, shrink value is a gap factor, which defines the gap between two elements in each iteration. And the gap between two elements will never be < 1, or we can say the list, or array becomes sorted when the gap value is equal to 1.

import math

def combsort(A):
    gap = len(A)
    shrink = 1.3 // most ideal gap shrink factor
    _sorted = False

    while(_sorted is False):
        gap = math.floor(gap/shrink)
        if(gap>1):
            _sorted = False
        else:
            gap = 1
            _sorted = True 
        i = 0
        while(i + gap < len(A)):             if(A[i]>A[i+1]):
                temp = A[i]
                A[i] = A[i+1]
                A[i+1] = temp
                _sorted = False
            i = i + 1
    return A

Graphical Demonstration of the Comb sort:

Shell Sort:

Shell Sort is almost the same as the comb sort except the shrink value, which is 2 instead of 1.3.

Binary Search

Binary Search follows the divide-and-conquer strategy. It’s effective when we need to find a key in a Large data set.

How: In a binary search operation the data set is divided into two parts in the first step. Then if the target value is lower then mid value, all the upper values will be ignored. It will ignore the same lower values in case of the opposite. And this procedure will keep going until there is one value. If it’s the target value then function will return its position, otherwise -1 will be returned.  By the way, the data set has to be sorted before we can apply this algorithm.

Binary Search algorithm can either be iterative, or recursive. Let’s see how they look in Java.

The recursive approach:

recursive_binary

The iterative approach:

iterative_binary

Pseudocode, and Performance Analysis

Pseudocode

Algorithm is a set of instructions written before the actual code is produced. A code is not the identical representation of the algorithm. It comes in the form of instructions which indeed resembles a programming language. It is known as the pseudocode. For an example let’s look at the pseudocode of linear search:

Algorithm linearSearch(A, n, t) 
// A is an array of size n, and t is the key to find
{
      for i:= 0 to n-1 do {
            if(A[i] := t)then
                  break;
      }
      if(i>n)then
         return -1;
      else
         return i;
}

The above pseudocode shows that a target value (t) is searched through an array (A) of size n.

Performance Analysis

The designed algorithm should be able to answer few questions, such as,

  • Does it solve the problem?
  • Cost of the solution in terms of time, and space.

In this discussion I will solely emphasize on the time, and space complexity.

Time Complexity

We can define time complexity of an algorithm is so many different ways, but in a few words, it is determined by the steps it takes to compute the function. In the above example we see that there is one iteration cycle to search through an array of size n. If we represent this in a form of asymptotic notation, it is O(n) the time complexity of this algorithm. It means that, for a data set of size n, it will iterate n times.

Space Complexity

Space complexity depends on the amount of space an algorithm needs to accomplish the computation. If an algorithm needs a fixed amount of memory to compute a task then the space complexity of that algorithm is constant, Sp = O(1). On the other hand, ignoring the constant variables we find that, the input size is n in the above mentioned pseudocode. And the algorithm is iterating till the last value of n; therefore, the space complexity of this algorithm is, Sp = O(n).

Now, if we want to make it a precise calculation then we have to take the constant parts into consideration. Each variable needs 2 bytes space in the memory. After calculating both constant, and auxiliary parts we get,

2*n bytes for the array A[] 
2 bytes for the variable i
2 bytes for the return keyword
Therefore, it will take (2n + 4) bytes memory for the algorithm to 
complete its execution.