We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
One approach to solve the Longest Common Substring (LCS) problem involves aligning the two strings and iterating over possible alignments to find the longest match. Given two strings, e.g., "abcde" and "efghicd" with LCS "cd", the alignment could look like:
abcdeefghicd
Multiple such alignments are possible. By iterating over these alignments and comparing characters, we can discover the LCS.
Below is the implementation of this method to find LCS:
functionlongestCommonString(a,b){/** * The algorithm works with two strings of varying lengths. * Swap the strings to ensure the shorter string is assigned to the variable 'a' to simplify the logic. * This swap can be omitted in the main challenge since both input strings have the same length. */if(a.length>b.length){[a,b]=[b,a];}letaStart=0;letaEnd=0;letbStart=b.length-1;letresult="";do{// Section A/* This part will be modified to solve the given challenge */lettmp="";letbIndex=bStart;for(leti=aStart;i<=aEnd;++i){constaChar=a[i];constbChar=b[bIndex++];if(aChar===bChar){tmp+=aChar;}// If characters are different or it's the last character, update the result if necessary.if(i===aEnd||aChar!==bChar){if(tmp.length>result.length){result=tmp;}tmp="";}}/* End of Section A */// Slide window to align next characters of the stringsif(aEnd<a.length-1){++aEnd;}elseif(bStart===0&&aStart<a.length){++aStart;}if(bStart>0){--bStart;}}while(aStart!==a.length);returnresult;}
In the algorithm, we use variables aStart, aEnd, and bStart to manage slices of the strings a and b for each alignment. The tmp string accumulates matching characters until we encounter a mismatch or reach the end. If tmp's length surpasses the current result, we update the result.
Given two strings, the challenge is to find the length of the longest common substring that has no more than 'k' differing characters. We approach this problem by adapting our earlier longest common substring algorithm, specifically by replacing Section A with Section B to account for the differences allowed.
Here's how our adapted algorithm works:
functionsubstringDiff(k,a,b){letaStart=0;letaEnd=0;letbStart=b.length-1;letresult=0;do{// Section B/** * 1. Iterate through the slices of `a` and `b`, and save the indices of characters that differ in `diffIndices`. * 2. If the length of `diffIndices` is less than or equal to k, it means the whole substring meets the criteria. * 3. If not, we iterate through `diffIndices`, calculating the length of the substring that contains * exactly `k` differing characters. */letdiffIndices=[];letlastDiffPosition=aStart;// Beginning of the substring containing k differences.letbIndex=bStart;for(leti=aStart;i<=aEnd;++i){constaChar=a[i];constbChar=b[bIndex++];if(aChar!==bChar){diffIndices.push(i);}}if(diffIndices.length<=k){result=Math.max(result,aEnd-aStart+1);}else{for(leti=0;i<=diffIndices.length-k;++i){result=Math.max(result,/** * Calculate the length of the substring with 'k' differing characters: * - `diffIndexes[i + k]`: Represents the ending index of our substring, which is not included. * There are 'k' differences between the indices `diffIndices[i]` and `diffIndices[i + k] - 1`. * - The subtraction of indices (like `A - B`) is used to determine the length of a substring * starting from index B and ending before index A. E.g., for a difference of 5-2, the substring * covers indices 2, 3, and 4. * - `aEnd + 1`: Adjusts to ensure the last character in the substring is included. */(diffIndices[i+k]??aEnd+1)-lastDiffPosition);lastDiffPosition=diffIndices[i]+1;// Update for the next substring starting position.}}// Proceed to the next slice.if(aEnd<a.length-1){++aEnd;}elseif(bStart===0&&aStart<a.length){++aStart;}if(bStart>0){--bStart;}}while(aStart!==a.length);returnresult;}
To understand the algorithm:
We start by aligning the two strings and sliding them against each other, similar to our earlier approach for finding the longest common substring.
For each alignment, we compute the number of differing characters (saved in diffIndices).
If the total differences are less than or equal to 'k', it's straightforward.
Otherwise, we use a sliding window on diffIndices to find the longest substring that has exactly k differences.
We continue this process, shifting the alignment, until we've explored all possible alignments.
To determine the time complexity of the algorithm, we need to focus on the number of iterations our inner loop performs.
Given two strings a and b with lengths n = |a| and m = |b| respectively, and assuming n <= m, the number of iterations in the inner loop can be captured as:
The sum of the first n integers for the increasing overlap of a with b: 1 + 2 + ... + n which equals n(n+1)/2.
n(m-n) iterations for the middle section where a slides over b with a constant length of n.
The sum of integers from n-1 to 1 for the decreasing overlap of a as it exits b: n-1 + n-2 + ... + 1 which equals n(n-1)/2.
Substring Diff
You are viewing a single comment's thread. Return to all comments →
One approach to solve the Longest Common Substring (LCS) problem involves aligning the two strings and iterating over possible alignments to find the longest match. Given two strings, e.g., "abcde" and "efghicd" with LCS "cd", the alignment could look like:
Multiple such alignments are possible. By iterating over these alignments and comparing characters, we can discover the LCS.
Below is the implementation of this method to find LCS:
In the algorithm, we use variables
aStart
,aEnd
, andbStart
to manage slices of the stringsa
andb
for each alignment. Thetmp
string accumulates matching characters until we encounter a mismatch or reach the end. Iftmp
's length surpasses the currentresult
, we update theresult
.Given two strings, the challenge is to find the length of the longest common substring that has no more than 'k' differing characters. We approach this problem by adapting our earlier longest common substring algorithm, specifically by replacing Section A with Section B to account for the differences allowed.
Here's how our adapted algorithm works:
To understand the algorithm:
diffIndices
).diffIndices
to find the longest substring that has exactlyk
differences.To determine the time complexity of the algorithm, we need to focus on the number of iterations our inner loop performs.
Given two strings
a
andb
with lengthsn = |a|
andm = |b|
respectively, and assumingn <= m
, the number of iterations in the inner loop can be captured as:n
integers for the increasing overlap ofa
withb
:1 + 2 + ... + n
which equalsn(n+1)/2
.n(m-n)
iterations for the middle section wherea
slides overb
with a constant length ofn
.n-1
to1
for the decreasing overlap ofa
as it exitsb
:n-1 + n-2 + ... + 1
which equalsn(n-1)/2
.Combining these:
n(n+1)/2 + n(m-n) + n(n-1)/2
=n^2 + nm - n^2
=nm
Therefore, the time complexity of the algorithm is O(nm).