We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
  • Hackerrank Home
  • Prepare
    NEW
  • Certify
  • Compete
  • Career Fair
  • Hiring developers?
  1. Prepare
  2. Algorithms
  3. Strings
  4. Similar Strings
  5. Discussions

Similar Strings

Problem
Submissions
Leaderboard
Discussions
Editorial

Sort 28 Discussions, By:

votes

Please Login in order to post a comment

  • shreshtha_sarka1
    6 years ago+ 1 comment

    can somebody explain the question properly ?

    20|
    Permalink
  • dzmitryc
    4 years ago+ 0 comments

    keep in mind that all templates are wrong and producing runtime error, as they're not reading the source string, which is line between n, q and queries

    18|
    Permalink
  • cguerrer
    6 years ago+ 1 comment

    Does anyone know the runtime of the most optimal solution?

    4|
    Permalink
  • chiaracoetzee
    5 years ago+ 0 comments

    In my solution I didn't end up relying on sorting, only on hashing, to help reduce the set of string positions that I need to examine. Then I do a full, expensive verify for each of those positions. It passes the tests but just barely. Room for improvement.

    #include <vector>
    #include <iostream>
    #include <unordered_map>
    using namespace std;
    
    // Two subranges that are similar will always give same hash
    int hash_subrange_similar(const string& s, int start, int len) {
        unsigned long hash = 5381;
        int similar_map[11] = {0};
        int count = 1;
        for (int i=start; i < start + len; i++) {
            int c = s[i];
            if (similar_map[c] == 0) {
                similar_map[c] = count;
                count++;
            }
            hash = hash * similar_map[c] + 33;
        }
        return (int)hash;
    }
    
    // Test if s[i..i+len] and s[j..j+len] are similar substrings
    bool are_similar(const string& s, int i, int j, int len) {
        int similar_map_i[11] = {0};
        int similar_map_j[11] = {0};
        for (int k=0; k < len; k++) {
            int ci = s[i + k];
            int cj = s[j + k];
            if (similar_map_i[ci] == 0 && similar_map_j[cj] == 0) {
                similar_map_i[ci] = cj;
                similar_map_j[cj] = ci;
            } else if (similar_map_i[ci] != cj || similar_map_j[cj] != ci) {
                return false;
            }
        }
        return true;
    }
    
    int main() {
        int s_len, num_tests;
        cin >> s_len >> num_tests;
        string s;
        cin >> s;
        
        // Reduce characters to 1-10 range (reserve 0 for "unknown")
        for (int i=0; i < s.length(); i++) {
            s[i] = s[i] - 'a' + 1;
        }
        
        // To speed things up we start by creating an index:
        // we hash the first 12 characters at each position and
        // then track for each hash which positions match that hash.
        const int hash_length = 12;
        unordered_map<int, vector<int>> index;
        for (int i=0; i <= (int)s.length() - hash_length; i++) {
            int hash = hash_subrange_similar(s, i, hash_length);
            index[hash].push_back(i);
        }
        
        for (int i=0; i < num_tests; i++) {
            int start, end;
            cin >> start >> end;
            int query_start = start - 1; // start is 1-based
            int query_len = end - start + 1; // start/end are inclusive
                    
            int count = 0;
            if (query_len >= hash_length) {
                // Hash first 12 chars of query then use index to get possible matching positions
                int hash = hash_subrange_similar(s, query_start, hash_length);
                for (int j : index[hash]) {
                    if (j <= s.length() - query_len) {
                        if (are_similar(s, query_start, j, query_len)) {
                            count++;
                        }
                    }
                }
            } else {
                // Brute force for short queries
                for (int j=0; j <= s.length() - query_len; j++) {
                    if (are_similar(s, query_start, j, query_len)) {
                        count++;
                    }
                }
            }
            cout << count << endl;
        }
        
        return 0;
    }
    
    2|
    Permalink
  • oyoungk10
    6 years ago+ 2 comments

    I don't understand how to not timeout on test cases 11 and up. I downloaded test case 20 with hackos, and just reading in the input and printing it took longer than the alotted 4s test cases are allowed. I'm not sure what I'm doing wrong or if I can even make things faster.

    Edit: I used Java 7, looks like C works though.

    2|
    Permalink
Load more conversations

Need Help?


View editorial
View top submissions
  • Blog
  • Scoring
  • Environment
  • FAQ
  • About Us
  • Support
  • Careers
  • Terms Of Service
  • Privacy Policy
  • Request a Feature