Detect the Domain Name

Sort by

recency

|

167 Discussions

|

  • + 0 comments

    Detecting a domain name typically involves parsing the URL string to extract the domain using regex or built-in libraries in languages like Python or JavaScript. According to Wikipedia, a domain name identifies a realm of administrative autonomy within the Internet. In coding challenges, tools like urlparse or regex can simplify this process efficiently. I was practicing with real-world data like food service websites—checking out the Dairy queen menu helped simulate parsing dynamic URLs. It’s a fun way to blend real examples into coding logic while sharpening string manipulation skills.

  • + 0 comments
    html = sys.stdin.read()
    
    pattern = r'https?://(?:www\.|ww2\.)?([\w-]+\.[a-zA-Z0-9.-]+)'
    
    rs = set(re.findall(pattern,html))
    
    print(";".join(sorted(rs)))
    
  • + 0 comments

    import re import sys

    html= sys.stdin.read()

    pattern= r'https?://(?:www.|ww2.)?([a-zA-Z0-9-]+.?[a-zA-Z0-9-]+.[a-zA-Z].?[a-zA-Z])'

    matches=re.findall(pattern,html)

    output=sorted(set(url for url in matches))

    print(';'.join(output))

  • + 0 comments
    import re
    import sys
    
    domain_pattern = re.compile(r'https?://(?:www\.|ww2\.)?([a-zA-Z0-9\-]+\.[a-zA-Z0-9.\-]+)[^a-zA-Z0-9.\-]')
    
    input_stream = sys.stdin
    
    unique_domains = set()
    
    n = int(input_stream.readline().strip())
    
    for _ in range(n):
        line = input_stream.readline().strip()
        for match in domain_pattern.findall(line):
            unique_domains.add(match)
    
    print(';'.join(sorted(unique_domains)))
    
  • + 0 comments

    I think there is a problem with the expected output of Test case 1. If you inspect the inputs (html). You can see these lines:

    ... '//www.googletagservices.com/tag/js/gpt.js'; ... _gaq.push(['_addIgnoredOrganic', 'www.timesofindia.com']); ...

    however, those 2 urls does not appear in the expected output