We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
  • Practice
  • Certification
  • Compete
  • Career Fair
  • Hiring developers?
  1. Practice
  2. Java
  3. Strings
  4. Tag Content Extractor
  5. Discussions

Tag Content Extractor

Problem
Submissions
Leaderboard
Discussions
Editorial

Sort 141 Discussions, By:

votes

Please Login in order to post a comment

  • RodneyShag 4 years ago+ 0 comments

    Java solution - passes 100% of test cases

    From my HackerRank solutions.

    import java.util.Scanner;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    /* Solution assumes we can't have the symbol "<" as text between tags */
    public class Solution{
        public static void main(String[] args){
            Scanner scan = new Scanner(System.in);
            int testCases = Integer.parseInt(scan.nextLine());
            
            while (testCases-- > 0) {
                String line = scan.nextLine();
                
                boolean matchFound = false;
                Pattern r = Pattern.compile("<(.+)>([^<]+)</\\1>");
                Matcher m = r.matcher(line);
    
                while (m.find()) {
                    System.out.println(m.group(2));
                    matchFound = true;
                }
                if ( ! matchFound) {
                    System.out.println("None");
                }
            }
        }
    }
    

    Let me try to explain the regular expression:

    <(.+)>
    

    matches HTML start tags. The parentheses save the contents inside the brackets into Group #1.

    ([^<]+)
    

    matches all the text in between the HTML start and end tags. We place a special restriction on the text in that it can't have the "<" symbol. The characters inside the parenthesis are saved into Group #2.

    </\\1>
    

    is to match the HTML end brace that corresponds to our previous start brace. The \1 is here to match all text from Group #1.

    259|
    Permalink
  • sunrav8586 5 years ago+ 0 comments
          int count=0;
             Pattern r = Pattern.compile("<(.+?)>([^<>]+)</\\1>");
             Matcher m = r.matcher(line);
             while(m.find()) {
                 if (m.group(2).length() !=0) {
                     System.out.println(m.group(2));
                 count++;
                 }
             }
             if (count == 0) System.out.println("None");
    
    18|
    Permalink
  • mschonaker 5 years ago+ 0 comments

    Is a goal of this site to promote good programming practices? It should: http://stackoverflow.com/a/1732454/368544

    14|
    Permalink
  • fnhckr 3 years ago+ 0 comments

    I don't like this challenge. It has at least three (!) unstated constraints on the input:

    1. the content can not be an empty string
    2. the tag name can not be an empty string
    3. tags itself can not be content in all cases

    To make clear what I mean, let me show you how the outputs should look like according to the challenge rules.


    Empty content:

    input:

    <a></a>
    

    output:

    
    

    So an empty line instead of None.


    Empty tag name:

    input:

    <>abc</>
    

    output:

    abc
    

    So abc and instead of None.


    Tags as content:

    input:

    <a>...</a>...</a>
    

    output:

    ...
    ...</a>...
    

    So two lines instead of one. In the first line the first </a> is interpreted as closing tag. In the second line the first </a> is interpreted as part of the content and the second as closing tag.


    Please repair this challenge.

    11|
    Permalink
  • rohit_ntil 5 years ago+ 0 comments

    This is my solution . Clears all test cases .

       String pattern ="\\<(.+)\\>([^\\<\\>]+)\\<\\/\\1\\>";
    
       int count = 0;
    
        Pattern p = Pattern.compile(pattern);
        Matcher m =  p.matcher(line);
    
        while(m.find())
        {
            System.out.println(m.group(2));
            count++;
        }
        if(count == 0){
            System.out.println("None");
        }
    
    10|
    Permalink
Load more conversations

Need Help?


View editorial
View top submissions
  • Contest Calendar
  • Blog
  • Scoring
  • Environment
  • FAQ
  • About Us
  • Support
  • Careers
  • Terms Of Service
  • Privacy Policy
  • Request a Feature