Java Regex 2 - Duplicate Words

Sort by

recency

|

389 Discussions

|

  • + 0 comments

    With this code I got the same expected results except for a mysterious character at the end of the last test case. It is neither a whitespace or unprintable char. It makes no sense for me.

    import java.util.Scanner;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class DuplicateWords {
    
        public static void main(String[] args) {
    
            String regex = "\\b(\\w+)\\s+\\1\\b";
            Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
    
            Scanner in = new Scanner(System.in);
            int numSentences = Integer.parseInt(in.nextLine());
            
            while (numSentences-- > 0) {
                String input = in.nextLine();
                Matcher m = p.matcher(input);
                
                // Check for subsequences of input that match the compiled pattern
                while (m.find()) {
                    input = input.replaceAll("(?i)"+regex,"$1");
                    //m = p.matcher(input);
                }
                
                // Prints the modified sentence.
                System.out.print(input+'\n');
            }
    			
            in.close();
        }
    }
    
  • + 0 comments

    String regex = "\b(\w+)(\s+\1\b)+"; Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

        Scanner in = new Scanner(System.in);
        int numSentences = Integer.parseInt(in.nextLine());
    
        while (numSentences-- > 0) {
            String input = in.nextLine();
    
            Matcher m = p.matcher(input);
    
            // Check for subsequences of input that match the compiled pattern
            while (m.find()) {
                input = input.replaceAll(m.group(), m.group(1));
            }
    
            // Prints the modified sentence.
            System.out.println(input);
    
  • + 0 comments

    my solution: 1. \b - is a word delimiter 2. \w+ - any word, letter, digit or underscore 3. \s+ - blank spaces 4. \1 - back reference (anything captured by (\w+))

    public static void main(String[] args) {
    
            String regex = "\\b(\\w+)(\\s\\1)+\\b";
            Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE /* Insert the correct Pattern flag here.*/);
    
            Scanner in = new Scanner(System.in);
            int numSentences = Integer.parseInt(in.nextLine());
            
            while (numSentences-- > 0) {
                String input = in.nextLine();
                
                Matcher m = p.matcher(input);
                
                // Check for subsequences of input that match the compiled pattern
                while (m.find()) {
                    input = input.replaceAll(m.group()/* The regex to replace */, m.group(1)/* The replacement. */);
                }
                
                // Prints the modified sentence.
                System.out.println(input);
            }
            
            in.close();
        }
    
  • + 0 comments

    Is this what we really want? I think that this paragraph should be improved

    3. Write the two necessary arguments for replaceAll such** that each repeated word is replaced with the very first instance the word found in the sentence**. It must be the exact first occurrence of the word, as the expected output is case-sensitive.

  • + 0 comments

    public static void main(String[] args) {

        String regex = "\\b(\\w+)(?: +\\1)+\\b";
        Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
    
        Scanner in = new Scanner(System.in);
        int numSentences = Integer.parseInt(in.nextLine());
        int i=0;
        while (i<numSentences) {
            String input = in.nextLine(); 
            Matcher m = p.matcher(input);
            input = m.replaceAll("$1");
    
            // Prints the modified sentence.
            System.out.println(input);
            i++;
        }
    
        in.close();
    }