Monday, April 9, 2018

Finding Japanese Characters in Java String


import java.util.HashSet;
import java.util.Set;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class JapneseTest {

       public static void main(String[] args) {
      
              Set<Character.UnicodeBlock> japaneseUnicodeBlocks = new HashSet<Character.UnicodeBlock>() {{
                  add(Character.UnicodeBlock.HIRAGANA);
                  add(Character.UnicodeBlock.KATAKANA);
                  add(Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS);
              }};

              String mixed = "This is a Japanese newspaper headline: ラドクリフ、マラソン五輪代表に1m出場にも含み";

              for (char c : mixed.toCharArray()) {
                  if (japaneseUnicodeBlocks.contains(Character.UnicodeBlock.of(c))) {
                      System.out.println(c + " is a Japanese character");
                  } else {
                      System.out.println(c + " is not a Japanese character");
                  }
              }     
      
       }
}

T is not a Japanese character
h is not a Japanese character
........
is a Japanese character
is a Japanese character
......

No comments:

Post a Comment

Recent Post

Databricks Delta table merge Example

here's some sample code that demonstrates a merge operation on a Delta table using PySpark:   from pyspark.sql import SparkSession # cre...