0

Hide all alphanumerics and symbols in Japanese text using regular expression?

Question:

Is there a way not to extract alphanumerics and mathematical symbols from Japanese texts for translation?
This is for MT with no post-edit and we noticed that MT sometimes cannot handle mathematical formula and English texts properly when translating from Japanese into English.

Here are examples of texts I would like to hide:

X: 3σ ≦ 12μm
  [Target: 8.5μm]
Y: 3σ ≦ 12μm
  [Target: 8.5μm]

Answer:

use this regular expression:

^[\s\p{Ll}\p{Lu}\p{Lt}\p{P}\p{S}\p{Z}\p{N}]+$

in the "Do not translate" tab



0 comments

Please sign in to leave a comment.