0

Changing SRX Segmentation Rules

Hi everyone,

everytime I enter a new text for translation, when WB does its segmentation, I always end up with a great deal of tiny segments thanks to abreviations such as Mr.,  Mrs. or art. and such appear (In portuguese, Sr. Sra. and so on), and I was wondering how I could edit the SRX segmentation rules file (which we can get under Settings->segmentation rules), and successfully (re)upload it.

I have been able to extract the file (Portuguese(pt) for me), understand (at least I think I did) how the rules work (the xml language) but I am not sure whether in the end I should load a .txt document (which I uploaded, but nothing really changed) or a .xml (which I tried and got a message saying: "The uploaded file does not comply with SRX standards") or any other type of document in order for this operation to work.

could you be so kind as to help me on this matter? Right now I don't know anymore if the problem is the file extension I use or the xml language that is not correct... or am I doing all the necessary steps to complete this operation?

Thank you for your help!

5 comments

Please sign in to leave a comment.