Clean up of translation memories? Thread poster: Peter Berntsen (X)
| Peter Berntsen (X) Sweden Local time: 20:41 English to Swedish + ...
Does anyone have any ideas on how to go about cleaning up a large TM that contains many old terms and incorrect translations? Where do you start? | | | Just Edit/Delete Specific Words | Feb 5, 2015 |
If I had to clean my translation memory (eg. if someone paid me to do that), I would do it by searching for specific words/phrases that I consider wrong (incorrectly translated). I would find all instances of those words/phrases in my TM and either edit or delete them manually. I would do it directly in my TM environment. Based on my experience, both SDL Trados and MemoQ allow you to do that.
However, like I said, it would take someone paying me to do that. Normally, the idea of cl... See more If I had to clean my translation memory (eg. if someone paid me to do that), I would do it by searching for specific words/phrases that I consider wrong (incorrectly translated). I would find all instances of those words/phrases in my TM and either edit or delete them manually. I would do it directly in my TM environment. Based on my experience, both SDL Trados and MemoQ allow you to do that.
However, like I said, it would take someone paying me to do that. Normally, the idea of cleaning a large TM seems non-productive to me. Even if you only clean out specific words, it takes significant amounts of time. If you want to clean the whole thing, it will take forever. Depending on what you consider large, it may take ages (like months) to just look through a large TM (again, I would do it in my TM environment). And the benefits of having a clean TM (or not having incorrect translations in your TM) seem so insignificant they do not justify the effort to my eyes. ▲ Collapse | | | Open ended question | Feb 5, 2015 |
It depends on what you want/need to do. If there are specific (incorrect) terms that you need to get rid of, and simply deleting the affected segments is an acceptable solution, then you may be able to do it fairly painlessly. If you want to fix poor translations of varying types, that will take a lot of work.
For scenario 1, you could use some kind of TM manager to batch delete segments. You can also do it with the newest version of TMLookup (link in TMLookup thread), but I ... See more It depends on what you want/need to do. If there are specific (incorrect) terms that you need to get rid of, and simply deleting the affected segments is an acceptable solution, then you may be able to do it fairly painlessly. If you want to fix poor translations of varying types, that will take a lot of work.
For scenario 1, you could use some kind of TM manager to batch delete segments. You can also do it with the newest version of TMLookup (link in TMLookup thread), but I would suggest using a TM editor instead. I believe heartsome TMX editor and Olifant are popular choices. I don't use them so I can't give specific instructions, but someone else surely will. You will probably need to 1) export the TM to TMX. 2) open that TMX with your TM editor of choice, do the automated or manual changes you want, save the TMX and 3) create a new TM in your CAT and import the modified TMX. ▲ Collapse | | | Silvio Picinini United States Local time: 11:41 English to Portuguese + ... Criteria to decide cleanup | Aug 18, 2017 |
Hi,
Farkas above has pointed you to how to do it, Okapi Olifant is great, Heartsome I don't know but should work, and cleant TMs in CAT tools is usually painful, so prefer a TM editor. However, before the "how" there is the "should I do it". How do you know if you have lots of errors? I am interested in this topic if people want to share ideas.
First I would take some or the QA checks that can be done with QA tools like Verifika and Xbench, or with the QA features in CAT tools. You ... See more Hi,
Farkas above has pointed you to how to do it, Okapi Olifant is great, Heartsome I don't know but should work, and cleant TMs in CAT tools is usually painful, so prefer a TM editor. However, before the "how" there is the "should I do it". How do you know if you have lots of errors? I am interested in this topic if people want to share ideas.
First I would take some or the QA checks that can be done with QA tools like Verifika and Xbench, or with the QA features in CAT tools. You can find out if your have lots of inconsistent segments, if you are following your glossary, and a variety of other things. You may have specific regular expressions about mandatory things from your customer (like "our slogan should not be translated"). You can apply that to the TM and find segments that do not comply, maybe because they were created before the rule was established.
Once you find these errors, you have a number for them. Then consider if it is significant or not and decide if the cleanup is needed.
I am also interested in criteria that is specific to TMs (different from the checks above that can be applied to the content that you just translated. I wonder if a purge on the TM for older segments is a good idea. Also, if you (actually your end client) have TMs for obsolete products, should you recommend that they remove those segments from the TM?
I would appreciate to hear about it.
Thanks ▲ Collapse | |
|
|
CafeTran Training (X) Netherlands Local time: 20:41 Correct your TM on the fly | Aug 20, 2017 |
Peter Berntsen wrote:
Does anyone have any ideas on how to go about cleaning up a large TM that contains many old terms and incorrect translations? Where do you start?
Like others have written here, the correction of a TM can be very time consuming.
In CafeTran you have this nice feature to make changes via Find and Replace simultaneously in the project and in the translation memories attached to that project.
Whenever I encounter a wrong term, typo etc. in my legacy TM (and in the project segments that have been populated from this legacy TM), I make sure that the cursor is placed in the incorrect word, press Cmd+F (Ctrl+F), type the correct replacement word and make sure that the correct Scope radio buttons are selected.
CafeTran will make sure that the case of the replacement string is automatically adapted (when the corresponding checkbox is selected):
While working in the translation project, I can also remove different translations for the same source segment (to condense the TM and possibly avoid the use of different target terms):
CafeTran also offers a full-fledged TMX editor, that allows you to execute its QA tasks, there are two tasks here that are especially useful:
In this QA mode you can also perform a spell check or a check for the use of forbidden words/the use of correct terminology from a glossary. | | | CafeTran Training (X) Netherlands Local time: 20:41 | I_CH Local time: 20:41 German to Italian + ... No "Remove" options | Jun 4 |
Hi CafeTran Training,
I have just downloaded the application but unfortunately I do not see any of the "Remove" options you mentioned under the "Task" menu.
Have I done sthg wrong?
KR
I_CH
| | | Natalie Poland Local time: 20:41 Member (2002) English to Russian + ... Moderator of this forum SITE LOCALIZER | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Clean up of translation memories? Pastey | Your smart companion app
Pastey is an innovative desktop application that bridges the gap between human expertise and artificial intelligence. With intuitive keyboard shortcuts, Pastey transforms your source text into AI-powered draft translations.
Find out more » |
| CafeTran Espresso | You've never met a CAT tool this clever!
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |