Welcome, Guest Login

Support Center

Translation Memory

Last Updated: Mar 30, 2017 08:00PM CEST
Translation Memory (TM) is a database composed of translation units (TU). Translation unit is created when source text (i.e. sentences, paragraphs, headings etc.) is translated and gets its equivalent in another language.

Using TMs comes with several benefits. TM speeds up the process of translation; a segment once translated doesn’t have to be translated ever again. For the very same reason, using TM reduces costs. It also helps to maintain consistency of translated documents (e.g. when a project for a specific client isn’t always assigned to the same translator).

To be able to add translation units into a TM, you need to create one first. Once you have created your translation memory, there are three ways how to add translation units to it:
  1. TM importAdd new TU directly from within Memsource Editor when translating. TU is added right after the translated segment is confirmed.
  2. Import TM from other translation tools (in TMX format) or from MS Excel.
  3. Align previously translated documents and import them as XLS into a Memsource translation memory.

Consistent segmentation is crucial for retrieving the best TM match. Segmentation rules in Memsource Cloud correspond with specifics of each supported language and can be customized if needed. However, keep in mind that importing Jobs with poor segmentation (e.g. poorly formatted Word files) or applying customized segmentation can affect retrieved TM match value. Such example can be seen on the picture below. The sentence in the second and third segment was manually broken into two lines. As you can see, the CAT pane shows only 63% match exactly because the second half of the sentence is missing in segment no. 2.

Bad segmentation
There are two ways of selecting a translation memory for a project. You can either create a new one or select an existing one. One project can have up to ten TMs assigned, however only two TMs can be set to the Write mode. Memsource Cloud allows selecting TMs that have reverse source-target language as opposed to the project's languages. E.g. a TM en -> de can be selected for a de -> en project. Reverse TMs can only be selected in the read mode for a project.
There can be up to 10 TMs assigned to each language pair - it means for project with two target languages you can have 20 different TMs. However large quantity of large TMs can slow down Analysis and Pre-translation process.

Language locales
Memsource allows you to add TM to project with the same language but different locale. Generally all languages with the same prefix can be added (en, en_gb, en_uk...). For example, if you create DE_EN TM, you can assign it to both DE-en-GB and DE-en-US projects. Keep in mind that all TUs will be stored as EN only with no distinction between US and GB, therefore using this TM for strictly GB or US projects might be inaccurate.

TM for workflow steps
When selecting TMs for projects with workflow, the user can decide whether the same translation memories should be selected for all of the project's workflow steps, or whether each workflow step should have its own TM setup.
If you are using proofreading on regular basis, it is recommended to:
  1. Create TM for proofreaders only, where only reviewed translation will be stored (TM_REV) and always assign it as READ to Translation step and WRITE in Revision step.
  2. Create TM for translation only (TM_TRA) and assign as WRITE to Translation step and READ in Revision step.
  3. You can even set 2% penalty to TM_TRA, so the matches from TM_REV have always priority (101% matches from TM_TRA will be shown as 99% matches - see "Setting Penalties" bellow.)
  4. After project is completed and all translation is reviewed and saved in TM_REV, you can delete the TM_TRA.

Please note, that you can select a TM for all workflow steps and all languages. Afterwards, you can select one specific TM to one step and one language (if needed). However, it does not work in the reverse order. You cannot select a TM for a specific step and language, and afterwards select a TM for all steps and all languages.
If you assign multiple TMs to a project and need to prefer matches only from one (or few selected TMs), setting penalty is the right feature. As many as ten TMs can be added to a project for each language pair and each can have a different penalty. Such setting will impact pre-translation and analysis.

In the Editors CAT oane, the penalized matches will be displayed in a new order (101% match penalized by 2% will become 99% match), with a little arrow indicating the penalization.
CAT pane penalized match
  • 101% match – In-context match
CAT pane showing matchesAn in-context match is a segment, which exactly matches a segment already stored in translation
memory, including the context. These 101% will be always overwritten in TM, when the segment is confirmed in the editor. There are 3 types of In-context match, which can be set for jobs (see TM Match Context and Optimization):
  1. Preceding and following segment - default settings.
  2. ID context - based on segment's key, available for specific file formats only).
  3. No context - only the source and target will be searched and saved in TM.
In-context matches get displayed in Memsource as 101% matches. To speed up translation, it is advisable to pre-translate 101% matches and even have them marked as confirmed. If you can rely on your translation memory, 101% do not have to be checked any more by a translator.
  • 100% match
A 100% match on the other hand exactly matches a segment already stored in translation memory but not its context. It is advisable to have them checked by a translator since they might require a slight change due to different context.
  • Fuzzy matches
A Fuzzy match can be any match between 99% and 1%. However, low value matches might have an inapplicable content. Minimum match value shown in the CAT pane is called Fuzzy match threshold. Memsource default setting is 60% (e.g. 59% - 1% matches are not shown). If needed, minimum match rate can be changed in both editors. Go to Tools -> Preferences.

Differences between source segment and TM match are shown in the CAT pane as well. 
  • Subsegment matches - S
Our subsegment feature is relatively simple, it is based on 100% matches appearing inside larger segments. If you have a word or phrase inside a segment, it will appear as a subsegment only if it has been previously translated as a separate segment. For example, a translator has previously translated a segment containing only “ZUSATZABKOMMEN”. Such segment would be a Subsegment match to a segment “ZUSATZABKOMMEN zum Abkommen zwischen der …”

Differences in fuzzy matches
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
Invalid characters found