Translation Resources

Translation Memories Overview

A translation memory (TM) is a database of previously translated text and is a key component of CAT tools. Text is split into smaller segments (usually sentences or titles) during segmentation. The original segment and its translation are then saved into the translation memory as a translation unit.

A TM can have several target languages but only one source language and can be used in multiple projects simultaneously.

There are two main benefits to using a TM:

  • A TM allows for the reuse of translations. This speeds up the translation process and reduces costs.

  • A TM helps to ensure translation consistency. This is important when a client has more than one translator working on a project

While the use of translation memories is highly recommended, there are some limitations:

  • While there are no specific limits to how many target languages a TM can have or how many translation units can be saved in a TM, very large TMs (with millions of segments) slow down the performance of searches, pre-translation or analysis and can be difficult to maintain and edit.

  • File size is limited to 1 GB for exporting and importing. A project can have multiple TMs, so it is always better to have a few smaller, well managed TMs than one very large one.

  • A maximum of 10 translation memories can be assigned to project/language pair.

Translation Memory Quality

Translation memory (TM) is essential for producing consistent translations and can dramatically reduce translation costs. If a TM is not set up correctly and maintained, inconsistent and poor quality translations are produced.

Follow these rules to improve TM quality:

  1. Choose trusted providers

    Have a group of trusted providers (linguists/vendors) who deliver high-quality output that is saved to the master TM. When working with a provider for the first time or with someone whose output quality varies, consider using a secondary working translation memory where they can commit segments and keep the master TM in read-only mode. Use the master TM in Read and Write mode in later workflow steps where review is performed.

    Preventing content of questionable quality from being included in a TM is easier than removing it later.

    Suggested TM configuration:

    tm-vendor.png
  2. Add context information to source files

    Context information allows linguists to better understand content they are translating and improves the quality of the translation. There are different options for providing context such as attaching assets as reference files to projects or adding them on the segment level. For file formats with context key and notes properties, information can be displayed on segment level in a CAT tool. Some editors can display animations and graphics from attached external links.

  3. Lock segments with high-quality matches

    Pre-translating content from translation memory and locking high-score matches (context matches) prevents unwanted changes in the TM. Excluding locked segments from analysis and quotes shared with a provider reduces translation volume and costs.

  4. Perform quality assurance and spell checks before confirming to TM

    Misspellings, missing tags, incorrect punctuation are easily overlooked. Automated Quality Assurance (QA) checks help with this. Advanced QA checks are also able to verify if correct terminology has been used—ensuring translation consistency. Some tools enable segment-level QA which won’t allow the provider to confirm segments and save them into the TM if quality assurance errors have been found. In case segment-level QA is not available (and the check is performed at the end of the localization process), use the working TM approach.

  5. Perform linguistic quality assurance (LQA) evaluation

    LQA evaluation is used to measure and qualify the translations and errors produced. It evaluates translation quality and provides constructive feedback to the provider.

  6. Update your TMS with any changes that happen outside of your translation management system.

    If linguistic edits take place in the native format or in a content management system, they are not saved to the TM and will be overwritten by future submissions of the same content unless the TM is updated. In such a scenario, update the TM manually.

  7. Close the feedback loop

    Discuss the quality of delivered translations with the provider and allow them to see the changes made to their work. It is important to clarify expectations and review detected issues to avoid encountering them again in the future.

Create a Translation Memory

To create a Translation Memory, follow these steps:

  1. Translation Memories can be created from three places:

    • Click the Translation Memories Panel_plus.png in the panel menu.

    • Click New from the Translation Memories page.

    • Click Create New from the Translation Memories table on a project page.

    The Create Translation Memory page opens.

  2. Provide a Name.

    Translation Memories can be used for multiple projects so the name does not need to be specific for a project.

  3. Provide a Source Language.

    The original language of your document.

    Only one source language can be selected per translation memory.

  4. Provide a Target Language.

    The languages to be translated into.

    There can be an unlimited number of target languages in a translation memory but a maximum of 10-15 languages is recommended. Less than 30 languages is still manageable, but more than 50 languages cause the TM to become slow and hard to work with.

  5. Provide business information and a note if applicable.

  6. Click Create.

    If created from a project page, the new TM is added to the list on that page.

    If created elsewhere, the new TM page opens.

Translation Memory Page

Clicking on a TM from the Translation Memories panel or a project page opens the translation memory page.

Dependent on user rights:

  • The attributes of a translation memory can be edited.

  • A TM can be deleted. This moves the TM to the Recycle Bin where it will remain for 30 days.

  • Content can be searched for individual entries requiring editing.

  • Content can be imported from other TMs or other CAT tools.

  • Content of a TM can be exported for editing in another CAT tool and imported back into Memsource.

  • Previously translated files can be aligned with their originals and imported to a new TM.

The Related Projects table presents all projects a specific TM is associated with.

Note that once added, target languages cannot be removed from a TM as long as: 1.) there is a single entry in the TM (even if only in the other target languages), or 2.) the TM is used in an existing project, or 3.) the TM is used in any Project Template.

Edit Translation Memory Attributes

The attributes in a translation memory allow users to group, filter, and sort them. Attributes can also be used to restrict or allow access to guests, restricted Project Managers or users with limited users.

Attributes do not apply to translation units stored in that TM. 

To edit translation memory attributes, follow these steps:

  1. From the Translation Memories page, or a project page, click on a TM to be edited.

    The translation memory page opens.

  2. Click Edit.

    The Edit Translation Memory page opens.

  3. Make required changes to TM attributes.

  4. Click Save.

    The project page opens with the updated TM.

Assign a Translation Memory to a Project

In order to use a translation memory for analysis, pre-translation, or actual translation in a Memsource Editor, the TM must be assigned to a project.

Multiple TMs can be assigned to one project and a single TM can be assigned to multiple projects. There can be up to 10 TMs assigned to each project per language and Workflow step.

To assign a translation memory to a project, follow these steps:

  1. From a project page, click Select from the Translation Memories table.

    The Select TMs window opens.

  2. Select either All, or a specific language from the dropdown list and if the TM will be assigned to All workflow steps and click Continue.

    The Select TMs for All Target Languages or Select TMs for Target Language page opens.

  3. Select TM(s) with these options:

    The first selected TM is automatically selected for Write mode. This can be changed if required. Filter can be used to search for TMs by name, ID number or client name.

    • Read

      Analysis, Pre-translate and the Editor can access the content of the TM selected as Read. Unchecking the Read checkbox unassigns the TM from the project. TMs with reversed source and target languages can be assigned as Read-only

    • Write

      Any segments confirmed in an editor or uploaded are saved into the TM.

      Not required and a maximum of two Write TMs per language and Workflow step in a project.

    • Source Language

      Source language of saved TMs is displayed.

    • Target Languages

      Shows all target languages associated with a TM whether used in current project or not.

    • Penalty (%)

      Set the penalty percentage for TM matches in Analysis, Pre-translate and an editor.

  4. Click Save.

    Project page opens with assigned TMs listed in the Translation Memories table.

If no longer needed or relevant, the TM can be unassigned. First, you need to access the overview of TMs attached to the Project. To do so, repeat Step 1 and 2 of the guide above. There, remove the check marks from the TM you want to unassign. After you click on Save, the TM will disappear from your Project overview.

Relevant TMs are displayed first and ordering is based on the Client, Domain, Subdomain and user rights of the Project Manager.

Translation Memory Match Context

When working with 101% matches from a TM, the previous and following segments provide context that can be saved with each segment.

Context is used to determine if the match in TM is:

  • 101%

    An in-context match.

  • 100%

    Source text is a match, but context of the new text is different.

This becomes important when the context of the segment results in two different translations of the same original text.

Example:

In Czech, a female 'Project manager' is translated differently than a male 'Project manager'.

If surrounding segments create context that can be used to identify the difference, both translations are saved to the Translation Memory and are presented as a 101% match when the same context is provided.

Context Types

The type of context which will be saved with the segment to the translation memory is set in File Import Settings when the job is imported. Every file can be imported with different settings.

A translation memory can contain segments with different types of context:

  • Automatic

    Context type will be selected automatically based on the file type.

    • Files imported with the context type Segment Key: ANDROID_STRING, CHROME_JSON, DESKTOP_ENTRY, DTD, JAVA PROPERTIES, JOOMLA_INI, JSON, MAC_STRINGS, MOZILLA_PROPERTIES, PHP, PLIST, PO (gettext), RESJSON, RESX, TS, XML_PROPERTIES, YAML

    • Other formats will be imported with the context type Previous and next segment.

  • Previous and Next Segments

    Both the previous and next segment will be saved as context.

  • Segment Key

    The segment key or the segment ID will be saved as context. This can be specified for the above mentioned Segment key file formats and also customized for: CSV, XML, Multilingual XML and Multilingual MS Excel files.

    In some file formats, the segment key is more important than context (YAML, JSON, etc.).

  • No Context

    If context can be ignored no context will be saved and the translation will always overwritten by the newest version.

    No Context is also applied when the provided context is not found.

    Example:

    A key is specified in XML/JSON but the key is not found for the given segment.

Optimize Translation Memory Matches

The translation memory match can be further optimized for the imported jobs in the File Import Settings:

  • Previous OR next segment context as 101%

    If context matches in either the previous or the next segment, it will be offered as a 101% match. Default requires both the previous and next segment to match.

  • Ignore tag metadata (enabled by default)

    If the tag's metadata in the job is different then tag metadata in the TM, the difference will be ignored. The tag metadata from the job's original segment will be automatically added to the job's translated segment.

    Example:

    If a source file contained tag {%1} while new source has {%d}=> the formal differences are ignored.

  • Penalize multiple 101% TM matches by 1%

    When more than one 101% with a different target (translation) is found, then all 101% matches are displayed as 100% with an arrow signaling the penalization.

    • In Pre-translate

      If the Pre-translation threshold is set to 101%, segments with multiple matches will not be pre-translated.

    • In Analysis:

      Segments with multiple 101% matches will be counted as 100% matches.

Batch Update Translation Memories

Memsource allows users to modify TMs directly in the UI itself. However, performing large-scale editing and modifications to the TM can only be done outside the Memsource UI. Appending the segment ID with update in the XLSX file triggers an update on import.

To batch update a translation memory in a spreadsheet editor, follow these steps:

  1. Export a TM to XLSX and ensure it is formatted correctly for import.

  2. Open the file in an editor.

  3. Insert two additional columns between the *ID* column and the first language column.

  4. Keeping the ID information in column A, remove the *ID* column label and place it in column C.

  5. Fill the cells in column B with the word update.

  6. In the first cell of the ID column, create the formula =(A2&"|"&B2) and click Enter.

    Cell C2 is populated with the ID from cell A2 and the word update (from cell B2) separated by |.

  7. Copy the formula to the rest of the *ID* column.

    All *ID* column cells are populated with the appended ID information.

  8. Make required modifications to the segments.

  9. Save the file.

  10. Import the file back to Memsource.

    All segments with the appended ID are updated.

Appending the ID with the word delete instead of update will delete those segments on import.

Translation Unit Metadata

A translation unit (TU) is a source segment and the target segments for all languages grouped together and saved into a translation memory.

When a segment/TU is saved, pre-defined attributes or metadata are automatically added to the source and/or target of the TU in the translation memory.

Memsource TU Attribute Metadata

These attributes can be saved with the target segment or source segment as indicated:

  • Created (date/time) - target segment

  • Created by - target segment.

    Only Memsource Usernames are supported.

  • Last modified (date/time) - target segment

  • Last modified by - target segment.

    Only Memsource Usernames are supported.

  • Project - target segment

  • Client - target segment.

    This is the Client as set in the project when the translation was created.

  • Domain - target segment

    This is the Domain as set in the project when the translation was created.

  • Subdomain - target segment

    This is the Subdomain as set in the project when the translation was created.

  • File - target segments

    Name of the file where the translation was created.

  • Context - source segment

    This is the Previous segment, Next segment or Segment key, depending on the File Import Settings when the job was created.

Translation Memory attributes (Domain, Client, Business Unit, etc.) have no effect on translation unit metadata.

TMX Format

For importing and exporting the content of a translation memory using the TMX format, the following metadata is supported:

  • Properties in the source TUV element:

    <prop type="context_prev">Text of the previous segment
    </prop>
    <prop type="context_next">Text of the following segment
    </prop><prop type="x-context_seg_key">Context Key</prop>*

    *For context based on a segment key

  • Properties in target TUV element:

    <prop type="created_at">1322746823589</prop>
    <prop type="created_by">Some name</prop>
    <prop type="modified_at">1323854662890</prop>
    <prop type="modified_by">Some name</prop>
    <prop type="project">Project name</prop>
    <prop type="client">59131</prop>
    <prop type="domain">6678</prop><prop type="subdomain">5370</prop>
    <prop type="filename">File name</prop><prop type="aligned">false</prop><prop type="reviewed">false</prop>

XLSX Format

For importing and exporting the content of a translation memory using the XLSX format, the following metadata is supported:

  • ID - Memsource internal ID

  • {source language code} - for example 'en' or 'en_us'

  • prev - text of the previous segment

  • next - text of the following segment

  • seg_key - text of the context key

  • mdata - metadata of Memsource tags

  • {target language code} - en or en_us

  • created_by - Memsource Username

  • created_at - in format 2017.07.07 14:39:52,000

  • modified_by - Memsource Username

  • modified_at - in format 2017.07.07 14:39:52,000

  • client - Memsource ID (number)

  • project - Memsource ID (number)

  • domain - Memsource ID (number)

  • subdomain - Memsource ID (number)

  • note - text (external use only, not visible in Memsource)

  • reviewed - true/false (external use only, not visible in Memsource)

  • aligned - true/false (external use only, not visible in Memsource)

  • filename - the name of the original file (test.docx)

  • mdata - metadata of Memsource tags

The order of the columns reflects the attributes of the source segment and the attributes of the target segment.

Was this article helpful?

Sorry about that! In what way was it not helpful?

The article didn’t address my problem.
I couldn’t understand the article.
The feature doesn’t do what I need.
Other reason.

Note that feedback is provided anonymously so we aren't able to reply to questions.
If you'd like to ask a question you can leave a public comment below or Submit a request to our Support team.
Thank you for your feedback.

Comments

0 comments

Article is closed for comments.