Robeco, The Investments Engineers
blue circle

19-04-2023 · インサイト

Quant Chart: Lost in translation

Do you like watching old Asian movies from the 60s and 70s? Perhaps you’re a connoisseur of kung fu or monster movies; popular genres from that era. If you are, and you live in a Western country, those movies would have been translated from their original Chinese or Japanese into a Western language, most likely English.

    執筆者

  • Mike Chen - Head of Next Gen Research

    Mike Chen

    Head of Next Gen Research

  • Matthias Hanauer - Researcher

    Matthias Hanauer

    Researcher

  • Nick Mutsaers - Researcher

    Nick Mutsaers

    Researcher

If so, you may have noticed that when the actors speak, their mouths move for far longer than it took to say the English translation, and you might even have wondered what it was you were missing. Of course, everyone knows that a lot of context and information – actors’ performance, accents, nuances and local culture references – gets lost in translation when a movie is dubbed. But have you ever wondered whether investment information also gets lost in translation?

Natural Language Processing (NLP), an application of artificial intelligence, is a popular tool that is revolutionizing quantitative finance and being applied to many types of texts. However, most NLP tools are developed for texts in English. Since English is not the only language spoken around the world1, a popular approach to process non-English texts is to translate them into English, and then apply English NLP models to the translated texts.

In recent research, Robeco discovered that just like in those old Asian movies, the above-described approach based on translated text also results in some information (alpha) being lost in translation. When a local-language-based NLP model is applied to the local-language text, additional information (alpha) can be revealed and therefore harvested.

Take, for example, Chinese investment texts. The left-hand chart in Figure 1 shows the performance of factors built from Chinese and English-based NLP engines. The good news is that both are positive, so not all information is lost in translation. However, the right-hand chart in Figure 1 shows that of the top quintile-ranked stocks from the Chinese NLP model, only 50% of which would be classified in the top two quintiles under the English NLP model.

Figure 1: English translation versus Chinese original NLP output

Figure 1: English translation versus Chinese original NLP output

Source: I/B/E/S, Refinitiv, Orbit Financial Technology, Robeco. The left panel of the figure displays the return spread between the top and bottom quintile portfolios based on the NLP sentiment score using the Chinese and the English language. The right panel of the graph displays the similarity in stock classification between the two signals. More specifically, it shows the percentage of top English NLP stocks classified in the corresponding quintiles based on the Chinese language. The investment universe consist of MSCI China A index constituents. The portfolios are equally weighted, rebalanced monthly. The left and right charts illustrate the results for the sample period of January 2013 till December 2022.

This shows that the stocks selected are different because there is no perfect overlap. Like those old Asian movies from the 60s and 70s, information may also be lost in translation. To fully grasp the nuances of a movie’s dialogue, it is worth watching the film in the original language, if possible. And to fully understand what is being communicated in an investment text, it may be worth reading the texts in their original local language.

As technology advances, so do the opportunities for quantitative investors. By incorporating more data and leveraging advanced modelling techniques, we can develop deeper insights and enhance decision-making.

Footnote

1 English is only spoken natively by 400 million people around the world, or ~5% of the global population.

Quant Charts

重要事項

当資料は情報提供を目的として、Robeco Institutional Asset Management B.V.が作成した英文資料、もしくはその英文資料をロベコ・ジャパン株式会社が翻訳したものです。資料中の個別の金融商品の売買の勧誘や推奨等を目的とするものではありません。記載された情報は十分信頼できるものであると考えておりますが、その正確性、完全性を保証するものではありません。意見や見通しはあくまで作成日における弊社の判断に基づくものであり、今後予告なしに変更されることがあります。運用状況、市場動向、意見等は、過去の一時点あるいは過去の一定期間についてのものであり、過去の実績は将来の運用成果を保証または示唆するものではありません。また、記載された投資方針・戦略等は全ての投資家の皆様に適合するとは限りません。当資料は法律、税務、会計面での助言の提供を意図するものではありません。 ご契約に際しては、必要に応じ専門家にご相談の上、最終的なご判断はお客様ご自身でなさるようお願い致します。 運用を行う資産の評価額は、組入有価証券等の価格、金融市場の相場や金利等の変動、及び組入有価証券の発行体の財務状況による信用力等の影響を受けて変動します。また、外貨建資産に投資する場合は為替変動の影響も受けます。運用によって生じた損益は、全て投資家の皆様に帰属します。したがって投資元本や一定の運用成果が保証されているものではなく、投資元本を上回る損失を被ることがあります。弊社が行う金融商品取引業に係る手数料または報酬は、締結される契約の種類や契約資産額により異なるため、当資料において記載せず別途ご提示させて頂く場合があります。具体的な手数料または報酬の金額・計算方法につきましては弊社担当者へお問合せください。 当資料及び記載されている情報、商品に関する権利は弊社に帰属します。したがって、弊社の書面による同意なくしてその全部もしくは一部を複製またはその他の方法で配布することはご遠慮ください。 商号等: ロベコ・ジャパン株式会社  金融商品取引業者 関東財務局長(金商)第2780号 加入協会: 一般社団法人 日本投資顧問業協会