《現代漢語新詞資料庫》致力於收集與整理現代漢語新詞,特別注重詞源研究和歷史語料的實際應用。所謂的現代漢語新詞,不僅包括漢語自發生成的新名詞,也涵蓋那些可能源自日語的詞彙。這個「可能」顯示即使學界已有初步成果,我們依然期待未來能有更多新的發現。日語借詞在現代漢語中的角色曲折多變;起初這些借詞被忽視,但後來逐漸被視爲理所當然。這兩種觀點都值得商榷。實際上,日語借詞對漢語詞彙產生了深遠的影響,但日語借詞背後的傳播路徑以及其所刺激的思想衝擊,並非我們想像的如此當然。不論是對個別詞源的研究,或是其整體傳播的歷程,對日語借詞的理解仍需跨學科的整合。日語借詞不僅是個語言現象,更爲涉及到跨文化的知識交流和概念傳播,對於研究過去一世紀東亞現代化的學者來說,掌握概念史至關重要,而日語借詞在其過程中起到了關鍵作用。從 120 年前首次被納入漢語詞彙以來,關於日語借詞的研究雖取得了一些成果,但這些成果似乎尚未經過歷史事實的考驗。目前看來,尚缺乏以具體歷史語料來驗證日語外來詞研究的實例。本資料庫旨在促進學術研究與歷史語料之間的直接對話和相互驗證,以揭示日語借詞的真面目。
中國人一直深信自己的語言特別純淨而充滿奧秘,畢竟中國自古以來是東亞的政治領袖與道德權威,對周邊國家文化有著深遠的影響。然而,到了清末,當中國人逐漸覺察到即使是鄰近的小國也能對中文造成反向影響時,這種認識似乎對他們的自我認同構成了嚴峻挑戰,引發了苦澀的自嘲情緒。在這樣的語境中,新名詞往往被當作直觀的代罪羔羊。彭文祖在《盲人瞎馬》一書中從前言受責到卷末,表達了對新名詞的強烈批判:「吁嗟乎,殊不知新名詞之為鬼為崇,害國殃民以啓亡國亡種之兆,至於不可紀極也。」這反映了一種深刻的文化焦慮,認為新詞的引入可能預示國家和民族的危機。
實際上,文化或語言上的相互交流不限於詞彙,也並非從日語借詞開始的。早從佛典翻譯和宗教的引入開始,還包括外來統治者對漢語的深刻烙印,還使得至今北方官話和南方方言在發音變化上呈現明顯差異。不僅如此,到了明末,西方的思想和詞彙隨著耶穌會士的到來而傳入。然而,漢語詞彙的影響自1900年代初以來達到了空前的高峰。
隨著留日學生的返國,大量日語借詞湧入了中文詞彙庫。學生帶回的現代日語詞典,乍看一下似無害,但長期處於沉寂中的中國社會,對新知已產生了極大的好奇心和渴望。每一個新引入的詞彙都帶來了一個新觀念,每一個新觀念都加劇了社會現代化的壓力,促使社會進行轉型並探索新的可能性。可想見,「革命」一詞的引入不僅讓變革成為可能,更讓其成為必然的,「革命」一詞在思想層次上的產生了突破,釋放出的力量直接影響了中華民國的成立及其後續的一系列發展。
自20世紀初以來, 中文新詞的研究持續進行,1903年《新爾雅》的出版標誌著這一領域的重要開端。然而,由於當時仍缺乏西方語言學方法以及受到各種政治和社會因素的限制,日語借詞研究直到戰後才真正開始。
戰後,學界對漢語新詞的興趣逐漸增強,學者們開始努力辨識哪些詞彙來自日語的,並探索哪些因素可能有助於建立一個實用的分類系統。最初,中國學者的反應受到西方外來詞框架的影響,西方理論主要通過一套語音演變原則,識別語音方面的相似程度。這種方法導致早期的現代漢語學者,如王力等人,對日語借詞持冷漠甚至否定的態度。此外,由於日本自唐朝以來繼承了漢字,部分學者混搖字和詞的區別,認爲日語借詞基本上是中文固有的。這種觀點忽視了借詞在語義方面有所變化,也低估創造過程的關鍵作用。
但中國語言學的發展並未因此停滯。特別是在歷史詞彙學領域上,隨著時間推移,越來越多的學者接受了一個觀點:由於漢字作為東亞地區的「共同書寫語言」(scripta franca)而被廣泛使用,詞彙的傳播往往依靠書寫而非口語。因此,僅依賴語音相似性來判斷借詞的做法已逐漸被質疑,學者們開始重視新的字組合(詞形)、逐語素比較(morpheme-based comparisons)以及語義演變,認為這些因素是識別借詞的重要標準。此外,現代漢語與古漢語在意義上的差異也已經廣為人知,許多所謂的回歸詞(同形異義詞帶有新意義)實際上可以被視為新詞。
從戰後到 2000 年,和製漢語的研究發展出了獨特的術語和方法論。其中一個里程碑是 1958 年王立達的借詞研究以及1984年出版的《漢語外來詞典》,這些作品正式承認和製漢語中日語借詞的重要性。該時期的研究建立了一套獨特的框架,採取了與西方不同的方法,並針對漢字的特性進行了調整。在這一時期,學術界對日語借詞的分類達成了某種共識,大致將其分為三類:回歸詞(如「經濟」),音譯詞(如「雷達」,音譯自英文 ‘radar’),以及意義詞(如「蜜月」,意譯自英文 ‘honeymoon’)。回歸詞指的是那些在古漢語文獻中已存在但被日本賦予新意義的詞,音譯詞則反映外語發音的新詞,以及意譯詞又稱仿譯詞,是部分或完全地複製原詞語素結構的詞。
在當代漢語語言學中,日語借詞的地位基本上已無爭議。然而,由於分類方法尚未統一以及對歷史資料的掌握程度不同,對不少個別詞彙的看法仍存在爭議,特別是在借詞的子類型上,例如一些混合型借詞,如借詞與漢語語素的結合,或是在創造過程中直接模仿日語詞的情況。最近,沈國威提出了「激活詞」的新概念,指的是那些過去使用率較低但由於日本的影響而開始頻繁使用的漢語詞。
然而,儘管現代漢語借詞研究具有創新之處,也存在一定的內部局限性。主要問題是,這些研究多半局限於中文的範疇內,很少涉及到日語,更不用說其他語言了。某些人似乎以爲借詞不曾跨越海洋、山脈或草原一樣。除了陳力衛以日語發表大部分文章和少數在國外任教的學者之外,大多數借詞研究很少涉及所謂的深層詞源(deep etymology)。這與西方的詞源學形成了鮮明對比,西方詞源學不僅跨越多種語言(廣度),還努力將單詞甚至語素追溯到其印歐語根(深度)。對比之下,如果古漢語時期與羅馬時期相類比,那麼所構擬的印歐語時期則比夏朝還早幾千年了。東西詞源學上的差異,可能源於中國的詞源學(詞本位)和文字學(字本位)之間長期的相互不干涉。這種分隔導致了研究方法和重點各有所偏,在某一種程度上可能也限制了中文詞源學的發展。
在此背景下,仍有許多議題待探索,這正是本數據庫開發的初衷。此處,我們將簡要概述還存在學術爭議的地方。其中最關鍵的是歷史詞彙學中的所謂「造詞者」(authorship)的問題。在一定意思上,造詞者問題被忽略是可以理解的,畢竟這個問題幾乎僅見於東亞新詞的範疇內,而較少涉及到西方借詞脈絡。核心問題在於,西方的概念及其相關詞彙很難直接用漢字記錄,必然牽涉到書寫系統的轉換,進而由於漢字本身帶有表義功能,所以使得問題變得更加複雜。
具體來說,即使是採用最直接的借用方式——音譯,被音譯的新詞也必須經歷相當程度的語音適應(phonetic adaption),而漢字並不是記錄語音的最佳工具。儘管可以勉強使用漢字來純粹表達音,比如將 ‘Gypsy’ 音譯作「吉卜賽人」,但這種方式往往顯得有些尷尬。實際上,中文通常避免採用這種翻譯方略。此外,自 19 世紀中期以來直到戰前,音譯常被視為一種不雅的翻譯方式(參考嚴復的「一名之立,旬月踟躕」的名言),也被看作是承認中國或日本等國缺乏相應的觀念和詞彙的表現,這種承認似乎象徵著一種自卑感,是不可接受且難以忍受的。
由於音譯並非首選的借詞方法,因此只剩下兩種其他借詞方法:意譯(亦稱仿造,calquing),即逐語素地模仿源詞的結構;以及概念重構(conceptual remaking),即用定制的語素結構來表達外來概念。這些方法讓外來詞以更優雅且在文化上更可接受的方式融入東亞語言詞彙。對於未受專業訓練的華人來說,這些借詞往往不引起懷疑,給人一種正常的中文詞彙的感覺。
然而,當源詞具有特殊或變異的結構時,仿造方法可能會遇到問題。例如,當詞源來自較不熟悉的語言,如 ‘kangaroo’(澳洲東北原住民語言一詞)或 ‘giraffe’(由未知非洲語言傳到阿拉伯語 zarafa,再傳至意大利語 giraffa),或來自拉丁文或古希臘文的複雜語素結構(如 ‘metamorphose’ 等),以及當詞義透過隱喻發生變化,因而與原始語素結構失去直接聯繫時,如生物學上的 ‘cell’(細胞),其原意為小房間(拉丁語 cella)。但(生物學上的)‘cell’ 未被仿造為「小室」或「小間」,而是創造了全新概念「細胞」。此外,自然科學術語如化學元素的名稱(‘hydrogen’,意為「源自水的」),或現代技術詞彙(‘satellite’,來自拉丁文 satelles,「隨從」)也帶來挑戰,主要是因為字面意義與其延伸意義之間常常存在顯著差異,使得直譯成中文時語義不通。
在這些情況下,便採用了最具挑戰性的借詞方法——基於外來詞義創建新的詞形。那些低估意義而強調詞形概念的學者,通常不接受概念重構作為有效的借詞方法。他們認為,語義借用很難證明,而且兩個詞之間幾乎不可能有完全相同的語義這種情形存在,更何況不同語言之間。雖然這是事實,但當我們研究不同文字系統之間詞彙是如何轉換的,新詞的形成往往是某種有意識的有目的的翻譯過程,這一點是無法否認的。借的是意思,自創的是詞形,因此,儘管概念重構是所有借詞類型中最為勉強的一種,我們仍然認為其可算作借詞。
概念重構需要翻譯者深度理解目標語言的細微差異和複雜細節,並需進行多次精細調整。從明治維新前後開始,日本通過一項知識產業共同努力,使用漢字創造新詞,大量翻譯西方的知識,包含現代詞彙和概念。正是憑借這些早期日本學者的卓越知識和語言敏感性,才得以創造出約 2000 至 3000 個新詞,這些詞專為我們的現代時代精心打造的。這不僅是語言上的成就,也搭建了一座文化橋樑,幫助將西方概念以有意義且易於接受的方式融入東亞語言。
儘管應當承認日本學者的重要貢獻,但我們也應保持對歷史的公正認識,並非所有新詞都是日本人親自創造的。除了前述佛典漢譯詞外,19 世紀後半期西方傳教士創造的另一組重要新詞也不可忽視。無論是字典、教科書、文章還是宗教文本,傳教士們用中文創造了許多重要的歷史文獻,這些文獻對教育和傳教活動至關重要,也自當成爲我們今天所關注的寶貴研究材料之一。
雖然當時許多西方文獻在中國受到冷遇,甚至無痕迹失傳,但其中一些作品被帶入日本,成為西方知識翻譯成日語的重要參考資料,例如偉烈亞力(Alexander Wylie, 1815-1887 年)早期的科學雜誌《六合叢談》,甚至羅存德(Wilhelm Lobscheid, 1822-1893 年)於 1866 至 1869 年之間所編的《英華字典》,日本人收買出版社全庫存,並分發給全國各地的翻譯所使用,對日本學術界影響力無可忽略。這種歷史情況造就了一種獨特的現象:最初由西方人所創造的中文詞彙被納入日語,隨後被誤以爲是日語借詞再次回歸中文。這一借詞路徑突顯了文化和語言交流的複雜現實,更加深了我們對詞源與中國、日本及西方之間歷史互動緊密交織的理解。這些所謂的漢語中的日語借詞的西方來源已在作者《漢語中的西源和製漢語:學術評析與辭典資料建置》一篇論文中進行了討論。
鑑於這一歷史背景,我們面臨的關鍵問題是,如何區分由西方人創造的中文詞與正統的中文新詞。在現行的標準框架下,通常將西方人創建的詞稱為「傳教士譯詞」或類似的名稱。我們建議,由於此分類的主要標準是「造詞者」,因此可以將其作為評估借詞性質的基本尺度。這一概念有助於區分日本人、中國人和西方人創造的借詞,其標準客觀明確。此外,這也能克服早期研究中常見的尷尬標準,即判斷一個詞是否具有「漢語特色」。正如我們今天所認識的,一個詞即使充分表現出「中文特色」,仍可能是個借詞,理由是該詞不是由中國人而是由西方或日本人所創造,且在造詞過程中已有意讓新詞符合漢語構詞原則。
這種證明造詞者的方法,在一定程度上需要證明每個詞的來源。因此我們主要是通過其最早書證(earliest record)來區分詞的類型:該詞是首次出現在由西方作者撰寫的中文文本中呢,還是在雙語詞典,或中國作者的文本,或是在日本資料中呢?通過追溯一個詞到其最初的記錄,我們可以推斷出其造詞者的身份,從而將作者、語言與詞的創造聯織在一起。儘管無法完全逃離理解詞義的歷史性演變,但使用最早書證提供了一個更客觀的基礎來評估借詞性。
那麼,當使用最早書證,來確定造詞者身份的方法,書證本身就要承擔重要的證明責任,因此評估歷史記錄的可靠性變得至關重要。具體來說,我們需要評估資料庫中某個詞目前所最早記錄是否真的是該詞的最早書證,以及在未來研究中發現更早記錄的可能性有多大。雖然無法達到絕對的確定性,但通過分析資料庫中從 1600 年至 1920 年間所有詞的記錄分佈,並檢查每個詞的具體記錄分佈,我們可以運用核密度估計(Kernel Density Estimation)這一統計方法,來計算特定的可能性。核密度估計常用於時間序列分析,所以可以針對於每個詞告提供一種數據告訴我們,對於書證的信賴程度有多高。有些情況,我們發現與學術界不同的最早書證進而判斷該詞是或不是借詞,那麼這種判斷是否有說服力也要看最早書證的信賴度。
這種方法標誌著與過往研究中常見的主觀判斷借詞分類方法的一次轉變。在我們的資料庫中,我們承認借詞的接受歷史是複雜的,而且可能會因新發現而受到影響。隨著新資料的添加,某些詞的分類可能會有所變動,但這也有助於提高分類的整體穩健性。
近年來,在外來詞語源研究領域,系統性整理近義詞也是一個重要的發展。原則上,近義詞可以組織成所謂的近義詞詞組(synset),其中主要分為兩種:共時近義詞詞組(synchronic synset)包含了意義相似的當代詞彙;而歷時近義詞詞組(diachronic synset)則包括了那些較早或具有早期意義的詞彙。近義詞的歷時研究尤其具有研究價值,因為在新術語形成的初期混亂中,尚未對特定詞形達成共識,多個近義詞可能同時出現,可以端倪出不同構詞嘗試背後的思想成分。隨著時間的推移,逐漸形成共識,某些術語成為主流,而其他則逐步淘汰。這一詞彙發展過程在不同術語間的持續時間各不相同,有些可能需要幾代人時間才能達成共識,而有些則在幾年內確立其地位。在這方面,黃河清所編纂的《近現代漢語辭源》一書對歷史近義詞詞組的研究提供了重要貢獻。
本資料庫涵蓋了共時及歷時近義詞詞組二種,除此外我們也建立了自己的歷時近義詞詞組。雖然從一開始就預計將近義詞詞組納入資料庫,但建立自有的歷時近義詞詞組並非初衷。源於我們資源中包含了眾多雙語詞典,因此這一結果也是自然形成的,所以資料庫也有了一套較為罕見的歷時多語近義詞詞組,捕捉了不同語言和時期概念的演化與相互連結。
本資料庫收錄了來自台灣教育部網上資源的約60萬條現代中文術語,覆蓋廣泛的學術領域。
- 《漢語外來語詞典》,岑麒祥 ,1984。
- 〈和製漢語一目覽〉載於《漢字百科大事典》,佐藤武義,1996。
- 《美意識的種子—和製漢詞對中國現代文學的影響》,周聖來,2016。
- 《清末民初和改革開放從來的日源借詞及其漢化研究》,曲紫瑞,2016。
- 《交錯的文化史 – 早期傳教士漢學研究史稿》,張西平,2017。
- 《近現代漢語辭源》,黃河清,2020。
- 《現代漢語外來詞研究》,高名凱、劉正埮,1958。
- 《現代漢語中從日語借來的詞彙》,王立達,1958。
- 《中國人留學日本史》,実藤恵秀, 1981。
- 《漢語外來詞詞典》高名凱、劉正埮、麥永乾、史有為,1984。
- 《跨語際實踐》(原書名:Translingual Practice), 劉禾,1995。
- 《近現代漢語新詞詞源詞典》, 2001。
- 《觀念史研究 - 中國現代重要政治術語的形成》,金觀濤、劉青峰,2010。
- 《日本明治時期北京官話課本詞匯研究》,陳明娥,2014。
- 《現代漢語詞典中的日語藉詞研究》,森田聰,2016。
- 《明清漢語外來詞史研究》,趙明,2016。
- 《新華外來詞詞典》, 史有爲,2019。
- 《漢語近代二字詞研究:語言接觸與漢語的近代演化》,沈國威,2019a。
- 《一名之立旬月踟躕:嚴復譯詞研究》,沈國威,2019b。
- 《近代中日詞彙交流研究:漢字新詞的創制、容受與共享》,沈國威,2020。
- 《清末民初詞彙研究》,張燁,2019。
- 《東往東來–近代中日之間的語詞概念》,陳力衛,2019。
- 《近代中日詞彙交流的軌跡》,朱京偉,2020。
在搜索功能方面,資料庫提供了多種搜索選項,讓使用者能夠根據自己的興趣和直覺查找並比較語詞,包含未知的語詞,從而鼓勵學者們獨立探索借詞的各種脈絡和路徑。
在語義方面,我們為每個中文術語提供了多種語言的翻譯,主要包括來自各類英漢詞典的實際歷史來源翻譯。此外,考慮到佛典漢譯詞的重要性,我們提供了以下三本佛教詞典的數據:
- 《佛學大辭典》,丁福保編,1922。
- 《五譯合璧集要》(“Pentaglot Dictionary of Buddhist Terms”)Raghu Vira, 1961。
- 唐法師舍支迦拉席瑪的《法師翻譯的法華經詞彙表》(“A Glossary of Kumārajīva's Translation of the Lotus Sutra”),Seishi Karashima, 2001。
上述這三本詞典可在 https://glossaries.dila.edu.tw 公開訪問。
所有日文詞彙、翻譯、文本來源及日期數據均來自「中納言」資料庫,該資料庫隸屬於日本國語研究所,並且已獲得合法授權。
本資料庫的建立旨在為研究現代漢語中的新詞提供一個整合性平台。目前,我們認為學術界在以下幾個方面仍有發現的潛力,需要持續的學術工作:
- 基礎資料的整合
- 學術界與歷史資料的對話
- 佛典漢譯詞(梵文術語)與日語借詞的整合
- 早期與現代學術研究的比較
- 分析西方作者的中文文本及其中的新詞
- 統一並整合過去的各種分類和標準
- 借詞分類的客觀方法的進一步發展和整合
目前,資料庫的三項功能使用受限,包括:
- 生成詞表:按特定標準(如學者、時間、語言等)生成詞表。
- 研究數據:為特定學者生成詞表。
- 歷史語料:查詢個別歷史語料的詞表及相關圖像檔案。
本資料庫最終目的是促成學術評估與歷史語料之間的對話、互證。我們提供了幾項量化的評判方法,主要是學術界普遍認為某詞是否為借詞,以及現代學術對於每一個別詞有多高的共識程度等統計數據。然後,最重要的乃是能將學術評估與歷史證據進行直接對比。在某些情況下,歷史證據支持學術分類,而在其他情況下,則顯示出不同的分類。所有這些訊息都可以在每個詞的摘要中輕鬆查閱。
該資料庫在中研院近史所陳建守助研究員的贊助和指導下所建立。我們也非常感謝近史所數位人文團隊長期的技術支持,並將繼續豐富其內容。如有任何不足之處,敬請與我們聯繫。
The Database of Neologisms in Modern Chinese focuses on the etymology and development of new terms in Chinese, particularly those generally discussed as potential Japanese loanwords. Potential here refers to the fact that research on these words and concepts is still ongoing. The role of Japanese loanwords in modern Chinese has a complex history; initially, they were long downplayed and later regarded as a matter of fact. Both positions are questionable: indeed, Japanese loanwords constitute a significant part of the Chinese lexicon, but their status is not simply a matter of fact. Each individual etymology, but also their collective history of transmission are far from well understood. For anyone concerned with the modernization processes that unfolded over the last century in East Asia, understanding the conceptual history is crucial, and Japanese loanwords play a major role in this context. Since their influx into the Chinese lexicon 120 years ago, the debate on Japanese loanwords has undergone significant changes and spurred a variety of research. However, these studies have not really stood the test against historical facts. This database aims to reconcile two perspectives: academic research and actual historical sources.
Chinese long held the belief that their language was especially pure, authentic, and original, as, over extended periods, Chinese culture profoundly influenced its neighbors. It was thus particularly painful to confront the reality that the Chinese language had been influenced by its smaller neighbors just as much as the other way around. The impact began not only with the introduction of Buddhist vocabulary and religion but also as non-native rulers and cultures left a profound imprint on the Chinese language. This influence was not limited to absorbed loanwords but extended to such a degree that the northern dialect underwent significant phonological changes compared to its southern variants. Then, with the arrival of the Jesuits at the end of the Ming Dynasty, Western ideas and words started to infiltrate the Chinese lexicon. However, the most substantial and dramatic impact on Chinese vocabulary came from the influx of Japanese loanwords, which began with Chinese students returning from Japan from 1900 on. Bringing back a modern Japanese dictionary to a China dormant for centuries might have seemed innocuously harmless at first glance. However, each new word introduced a new idea, and every new idea increased the pressure to modernize, to transform society, and to explore new possibilities. A revolution became not only possible but inevitable, as the word and the idea of revolution became suddenly accessible.
Research into Chinese neologisms has been ongoing since the early 20th century, notably marked by the publication of the Xinerya (《新爾雅》) in 1903. However, due to the absence of Western linguistic methodologies and various political and societal factors, the study of Japanese loanwords in Chinese did not evolve into a legitimate research field until after the war.
Post-war, there was a renewed interest in Chinese neologisms, with efforts to gain a clear understanding of which terms were borrowed from Japanese, and which factors and aspects might facilitate a useful categorization system. Initially, the Chinese response was heavily influenced by the Western loanword framework, which primarily identifies loanwords through phonetic similarities. This approach led early modern Chinese scholars such as Wang Li famously dismiss the idea that Japanese loanwords could be considered as such. The historical adoption of Chinese characters by the Japanese centuries earlier further contributed to minimizing or outright rejecting the significance and status of Japanese loanwords in Chinese. The role of semantic change in these loanwords was not fully acknowledged in these studies.
But Chinese linguistics did not stop developing. Particularly in historical lexicology, it became more widely accepted over time that due to the extensive use of Chinese script as a "scripta franca" in East Asia, words were often transferred via writing rather than spoken language. Consequently, the emphasis on phonetic similarity began to diminish, and factors such as new character combinations (word form), comparing words morpheme-by-morpheme, and finally semantic change came to be recognized as important criteria for identifying loanwords. Furthermore, comprehensive comparisons between the modern and classical meanings of Chinese words revealed that many so-called backloans (homographs that have acquired new meanings) could in fact be considered new words, as their modern meanings are partly or entirely distinct from their classical counterparts.
From the post-war period up to 2000, research on Japanese-made Chinese words (和製漢語) significantly developed its own terminology and methodology. A milestone in this regard was Wang Lida's work on loanwords back in 1958, and later the publication of the "Dictionary of Chinese Foreign Words" in 1984, which formally recognized the status of Japanese-made Chinese words as Japanese loanwords. Studies during that period established unique frameworks that applied methods different from their Western counterparts and adapted to problems unique to Chinese characters. During this period, the academic community reached a certain consensus on the classification of Japanese loanwords, broadly dividing them into three categories: **backloans** (回歸詞), **transliterations** (音譯詞), and **calques** in Chinese called (意義詞). Backloans refer to those words that originally existed in ancient Chinese texts but were given new meanings by the Japanese, such as "經濟" (economy); transliterations are new terms created by the Japanese that reflect the pronunciation of foreign languages, such as "雷達" (radar); calques, also known as imitations, are words that copy the original morpheme structure of the source word, either partially or wholly, such as "蜜月" (honeymoon).
In contemporary Chinese linguistics, the status of early modern Japanese loanwords is largely uncontested. However, differences of opinion still persist, on a word by word basis, mostly due to different access to historical sources. Particularly concerning certain subtypes of loanwords—specifically, mixed types where loanwords are combined with native morphemes, or cases where genuinely new, legitimate Chinese terms have been created, modeled after Japanese words. Shen Guowei recently introduced the notion that frequency of use also plays a role, and that some words, although never truly vanished from the Chinese lexicon, have become more popular due to Japanese influence.
However, modern studies also have their limitations. The most fundamental problem of all loanword studies in the Chinese language is that they are confined within the borders of the Chinese language and only in rare cases include Japanese, as if loanwords would have trouble crossing water, mountains, or the steppe. With the exception of Chen Liwei, who mostly publishes in Japanese, loanword studies do not consider deep etymologies. This is a stark contrast to Western etymology, where words are traced back many steps up to Indo-European roots.
In this context, there remain many topics to explore, which is why the current database has been developed. Here, we will briefly outline areas of ongoing academic debate. Most importantly, the issue of authorship in the realm of historical lexicography has not been fully acknowledged. This oversight is somewhat understandable given that this issue is almost unique to East Asian neologisms and is far less pertinent in the Western context. At the heart of this issue is the fact that Western ideas—and the words that express them—cannot be easily written in languages using some variant of Chinese characters at their basis.
Assume even the most direct borrowing method is applied—phonetic transliteration, not only do these new words undergo significant phonetic adaptation, but Chinese characters are also not the best tool for writing phonetically. Although they can be used to just represent sound, it is a rather awkward choice, and as a matter of fact, Chinese typically avoids this strategy. Furthermore, in the mid-19th century, phonetic transliteration was often viewed as a low-quality translation method. It was also seen as an admission of a lack of equivalent ideas and words in, for example, China or Japan—an acknowledgment associated with inferiority, which was unthinkable and intolerable.
Since phonetic translation was and is not a favored borrowing technique, only two methods of linguistic transmission remained viable: calquing, which involves imitating a foreign word morpheme by morpheme, and conceptual remaking, where the foreign idea is expressed using a custom-made morpheme construction. These methods allow for a more nuanced and culturally acceptable integration of foreign terms into the East Asian linguistic landscape. More often than not, to the untrained native eye, these loanwords go undetected and feel like a normal part of the lexicon.
However, calquing can be made difficult when the source word has an oblique morpheme structure, for example if the word was derived from a not well known language at the time, for example words like ‘kangoroo’ or ‘giraffe’, or had a complex morpheme structure from Latin or Greek, or when the word meaning was already metaphically shifted and not directly tied to its morphemic components, in words like (biological) ‘cell’, originally meaning small room (Lat. *cella*). But (biological) ‘cell’ has not been calqued into ‘小室’ or ‘小間’, but is an entirely new construction *xibao* (細胞). Furthermore, specialized vocabularies in the natural sciences, e.g. terms of chemical elements (‘Hydrogen’, literal ‘water-originated’), or modern technologies (‘satellite’, from Lat. *satelles*, an ‘attendant’), also presented challenges, as the literal meaning often diverged significantly from their extensional meanings.
In these cases, the most challenging and definitive method of borrowing was employed: creating a new word form based solely on and triggered by the foreign word's meaning. Authors who downplay the importance of meaning and stress the notion of word form generally do not accept conceptual remakes as loanwords. Their argument is that semantic borrowing is hard to prove, not least because semantic equivalence is almost never possible between two words, let alone between two words in different languages. Although this is true, in the case of written borrowing between languages with different script systems, if the creation of a new word can be credibly associated with foreign sources and is documented as a translation, it can be assumed to be a motivated creation. Although conceptual remakes are the weakest form of borrowings, they are, as we think, loanwords.
Conceptual remaking requires, on the part of the translator, a deep understanding of the intricacies and nuanced details in the target language, alongside numerous iterations of fine-tuning. Starting shortly before and during the Meiji Restoration, there was a significant intellectual effort to create new words using Chinese characters, explicitly intended to convey Western modern meanings. The remarkable knowledge and linguistic sensitivity of these early Japanese scholars led to the creation of approximately 2,000 to 3,000 new words, meticulously crafted for use in our modern era. These efforts were not only a linguistic achievement but also a cultural bridge that helped integrate Western concepts into East Asian languages in a way that was both meaningful and accessible.
Notwithstanding the contributions of Japanese scholars, it's crucial to acknowledge that not all new terms were solely their own creation. Apart from the integration of Chinese neologisms stemming from translations of Buddhist texts—a subject we already briefly mentioned—another significant subset of new words originated from Western missionaries during the latter half of the 19th century. These missionaries produced a range of materials, including dictionaries, textbooks, essays, and religious texts, all written in Chinese with the aim of educating and evangelizing.
While many Western contributions were mostly ignored by their Chinese counterparts at the time or lost to history, some of their works made their way to Japan, where they became essential reference materials for translating Western knowledge into Japanese, for example Wylie’s *Liuhe Congtan* (六合叢談). This historical circumstance led to the unique phenomenon where Western-made Chinese words were first incorporated into Japanese and then, disguised as Japanese loanwords, re-entered the Chinese lexicon. This pathway of lexical transmission underscores a complex layer of cultural and linguistic exchange, where the origin of words is intertwined with historical interactions among China, Japan, and the West. These so-called Western sources of Japanese loanwords in Chinese have been thoroughly investigated in the study "Western Origins of Japanese Loanwords in Chinese: Academic Evaluation and Lexical Resource Construction" by the author.
Given this historical context, the critical question is then how to differentiate Western-made Chinese words from legitimate Chinese neologisms. Within the current standard framework, words created by Westerners are typically labeled as "missionary words" or similarly. We suggest that the primary criterion for this classification is authorship, which serves as a fundamental measure to assess the nature of loanwords. The notion of authorship helps distinguish between Japanese, Chinese, and Western-made loanwords based on historical documentation. It also eliminates the somewhat awkward criterion often employed in earlier studies to judge the loanword status of a word by whether the word had Chinese characteristics. As we understand today, loanwords can be created perfectly just like Chinese words and nevertheless be loanwords, because they were constructed by non-native Chinese translators, be they Westerners or Japanese, with the intent of being used as legitimate Chinese words.
This approach necessitates proving, to some extent, the origin of each word by identifying in which type of source it first appeared: in Chinese texts written by Western authors, in bilingual dictionaries, in Chinese texts by Chinese authors, or in Japanese sources. By tracing a word back to its earliest record, we can infer its authorship and associate both the author and the language with the word's creation. While this method does not relieve us from the need to understand the extent of any meaning changes that have occurred over time, using first historical records provides a more objective basis for evaluating loanwordness.
Given the significant burdon of proof that the first historical record holds in establishing authorship, it becomes essential to assess the reliability of historical records. Specifically, we need to evaluate how likely it is that the currently earliest record for a given word in the database truly represents the first occurrence of the word in reality, and conversely, how probable it is that an even earlier record might be discovered at a later point during research. Although absolute certainty is unachievable, by analyzing the distribution of all words in the database from 1600 to 1920, and examining the specific distribution of records for each individual word, we can calculate a specific likelihood, using Kernel Density Estimation, a method often used in analyzing time series. This likelihood informs us, for each word, how convincing our assessment of its authorship is.
This method marks a departure from previous approaches, where non-explicit processes or subjective judgments formed the basis for classification. In our database, we acknowledge that the historical borrowing of words is complex and subject to ongoing discovery. As new data are added to the database, the classification of some words may change, but this also enhances the overall robustness of our classifications.
In recent years, another significant development in the field of loanword etymology research has been the systematic organization of synonyms. In principle, synonyms can be organized into so-called synsets. There are two basic types: synchronic synsets contain words that share a similar meaning according to their current reading; and diachronic synsets, collecting older word forms or words with older meanings that are closely related. Diachronic research of synonyms is especially interesting because it can shed light on the initial period of confusion during the early formation of new terms, during which there was no consensus yet on a specific word form, and multiple synonyms might appear simultaneously. Over time, a consensus gradually formed, and certain terms emerged as mainstream, while others were gradually phased out. This process of vocabulary development varies in duration among different terms, some taking generations to reach a consensus, while others establish their status within a few years. In this regard, a significant contribution comes from Huang Heqing, whose work "Modern and Contemporary Chinese Etymology" is an important study on historical synsets.
We include both synchronic and diachronic synsets, as well as our own diachronic set. Although the inclusion of synsets was an integral part of the database from the beginning, the building of our diachronic synset was not a main focus of the undertaking. It developed rather naturally from the fact that we included many bilingual dictionaries in our resources. In this respect, our synset represents a rare type of diachronic multilingual synset, capturing the evolution and interconnections of concepts across different languages and time periods.
In total, the academic research section of this database includes the following research:
- "The Chinese Loanword Dictionary", Cen Qixiang (1984)
- "List of Hezhihanyu", Sato Takeyoshi (1996)
- "Nurturing Esthetics", Zhou Shenglai (2016)
- "A Study of Japanese Loanwords from the Late Qing Dynasty to the Early Republic of China and the Reform and Opening Up and Their Sinicization", Qu Zirui (2016)
- "Intertwined Cultural History – Manuscripts of Early Missionary Sinology Studies", Zhang Xiping (2017)
- "Modern Etymological Dictionary", Huang Heqing (2020)
- "Modern Chinese loanword research", Gao and Liu (1958)
- "Words Borrowed from Japanese in Modern Chinese", Wang Lida (1958)
- "Chronicles of Foreign Students in China", Saneto Keishu (1981)
- "The Chinese Loanword Dictionary", Liu, Gao, Mai and Shi (1984)
- "Translingual Practice", Liu He (1995)
- "The Etymological Dictionary of Modern Chinese Neologisms" (2001)
- "Conceptual History Research", Jin and Liu (2010)
- "Research on the vocabulary of Beijing Mandarin textbooks during the Meiji period in Japan", Chen Ming’e (2014)
- "The study of Japanese loanwords in Xiandai hanyu cidian", Morita Satoshi (2016)
- "The History of Loanwords in Ming-Qing Chinese", Zhao Ming (2016)
- "Xinhua Loanword Dictionary", Shi Youwei (2019)
- "Research on Modern Chinese 2-character words Shen Guowei (2019a)"
- "Research on Yan Fu’s translations" Shen Guowei (2019b)
- "History of Sino-Japanese linguistic exchanges", Shen Guowei (2020)
- "Lexical Studies During the End of the Qing and Beginning of the Republican Era", Zhang Ye (2019)
- "From East to East — Lexical concept between China and Japan in Modern Times", Chen Liwei (2019)
- "The Trajectory of Vocabulary Exchange between China and Japan in Modern Times", Zhu Jingwei (2020)
Besides modern studies, we also include 8 earlier studies:
- "Xinerya", Wang Rongbao and Ye Lan (1903)
- "New explanation of words", Liang Qichao (1904)
- "Blind men blind horses
- New terms", Peng Wenzu (1915)
- "Etymology of new terms", Zhou Shangfu (1917)
- "Etymology of Japanese words", Liu Zihe (1919)
- "Development of terminology translated from Japanese", Yu Yousun (1935)
- "Sources of new terms", Wang Yunwu (1944)
Furthermore, the database includes a list of around 600,000 modern Chinese terminologies across a large range of academic fields, compiled from online sources of the Ministry of Education in Taiwan.
In terms of search capabilities, the database offers a variety of search options to enable users to find and compare words according to their own interests and intuitions, thereby encouraging them to explore the borrowing history of words independently.
In terms of semantics, we provide translations for each Chinese term in different languages, mainly including actual historical source translations, such as those obtained from a variety of English Chinese dictionaries. Furthermore, we recognize the importance of Buddhist Chinese translation terms, thus providing definitions from three Buddhist dictionaries:
- "Buddhist Dictionary"
- "A Dictionary of Chinese Buddhist Terms"
- Seishi Karashima's "A Glossary of Kumārajīva's Translation of the Lotus Sutra"
These three dictionaries can be publicly accessed at https://glossaries.dila.edu.tw.
All Japanese vocabulary, translations, text sources, and dates are authorized data from the "Chunagon" database of the National Institute for Japanese Language and Linguistics.
To collect these related studies as comprehensively as possible, we have established this database, aiming to provide an integrated platform for doing research about neologisms in modern Chinese. Currently, we believe there is still potential for more discoveries, and a need for continuing acedemic work, in the following areas:
- Integration of Basic Data
- Dialogue Between Academia and Historical Sources
- Integration of Chinese Terms from Buddhist Texts (Sanskrit Terms) Within the Realm of Japanese Loanwords
- Comparison of Early Academic Research with Modern Academic Research
- Analysis and Integration of Western-Translated Chinese Texts and Words
- Better Integration of Various Academic Approaches to Loanwords
- Integration of Objective Approaches to Classifying Loanwords
Currently, the database has three features with limited excess:
- Generating word lists: Generating word lists based on specific criteria such as scholars, time, language, etc.
- Research data: Generating word lists for specific scholars.
- Historical documents: Querying individual historical document word lists and related images.
Ultimately, this database serves as a dialogue between academic evaluation and historical facts. We provide a quantified judgement whether a word is generally considered a loanword by the academic community, and statistical data reflecting the degree of modern academic agreement on the classification of each word. We then contrast this academic evaluation with historical evidence. In some cases, historical evidence supports the academic classification, while in others, it suggests a different classification. This is all easily accessible in the summary of each word.
The database was established under the sponsorship and guidance of Associate Researcher Chen Jianshou at the Institute of Modern History, Academia Sinica. We are also very grateful for the support of the Digital Humanities team at the Institute of Modern History, and we will continue to enrich its content. If there are any questions, please contact us.