Sign in

A fool. Independent researcher in NLP, particularly sequence labeling. Expat. Instrumental music lover. Skeptic & pragmatist. Atheist. PhD in computer science.

TRANSFER LEARNING

For NTCIR-15 FinNum-2 and DialEval-1 Tasks

Disclaimer: this post won’t use any author’s or editorial we. The overconfident title and content are opinionated. The work described here is a team effort, but my words don’t express the view of my employer and co-authors.

On December 14, 2020, NTCIR-15 conference has concluded successfully. I have led two teams to work on the conference’s shared task, namely DialEval-1 and FinNum-2. My teams’ works have reached the first and the second places for DialEval-1 and FinNum-2, respectively. …


TRANSFER LEARNING

A brief introduction and a simple approach

DialEval-1 consists of two subtasks: Nugget Detection (ND) and Dialogue Quality (DQ). They aim to evaluate customer-helpdesk dialogues automatically. The ND subtask is to classify whether a customer or helpdesk turn is a nugget, where being a nugget means that the turn helps towards problem solving; and the DQ subtask is to assign quality scores to each dialogue in terms of three criteria: task accomplishment, customer satisfaction, and efficiency. The official evaluation results of 18 runs are received from 7 teams. IMTKU (Dept. …


名校之外,爬上陡坡,不畏強風和暈眩,仰視前途茫茫

在一片纏繞交錯的網中

想對自己的人生進行 XDDDD (eXtreme Due Day Driven Development)¹ 嗎?勉強聽聽此人的當年不勇吧。關於「輸在起跑點」,我略知一二。

直到高中畢業擠進私校資科為止,生於台灣最窮、鐵路最晚電氣化的地方,長於芋頭番薯 (ō͘-á han-chî) 軍教家庭,要說有什麼還比下有餘的話,恐怕就那萬惡的十八趴而已^H^H。大三做專題那年,我加入了號稱系上最操的實驗室,想著好歹有機會考上名校研究所洗洗身分吧?當指導教授強烈反對我們課餘上補習班時,我才開始意識到,自己一直以來光顧著練碼農口訣,田都還種不好,更別提怎麼賣藝了。別人家的小孩搞不好要進大公司(可靠消息指出,某上 t 下 m 殺 v 社只收台清交)了,我的田要是以後吃不到聖地亞哥金坷垃該怎麼辦?

(接下來大概可以猜出我馬齒徒增又動搖的程度)

那年,Google 才剛開始變得好用(之前我都用 Infoseek),最受歡迎的手機還是 Nokia 6150, Steve Jobs 剛完成王子復仇,還沒人知道將來 iCEO 系列某產品會改變大家走路的方式。至於 GPU, 當然是打 WoW 用的啊!BTW, ATI 可能比 nVidia 划算(但我喜歡遊戲類型不同,沒必要花那錢就是了)。回頭說到碼農藝,Java 剛紅起來,Python 還沒什麼知名度,最多人用機器學習模型是 Hidden Markov model 而不是不夠深層的類神經網路。要上那裡找武功心法?MOOCs 這縮寫的原文才剛出現,談軟體的約耳還沒真認識 coding 的恐怖,於是堆疊溢位只有教科書上那一種意義。我則是因為讀了不少施威銘研究室的書而沾沾自喜。

然後,老闆說要做 J2EE, J2ME, 和一堆 J 來 E∨¬E 去沒中文書的東西。另一方面,我光能確保自己不會被當一屁股就不錯了。亂成一團了啊 XDDDD

在一片穿織交錯的網中

看來看去,沒有高潮就只能怪床了?接著想起那名叫高中的床,有群老師對英文教育十分熱衷。好吧,英文確實是顯而易見的瓶頸,但總覺得不是問題根源。還記得小時候看偉人傳記說到英美某些科學家,為了學習最先進的知識而自學俄文法文等等。這類偉人除了不看鮭魚力爭迴游以及慧根超多條之外,還有什麼值得效法的嗎?

by courtesy of https://imgur.com/Ru1BtMP

我大三時,網際網路還不夠方便好用,反過來說,焦慮資訊爆炸和逃避選擇自由還不是講得出來的藉口。就算到了今天,在已經看不太懂的英文資料中,要怎麼挑出最好的,或最低限度,去掉有問題的,也不見得輕鬆。如果非要給這障礙一個期限的話,我希望…… 當然限量是殘酷的,時間要花在刀口上,我是這麼想的。眼前爛帳一團,不剪不理不忍卒睹的話,能不能像 “The Amazing” Yen 那樣穿過去

觀眾當然能一眼看穿 “historical documentaries” 的把戲。現實中戰力高強的人們有句老話:推測難,推測未來特別難。回頭看來,這幾乎是學習的永恆課題,非監督式學習與強化學習自然也不例外。大學與教授的任務和中小學不同,指導未必監督,強化得看造化。從吃角子老虎機的經典模型來看,似乎有個什麼一以貫之的吾道。

A carpet of leaves illuminated by the moon, around an empty grave.

如果有人宣稱吃角子老虎(或其他類似狀況)有什麼必勝法,那麼大概就像「鑰匙掉了,在路燈下找,雖然未必掉在這裡,但這裡才有光」諸如此類的。另一方面也像那個常被引錯的孫子兵法名言,百戰頂多不殆而非百勝,總之要知己知彼。

一、找出自己最重要的目標。


I want to | When → So I can 授人以魚不如授人以漁

この job storyZeals Advent Calendar のために書かれています。
I write this job story for Zeals Advent Calendar.
這則 job story 是為 Zeals Advent Calendar 而寫的。

Take-Home Messages

  1. 情報非対称性を解消:コミュニケーション壁を下げる
    Eliminate information asymmetry: lower barriers to communications
    消弭資訊不對稱:降低溝通門檻
  2. オープンソースソフトウェアと無料のクラウドサービスを利用する
    Utilize open source softwares and free cloud services
    善用開源軟體與免費雲端服務
    huggingface, fastai, AllenNLP, etc.
    • Google Colab, Paperspace Gradient, etc.
  3. コードだけでなく、すべてのためにレビューを求める
    Seeking reviews for everything, not just for code
    不光是程式碼,任何事都該徵詢審閱

§1 Talk Is Cheap, but……

IMAO,
実用的な話はちょっと優れている場合があります。
actionable talk can be slightly better.
實用的嘴砲倒有可能好點。

何かを民主化するなどの善意のために、誰もが参加することが難しい場合は矛盾したものになるでしょう。
For such a good intention as “democratizing something,” it would be an oxymoron if it is difficult for anyone to participate.
以「民主化某事」這般良善意圖而言,若人人參與並非易事就自相矛盾了。

(So the talk…


Usability: effectiveness, efficiency, and satisfaction.

TL;DR and for what you may not see in this post.

  • Know (what are) self and others (part-1);
  • Control PRNG states, not just their seed numbers (part-2);
  • Uncover side effects, e.g.:
  • Mixed Precision, mind the gap of the dynamic loss scale [1][2];
  • — cuDNN, avoid using its built-in dropout;
  • — Non-determinism/Instability of multi-core/process/thread usages.

This series of posts shares my thoughts on how to get deterministic outcomes of machine learning experiments at an affordable cost. As usual, most of the content is only my murmur. Feel free to skip them and check the Jupyter notebook for this series directly. It tries Universal Language Model Fine-tuning for Text Classification with a…


A rough translation of 如何策略性地撰寫論文草稿 by Dr. Wallace

A week ago, I received a newsletter about paper drafting. It’s from one of the biggest paper editing agencies in Taiwan. Although the author, Dr. Steve Wallace, is an American, he wrote newsletters in Mandarin Chinese. So I took the liberty to translate his points roughly in English here.

One may find many “DOs” and “DON’Ts” online or via friends. Whether they will be effective are on a case-by-case basis. We think academic paper authors should develop various strategies and resources to find the best approach for themselves. Every opportunity and task deserves a different measure. …


Hold Every Random State Accountable

TL;DR and for what you may not see in this post.

  • Know (what are) self and others (part-1);
  • Control PRNG states, not just their seed numbers;
  • Uncover side effects (part-3), e.g.:
  • Mixed Precision, mind the gap of the dynamic loss scale [1][2];
  • — cuDNN, avoid using its built-in dropout;
  • — Non-determinism/Instability of multi-core/process/thread usages.

This series of posts shares my thoughts on how to get deterministic outcomes of machine learning experiments at an affordable cost. As usual, most of the content is only my murmur. Feel free to skip them and check the Jupyter notebook for this series directly. It tries Universal Language Model Fine-tuning for Text Classification with a…


A Bigger, Longer, & Uncut Preamble

TL; DR and for what you may not see in this post.

  1. Know (what are) self and others;
  2. Control PRNG states, not just their seed numbers (part-2);
  3. Uncover side effects (part-3), e.g.:
  4. Mixed Precision, mind the gap of the dynamic loss scale [1][2];
  • — cuDNN, avoid using its built-in dropout;
  • — Non-determinism/Instability of multi-core/process/thread usages.

This series of posts shares my thoughts on how to get deterministic outcomes of machine learning experiments at an affordable cost. As usual, most of the content is only my murmur. Feel free to skip them and check the Jupyter notebook for this series directly. It tries Universal Language Model Fine-tuning for Text Classification with a…


Rename and repurpose this blog

This blog was merely a personal notepad. Now I like to conduct some experiments with it for the next couple of months, starting from redefining it.

According to https://www.dictionary.com/e/acronyms/imao/, “in my arrogant opinion is usually used with a sense of irony or self-deprecation.” IMAO and a far-fetched way, it can also stand for ī-máo (一毛) in Mandarin Chinese, meaning one single hair and figuratively something insignificant, such as a cent. In that sense, i-mao is just half of “my two cents.”

As insignificant as i-mao can be, one of my favorite stories has made quite…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store