GLUE TASK

NLP

GLUE TASK

메린지 2022. 12. 15. 22:46

GLUE, General Language Understanding Evaluation

NLU 모델들의 성능 평가를 위한 방법

9개의 태스크를 크게 3가지로 분류해서 볼 수 있음

1. Single Sentence

- CoLA ( The Corpus of Linguistic Acceptability )

* 내용 : NLP 모델의 언어 능력을 평가하여 문장의 언어적 수용 가능성(문법)을 판단하여 분류하는 task

* Labeling : 문장이 문법적 = 1, 비문 = 0

(예)

- They can sing. (Classification = 1, 허용)
- many evidence was provided. (Classification = 0, 허용X)

- SST-2 ( Stanford Sentiment Treebank )

* 내용 : 영화 리뷰에 대한 감성을 예측하는 Task

* Labeling : 문장이 지닌 감정에 따라 긍정 = 1, 부정 = 0

(예)

- that loves its characters and communicates something rather beautiful about human nature. (Classification = 1, 긍정)

- contains no wit , only labored gags. (Classification = 0, 부정)

2. SIMILARITY AND PARAPHRSE

- MRPC ( Microsoft Research Paraphrase Corpus )

* 내용 : 온라인 뉴스에서 추출된 문장 쌍의 말뭉치로, 문장 쌍이 같은 의미인지 예측하는 Task

* Labeling : 문장 쌍이 같은 의미 = 1, 다른 의미 = 0

(예)

- Those who only had surgery lived an average of 46 months. / For house who got surgery alone, median survival was 41 months.(Classification = 1, 같음)

- Friday, Standford blanked the Gamecocks 8-0. / Standford has a team full of such players this season.

(Classification = 0, 다름)

- QQP ( Quora Question Pairs )

* 내용 : 질의응답 웹 사이트 Quora에서 추출한 질문 쌍의 말뭉치로, 질문 쌍이 같은 의미인지 예측하는 Task

* Labeling : 문장 쌍이 같은 의미 = 1, 다른 의미 = 0

예)

- How do you start a bakery? / How can one start a bakery business? (Classification = 1, 유사)

- What are natural number? / What is a least natural number? (Classification = 0, 유사 X)

- STS-B ( Semantic Textual Similarity Benchmark )

* 내용 : 뉴스 헤드라인, 영상 및 이미지 캡션, 자연어 추론 데이터로부터 추출한 문장 쌍의 데이터로, 문장 쌍이 유사한 의미인지를 1부터 5까지의 점수로 예측하는 Task

* Labeling : 문장이 문법적 = 1, 비문 = 0

예)

- that loves its characters and communicates something rather beautiful about human nature. (Classification = 1, 긍정)
- contains no wit , only labored gags. (Classification = 0, 부정)

3. INFERENCE

- MNLI ( Multi-Genre Natural Language Inference Corpus )

* 내용 : 전제 문장과 가설 문장이 주어졌을 때, 전제가 가설을 수반하는지, 가설과 모순되는지, 둘 다 아닌지 예측하는 Task

* Labeling : 전제가 가설을 수반 = 0, 중립 = 1, 모순 = 2

예)

- How do you konw? All this is their information again. / This information belongs to them. (Classification = 0, 수반)

- yeah well you＇re a student right / Well you＇re a mecahnics student right? (Classification = 1, 중립)

- Vrenna and I both fought him and he nearly took us. / Neither Vrenna nor myself have ever fought him.

(Classification = 2, 모순)

- QNLI ( Stanford Question Answering Dataset )

* 내용 : 위키피디아에서 가져온 데이터로, 문단에 질문에 대한 답이 포함되어 있는지 예측하는 Task

* Labeling : 문단 안에 답이 포함 = 0, 없으면 = 1

예)

- How many alumni does Olin Business School have worldwide? / Olin has a network of more than 16.000 alumni worldwide. (Classification = 0, 포함)

- Who did the children work beside? / In many cases, men worked from home. (Classification = 1, 포함 X)

- WNLI ( Winograd Schema Challenge )

* 내용 : 대명사가 대체된 문장이 원래 문장에 포함되는지 예측하는 Task

* Labeling : 대체 문장의 대명사가 원래 문장에 있으면 = 0, 없으면 = 1

예)

- Steve follows Fred's example in everything. He influences him hugely. / Steve influences him hugely.

(Classification = 0, 포함 X)

- I couldn't put the pot on the shelf because it was too tall. / The pot was too tall. (Classification = 1, 포함)

- RTE ( Recognizing Textual Entailment )

* 내용 : 뉴스와 위키피디아 기반의 데이터로, 문장 쌍의 함의 여부 예측 Task, MNLI와 데이터셋이 유사한데 3가지 multi-classification이 아닌 Entailment/Not-Entailment의 binary-classification으로 진행하는 Task인 것이 차이점임

* Labeling : 문장 쌍끼리 서로 포함될 수 없는 의미를 가지면 = 1, 포함되면 = 0

예)

- The Germany technology was employed to build Shanghai's existing maglevline, the first in the world to be used commercially. / Maglev is commercially used. (Classification = 0, 포함)

- No weapons of Mass Destruction Found in Iraq Yet. / Weapons of Mass Destruction Found in Iraq.

(Classification = 1, 포함 X)

'NLP' 카테고리의 다른 글

[Langchain] LLM/BitsandBytes 기반 Quantization LLM + Langchain 함께 사용하기 (1)	2024.04.24
[GPT] Generation 코드 탐험기 (+ generation param 설명) (0)	2024.02.12
BiLSTM(Bidirectional LSTM)과 CRF(Conditional Random Field) (0)	2023.01.31
표 기계독해에 다중 도메인을 적용하는 방법 3가지 (0)	2022.11.22

현재글GLUE TASK

NLP 잘하고 싶다 https://github.com/HyeLynnKIM

if, Python #KFold #StratifiedKFold #sklearn, latex #overleaf #chicago #reference, NLP #QA #EMNLP #SQuAD, Error #RuntimeError #tensor, python, Programmers #SQL #MySQL #Oracle, TRY, NLP #GLUE #GLUE TASK #NLU, python #error #torch, linux #한영전환키 #xrdp #Hangul, lstm #binary분류 #python, error #python #runtimeerror #grad_fn #requres_grad, cuda #tensorflow #cuda10.0 #gpu, python #re #정규표현식 #정규식, bash #conda #activate, programmers,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

공부해볼라는 스토리

GLUE TASK

GLUE, General Language Understanding Evaluation

1. Single Sentence

- CoLA ( The Corpus of Linguistic Acceptability )

- SST-2 ( Stanford Sentiment Treebank )

2. SIMILARITY AND PARAPHRSE

- MRPC ( Microsoft Research Paraphrase Corpus )

- QQP ( Quora Question Pairs )

- STS-B ( Semantic Textual Similarity Benchmark )

3. INFERENCE

- MNLI ( Multi-Genre Natural Language Inference Corpus )

- QNLI ( Stanford Question Answering Dataset )

- WNLI ( Winograd Schema Challenge )

- RTE ( Recognizing Textual Entailment )

'NLP' 카테고리의 다른 글

'NLP'의 다른글

티스토리툴바

GLUE TASK

GLUE, General Language Understanding Evaluation

1. Single Sentence

- CoLA ( The Corpus of Linguistic Acceptability )

- SST-2 ( Stanford Sentiment Treebank )

2. SIMILARITY AND PARAPHRSE

- MRPC ( Microsoft Research Paraphrase Corpus )

- QQP ( Quora Question Pairs )

- STS-B ( Semantic Textual Similarity Benchmark )

3. INFERENCE

- MNLI ( Multi-Genre Natural Language Inference Corpus )

- QNLI ( Stanford Question Answering Dataset )

- WNLI ( Winograd Schema Challenge )

- RTE ( Recognizing Textual Entailment )

'NLP' 카테고리의 다른 글

'NLP'의 다른글

관련글

티스토리툴바