Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

yoooniverse

CS224N Lecture 1. Word Vectors 본문

카테고리 없음

CS224N Lecture 1. Word Vectors

Ykl 2022. 9. 29. 00:53

CS224n: Natural Language Processing with Deep Learning

Stanford / Winter 2021

강의 페이지

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1214/

https://www.youtube.com/playlist?list=PLoROMvodv4rOSH4v6133s9LFPRHjEmbmJ

Stanford CS224N: Natural Language Processing with Deep Learning | Winter 2021

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai

www.youtube.com

note) 자연어처리는 cs224n, 이미지 프로세싱은 cs231n으로 찾아가자

Lecture 1 - Intro & Word Vectors

The meaning of the word

language는 본디 social system의 결과물로, 사람에 의해 만들어지고, 이해된다.

즉 컴퓨터는 인간과 같은 방식으로 인간의 언어를 이해하는 것이 불가능하다는 것.

따라서 계산을 이용한 방법으로 언어를 이해하는 시스템을 구축하는 것이 중요한 포인트.

traditional common NLP solution : dictionary synonyms

cons) lack of nuance, difficulties in updating new meanings of the world

1 word = 1 one-hot vector 단어의 수가 늘어나는 것과 비례하게 처리해야 할 벡터의 양이 늘어난다.

😀 개선

new idea : representing words by their context $with word vector$

문맥을 고려한 벡터를 만들기 위해 사용되는 알고리즘에 대한 소개 : word2vec

word2vec : a framework for learning word vectors

$1$ Data likelihood : how good at predicting words in the context of other words

앞에서 순서대로

product of each word as the center word

product of each word and a window around that of the probability of predicting that context word in the center word

likelihood에서 얻고자 하는 GOAL) maximize the likelihood of the context we see around center words

$2$ objective function $목적 함수$

also known as cost function, loss function

: the average negative log likelihood

T : number of words in the corpus

the reason why using minus sign $-$ : to minimize objective funtion

word2vec이 목표로 하는 것 : 중심 단어 $c$ 가 주어졌을 때 주변 단어 $o$ 가 등장할 조건부확률을 최대화 하기.

중심단어로 주변 단어를 잘 맞추는 것

수학적 이론에 대한 설명은 아래 링크 게시물을 보자. 너무 잘되어있는 설명!

https://ratsgo.github.io/from%20frequency%20to%20semantics/2017/03/11/embedding/

빈도수 세기의 놀라운 마법 Word2Vec, Glove, Fasttext · ratsgo's blog

안녕하세요. 이번 포스팅에서는 단어를 벡터화하는 임베딩 $embedding$ 방법론인 Word2Vec, Glove, Fasttext에 대해 알아보고자 합니다. 세 방법론은 대체 어떤 정보를 보존하면서 단어벡터를 만들기에 뛰

ratsgo.github.io

$아래는 나만 알아볼 수 있을듯한 필기..$

Comments

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

yoooniverse

yoooniverse

CS224N Lecture 1. Word Vectors 본문

CS224N Lecture 1. Word Vectors

CS224n: Natural Language Processing with Deep Learning

Stanford / Winter 2021

Lecture 1 - Intro & Word Vectors

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역