13 - Named Entity Recognition
ucla | CS 162 | 2024-03-11 02:33
Table of Contents
Named Entity Problem
- anything that can be referred to with a proper name
- 4 common tags: Person (PER), Location (LOC), Organization (ORG), Geo-Political Entity (GPE)
- but also dates, times, prices
- e.g.
- Sense of words (different semantic meanings) can have different types of NER tags:
BIO Tagging
- set up as a sequence tagging problem
- use BIO tagging to distinguish start and end of consecutive separate entities
Algorithms
- HMM, MEMM/CRF (conditional random fields)
- neural sequence models (RNNs, LSTMs, Transformers)
- Neural CRF models
- LLMs (Bert, finetuned, etc.)
Neural Sequence Tagging
RNNs alone
- BIO tags as outputs for token inputs:
- works but completely ignores interdependencies of output tags
MEMM w/ RNNs (LSTMs)
- concat hidden states with (previous) tags
- thus each tag has an embedding (hidden state)
- then softmax wrt to the linear transformation of these embeddings (lin. trans. done using
) to get the probs for each tagusing LSTM params:
MEMM w/ Transformers
- concat contextualized embeddings (outputs) with previous tags to create contextualized tags
- then lin transform with
- then softmax to select the probs of each tag
using transformer params:
Pretrained LLMs for NER
- e.g. pretrain BERT for seq tagging
- then use it to create output embeddings and make preds as BIO tags:
- then softmax on the output embeddings to get probs of tags