13 - Named Entity Recognition

ucla | CS 162 | 2024-03-11 02:33


Table of Contents

Named Entity Problem

  • anything that can be referred to with a proper name
  • 4 common tags: Person (PER), Location (LOC), Organization (ORG), Geo-Political Entity (GPE)
  • but also dates, times, prices
  • e.g.
    • Type Ambiguity

  • Sense of words (different semantic meanings) can have different types of NER tags:

    BIO Tagging

  • set up as a sequence tagging problem
  • use BIO tagging to distinguish start and end of consecutive separate entities

    Algorithms

  • HMM, MEMM/CRF (conditional random fields)
  • neural sequence models (RNNs, LSTMs, Transformers)
  • Neural CRF models
  • LLMs (Bert, finetuned, etc.)

Neural Sequence Tagging

RNNs alone

  • BIO tags as outputs for token inputs:
  • works but completely ignores interdependencies of output tags

    MEMM w/ RNNs (LSTMs)

  • concat hidden states with (previous) tags
  • thus each tag has an embedding (hidden state)
  • then softmax wrt to the linear transformation of these embeddings (lin. trans. done using Wp) to get the probs for each tag using LSTM params:

    MEMM w/ Transformers

  • concat contextualized embeddings (outputs) with previous tags to create contextualized tags
  • then lin transform with Wp
  • then softmax to select the probs of each tag using transformer params:

    Pretrained LLMs for NER

  • e.g. pretrain BERT for seq tagging
  • then use it to create output embeddings and make preds as BIO tags:
  • then softmax on the output embeddings to get probs of tags