Part of Speech (POS) Tagging in NLP

Part of Speech tagging assigns word classes (nouns, verbs, adjectives, etc.) to each word in a sentence. It is a foundational step in many NLP tasks, helping understand the syntactic structure and meaning of text.

Why is POS Tagging Important?

POS Tags in NLTK

NLTK uses the Penn Treebank tagset, e.g., NN for noun, VB for verb, JJ for adjective, etc.

10 Examples of POS Tagging with NLTK

Example 1: Basic POS Tagging


import nltk
from nltk.tokenize import word_tokenize
from nltk import pos_tag

nltk.download('averaged_perceptron_tagger')
nltk.download('punkt')

text = "The quick brown fox jumps over the lazy dog."
tokens = word_tokenize(text)
tagged = pos_tag(tokens)
print(tagged)

Example 2: Tagging Multiple Sentences


sentences = [
    "I love Python programming.",
    "She is reading a book."
]

for sent in sentences:
    tokens = word_tokenize(sent)
    print(pos_tag(tokens))

Example 3: Extracting Nouns from Text


nouns = [word for word, pos in tagged if pos.startswith('NN')]
print("Nouns:", nouns)

Example 4: Extracting Verbs from Text


verbs = [word for word, pos in tagged if pos.startswith('VB')]
print("Verbs:", verbs)

Example 5: Filtering Adjectives


adjectives = [word for word, pos in tagged if pos == 'JJ']
print("Adjectives:", adjectives)

Example 6: POS Tags for Named Entity Recognition Preprocessing


text = "Barack Obama was born in Hawaii."
tokens = word_tokenize(text)
tagged = pos_tag(tokens)
proper_nouns = [word for word, pos in tagged if pos == 'NNP']
print("Proper Nouns (Candidates for NER):", proper_nouns)

Example 7: POS Tagging with spaCy


import spacy
nlp = spacy.load("en_core_web_sm")

doc = nlp("The quick brown fox jumps over the lazy dog.")
for token in doc:
    print(f"{token.text}: {token.pos_}")

Example 8: Visualizing POS Tags Frequency


from collections import Counter
import matplotlib.pyplot as plt

tag_counts = Counter(tag for word, tag in tagged)
print(tag_counts)

plt.bar(tag_counts.keys(), tag_counts.values())
plt.title("POS Tag Frequency")
plt.xlabel("POS Tags")
plt.ylabel("Count")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

Example 9: Using POS Tags for Dependency Parsing


import spacy
nlp = spacy.load("en_core_web_sm")

doc = nlp("The quick brown fox jumps over the lazy dog.")
for token in doc:
    print(f"{token.text} --> {token.dep_} --> {token.head.text}")

Example 10: POS Tagging in Different Languages


import spacy
nlp_fr = spacy.load("fr_core_news_sm")

doc = nlp_fr("Le renard brun rapide saute par-dessus le chien paresseux.")
for token in doc:
    print(f"{token.text}: {token.pos_}")