SLM for Text Tasks
Build practical NLP applications with small language models
SLM for Text Tasks
Build production-ready NLP applications using small language models. Learn how to leverage SLMs for classification, extraction, summarization, and structured output generation with minimal resource requirements.
TL;DR
SLMs handle NLP tasks (classification, NER, extraction, summarization) surprisingly well with good prompting. Key patterns: request JSON output explicitly, use low temperature (0.1) for consistency, extract answers with regex fallback, and cache repeated calls. A 3B model can match GPT-3.5 quality on structured tasks at 1/100th the cost.
Project Overview
| Aspect | Details |
|---|---|
| Difficulty | Beginner |
| Time | 2-3 hours |
| Prerequisites | Local SLM Setup |
| What You'll Build | Multi-task NLP pipeline with text classification, NER, and extraction |
What You'll Learn
- Text classification and sentiment analysis with SLMs
- Named entity recognition (NER) techniques
- Information extraction from unstructured text
- Structured output generation with Pydantic
- Prompt engineering for reliable results
- Building reusable NLP pipelines
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ NLP Pipeline Architecture │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────────────┐ │
│ │ Input Layer │ │
│ │ ┌─────────┐ ┌───────┐ │ │
│ │ │Raw Text │ │ Batch │ │ │
│ │ └────┬────┘ └───┬───┘ │ │
│ └──────┼──────────┼─────┘ │
│ └────┬─────┘ │
│ ▼ │
│ ┌───────────────────────┐ ┌───────────────────────┐ │
│ │ NLP Pipeline │ │ SLM Backend │ │
│ │ │ │ │ │
│ │ ┌─────────────┐ │ │ ┌─────────────────┐ │ │
│ │ │Preprocessor │ │ │ │ Phi-3 Mini │ │ │
│ │ └──────┬──────┘ │ │ │ (classification)│ │ │
│ │ │ │ │ └─────────────────┘ │ │
│ │ ┌─────┼─────┐ │ │ ┌─────────────────┐ │ │
│ │ │ │ │ │ │ │ Qwen2.5 3B │ │ │
│ │ ▼ ▼ ▼ │ ◄── │ │ (extraction) │ │ │
│ │ ┌─────┐┌────┐┌─────┐ │ │ └─────────────────┘ │ │
│ │ │Class││ NER││Extract│ │ │ ┌─────────────────┐ │ │
│ │ └──┬──┘└──┬─┘└──┬───┘ │ │ │ Gemma 2 2B │ │ │
│ │ └──────┼─────┘ │ │ │ (generation) │ │ │
│ │ ▼ │ │ └─────────────────┘ │ │
│ │ ┌───────────────┐ │ │ │ │
│ │ │ Structure │ │ └───────────────────────┘ │
│ │ │ Generator │ │ │
│ │ └───────┬───────┘ │ │
│ └───────────┼───────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ Output Layer │ │
│ │ ┌──────────────┐ ┌───────────────┐ ┌─────────────┐ │ │
│ │ │Structured JSON│ │Classifications│ │ Entities │ │ │
│ │ └──────────────┘ └───────────────┘ └─────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘Project Setup
Install Dependencies
# Create project directory
mkdir slm-text-tasks && cd slm-text-tasks
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install ollama pydantic instructor langchain-ollama richPull Required Models
# Recommended models for text tasks
ollama pull phi3:mini # 2.3GB - Great for classification
ollama pull qwen2.5:3b # 2.0GB - Excellent for extraction
ollama pull gemma2:2b # 1.6GB - Fast for simple tasksPart 1: Text Classification
Basic Classifier
Start with a simple text classifier using prompt engineering.
# classifier.py
import ollama
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field
class Sentiment(str, Enum):
POSITIVE = "positive"
NEGATIVE = "negative"
NEUTRAL = "neutral"
class Category(str, Enum):
TECHNOLOGY = "technology"
BUSINESS = "business"
SPORTS = "sports"
ENTERTAINMENT = "entertainment"
POLITICS = "politics"
HEALTH = "health"
SCIENCE = "science"
OTHER = "other"
class ClassificationResult(BaseModel):
"""Structured classification output."""
sentiment: Sentiment
category: Category
confidence: float = Field(ge=0.0, le=1.0)
reasoning: Optional[str] = None
def classify_text(
text: str,
model: str = "phi3:mini",
include_reasoning: bool = False
) -> ClassificationResult:
"""Classify text for sentiment and category."""
prompt = f"""Analyze the following text and classify it.
Text: {text}
Respond with ONLY a JSON object in this exact format:
{{
"sentiment": "positive" or "negative" or "neutral",
"category": one of ["technology", "business", "sports", "entertainment", "politics", "health", "science", "other"],
"confidence": number between 0 and 1,
"reasoning": "brief explanation" (only if reasoning requested)
}}
{"Include brief reasoning for your classification." if include_reasoning else "Do not include reasoning."}
JSON response:"""
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": 0.1} # Low temperature for consistency
)
# Parse response
import json
content = response["message"]["content"]
# Extract JSON from response
try:
# Try to find JSON in the response
start = content.find("{")
end = content.rfind("}") + 1
if start != -1 and end > start:
json_str = content[start:end]
data = json.loads(json_str)
return ClassificationResult(**data)
except (json.JSONDecodeError, ValueError) as e:
# Fallback to defaults
return ClassificationResult(
sentiment=Sentiment.NEUTRAL,
category=Category.OTHER,
confidence=0.5,
reasoning=f"Failed to parse: {str(e)}"
)
# Example usage
if __name__ == "__main__":
texts = [
"Apple's new M4 chip delivers unprecedented performance gains in AI workloads.",
"The team's devastating loss marks their fifth consecutive defeat this season.",
"Scientists discover high microbial activity under Antarctic ice sheet.",
]
for text in texts:
result = classify_text(text, include_reasoning=True)
print(f"Text: {text[:50]}...")
print(f" Sentiment: {result.sentiment.value}")
print(f" Category: {result.category.value}")
print(f" Confidence: {result.confidence:.2f}")
print(f" Reasoning: {result.reasoning}")
print()Multi-Label Classification
Handle texts that belong to multiple categories.
# multi_label_classifier.py
from typing import List
from pydantic import BaseModel, Field
import ollama
class MultiLabelResult(BaseModel):
"""Multi-label classification result."""
labels: List[str]
scores: dict[str, float]
primary_label: str
TOPIC_LABELS = [
"artificial_intelligence",
"machine_learning",
"data_science",
"software_engineering",
"cloud_computing",
"cybersecurity",
"web_development",
"mobile_development",
"devops",
"blockchain"
]
def multi_label_classify(
text: str,
labels: List[str] = TOPIC_LABELS,
threshold: float = 0.5,
model: str = "phi3:mini"
) -> MultiLabelResult:
"""Classify text with multiple labels."""
labels_str = ", ".join(labels)
prompt = f"""Analyze this text and assign relevance scores to each topic.
Text: {text}
Available topics: {labels_str}
For each topic, assign a score from 0.0 to 1.0 based on relevance.
Only include topics with score > 0.3.
Respond with JSON:
{{
"scores": {{"topic_name": score, ...}},
"primary_label": "most relevant topic"
}}
JSON:"""
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": 0.1}
)
import json
content = response["message"]["content"]
try:
start = content.find("{")
end = content.rfind("}") + 1
data = json.loads(content[start:end])
scores = data.get("scores", {})
# Filter by threshold
filtered_labels = [l for l, s in scores.items() if s >= threshold]
return MultiLabelResult(
labels=filtered_labels,
scores=scores,
primary_label=data.get("primary_label", filtered_labels[0] if filtered_labels else "unknown")
)
except Exception as e:
return MultiLabelResult(
labels=[],
scores={},
primary_label="unknown"
)
# Example
if __name__ == "__main__":
text = """
We implemented a new CI/CD pipeline using GitHub Actions that automatically
deploys our machine learning models to Kubernetes. The pipeline includes
security scanning and runs our test suite before deployment.
"""
result = multi_label_classify(text)
print(f"Labels: {result.labels}")
print(f"Primary: {result.primary_label}")
print(f"Scores: {result.scores}")Part 2: Named Entity Recognition
Custom NER with SLMs
Extract named entities without requiring specialized NER models.
# ner_extractor.py
from typing import List, Optional
from pydantic import BaseModel
import ollama
class Entity(BaseModel):
"""A named entity extracted from text."""
text: str
type: str
start_idx: Optional[int] = None
end_idx: Optional[int] = None
class NERResult(BaseModel):
"""NER extraction result."""
entities: List[Entity]
original_text: str
ENTITY_TYPES = [
"PERSON",
"ORGANIZATION",
"LOCATION",
"DATE",
"TIME",
"MONEY",
"PRODUCT",
"EVENT",
"TECHNOLOGY",
"EMAIL",
"PHONE"
]
def extract_entities(
text: str,
entity_types: List[str] = ENTITY_TYPES,
model: str = "qwen2.5:3b"
) -> NERResult:
"""Extract named entities from text."""
types_str = ", ".join(entity_types)
prompt = f"""Extract all named entities from the following text.
Text: {text}
Entity types to find: {types_str}
For each entity found, provide:
- text: the exact text of the entity
- type: the entity type from the list above
Respond with JSON:
{{
"entities": [
{{"text": "entity text", "type": "ENTITY_TYPE"}},
...
]
}}
Only include entities you are confident about. JSON:"""
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": 0.1}
)
import json
content = response["message"]["content"]
try:
start = content.find("{")
end = content.rfind("}") + 1
data = json.loads(content[start:end])
entities = []
for ent in data.get("entities", []):
entity = Entity(
text=ent["text"],
type=ent["type"]
)
# Find position in original text
idx = text.find(ent["text"])
if idx != -1:
entity.start_idx = idx
entity.end_idx = idx + len(ent["text"])
entities.append(entity)
return NERResult(entities=entities, original_text=text)
except Exception as e:
return NERResult(entities=[], original_text=text)
def highlight_entities(ner_result: NERResult) -> str:
"""Create highlighted text with entity annotations."""
from rich.console import Console
from rich.text import Text
text = Text(ner_result.original_text)
# Color map for entity types
colors = {
"PERSON": "cyan",
"ORGANIZATION": "green",
"LOCATION": "yellow",
"DATE": "magenta",
"MONEY": "red",
"PRODUCT": "blue",
"TECHNOLOGY": "bright_cyan",
}
# Sort by position (reverse) to avoid offset issues
sorted_entities = sorted(
[e for e in ner_result.entities if e.start_idx is not None],
key=lambda x: x.start_idx,
reverse=True
)
for entity in sorted_entities:
color = colors.get(entity.type, "white")
text.stylize(f"bold {color}", entity.start_idx, entity.end_idx)
console = Console()
console.print(text)
# Print legend
print("\nEntities found:")
for entity in ner_result.entities:
print(f" [{entity.type}] {entity.text}")
return str(text)
# Example usage
if __name__ == "__main__":
sample_text = """
On January 15, 2024, Anthropic announced Claude 3, their latest AI assistant.
The San Francisco-based company, founded by Dario Amodei, raised $750 million
in Series C funding. Microsoft and Google are also investing heavily in AI.
Contact press@anthropic.com for more information.
"""
result = extract_entities(sample_text.strip())
highlight_entities(result)Domain-Specific NER
Customize entity extraction for specific domains.
# domain_ner.py
from typing import List, Dict
from pydantic import BaseModel
import ollama
class DomainEntity(BaseModel):
text: str
type: str
attributes: Dict[str, str] = {}
class DomainNERResult(BaseModel):
domain: str
entities: List[DomainEntity]
# Domain-specific entity configurations
DOMAIN_CONFIGS = {
"medical": {
"entity_types": ["SYMPTOM", "DISEASE", "MEDICATION", "DOSAGE", "BODY_PART", "PROCEDURE", "DOCTOR"],
"examples": [
{"text": "100mg", "type": "DOSAGE"},
{"text": "headache", "type": "SYMPTOM"},
{"text": "ibuprofen", "type": "MEDICATION"}
]
},
"legal": {
"entity_types": ["CASE_NUMBER", "COURT", "JUDGE", "PARTY", "STATUTE", "DATE", "JURISDICTION"],
"examples": [
{"text": "Case No. 2024-CV-123", "type": "CASE_NUMBER"},
{"text": "Supreme Court", "type": "COURT"}
]
},
"financial": {
"entity_types": ["COMPANY", "TICKER", "AMOUNT", "CURRENCY", "PERCENTAGE", "METRIC", "DATE"],
"examples": [
{"text": "AAPL", "type": "TICKER"},
{"text": "$1.2B", "type": "AMOUNT"}
]
},
"ecommerce": {
"entity_types": ["PRODUCT", "BRAND", "PRICE", "SKU", "CATEGORY", "FEATURE", "SIZE", "COLOR"],
"examples": [
{"text": "iPhone 15 Pro", "type": "PRODUCT"},
{"text": "Apple", "type": "BRAND"}
]
}
}
def domain_ner(
text: str,
domain: str,
model: str = "qwen2.5:3b"
) -> DomainNERResult:
"""Extract domain-specific entities."""
config = DOMAIN_CONFIGS.get(domain, {
"entity_types": ["ENTITY"],
"examples": []
})
types_str = ", ".join(config["entity_types"])
examples_str = "\n".join([
f' - "{ex["text"]}" -> {ex["type"]}'
for ex in config["examples"]
])
prompt = f"""You are a {domain} domain expert. Extract all relevant entities.
Text: {text}
Entity types for {domain} domain: {types_str}
Examples:
{examples_str}
Extract entities with any relevant attributes. Respond with JSON:
{{
"entities": [
{{"text": "entity", "type": "TYPE", "attributes": {{"key": "value"}}}},
...
]
}}
JSON:"""
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": 0.1}
)
import json
content = response["message"]["content"]
try:
start = content.find("{")
end = content.rfind("}") + 1
data = json.loads(content[start:end])
entities = [
DomainEntity(
text=e["text"],
type=e["type"],
attributes=e.get("attributes", {})
)
for e in data.get("entities", [])
]
return DomainNERResult(domain=domain, entities=entities)
except Exception:
return DomainNERResult(domain=domain, entities=[])
# Example
if __name__ == "__main__":
medical_text = """
Patient presents with persistent headache and fever of 101.5F for 3 days.
Prescribed Tylenol 500mg every 6 hours. Follow up with Dr. Smith in 1 week
if symptoms persist. Consider CT scan if headache worsens.
"""
result = domain_ner(medical_text, "medical")
print(f"Domain: {result.domain}")
for entity in result.entities:
attrs = f" ({entity.attributes})" if entity.attributes else ""
print(f" [{entity.type}] {entity.text}{attrs}")Part 3: Information Extraction
Structured Data Extraction
Extract structured information from unstructured text using SLMs.
# info_extractor.py
from typing import List, Optional, Any
from pydantic import BaseModel, Field
from datetime import datetime
import ollama
class ContactInfo(BaseModel):
"""Extracted contact information."""
name: Optional[str] = None
email: Optional[str] = None
phone: Optional[str] = None
company: Optional[str] = None
title: Optional[str] = None
address: Optional[str] = None
class EventInfo(BaseModel):
"""Extracted event information."""
name: str
date: Optional[str] = None
time: Optional[str] = None
location: Optional[str] = None
organizer: Optional[str] = None
description: Optional[str] = None
class ProductInfo(BaseModel):
"""Extracted product information."""
name: str
price: Optional[str] = None
features: List[str] = []
specifications: dict = {}
brand: Optional[str] = None
category: Optional[str] = None
def extract_structured_data(
text: str,
schema: type[BaseModel],
model: str = "phi3:mini"
) -> BaseModel:
"""Extract structured data based on a Pydantic schema."""
# Get schema JSON for prompt
schema_json = schema.model_json_schema()
# Create field descriptions
fields_desc = []
for field_name, field_info in schema_json.get("properties", {}).items():
field_type = field_info.get("type", "any")
description = field_info.get("description", "")
fields_desc.append(f"- {field_name} ({field_type}): {description}")
fields_str = "\n".join(fields_desc)
prompt = f"""Extract structured information from the text.
Text: {text}
Extract the following fields:
{fields_str}
Respond with a JSON object containing the extracted fields.
Use null for fields you cannot find. JSON:"""
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": 0.1}
)
import json
content = response["message"]["content"]
try:
start = content.find("{")
end = content.rfind("}") + 1
data = json.loads(content[start:end])
return schema(**data)
except Exception as e:
# Return empty instance
return schema()
# Specialized extractors
def extract_contact(text: str, model: str = "phi3:mini") -> ContactInfo:
"""Extract contact information from text."""
return extract_structured_data(text, ContactInfo, model)
def extract_event(text: str, model: str = "phi3:mini") -> EventInfo:
"""Extract event information from text."""
return extract_structured_data(text, EventInfo, model)
def extract_product(text: str, model: str = "phi3:mini") -> ProductInfo:
"""Extract product information from text."""
return extract_structured_data(text, ProductInfo, model)
# Example
if __name__ == "__main__":
# Test contact extraction
contact_text = """
Hi, I'm Sarah Johnson, the VP of Engineering at TechCorp Inc.
You can reach me at sarah.johnson@techcorp.com or call
(555) 123-4567. Our office is at 123 Innovation Drive,
San Francisco, CA 94105.
"""
contact = extract_contact(contact_text)
print("Contact Info:")
print(f" Name: {contact.name}")
print(f" Email: {contact.email}")
print(f" Phone: {contact.phone}")
print(f" Company: {contact.company}")
print(f" Title: {contact.title}")
print()
# Test event extraction
event_text = """
Join us for the Annual AI Conference 2024!
Date: March 15-17, 2024
Location: Moscone Center, San Francisco
Hosted by the AI Research Foundation
Three days of cutting-edge AI presentations and workshops.
"""
event = extract_event(event_text)
print("Event Info:")
print(f" Name: {event.name}")
print(f" Date: {event.date}")
print(f" Location: {event.location}")
print(f" Organizer: {event.organizer}")Relationship Extraction
Extract relationships between entities.
# relationship_extractor.py
from typing import List, Tuple
from pydantic import BaseModel
import ollama
class Relationship(BaseModel):
"""A relationship between two entities."""
subject: str
predicate: str
object: str
confidence: float = 1.0
class RelationshipGraph(BaseModel):
"""Graph of extracted relationships."""
relationships: List[Relationship]
entities: List[str]
def to_triples(self) -> List[Tuple[str, str, str]]:
"""Convert to list of (subject, predicate, object) triples."""
return [(r.subject, r.predicate, r.object) for r in self.relationships]
def extract_relationships(
text: str,
relationship_types: List[str] = None,
model: str = "qwen2.5:3b"
) -> RelationshipGraph:
"""Extract entity relationships from text."""
types_hint = ""
if relationship_types:
types_hint = f"\nFocus on these relationship types: {', '.join(relationship_types)}"
prompt = f"""Extract relationships between entities in this text.
Text: {text}
{types_hint}
For each relationship, identify:
- subject: the entity performing/having the relationship
- predicate: the relationship type (e.g., "works_at", "founded", "located_in")
- object: the entity being related to
Respond with JSON:
{{
"relationships": [
{{"subject": "...", "predicate": "...", "object": "..."}},
...
]
}}
JSON:"""
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": 0.1}
)
import json
content = response["message"]["content"]
try:
start = content.find("{")
end = content.rfind("}") + 1
data = json.loads(content[start:end])
relationships = [
Relationship(**r) for r in data.get("relationships", [])
]
# Extract unique entities
entities = set()
for r in relationships:
entities.add(r.subject)
entities.add(r.object)
return RelationshipGraph(
relationships=relationships,
entities=list(entities)
)
except Exception:
return RelationshipGraph(relationships=[], entities=[])
def visualize_graph(graph: RelationshipGraph) -> str:
"""Create a simple text visualization of the relationship graph."""
output = ["Relationship Graph:", "=" * 40]
for rel in graph.relationships:
output.append(f" {rel.subject} --[{rel.predicate}]--> {rel.object}")
output.append("")
output.append(f"Entities ({len(graph.entities)}): {', '.join(graph.entities)}")
return "\n".join(output)
# Example
if __name__ == "__main__":
text = """
Elon Musk is the CEO of Tesla and SpaceX. Tesla is headquartered in
Austin, Texas. SpaceX was founded in 2002 and operates from Hawthorne,
California. Musk also acquired Twitter in 2022 and renamed it to X.
"""
graph = extract_relationships(text)
print(visualize_graph(graph))
print("\nTriples:")
for triple in graph.to_triples():
print(f" {triple}")Part 4: Text Summarization
Multi-Strategy Summarization
Implement different summarization strategies for various use cases.
# summarizer.py
from typing import List, Literal
from pydantic import BaseModel
from enum import Enum
import ollama
class SummaryType(str, Enum):
EXTRACTIVE = "extractive"
ABSTRACTIVE = "abstractive"
BULLET_POINTS = "bullet_points"
HEADLINE = "headline"
TL_DR = "tldr"
class Summary(BaseModel):
"""Summarization result."""
original_length: int
summary_length: int
compression_ratio: float
summary_type: SummaryType
content: str
key_points: List[str] = []
def summarize(
text: str,
summary_type: SummaryType = SummaryType.ABSTRACTIVE,
max_length: int = 150,
model: str = "phi3:mini"
) -> Summary:
"""Summarize text using specified strategy."""
strategy_prompts = {
SummaryType.EXTRACTIVE: f"""Extract the most important sentences from this text verbatim.
Select 2-3 key sentences that capture the main points.
Text: {text}
Important sentences:""",
SummaryType.ABSTRACTIVE: f"""Write a concise summary of this text in your own words.
Keep it under {max_length} words.
Text: {text}
Summary:""",
SummaryType.BULLET_POINTS: f"""Summarize this text as 3-5 bullet points.
Each point should be one clear, complete thought.
Text: {text}
Bullet points:""",
SummaryType.HEADLINE: f"""Write a single headline (under 15 words) that captures the main point.
Text: {text}
Headline:""",
SummaryType.TL_DR: f"""Write a TL;DR (Too Long; Didn't Read) summary in 1-2 sentences.
Text: {text}
TL;DR:"""
}
prompt = strategy_prompts[summary_type]
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": 0.3}
)
summary_text = response["message"]["content"].strip()
# Extract key points for bullet point summaries
key_points = []
if summary_type == SummaryType.BULLET_POINTS:
for line in summary_text.split("\n"):
line = line.strip()
if line.startswith(("•", "-", "*", "1", "2", "3", "4", "5")):
# Remove bullet/number prefix
clean = line.lstrip("•-*0123456789.) ").strip()
if clean:
key_points.append(clean)
original_len = len(text.split())
summary_len = len(summary_text.split())
return Summary(
original_length=original_len,
summary_length=summary_len,
compression_ratio=summary_len / original_len if original_len > 0 else 0,
summary_type=summary_type,
content=summary_text,
key_points=key_points
)
def multi_summarize(text: str, model: str = "phi3:mini") -> dict:
"""Generate multiple summary types for the same text."""
return {
summary_type.value: summarize(text, summary_type, model=model)
for summary_type in SummaryType
}
# Example
if __name__ == "__main__":
article = """
Artificial intelligence has made remarkable strides in recent years,
transforming industries from healthcare to finance. Machine learning
models can now diagnose diseases, predict market trends, and even
generate creative content. However, these advances come with significant
challenges including bias in training data, lack of explainability,
and concerns about job displacement. Researchers are working on
developing more transparent and fair AI systems. Governments worldwide
are beginning to implement regulations to ensure AI is developed and
deployed responsibly. The next decade will be crucial in determining
how AI shapes our society and economy.
"""
print("=" * 60)
print("SUMMARIZATION EXAMPLES")
print("=" * 60)
for stype in SummaryType:
result = summarize(article, stype)
print(f"\n{stype.value.upper()}:")
print(f" {result.content}")
print(f" [Compression: {result.compression_ratio:.1%}]")Part 5: Intent Detection and Slot Filling
Conversational Intent Detection
Build an intent classifier for conversational AI applications.
# intent_detector.py
from typing import List, Dict, Optional
from pydantic import BaseModel
import ollama
class Intent(BaseModel):
"""Detected user intent."""
name: str
confidence: float
slots: Dict[str, str] = {}
class IntentResult(BaseModel):
"""Intent detection result."""
primary_intent: Intent
alternative_intents: List[Intent] = []
raw_input: str
# Define intents and their slots
INTENT_SCHEMA = {
"book_flight": {
"description": "User wants to book a flight",
"slots": ["origin", "destination", "date", "passengers", "class"]
},
"check_weather": {
"description": "User wants weather information",
"slots": ["location", "date"]
},
"set_reminder": {
"description": "User wants to set a reminder",
"slots": ["task", "datetime", "recurrence"]
},
"play_music": {
"description": "User wants to play music",
"slots": ["song", "artist", "genre", "playlist"]
},
"order_food": {
"description": "User wants to order food",
"slots": ["item", "quantity", "restaurant", "delivery_address"]
},
"get_directions": {
"description": "User wants navigation help",
"slots": ["origin", "destination", "mode"]
},
"general_question": {
"description": "User has a general question",
"slots": ["topic"]
},
"chitchat": {
"description": "Casual conversation",
"slots": []
}
}
def detect_intent(
user_input: str,
intent_schema: dict = INTENT_SCHEMA,
model: str = "phi3:mini"
) -> IntentResult:
"""Detect intent and extract slots from user input."""
# Build intent descriptions for prompt
intent_desc = "\n".join([
f"- {name}: {info['description']} (slots: {', '.join(info['slots'])})"
for name, info in intent_schema.items()
])
prompt = f"""Analyze this user input and determine the intent.
User input: "{user_input}"
Available intents:
{intent_desc}
Respond with JSON:
{{
"intent": "intent_name",
"confidence": 0.0-1.0,
"slots": {{"slot_name": "extracted_value", ...}},
"alternatives": [{{"intent": "name", "confidence": 0.0-1.0}}]
}}
Extract slot values from the user input when present. JSON:"""
response = ollama.chat(
model=model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": 0.1}
)
import json
content = response["message"]["content"]
try:
start = content.find("{")
end = content.rfind("}") + 1
data = json.loads(content[start:end])
primary = Intent(
name=data.get("intent", "general_question"),
confidence=data.get("confidence", 0.5),
slots=data.get("slots", {})
)
alternatives = [
Intent(name=alt["intent"], confidence=alt["confidence"])
for alt in data.get("alternatives", [])
]
return IntentResult(
primary_intent=primary,
alternative_intents=alternatives,
raw_input=user_input
)
except Exception:
return IntentResult(
primary_intent=Intent(name="general_question", confidence=0.5),
raw_input=user_input
)
# Example
if __name__ == "__main__":
test_inputs = [
"Book me a flight from New York to London next Friday for 2 people",
"What's the weather like in Tokyo tomorrow?",
"Remind me to call mom at 5pm",
"Play some jazz music",
"How do I get to the airport from downtown?",
"Hey, how's it going?"
]
print("Intent Detection Results")
print("=" * 60)
for input_text in test_inputs:
result = detect_intent(input_text)
print(f"\nInput: {input_text}")
print(f"Intent: {result.primary_intent.name} ({result.primary_intent.confidence:.0%})")
if result.primary_intent.slots:
print(f"Slots: {result.primary_intent.slots}")Part 6: Building a Complete NLP Pipeline
Unified Pipeline
Combine all components into a reusable pipeline.
# nlp_pipeline.py
from typing import List, Dict, Any, Optional
from pydantic import BaseModel
from enum import Enum
import ollama
from dataclasses import dataclass
import time
class TaskType(str, Enum):
CLASSIFY = "classify"
NER = "ner"
EXTRACT = "extract"
SUMMARIZE = "summarize"
INTENT = "intent"
ALL = "all"
@dataclass
class PipelineConfig:
"""Pipeline configuration."""
model: str = "phi3:mini"
temperature: float = 0.1
tasks: List[TaskType] = None
def __post_init__(self):
if self.tasks is None:
self.tasks = [TaskType.ALL]
class PipelineResult(BaseModel):
"""Complete pipeline result."""
text: str
processing_time_ms: float
results: Dict[str, Any]
class NLPPipeline:
"""Unified NLP pipeline using SLMs."""
def __init__(self, config: PipelineConfig = None):
self.config = config or PipelineConfig()
def _call_model(self, prompt: str) -> str:
"""Make a model call with consistent settings."""
response = ollama.chat(
model=self.config.model,
messages=[{"role": "user", "content": prompt}],
options={"temperature": self.config.temperature}
)
return response["message"]["content"]
def _parse_json(self, content: str) -> dict:
"""Parse JSON from model response."""
import json
try:
start = content.find("{")
end = content.rfind("}") + 1
if start != -1 and end > start:
return json.loads(content[start:end])
except:
pass
return {}
def classify(self, text: str) -> dict:
"""Run classification."""
prompt = f"""Classify this text.
Text: {text}
JSON response with sentiment (positive/negative/neutral),
category, and confidence (0-1):"""
result = self._call_model(prompt)
return self._parse_json(result)
def extract_entities(self, text: str) -> dict:
"""Run NER."""
prompt = f"""Extract named entities (PERSON, ORG, LOCATION, DATE, etc).
Text: {text}
JSON with "entities" list containing text and type:"""
result = self._call_model(prompt)
return self._parse_json(result)
def extract_info(self, text: str) -> dict:
"""Run information extraction."""
prompt = f"""Extract key information as structured data.
Text: {text}
JSON with relevant fields (names, dates, amounts, etc):"""
result = self._call_model(prompt)
return self._parse_json(result)
def summarize(self, text: str) -> dict:
"""Run summarization."""
prompt = f"""Summarize in 2-3 sentences.
Text: {text}
JSON with "summary" and "key_points" list:"""
result = self._call_model(prompt)
return self._parse_json(result)
def detect_intent(self, text: str) -> dict:
"""Run intent detection."""
prompt = f"""Detect the intent and extract slots.
Text: {text}
JSON with "intent", "confidence", and "slots":"""
result = self._call_model(prompt)
return self._parse_json(result)
def process(self, text: str) -> PipelineResult:
"""Run the complete pipeline."""
start_time = time.time()
results = {}
tasks = self.config.tasks
if TaskType.ALL in tasks:
tasks = [TaskType.CLASSIFY, TaskType.NER, TaskType.EXTRACT,
TaskType.SUMMARIZE, TaskType.INTENT]
for task in tasks:
if task == TaskType.CLASSIFY:
results["classification"] = self.classify(text)
elif task == TaskType.NER:
results["entities"] = self.extract_entities(text)
elif task == TaskType.EXTRACT:
results["extraction"] = self.extract_info(text)
elif task == TaskType.SUMMARIZE:
results["summary"] = self.summarize(text)
elif task == TaskType.INTENT:
results["intent"] = self.detect_intent(text)
processing_time = (time.time() - start_time) * 1000
return PipelineResult(
text=text,
processing_time_ms=processing_time,
results=results
)
def batch_process(self, texts: List[str]) -> List[PipelineResult]:
"""Process multiple texts."""
return [self.process(text) for text in texts]
# Example usage
if __name__ == "__main__":
# Create pipeline
pipeline = NLPPipeline(PipelineConfig(
model="phi3:mini",
tasks=[TaskType.CLASSIFY, TaskType.NER, TaskType.SUMMARIZE]
))
text = """
Microsoft announced today that CEO Satya Nadella will present the company's
new AI strategy at the Build 2024 conference in Seattle. The presentation
will cover Azure AI services and the integration of GPT-4 across Office 365.
Analysts expect this to drive significant growth in cloud revenue.
"""
result = pipeline.process(text.strip())
print("NLP Pipeline Results")
print("=" * 60)
print(f"Processing time: {result.processing_time_ms:.0f}ms")
print()
for task_name, task_result in result.results.items():
print(f"{task_name.upper()}:")
for key, value in task_result.items():
print(f" {key}: {value}")
print()FastAPI Application
Expose the pipeline as a REST API.
# api.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import uvicorn
from nlp_pipeline import NLPPipeline, PipelineConfig, TaskType
app = FastAPI(
title="SLM Text Tasks API",
description="NLP tasks powered by small language models",
version="1.0.0"
)
# Initialize pipeline
pipeline = NLPPipeline(PipelineConfig(model="phi3:mini"))
class TextRequest(BaseModel):
text: str
tasks: Optional[List[str]] = None
class ClassifyRequest(BaseModel):
text: str
include_reasoning: bool = False
class NERRequest(BaseModel):
text: str
entity_types: Optional[List[str]] = None
class ExtractRequest(BaseModel):
text: str
schema_type: str = "auto" # auto, contact, event, product
class SummarizeRequest(BaseModel):
text: str
style: str = "abstractive" # extractive, abstractive, bullet_points, headline, tldr
class IntentRequest(BaseModel):
text: str
class BatchRequest(BaseModel):
texts: List[str]
tasks: Optional[List[str]] = None
@app.get("/health")
async def health_check():
"""Health check endpoint."""
return {"status": "healthy", "model": pipeline.config.model}
@app.post("/analyze")
async def analyze_text(request: TextRequest):
"""Run full NLP analysis on text."""
try:
if request.tasks:
task_types = [TaskType(t) for t in request.tasks]
pipeline.config.tasks = task_types
else:
pipeline.config.tasks = [TaskType.ALL]
result = pipeline.process(request.text)
return result.model_dump()
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.post("/classify")
async def classify_text(request: ClassifyRequest):
"""Classify text sentiment and category."""
result = pipeline.classify(request.text)
return result
@app.post("/ner")
async def extract_entities(request: NERRequest):
"""Extract named entities from text."""
result = pipeline.extract_entities(request.text)
return result
@app.post("/extract")
async def extract_info(request: ExtractRequest):
"""Extract structured information from text."""
result = pipeline.extract_info(request.text)
return result
@app.post("/summarize")
async def summarize_text(request: SummarizeRequest):
"""Summarize text."""
result = pipeline.summarize(request.text)
return result
@app.post("/intent")
async def detect_intent(request: IntentRequest):
"""Detect user intent."""
result = pipeline.detect_intent(request.text)
return result
@app.post("/batch")
async def batch_analyze(request: BatchRequest):
"""Analyze multiple texts."""
try:
if request.tasks:
task_types = [TaskType(t) for t in request.tasks]
pipeline.config.tasks = task_types
results = pipeline.batch_process(request.texts)
return [r.model_dump() for r in results]
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)Test the API
# Start the server
python api.py
# Test endpoints
curl -X POST http://localhost:8000/classify \
-H "Content-Type: application/json" \
-d '{"text": "The product quality exceeded my expectations!"}'
curl -X POST http://localhost:8000/ner \
-H "Content-Type: application/json" \
-d '{"text": "Tim Cook announced new Apple products at WWDC in San Jose."}'
curl -X POST http://localhost:8000/analyze \
-H "Content-Type: application/json" \
-d '{"text": "Book a flight to Paris for next Monday", "tasks": ["intent", "ner"]}'Model Selection Guide
| Task | Recommended Model | Why |
|---|---|---|
| Classification | phi3:mini | Fast, good at structured output |
| NER | qwen2.5:3b | Better entity boundary detection |
| Extraction | phi3:mini | Strong instruction following |
| Summarization | gemma2:2b | Natural language generation |
| Intent | phi3:mini | Good at classification + slots |
Prompt Engineering Tips
Be Specific About Format
# Good - explicit format
prompt = """
Respond with ONLY a JSON object:
{"sentiment": "positive/negative/neutral", "confidence": 0.0-1.0}
"""
# Bad - ambiguous format
prompt = "What is the sentiment? Return JSON."Use Few-Shot Examples
# Few-shot for consistent output
prompt = """
Extract entities. Examples:
"Apple released iPhone" -> [{"text": "Apple", "type": "ORG"}, {"text": "iPhone", "type": "PRODUCT"}]
"Tim Cook visited Berlin" -> [{"text": "Tim Cook", "type": "PERSON"}, {"text": "Berlin", "type": "LOCATION"}]
Now extract from: {text}
"""Handle Edge Cases
def safe_extract(text: str) -> dict:
"""Extract with fallback handling."""
if not text or len(text.strip()) < 5:
return {"error": "Text too short"}
if len(text) > 10000:
# Chunk long text
text = text[:10000]
# ... extraction logicPerformance Optimization
Batch Processing
async def batch_classify(texts: List[str], batch_size: int = 5):
"""Process texts in batches for better throughput."""
results = []
for i in range(0, len(texts), batch_size):
batch = texts[i:i + batch_size]
# Process batch concurrently
batch_results = await asyncio.gather(
*[classify_async(text) for text in batch]
)
results.extend(batch_results)
return resultsCaching
from functools import lru_cache
import hashlib
@lru_cache(maxsize=1000)
def cached_classify(text_hash: str, text: str) -> dict:
"""Cache classification results."""
return pipeline.classify(text)
def classify_with_cache(text: str) -> dict:
text_hash = hashlib.md5(text.encode()).hexdigest()
return cached_classify(text_hash, text)Exercises
-
Custom Entity Types: Add support for extracting custom entity types specific to your domain (e.g., programming languages, frameworks, APIs)
-
Confidence Calibration: Implement a calibration layer that adjusts model confidence scores based on validation data
-
Multi-Language Support: Extend the pipeline to handle multiple languages with automatic language detection
-
Active Learning: Build a feedback loop where low-confidence predictions are flagged for human review
Key Concepts Recap
| Concept | What It Is | Why It Matters |
|---|---|---|
| Text Classification | Assign labels (sentiment, category) to text | Foundation for routing, filtering, analysis |
| NER | Named Entity Recognition - extract people, orgs, dates | Structured data from unstructured text |
| Information Extraction | Pull specific fields (email, phone, price) | Automate data entry from documents |
| Structured Output | Force JSON responses with Pydantic schemas | Reliable parsing, type safety |
| Low Temperature | temperature=0.1 for deterministic output | Consistent, reproducible results |
| JSON Extraction | Use regex to find {...} in response | Robust parsing when model adds text |
| Intent Detection | Classify user intent + extract slots | Build conversational AI systems |
| Relationship Extraction | Find (subject, predicate, object) triples | Build knowledge graphs from text |
| Few-Shot Prompting | Include examples in prompt | Guide model to exact output format |
| Batch Processing | Process multiple texts together | Higher throughput, better efficiency |
Next Steps
- SLM Evaluation & Benchmarking - Measure and compare model performance
- SLM Fine-tuning - Customize models for your specific tasks
- SLM-Powered RAG - Combine SLMs with retrieval systems