SLM for Text Tasks

Build production-ready NLP applications using small language models. Learn how to leverage SLMs for classification, extraction, summarization, and structured output generation with minimal resource requirements.

TL;DR

SLMs handle NLP tasks (classification, NER, extraction, summarization) surprisingly well with good prompting. Key patterns: request JSON output explicitly, use low temperature (0.1) for consistency, extract answers with regex fallback, and cache repeated calls. A 3B model can match GPT-3.5 quality on structured tasks at 1/100th the cost.

Project Overview

Aspect	Details
Difficulty	Beginner
Time	2-3 hours
Prerequisites	Local SLM Setup
What You'll Build	Multi-task NLP pipeline with text classification, NER, and extraction

What You'll Learn

Text classification and sentiment analysis with SLMs
Named entity recognition (NER) techniques
Information extraction from unstructured text
Structured output generation with Pydantic
Prompt engineering for reliable results
Building reusable NLP pipelines

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                         NLP Pipeline Architecture                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌───────────────────────┐                                                  │
│   │     Input Layer       │                                                  │
│   │ ┌─────────┐ ┌───────┐ │                                                  │
│   │ │Raw Text │ │ Batch │ │                                                  │
│   │ └────┬────┘ └───┬───┘ │                                                  │
│   └──────┼──────────┼─────┘                                                  │
│          └────┬─────┘                                                        │
│               ▼                                                              │
│   ┌───────────────────────┐      ┌───────────────────────┐                  │
│   │    NLP Pipeline       │      │     SLM Backend       │                  │
│   │                       │      │                       │                  │
│   │   ┌─────────────┐     │      │  ┌─────────────────┐  │                  │
│   │   │Preprocessor │     │      │  │   Phi-3 Mini    │  │                  │
│   │   └──────┬──────┘     │      │  │ (classification)│  │                  │
│   │          │            │      │  └─────────────────┘  │                  │
│   │    ┌─────┼─────┐      │      │  ┌─────────────────┐  │                  │
│   │    │     │     │      │      │  │   Qwen2.5 3B    │  │                  │
│   │    ▼     ▼     ▼      │  ◄── │  │   (extraction)  │  │                  │
│   │ ┌─────┐┌────┐┌─────┐  │      │  └─────────────────┘  │                  │
│   │ │Class││ NER││Extract│ │      │  ┌─────────────────┐  │                  │
│   │ └──┬──┘└──┬─┘└──┬───┘ │      │  │   Gemma 2 2B    │  │                  │
│   │    └──────┼─────┘     │      │  │  (generation)   │  │                  │
│   │           ▼           │      │  └─────────────────┘  │                  │
│   │   ┌───────────────┐   │      │                       │                  │
│   │   │   Structure   │   │      └───────────────────────┘                  │
│   │   │   Generator   │   │                                                  │
│   │   └───────┬───────┘   │                                                  │
│   └───────────┼───────────┘                                                  │
│               │                                                              │
│               ▼                                                              │
│   ┌───────────────────────────────────────────────────────────────────────┐ │
│   │                        Output Layer                                    │ │
│   │     ┌──────────────┐  ┌───────────────┐  ┌─────────────┐              │ │
│   │     │Structured JSON│  │Classifications│  │  Entities   │              │ │
│   │     └──────────────┘  └───────────────┘  └─────────────┘              │ │
│   └───────────────────────────────────────────────────────────────────────┘ │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Project Setup

Install Dependencies

# Create project directory
mkdir slm-text-tasks && cd slm-text-tasks

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install ollama pydantic instructor langchain-ollama rich

Pull Required Models

# Recommended models for text tasks
ollama pull phi3:mini          # 2.3GB - Great for classification
ollama pull qwen2.5:3b         # 2.0GB - Excellent for extraction
ollama pull gemma2:2b          # 1.6GB - Fast for simple tasks

Part 1: Text Classification

Basic Classifier

Start with a simple text classifier using prompt engineering.

# classifier.py
import ollama
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field


class Sentiment(str, Enum):
    POSITIVE = "positive"
    NEGATIVE = "negative"
    NEUTRAL = "neutral"


class Category(str, Enum):
    TECHNOLOGY = "technology"
    BUSINESS = "business"
    SPORTS = "sports"
    ENTERTAINMENT = "entertainment"
    POLITICS = "politics"
    HEALTH = "health"
    SCIENCE = "science"
    OTHER = "other"


class ClassificationResult(BaseModel):
    """Structured classification output."""
    sentiment: Sentiment
    category: Category
    confidence: float = Field(ge=0.0, le=1.0)
    reasoning: Optional[str] = None


def classify_text(
    text: str,
    model: str = "phi3:mini",
    include_reasoning: bool = False
) -> ClassificationResult:
    """Classify text for sentiment and category."""

    prompt = f"""Analyze the following text and classify it.

Text: {text}

Respond with ONLY a JSON object in this exact format:
{{
    "sentiment": "positive" or "negative" or "neutral",
    "category": one of ["technology", "business", "sports", "entertainment", "politics", "health", "science", "other"],
    "confidence": number between 0 and 1,
    "reasoning": "brief explanation" (only if reasoning requested)
}}

{"Include brief reasoning for your classification." if include_reasoning else "Do not include reasoning."}

JSON response:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}  # Low temperature for consistency
    )

    # Parse response
    import json
    content = response["message"]["content"]

    # Extract JSON from response
    try:
        # Try to find JSON in the response
        start = content.find("{")
        end = content.rfind("}") + 1
        if start != -1 and end > start:
            json_str = content[start:end]
            data = json.loads(json_str)
            return ClassificationResult(**data)
    except (json.JSONDecodeError, ValueError) as e:
        # Fallback to defaults
        return ClassificationResult(
            sentiment=Sentiment.NEUTRAL,
            category=Category.OTHER,
            confidence=0.5,
            reasoning=f"Failed to parse: {str(e)}"
        )


# Example usage
if __name__ == "__main__":
    texts = [
        "Apple's new M4 chip delivers unprecedented performance gains in AI workloads.",
        "The team's devastating loss marks their fifth consecutive defeat this season.",
        "Scientists discover high microbial activity under Antarctic ice sheet.",
    ]

    for text in texts:
        result = classify_text(text, include_reasoning=True)
        print(f"Text: {text[:50]}...")
        print(f"  Sentiment: {result.sentiment.value}")
        print(f"  Category: {result.category.value}")
        print(f"  Confidence: {result.confidence:.2f}")
        print(f"  Reasoning: {result.reasoning}")
        print()

Multi-Label Classification

Handle texts that belong to multiple categories.

# multi_label_classifier.py
from typing import List
from pydantic import BaseModel, Field
import ollama


class MultiLabelResult(BaseModel):
    """Multi-label classification result."""
    labels: List[str]
    scores: dict[str, float]
    primary_label: str


TOPIC_LABELS = [
    "artificial_intelligence",
    "machine_learning",
    "data_science",
    "software_engineering",
    "cloud_computing",
    "cybersecurity",
    "web_development",
    "mobile_development",
    "devops",
    "blockchain"
]


def multi_label_classify(
    text: str,
    labels: List[str] = TOPIC_LABELS,
    threshold: float = 0.5,
    model: str = "phi3:mini"
) -> MultiLabelResult:
    """Classify text with multiple labels."""

    labels_str = ", ".join(labels)

    prompt = f"""Analyze this text and assign relevance scores to each topic.

Text: {text}

Available topics: {labels_str}

For each topic, assign a score from 0.0 to 1.0 based on relevance.
Only include topics with score > 0.3.

Respond with JSON:
{{
    "scores": {{"topic_name": score, ...}},
    "primary_label": "most relevant topic"
}}

JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        scores = data.get("scores", {})
        # Filter by threshold
        filtered_labels = [l for l, s in scores.items() if s >= threshold]

        return MultiLabelResult(
            labels=filtered_labels,
            scores=scores,
            primary_label=data.get("primary_label", filtered_labels[0] if filtered_labels else "unknown")
        )
    except Exception as e:
        return MultiLabelResult(
            labels=[],
            scores={},
            primary_label="unknown"
        )


# Example
if __name__ == "__main__":
    text = """
    We implemented a new CI/CD pipeline using GitHub Actions that automatically
    deploys our machine learning models to Kubernetes. The pipeline includes
    security scanning and runs our test suite before deployment.
    """

    result = multi_label_classify(text)
    print(f"Labels: {result.labels}")
    print(f"Primary: {result.primary_label}")
    print(f"Scores: {result.scores}")

Part 2: Named Entity Recognition

Custom NER with SLMs

Extract named entities without requiring specialized NER models.

# ner_extractor.py
from typing import List, Optional
from pydantic import BaseModel
import ollama


class Entity(BaseModel):
    """A named entity extracted from text."""
    text: str
    type: str
    start_idx: Optional[int] = None
    end_idx: Optional[int] = None


class NERResult(BaseModel):
    """NER extraction result."""
    entities: List[Entity]
    original_text: str


ENTITY_TYPES = [
    "PERSON",
    "ORGANIZATION",
    "LOCATION",
    "DATE",
    "TIME",
    "MONEY",
    "PRODUCT",
    "EVENT",
    "TECHNOLOGY",
    "EMAIL",
    "PHONE"
]


def extract_entities(
    text: str,
    entity_types: List[str] = ENTITY_TYPES,
    model: str = "qwen2.5:3b"
) -> NERResult:
    """Extract named entities from text."""

    types_str = ", ".join(entity_types)

    prompt = f"""Extract all named entities from the following text.

Text: {text}

Entity types to find: {types_str}

For each entity found, provide:
- text: the exact text of the entity
- type: the entity type from the list above

Respond with JSON:
{{
    "entities": [
        {{"text": "entity text", "type": "ENTITY_TYPE"}},
        ...
    ]
}}

Only include entities you are confident about. JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        entities = []
        for ent in data.get("entities", []):
            entity = Entity(
                text=ent["text"],
                type=ent["type"]
            )
            # Find position in original text
            idx = text.find(ent["text"])
            if idx != -1:
                entity.start_idx = idx
                entity.end_idx = idx + len(ent["text"])
            entities.append(entity)

        return NERResult(entities=entities, original_text=text)

    except Exception as e:
        return NERResult(entities=[], original_text=text)


def highlight_entities(ner_result: NERResult) -> str:
    """Create highlighted text with entity annotations."""
    from rich.console import Console
    from rich.text import Text

    text = Text(ner_result.original_text)

    # Color map for entity types
    colors = {
        "PERSON": "cyan",
        "ORGANIZATION": "green",
        "LOCATION": "yellow",
        "DATE": "magenta",
        "MONEY": "red",
        "PRODUCT": "blue",
        "TECHNOLOGY": "bright_cyan",
    }

    # Sort by position (reverse) to avoid offset issues
    sorted_entities = sorted(
        [e for e in ner_result.entities if e.start_idx is not None],
        key=lambda x: x.start_idx,
        reverse=True
    )

    for entity in sorted_entities:
        color = colors.get(entity.type, "white")
        text.stylize(f"bold {color}", entity.start_idx, entity.end_idx)

    console = Console()
    console.print(text)

    # Print legend
    print("\nEntities found:")
    for entity in ner_result.entities:
        print(f"  [{entity.type}] {entity.text}")

    return str(text)


# Example usage
if __name__ == "__main__":
    sample_text = """
    On January 15, 2024, Anthropic announced Claude 3, their latest AI assistant.
    The San Francisco-based company, founded by Dario Amodei, raised $750 million
    in Series C funding. Microsoft and Google are also investing heavily in AI.
    Contact press@anthropic.com for more information.
    """

    result = extract_entities(sample_text.strip())
    highlight_entities(result)

Domain-Specific NER

Customize entity extraction for specific domains.

# domain_ner.py
from typing import List, Dict
from pydantic import BaseModel
import ollama


class DomainEntity(BaseModel):
    text: str
    type: str
    attributes: Dict[str, str] = {}


class DomainNERResult(BaseModel):
    domain: str
    entities: List[DomainEntity]


# Domain-specific entity configurations
DOMAIN_CONFIGS = {
    "medical": {
        "entity_types": ["SYMPTOM", "DISEASE", "MEDICATION", "DOSAGE", "BODY_PART", "PROCEDURE", "DOCTOR"],
        "examples": [
            {"text": "100mg", "type": "DOSAGE"},
            {"text": "headache", "type": "SYMPTOM"},
            {"text": "ibuprofen", "type": "MEDICATION"}
        ]
    },
    "legal": {
        "entity_types": ["CASE_NUMBER", "COURT", "JUDGE", "PARTY", "STATUTE", "DATE", "JURISDICTION"],
        "examples": [
            {"text": "Case No. 2024-CV-123", "type": "CASE_NUMBER"},
            {"text": "Supreme Court", "type": "COURT"}
        ]
    },
    "financial": {
        "entity_types": ["COMPANY", "TICKER", "AMOUNT", "CURRENCY", "PERCENTAGE", "METRIC", "DATE"],
        "examples": [
            {"text": "AAPL", "type": "TICKER"},
            {"text": "$1.2B", "type": "AMOUNT"}
        ]
    },
    "ecommerce": {
        "entity_types": ["PRODUCT", "BRAND", "PRICE", "SKU", "CATEGORY", "FEATURE", "SIZE", "COLOR"],
        "examples": [
            {"text": "iPhone 15 Pro", "type": "PRODUCT"},
            {"text": "Apple", "type": "BRAND"}
        ]
    }
}


def domain_ner(
    text: str,
    domain: str,
    model: str = "qwen2.5:3b"
) -> DomainNERResult:
    """Extract domain-specific entities."""

    config = DOMAIN_CONFIGS.get(domain, {
        "entity_types": ["ENTITY"],
        "examples": []
    })

    types_str = ", ".join(config["entity_types"])
    examples_str = "\n".join([
        f'  - "{ex["text"]}" -> {ex["type"]}'
        for ex in config["examples"]
    ])

    prompt = f"""You are a {domain} domain expert. Extract all relevant entities.

Text: {text}

Entity types for {domain} domain: {types_str}

Examples:
{examples_str}

Extract entities with any relevant attributes. Respond with JSON:
{{
    "entities": [
        {{"text": "entity", "type": "TYPE", "attributes": {{"key": "value"}}}},
        ...
    ]
}}

JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        entities = [
            DomainEntity(
                text=e["text"],
                type=e["type"],
                attributes=e.get("attributes", {})
            )
            for e in data.get("entities", [])
        ]

        return DomainNERResult(domain=domain, entities=entities)

    except Exception:
        return DomainNERResult(domain=domain, entities=[])


# Example
if __name__ == "__main__":
    medical_text = """
    Patient presents with persistent headache and fever of 101.5F for 3 days.
    Prescribed Tylenol 500mg every 6 hours. Follow up with Dr. Smith in 1 week
    if symptoms persist. Consider CT scan if headache worsens.
    """

    result = domain_ner(medical_text, "medical")
    print(f"Domain: {result.domain}")
    for entity in result.entities:
        attrs = f" ({entity.attributes})" if entity.attributes else ""
        print(f"  [{entity.type}] {entity.text}{attrs}")

Part 3: Information Extraction

Structured Data Extraction

Extract structured information from unstructured text using SLMs.

# info_extractor.py
from typing import List, Optional, Any
from pydantic import BaseModel, Field
from datetime import datetime
import ollama


class ContactInfo(BaseModel):
    """Extracted contact information."""
    name: Optional[str] = None
    email: Optional[str] = None
    phone: Optional[str] = None
    company: Optional[str] = None
    title: Optional[str] = None
    address: Optional[str] = None


class EventInfo(BaseModel):
    """Extracted event information."""
    name: str
    date: Optional[str] = None
    time: Optional[str] = None
    location: Optional[str] = None
    organizer: Optional[str] = None
    description: Optional[str] = None


class ProductInfo(BaseModel):
    """Extracted product information."""
    name: str
    price: Optional[str] = None
    features: List[str] = []
    specifications: dict = {}
    brand: Optional[str] = None
    category: Optional[str] = None


def extract_structured_data(
    text: str,
    schema: type[BaseModel],
    model: str = "phi3:mini"
) -> BaseModel:
    """Extract structured data based on a Pydantic schema."""

    # Get schema JSON for prompt
    schema_json = schema.model_json_schema()

    # Create field descriptions
    fields_desc = []
    for field_name, field_info in schema_json.get("properties", {}).items():
        field_type = field_info.get("type", "any")
        description = field_info.get("description", "")
        fields_desc.append(f"- {field_name} ({field_type}): {description}")

    fields_str = "\n".join(fields_desc)

    prompt = f"""Extract structured information from the text.

Text: {text}

Extract the following fields:
{fields_str}

Respond with a JSON object containing the extracted fields.
Use null for fields you cannot find. JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])
        return schema(**data)
    except Exception as e:
        # Return empty instance
        return schema()


# Specialized extractors
def extract_contact(text: str, model: str = "phi3:mini") -> ContactInfo:
    """Extract contact information from text."""
    return extract_structured_data(text, ContactInfo, model)


def extract_event(text: str, model: str = "phi3:mini") -> EventInfo:
    """Extract event information from text."""
    return extract_structured_data(text, EventInfo, model)


def extract_product(text: str, model: str = "phi3:mini") -> ProductInfo:
    """Extract product information from text."""
    return extract_structured_data(text, ProductInfo, model)


# Example
if __name__ == "__main__":
    # Test contact extraction
    contact_text = """
    Hi, I'm Sarah Johnson, the VP of Engineering at TechCorp Inc.
    You can reach me at sarah.johnson@techcorp.com or call
    (555) 123-4567. Our office is at 123 Innovation Drive,
    San Francisco, CA 94105.
    """

    contact = extract_contact(contact_text)
    print("Contact Info:")
    print(f"  Name: {contact.name}")
    print(f"  Email: {contact.email}")
    print(f"  Phone: {contact.phone}")
    print(f"  Company: {contact.company}")
    print(f"  Title: {contact.title}")
    print()

    # Test event extraction
    event_text = """
    Join us for the Annual AI Conference 2024!
    Date: March 15-17, 2024
    Location: Moscone Center, San Francisco
    Hosted by the AI Research Foundation
    Three days of cutting-edge AI presentations and workshops.
    """

    event = extract_event(event_text)
    print("Event Info:")
    print(f"  Name: {event.name}")
    print(f"  Date: {event.date}")
    print(f"  Location: {event.location}")
    print(f"  Organizer: {event.organizer}")

Relationship Extraction

Extract relationships between entities.

# relationship_extractor.py
from typing import List, Tuple
from pydantic import BaseModel
import ollama


class Relationship(BaseModel):
    """A relationship between two entities."""
    subject: str
    predicate: str
    object: str
    confidence: float = 1.0


class RelationshipGraph(BaseModel):
    """Graph of extracted relationships."""
    relationships: List[Relationship]
    entities: List[str]

    def to_triples(self) -> List[Tuple[str, str, str]]:
        """Convert to list of (subject, predicate, object) triples."""
        return [(r.subject, r.predicate, r.object) for r in self.relationships]


def extract_relationships(
    text: str,
    relationship_types: List[str] = None,
    model: str = "qwen2.5:3b"
) -> RelationshipGraph:
    """Extract entity relationships from text."""

    types_hint = ""
    if relationship_types:
        types_hint = f"\nFocus on these relationship types: {', '.join(relationship_types)}"

    prompt = f"""Extract relationships between entities in this text.

Text: {text}
{types_hint}

For each relationship, identify:
- subject: the entity performing/having the relationship
- predicate: the relationship type (e.g., "works_at", "founded", "located_in")
- object: the entity being related to

Respond with JSON:
{{
    "relationships": [
        {{"subject": "...", "predicate": "...", "object": "..."}},
        ...
    ]
}}

JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        relationships = [
            Relationship(**r) for r in data.get("relationships", [])
        ]

        # Extract unique entities
        entities = set()
        for r in relationships:
            entities.add(r.subject)
            entities.add(r.object)

        return RelationshipGraph(
            relationships=relationships,
            entities=list(entities)
        )

    except Exception:
        return RelationshipGraph(relationships=[], entities=[])


def visualize_graph(graph: RelationshipGraph) -> str:
    """Create a simple text visualization of the relationship graph."""
    output = ["Relationship Graph:", "=" * 40]

    for rel in graph.relationships:
        output.append(f"  {rel.subject} --[{rel.predicate}]--> {rel.object}")

    output.append("")
    output.append(f"Entities ({len(graph.entities)}): {', '.join(graph.entities)}")

    return "\n".join(output)


# Example
if __name__ == "__main__":
    text = """
    Elon Musk is the CEO of Tesla and SpaceX. Tesla is headquartered in
    Austin, Texas. SpaceX was founded in 2002 and operates from Hawthorne,
    California. Musk also acquired Twitter in 2022 and renamed it to X.
    """

    graph = extract_relationships(text)
    print(visualize_graph(graph))

    print("\nTriples:")
    for triple in graph.to_triples():
        print(f"  {triple}")

Part 4: Text Summarization

Multi-Strategy Summarization

Implement different summarization strategies for various use cases.

# summarizer.py
from typing import List, Literal
from pydantic import BaseModel
from enum import Enum
import ollama


class SummaryType(str, Enum):
    EXTRACTIVE = "extractive"
    ABSTRACTIVE = "abstractive"
    BULLET_POINTS = "bullet_points"
    HEADLINE = "headline"
    TL_DR = "tldr"


class Summary(BaseModel):
    """Summarization result."""
    original_length: int
    summary_length: int
    compression_ratio: float
    summary_type: SummaryType
    content: str
    key_points: List[str] = []


def summarize(
    text: str,
    summary_type: SummaryType = SummaryType.ABSTRACTIVE,
    max_length: int = 150,
    model: str = "phi3:mini"
) -> Summary:
    """Summarize text using specified strategy."""

    strategy_prompts = {
        SummaryType.EXTRACTIVE: f"""Extract the most important sentences from this text verbatim.
Select 2-3 key sentences that capture the main points.
Text: {text}

Important sentences:""",

        SummaryType.ABSTRACTIVE: f"""Write a concise summary of this text in your own words.
Keep it under {max_length} words.
Text: {text}

Summary:""",

        SummaryType.BULLET_POINTS: f"""Summarize this text as 3-5 bullet points.
Each point should be one clear, complete thought.
Text: {text}

Bullet points:""",

        SummaryType.HEADLINE: f"""Write a single headline (under 15 words) that captures the main point.
Text: {text}

Headline:""",

        SummaryType.TL_DR: f"""Write a TL;DR (Too Long; Didn't Read) summary in 1-2 sentences.
Text: {text}

TL;DR:"""
    }

    prompt = strategy_prompts[summary_type]

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.3}
    )

    summary_text = response["message"]["content"].strip()

    # Extract key points for bullet point summaries
    key_points = []
    if summary_type == SummaryType.BULLET_POINTS:
        for line in summary_text.split("\n"):
            line = line.strip()
            if line.startswith(("•", "-", "*", "1", "2", "3", "4", "5")):
                # Remove bullet/number prefix
                clean = line.lstrip("•-*0123456789.) ").strip()
                if clean:
                    key_points.append(clean)

    original_len = len(text.split())
    summary_len = len(summary_text.split())

    return Summary(
        original_length=original_len,
        summary_length=summary_len,
        compression_ratio=summary_len / original_len if original_len > 0 else 0,
        summary_type=summary_type,
        content=summary_text,
        key_points=key_points
    )


def multi_summarize(text: str, model: str = "phi3:mini") -> dict:
    """Generate multiple summary types for the same text."""
    return {
        summary_type.value: summarize(text, summary_type, model=model)
        for summary_type in SummaryType
    }


# Example
if __name__ == "__main__":
    article = """
    Artificial intelligence has made remarkable strides in recent years,
    transforming industries from healthcare to finance. Machine learning
    models can now diagnose diseases, predict market trends, and even
    generate creative content. However, these advances come with significant
    challenges including bias in training data, lack of explainability,
    and concerns about job displacement. Researchers are working on
    developing more transparent and fair AI systems. Governments worldwide
    are beginning to implement regulations to ensure AI is developed and
    deployed responsibly. The next decade will be crucial in determining
    how AI shapes our society and economy.
    """

    print("=" * 60)
    print("SUMMARIZATION EXAMPLES")
    print("=" * 60)

    for stype in SummaryType:
        result = summarize(article, stype)
        print(f"\n{stype.value.upper()}:")
        print(f"  {result.content}")
        print(f"  [Compression: {result.compression_ratio:.1%}]")

Part 5: Intent Detection and Slot Filling

Conversational Intent Detection

Build an intent classifier for conversational AI applications.

# intent_detector.py
from typing import List, Dict, Optional
from pydantic import BaseModel
import ollama


class Intent(BaseModel):
    """Detected user intent."""
    name: str
    confidence: float
    slots: Dict[str, str] = {}


class IntentResult(BaseModel):
    """Intent detection result."""
    primary_intent: Intent
    alternative_intents: List[Intent] = []
    raw_input: str


# Define intents and their slots
INTENT_SCHEMA = {
    "book_flight": {
        "description": "User wants to book a flight",
        "slots": ["origin", "destination", "date", "passengers", "class"]
    },
    "check_weather": {
        "description": "User wants weather information",
        "slots": ["location", "date"]
    },
    "set_reminder": {
        "description": "User wants to set a reminder",
        "slots": ["task", "datetime", "recurrence"]
    },
    "play_music": {
        "description": "User wants to play music",
        "slots": ["song", "artist", "genre", "playlist"]
    },
    "order_food": {
        "description": "User wants to order food",
        "slots": ["item", "quantity", "restaurant", "delivery_address"]
    },
    "get_directions": {
        "description": "User wants navigation help",
        "slots": ["origin", "destination", "mode"]
    },
    "general_question": {
        "description": "User has a general question",
        "slots": ["topic"]
    },
    "chitchat": {
        "description": "Casual conversation",
        "slots": []
    }
}


def detect_intent(
    user_input: str,
    intent_schema: dict = INTENT_SCHEMA,
    model: str = "phi3:mini"
) -> IntentResult:
    """Detect intent and extract slots from user input."""

    # Build intent descriptions for prompt
    intent_desc = "\n".join([
        f"- {name}: {info['description']} (slots: {', '.join(info['slots'])})"
        for name, info in intent_schema.items()
    ])

    prompt = f"""Analyze this user input and determine the intent.

User input: "{user_input}"

Available intents:
{intent_desc}

Respond with JSON:
{{
    "intent": "intent_name",
    "confidence": 0.0-1.0,
    "slots": {{"slot_name": "extracted_value", ...}},
    "alternatives": [{{"intent": "name", "confidence": 0.0-1.0}}]
}}

Extract slot values from the user input when present. JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        primary = Intent(
            name=data.get("intent", "general_question"),
            confidence=data.get("confidence", 0.5),
            slots=data.get("slots", {})
        )

        alternatives = [
            Intent(name=alt["intent"], confidence=alt["confidence"])
            for alt in data.get("alternatives", [])
        ]

        return IntentResult(
            primary_intent=primary,
            alternative_intents=alternatives,
            raw_input=user_input
        )

    except Exception:
        return IntentResult(
            primary_intent=Intent(name="general_question", confidence=0.5),
            raw_input=user_input
        )


# Example
if __name__ == "__main__":
    test_inputs = [
        "Book me a flight from New York to London next Friday for 2 people",
        "What's the weather like in Tokyo tomorrow?",
        "Remind me to call mom at 5pm",
        "Play some jazz music",
        "How do I get to the airport from downtown?",
        "Hey, how's it going?"
    ]

    print("Intent Detection Results")
    print("=" * 60)

    for input_text in test_inputs:
        result = detect_intent(input_text)
        print(f"\nInput: {input_text}")
        print(f"Intent: {result.primary_intent.name} ({result.primary_intent.confidence:.0%})")
        if result.primary_intent.slots:
            print(f"Slots: {result.primary_intent.slots}")

Part 6: Building a Complete NLP Pipeline

Unified Pipeline

Combine all components into a reusable pipeline.

# nlp_pipeline.py
from typing import List, Dict, Any, Optional
from pydantic import BaseModel
from enum import Enum
import ollama
from dataclasses import dataclass
import time


class TaskType(str, Enum):
    CLASSIFY = "classify"
    NER = "ner"
    EXTRACT = "extract"
    SUMMARIZE = "summarize"
    INTENT = "intent"
    ALL = "all"


@dataclass
class PipelineConfig:
    """Pipeline configuration."""
    model: str = "phi3:mini"
    temperature: float = 0.1
    tasks: List[TaskType] = None

    def __post_init__(self):
        if self.tasks is None:
            self.tasks = [TaskType.ALL]


class PipelineResult(BaseModel):
    """Complete pipeline result."""
    text: str
    processing_time_ms: float
    results: Dict[str, Any]


class NLPPipeline:
    """Unified NLP pipeline using SLMs."""

    def __init__(self, config: PipelineConfig = None):
        self.config = config or PipelineConfig()

    def _call_model(self, prompt: str) -> str:
        """Make a model call with consistent settings."""
        response = ollama.chat(
            model=self.config.model,
            messages=[{"role": "user", "content": prompt}],
            options={"temperature": self.config.temperature}
        )
        return response["message"]["content"]

    def _parse_json(self, content: str) -> dict:
        """Parse JSON from model response."""
        import json
        try:
            start = content.find("{")
            end = content.rfind("}") + 1
            if start != -1 and end > start:
                return json.loads(content[start:end])
        except:
            pass
        return {}

    def classify(self, text: str) -> dict:
        """Run classification."""
        prompt = f"""Classify this text.
Text: {text}

JSON response with sentiment (positive/negative/neutral),
category, and confidence (0-1):"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def extract_entities(self, text: str) -> dict:
        """Run NER."""
        prompt = f"""Extract named entities (PERSON, ORG, LOCATION, DATE, etc).
Text: {text}

JSON with "entities" list containing text and type:"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def extract_info(self, text: str) -> dict:
        """Run information extraction."""
        prompt = f"""Extract key information as structured data.
Text: {text}

JSON with relevant fields (names, dates, amounts, etc):"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def summarize(self, text: str) -> dict:
        """Run summarization."""
        prompt = f"""Summarize in 2-3 sentences.
Text: {text}

JSON with "summary" and "key_points" list:"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def detect_intent(self, text: str) -> dict:
        """Run intent detection."""
        prompt = f"""Detect the intent and extract slots.
Text: {text}

JSON with "intent", "confidence", and "slots":"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def process(self, text: str) -> PipelineResult:
        """Run the complete pipeline."""
        start_time = time.time()
        results = {}

        tasks = self.config.tasks
        if TaskType.ALL in tasks:
            tasks = [TaskType.CLASSIFY, TaskType.NER, TaskType.EXTRACT,
                    TaskType.SUMMARIZE, TaskType.INTENT]

        for task in tasks:
            if task == TaskType.CLASSIFY:
                results["classification"] = self.classify(text)
            elif task == TaskType.NER:
                results["entities"] = self.extract_entities(text)
            elif task == TaskType.EXTRACT:
                results["extraction"] = self.extract_info(text)
            elif task == TaskType.SUMMARIZE:
                results["summary"] = self.summarize(text)
            elif task == TaskType.INTENT:
                results["intent"] = self.detect_intent(text)

        processing_time = (time.time() - start_time) * 1000

        return PipelineResult(
            text=text,
            processing_time_ms=processing_time,
            results=results
        )

    def batch_process(self, texts: List[str]) -> List[PipelineResult]:
        """Process multiple texts."""
        return [self.process(text) for text in texts]


# Example usage
if __name__ == "__main__":
    # Create pipeline
    pipeline = NLPPipeline(PipelineConfig(
        model="phi3:mini",
        tasks=[TaskType.CLASSIFY, TaskType.NER, TaskType.SUMMARIZE]
    ))

    text = """
    Microsoft announced today that CEO Satya Nadella will present the company's
    new AI strategy at the Build 2024 conference in Seattle. The presentation
    will cover Azure AI services and the integration of GPT-4 across Office 365.
    Analysts expect this to drive significant growth in cloud revenue.
    """

    result = pipeline.process(text.strip())

    print("NLP Pipeline Results")
    print("=" * 60)
    print(f"Processing time: {result.processing_time_ms:.0f}ms")
    print()

    for task_name, task_result in result.results.items():
        print(f"{task_name.upper()}:")
        for key, value in task_result.items():
            print(f"  {key}: {value}")
        print()

FastAPI Application

Expose the pipeline as a REST API.

# api.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import uvicorn

from nlp_pipeline import NLPPipeline, PipelineConfig, TaskType

app = FastAPI(
    title="SLM Text Tasks API",
    description="NLP tasks powered by small language models",
    version="1.0.0"
)

# Initialize pipeline
pipeline = NLPPipeline(PipelineConfig(model="phi3:mini"))


class TextRequest(BaseModel):
    text: str
    tasks: Optional[List[str]] = None


class ClassifyRequest(BaseModel):
    text: str
    include_reasoning: bool = False


class NERRequest(BaseModel):
    text: str
    entity_types: Optional[List[str]] = None


class ExtractRequest(BaseModel):
    text: str
    schema_type: str = "auto"  # auto, contact, event, product


class SummarizeRequest(BaseModel):
    text: str
    style: str = "abstractive"  # extractive, abstractive, bullet_points, headline, tldr


class IntentRequest(BaseModel):
    text: str


class BatchRequest(BaseModel):
    texts: List[str]
    tasks: Optional[List[str]] = None


@app.get("/health")
async def health_check():
    """Health check endpoint."""
    return {"status": "healthy", "model": pipeline.config.model}


@app.post("/analyze")
async def analyze_text(request: TextRequest):
    """Run full NLP analysis on text."""
    try:
        if request.tasks:
            task_types = [TaskType(t) for t in request.tasks]
            pipeline.config.tasks = task_types
        else:
            pipeline.config.tasks = [TaskType.ALL]

        result = pipeline.process(request.text)
        return result.model_dump()
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))


@app.post("/classify")
async def classify_text(request: ClassifyRequest):
    """Classify text sentiment and category."""
    result = pipeline.classify(request.text)
    return result


@app.post("/ner")
async def extract_entities(request: NERRequest):
    """Extract named entities from text."""
    result = pipeline.extract_entities(request.text)
    return result


@app.post("/extract")
async def extract_info(request: ExtractRequest):
    """Extract structured information from text."""
    result = pipeline.extract_info(request.text)
    return result


@app.post("/summarize")
async def summarize_text(request: SummarizeRequest):
    """Summarize text."""
    result = pipeline.summarize(request.text)
    return result


@app.post("/intent")
async def detect_intent(request: IntentRequest):
    """Detect user intent."""
    result = pipeline.detect_intent(request.text)
    return result


@app.post("/batch")
async def batch_analyze(request: BatchRequest):
    """Analyze multiple texts."""
    try:
        if request.tasks:
            task_types = [TaskType(t) for t in request.tasks]
            pipeline.config.tasks = task_types

        results = pipeline.batch_process(request.texts)
        return [r.model_dump() for r in results]
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))


if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

Test the API

# Start the server
python api.py

# Test endpoints
curl -X POST http://localhost:8000/classify \
  -H "Content-Type: application/json" \
  -d '{"text": "The product quality exceeded my expectations!"}'

curl -X POST http://localhost:8000/ner \
  -H "Content-Type: application/json" \
  -d '{"text": "Tim Cook announced new Apple products at WWDC in San Jose."}'

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"text": "Book a flight to Paris for next Monday", "tasks": ["intent", "ner"]}'

Model Selection Guide

Task	Recommended Model	Why
Classification	phi3:mini	Fast, good at structured output
NER	qwen2.5:3b	Better entity boundary detection
Extraction	phi3:mini	Strong instruction following
Summarization	gemma2:2b	Natural language generation
Intent	phi3:mini	Good at classification + slots

Prompt Engineering Tips

Be Specific About Format

# Good - explicit format
prompt = """
Respond with ONLY a JSON object:
{"sentiment": "positive/negative/neutral", "confidence": 0.0-1.0}
"""

# Bad - ambiguous format
prompt = "What is the sentiment? Return JSON."

Use Few-Shot Examples

# Few-shot for consistent output
prompt = """
Extract entities. Examples:
"Apple released iPhone" -> [{"text": "Apple", "type": "ORG"}, {"text": "iPhone", "type": "PRODUCT"}]
"Tim Cook visited Berlin" -> [{"text": "Tim Cook", "type": "PERSON"}, {"text": "Berlin", "type": "LOCATION"}]

Now extract from: {text}
"""

Handle Edge Cases

def safe_extract(text: str) -> dict:
    """Extract with fallback handling."""
    if not text or len(text.strip()) < 5:
        return {"error": "Text too short"}

    if len(text) > 10000:
        # Chunk long text
        text = text[:10000]

    # ... extraction logic

Performance Optimization

Batch Processing

async def batch_classify(texts: List[str], batch_size: int = 5):
    """Process texts in batches for better throughput."""
    results = []
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i + batch_size]
        # Process batch concurrently
        batch_results = await asyncio.gather(
            *[classify_async(text) for text in batch]
        )
        results.extend(batch_results)
    return results

Caching

from functools import lru_cache
import hashlib

@lru_cache(maxsize=1000)
def cached_classify(text_hash: str, text: str) -> dict:
    """Cache classification results."""
    return pipeline.classify(text)

def classify_with_cache(text: str) -> dict:
    text_hash = hashlib.md5(text.encode()).hexdigest()
    return cached_classify(text_hash, text)

Exercises

Custom Entity Types: Add support for extracting custom entity types specific to your domain (e.g., programming languages, frameworks, APIs)
Confidence Calibration: Implement a calibration layer that adjusts model confidence scores based on validation data
Multi-Language Support: Extend the pipeline to handle multiple languages with automatic language detection
Active Learning: Build a feedback loop where low-confidence predictions are flagged for human review

Key Concepts Recap

Concept	What It Is	Why It Matters
Text Classification	Assign labels (sentiment, category) to text	Foundation for routing, filtering, analysis
NER	Named Entity Recognition - extract people, orgs, dates	Structured data from unstructured text
Information Extraction	Pull specific fields (email, phone, price)	Automate data entry from documents
Structured Output	Force JSON responses with Pydantic schemas	Reliable parsing, type safety
Low Temperature	temperature=0.1 for deterministic output	Consistent, reproducible results
JSON Extraction	Use regex to find `{...}` in response	Robust parsing when model adds text
Intent Detection	Classify user intent + extract slots	Build conversational AI systems
Relationship Extraction	Find (subject, predicate, object) triples	Build knowledge graphs from text
Few-Shot Prompting	Include examples in prompt	Guide model to exact output format
Batch Processing	Process multiple texts together	Higher throughput, better efficiency

Next Steps

SLM Evaluation & Benchmarking - Measure and compare model performance
SLM Fine-tuning - Customize models for your specific tasks
SLM-Powered RAG - Combine SLMs with retrieval systems

SLM for Text Tasks

TL;DR

Project Overview

Aspect	Details
Difficulty	Beginner
Time	2-3 hours
Prerequisites	Local SLM Setup
What You'll Build	Multi-task NLP pipeline with text classification, NER, and extraction

What You'll Learn

Text classification and sentiment analysis with SLMs
Named entity recognition (NER) techniques
Information extraction from unstructured text
Structured output generation with Pydantic
Prompt engineering for reliable results
Building reusable NLP pipelines

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                         NLP Pipeline Architecture                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌───────────────────────┐                                                  │
│   │     Input Layer       │                                                  │
│   │ ┌─────────┐ ┌───────┐ │                                                  │
│   │ │Raw Text │ │ Batch │ │                                                  │
│   │ └────┬────┘ └───┬───┘ │                                                  │
│   └──────┼──────────┼─────┘                                                  │
│          └────┬─────┘                                                        │
│               ▼                                                              │
│   ┌───────────────────────┐      ┌───────────────────────┐                  │
│   │    NLP Pipeline       │      │     SLM Backend       │                  │
│   │                       │      │                       │                  │
│   │   ┌─────────────┐     │      │  ┌─────────────────┐  │                  │
│   │   │Preprocessor │     │      │  │   Phi-3 Mini    │  │                  │
│   │   └──────┬──────┘     │      │  │ (classification)│  │                  │
│   │          │            │      │  └─────────────────┘  │                  │
│   │    ┌─────┼─────┐      │      │  ┌─────────────────┐  │                  │
│   │    │     │     │      │      │  │   Qwen2.5 3B    │  │                  │
│   │    ▼     ▼     ▼      │  ◄── │  │   (extraction)  │  │                  │
│   │ ┌─────┐┌────┐┌─────┐  │      │  └─────────────────┘  │                  │
│   │ │Class││ NER││Extract│ │      │  ┌─────────────────┐  │                  │
│   │ └──┬──┘└──┬─┘└──┬───┘ │      │  │   Gemma 2 2B    │  │                  │
│   │    └──────┼─────┘     │      │  │  (generation)   │  │                  │
│   │           ▼           │      │  └─────────────────┘  │                  │
│   │   ┌───────────────┐   │      │                       │                  │
│   │   │   Structure   │   │      └───────────────────────┘                  │
│   │   │   Generator   │   │                                                  │
│   │   └───────┬───────┘   │                                                  │
│   └───────────┼───────────┘                                                  │
│               │                                                              │
│               ▼                                                              │
│   ┌───────────────────────────────────────────────────────────────────────┐ │
│   │                        Output Layer                                    │ │
│   │     ┌──────────────┐  ┌───────────────┐  ┌─────────────┐              │ │
│   │     │Structured JSON│  │Classifications│  │  Entities   │              │ │
│   │     └──────────────┘  └───────────────┘  └─────────────┘              │ │
│   └───────────────────────────────────────────────────────────────────────┘ │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Project Setup

Install Dependencies

# Create project directory
mkdir slm-text-tasks && cd slm-text-tasks

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install ollama pydantic instructor langchain-ollama rich

Pull Required Models

# Recommended models for text tasks
ollama pull phi3:mini          # 2.3GB - Great for classification
ollama pull qwen2.5:3b         # 2.0GB - Excellent for extraction
ollama pull gemma2:2b          # 1.6GB - Fast for simple tasks

Part 1: Text Classification

Basic Classifier

Start with a simple text classifier using prompt engineering.

# classifier.py
import ollama
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field


class Sentiment(str, Enum):
    POSITIVE = "positive"
    NEGATIVE = "negative"
    NEUTRAL = "neutral"


class Category(str, Enum):
    TECHNOLOGY = "technology"
    BUSINESS = "business"
    SPORTS = "sports"
    ENTERTAINMENT = "entertainment"
    POLITICS = "politics"
    HEALTH = "health"
    SCIENCE = "science"
    OTHER = "other"


class ClassificationResult(BaseModel):
    """Structured classification output."""
    sentiment: Sentiment
    category: Category
    confidence: float = Field(ge=0.0, le=1.0)
    reasoning: Optional[str] = None


def classify_text(
    text: str,
    model: str = "phi3:mini",
    include_reasoning: bool = False
) -> ClassificationResult:
    """Classify text for sentiment and category."""

    prompt = f"""Analyze the following text and classify it.

Text: {text}

Respond with ONLY a JSON object in this exact format:
{{
    "sentiment": "positive" or "negative" or "neutral",
    "category": one of ["technology", "business", "sports", "entertainment", "politics", "health", "science", "other"],
    "confidence": number between 0 and 1,
    "reasoning": "brief explanation" (only if reasoning requested)
}}

{"Include brief reasoning for your classification." if include_reasoning else "Do not include reasoning."}

JSON response:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}  # Low temperature for consistency
    )

    # Parse response
    import json
    content = response["message"]["content"]

    # Extract JSON from response
    try:
        # Try to find JSON in the response
        start = content.find("{")
        end = content.rfind("}") + 1
        if start != -1 and end > start:
            json_str = content[start:end]
            data = json.loads(json_str)
            return ClassificationResult(**data)
    except (json.JSONDecodeError, ValueError) as e:
        # Fallback to defaults
        return ClassificationResult(
            sentiment=Sentiment.NEUTRAL,
            category=Category.OTHER,
            confidence=0.5,
            reasoning=f"Failed to parse: {str(e)}"
        )


# Example usage
if __name__ == "__main__":
    texts = [
        "Apple's new M4 chip delivers unprecedented performance gains in AI workloads.",
        "The team's devastating loss marks their fifth consecutive defeat this season.",
        "Scientists discover high microbial activity under Antarctic ice sheet.",
    ]

    for text in texts:
        result = classify_text(text, include_reasoning=True)
        print(f"Text: {text[:50]}...")
        print(f"  Sentiment: {result.sentiment.value}")
        print(f"  Category: {result.category.value}")
        print(f"  Confidence: {result.confidence:.2f}")
        print(f"  Reasoning: {result.reasoning}")
        print()

Multi-Label Classification

Handle texts that belong to multiple categories.

# multi_label_classifier.py
from typing import List
from pydantic import BaseModel, Field
import ollama


class MultiLabelResult(BaseModel):
    """Multi-label classification result."""
    labels: List[str]
    scores: dict[str, float]
    primary_label: str


TOPIC_LABELS = [
    "artificial_intelligence",
    "machine_learning",
    "data_science",
    "software_engineering",
    "cloud_computing",
    "cybersecurity",
    "web_development",
    "mobile_development",
    "devops",
    "blockchain"
]


def multi_label_classify(
    text: str,
    labels: List[str] = TOPIC_LABELS,
    threshold: float = 0.5,
    model: str = "phi3:mini"
) -> MultiLabelResult:
    """Classify text with multiple labels."""

    labels_str = ", ".join(labels)

    prompt = f"""Analyze this text and assign relevance scores to each topic.

Text: {text}

Available topics: {labels_str}

For each topic, assign a score from 0.0 to 1.0 based on relevance.
Only include topics with score > 0.3.

Respond with JSON:
{{
    "scores": {{"topic_name": score, ...}},
    "primary_label": "most relevant topic"
}}

JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        scores = data.get("scores", {})
        # Filter by threshold
        filtered_labels = [l for l, s in scores.items() if s >= threshold]

        return MultiLabelResult(
            labels=filtered_labels,
            scores=scores,
            primary_label=data.get("primary_label", filtered_labels[0] if filtered_labels else "unknown")
        )
    except Exception as e:
        return MultiLabelResult(
            labels=[],
            scores={},
            primary_label="unknown"
        )


# Example
if __name__ == "__main__":
    text = """
    We implemented a new CI/CD pipeline using GitHub Actions that automatically
    deploys our machine learning models to Kubernetes. The pipeline includes
    security scanning and runs our test suite before deployment.
    """

    result = multi_label_classify(text)
    print(f"Labels: {result.labels}")
    print(f"Primary: {result.primary_label}")
    print(f"Scores: {result.scores}")

Part 2: Named Entity Recognition

Custom NER with SLMs

Extract named entities without requiring specialized NER models.

# ner_extractor.py
from typing import List, Optional
from pydantic import BaseModel
import ollama


class Entity(BaseModel):
    """A named entity extracted from text."""
    text: str
    type: str
    start_idx: Optional[int] = None
    end_idx: Optional[int] = None


class NERResult(BaseModel):
    """NER extraction result."""
    entities: List[Entity]
    original_text: str


ENTITY_TYPES = [
    "PERSON",
    "ORGANIZATION",
    "LOCATION",
    "DATE",
    "TIME",
    "MONEY",
    "PRODUCT",
    "EVENT",
    "TECHNOLOGY",
    "EMAIL",
    "PHONE"
]


def extract_entities(
    text: str,
    entity_types: List[str] = ENTITY_TYPES,
    model: str = "qwen2.5:3b"
) -> NERResult:
    """Extract named entities from text."""

    types_str = ", ".join(entity_types)

    prompt = f"""Extract all named entities from the following text.

Text: {text}

Entity types to find: {types_str}

For each entity found, provide:
- text: the exact text of the entity
- type: the entity type from the list above

Respond with JSON:
{{
    "entities": [
        {{"text": "entity text", "type": "ENTITY_TYPE"}},
        ...
    ]
}}

Only include entities you are confident about. JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        entities = []
        for ent in data.get("entities", []):
            entity = Entity(
                text=ent["text"],
                type=ent["type"]
            )
            # Find position in original text
            idx = text.find(ent["text"])
            if idx != -1:
                entity.start_idx = idx
                entity.end_idx = idx + len(ent["text"])
            entities.append(entity)

        return NERResult(entities=entities, original_text=text)

    except Exception as e:
        return NERResult(entities=[], original_text=text)


def highlight_entities(ner_result: NERResult) -> str:
    """Create highlighted text with entity annotations."""
    from rich.console import Console
    from rich.text import Text

    text = Text(ner_result.original_text)

    # Color map for entity types
    colors = {
        "PERSON": "cyan",
        "ORGANIZATION": "green",
        "LOCATION": "yellow",
        "DATE": "magenta",
        "MONEY": "red",
        "PRODUCT": "blue",
        "TECHNOLOGY": "bright_cyan",
    }

    # Sort by position (reverse) to avoid offset issues
    sorted_entities = sorted(
        [e for e in ner_result.entities if e.start_idx is not None],
        key=lambda x: x.start_idx,
        reverse=True
    )

    for entity in sorted_entities:
        color = colors.get(entity.type, "white")
        text.stylize(f"bold {color}", entity.start_idx, entity.end_idx)

    console = Console()
    console.print(text)

    # Print legend
    print("\nEntities found:")
    for entity in ner_result.entities:
        print(f"  [{entity.type}] {entity.text}")

    return str(text)


# Example usage
if __name__ == "__main__":
    sample_text = """
    On January 15, 2024, Anthropic announced Claude 3, their latest AI assistant.
    The San Francisco-based company, founded by Dario Amodei, raised $750 million
    in Series C funding. Microsoft and Google are also investing heavily in AI.
    Contact press@anthropic.com for more information.
    """

    result = extract_entities(sample_text.strip())
    highlight_entities(result)

Domain-Specific NER

Customize entity extraction for specific domains.

# domain_ner.py
from typing import List, Dict
from pydantic import BaseModel
import ollama


class DomainEntity(BaseModel):
    text: str
    type: str
    attributes: Dict[str, str] = {}


class DomainNERResult(BaseModel):
    domain: str
    entities: List[DomainEntity]


# Domain-specific entity configurations
DOMAIN_CONFIGS = {
    "medical": {
        "entity_types": ["SYMPTOM", "DISEASE", "MEDICATION", "DOSAGE", "BODY_PART", "PROCEDURE", "DOCTOR"],
        "examples": [
            {"text": "100mg", "type": "DOSAGE"},
            {"text": "headache", "type": "SYMPTOM"},
            {"text": "ibuprofen", "type": "MEDICATION"}
        ]
    },
    "legal": {
        "entity_types": ["CASE_NUMBER", "COURT", "JUDGE", "PARTY", "STATUTE", "DATE", "JURISDICTION"],
        "examples": [
            {"text": "Case No. 2024-CV-123", "type": "CASE_NUMBER"},
            {"text": "Supreme Court", "type": "COURT"}
        ]
    },
    "financial": {
        "entity_types": ["COMPANY", "TICKER", "AMOUNT", "CURRENCY", "PERCENTAGE", "METRIC", "DATE"],
        "examples": [
            {"text": "AAPL", "type": "TICKER"},
            {"text": "$1.2B", "type": "AMOUNT"}
        ]
    },
    "ecommerce": {
        "entity_types": ["PRODUCT", "BRAND", "PRICE", "SKU", "CATEGORY", "FEATURE", "SIZE", "COLOR"],
        "examples": [
            {"text": "iPhone 15 Pro", "type": "PRODUCT"},
            {"text": "Apple", "type": "BRAND"}
        ]
    }
}


def domain_ner(
    text: str,
    domain: str,
    model: str = "qwen2.5:3b"
) -> DomainNERResult:
    """Extract domain-specific entities."""

    config = DOMAIN_CONFIGS.get(domain, {
        "entity_types": ["ENTITY"],
        "examples": []
    })

    types_str = ", ".join(config["entity_types"])
    examples_str = "\n".join([
        f'  - "{ex["text"]}" -> {ex["type"]}'
        for ex in config["examples"]
    ])

    prompt = f"""You are a {domain} domain expert. Extract all relevant entities.

Text: {text}

Entity types for {domain} domain: {types_str}

Examples:
{examples_str}

Extract entities with any relevant attributes. Respond with JSON:
{{
    "entities": [
        {{"text": "entity", "type": "TYPE", "attributes": {{"key": "value"}}}},
        ...
    ]
}}

JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        entities = [
            DomainEntity(
                text=e["text"],
                type=e["type"],
                attributes=e.get("attributes", {})
            )
            for e in data.get("entities", [])
        ]

        return DomainNERResult(domain=domain, entities=entities)

    except Exception:
        return DomainNERResult(domain=domain, entities=[])


# Example
if __name__ == "__main__":
    medical_text = """
    Patient presents with persistent headache and fever of 101.5F for 3 days.
    Prescribed Tylenol 500mg every 6 hours. Follow up with Dr. Smith in 1 week
    if symptoms persist. Consider CT scan if headache worsens.
    """

    result = domain_ner(medical_text, "medical")
    print(f"Domain: {result.domain}")
    for entity in result.entities:
        attrs = f" ({entity.attributes})" if entity.attributes else ""
        print(f"  [{entity.type}] {entity.text}{attrs}")

Part 3: Information Extraction

Structured Data Extraction

Extract structured information from unstructured text using SLMs.

# info_extractor.py
from typing import List, Optional, Any
from pydantic import BaseModel, Field
from datetime import datetime
import ollama


class ContactInfo(BaseModel):
    """Extracted contact information."""
    name: Optional[str] = None
    email: Optional[str] = None
    phone: Optional[str] = None
    company: Optional[str] = None
    title: Optional[str] = None
    address: Optional[str] = None


class EventInfo(BaseModel):
    """Extracted event information."""
    name: str
    date: Optional[str] = None
    time: Optional[str] = None
    location: Optional[str] = None
    organizer: Optional[str] = None
    description: Optional[str] = None


class ProductInfo(BaseModel):
    """Extracted product information."""
    name: str
    price: Optional[str] = None
    features: List[str] = []
    specifications: dict = {}
    brand: Optional[str] = None
    category: Optional[str] = None


def extract_structured_data(
    text: str,
    schema: type[BaseModel],
    model: str = "phi3:mini"
) -> BaseModel:
    """Extract structured data based on a Pydantic schema."""

    # Get schema JSON for prompt
    schema_json = schema.model_json_schema()

    # Create field descriptions
    fields_desc = []
    for field_name, field_info in schema_json.get("properties", {}).items():
        field_type = field_info.get("type", "any")
        description = field_info.get("description", "")
        fields_desc.append(f"- {field_name} ({field_type}): {description}")

    fields_str = "\n".join(fields_desc)

    prompt = f"""Extract structured information from the text.

Text: {text}

Extract the following fields:
{fields_str}

Respond with a JSON object containing the extracted fields.
Use null for fields you cannot find. JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])
        return schema(**data)
    except Exception as e:
        # Return empty instance
        return schema()


# Specialized extractors
def extract_contact(text: str, model: str = "phi3:mini") -> ContactInfo:
    """Extract contact information from text."""
    return extract_structured_data(text, ContactInfo, model)


def extract_event(text: str, model: str = "phi3:mini") -> EventInfo:
    """Extract event information from text."""
    return extract_structured_data(text, EventInfo, model)


def extract_product(text: str, model: str = "phi3:mini") -> ProductInfo:
    """Extract product information from text."""
    return extract_structured_data(text, ProductInfo, model)


# Example
if __name__ == "__main__":
    # Test contact extraction
    contact_text = """
    Hi, I'm Sarah Johnson, the VP of Engineering at TechCorp Inc.
    You can reach me at sarah.johnson@techcorp.com or call
    (555) 123-4567. Our office is at 123 Innovation Drive,
    San Francisco, CA 94105.
    """

    contact = extract_contact(contact_text)
    print("Contact Info:")
    print(f"  Name: {contact.name}")
    print(f"  Email: {contact.email}")
    print(f"  Phone: {contact.phone}")
    print(f"  Company: {contact.company}")
    print(f"  Title: {contact.title}")
    print()

    # Test event extraction
    event_text = """
    Join us for the Annual AI Conference 2024!
    Date: March 15-17, 2024
    Location: Moscone Center, San Francisco
    Hosted by the AI Research Foundation
    Three days of cutting-edge AI presentations and workshops.
    """

    event = extract_event(event_text)
    print("Event Info:")
    print(f"  Name: {event.name}")
    print(f"  Date: {event.date}")
    print(f"  Location: {event.location}")
    print(f"  Organizer: {event.organizer}")

Relationship Extraction

Extract relationships between entities.

# relationship_extractor.py
from typing import List, Tuple
from pydantic import BaseModel
import ollama


class Relationship(BaseModel):
    """A relationship between two entities."""
    subject: str
    predicate: str
    object: str
    confidence: float = 1.0


class RelationshipGraph(BaseModel):
    """Graph of extracted relationships."""
    relationships: List[Relationship]
    entities: List[str]

    def to_triples(self) -> List[Tuple[str, str, str]]:
        """Convert to list of (subject, predicate, object) triples."""
        return [(r.subject, r.predicate, r.object) for r in self.relationships]


def extract_relationships(
    text: str,
    relationship_types: List[str] = None,
    model: str = "qwen2.5:3b"
) -> RelationshipGraph:
    """Extract entity relationships from text."""

    types_hint = ""
    if relationship_types:
        types_hint = f"\nFocus on these relationship types: {', '.join(relationship_types)}"

    prompt = f"""Extract relationships between entities in this text.

Text: {text}
{types_hint}

For each relationship, identify:
- subject: the entity performing/having the relationship
- predicate: the relationship type (e.g., "works_at", "founded", "located_in")
- object: the entity being related to

Respond with JSON:
{{
    "relationships": [
        {{"subject": "...", "predicate": "...", "object": "..."}},
        ...
    ]
}}

JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        relationships = [
            Relationship(**r) for r in data.get("relationships", [])
        ]

        # Extract unique entities
        entities = set()
        for r in relationships:
            entities.add(r.subject)
            entities.add(r.object)

        return RelationshipGraph(
            relationships=relationships,
            entities=list(entities)
        )

    except Exception:
        return RelationshipGraph(relationships=[], entities=[])


def visualize_graph(graph: RelationshipGraph) -> str:
    """Create a simple text visualization of the relationship graph."""
    output = ["Relationship Graph:", "=" * 40]

    for rel in graph.relationships:
        output.append(f"  {rel.subject} --[{rel.predicate}]--> {rel.object}")

    output.append("")
    output.append(f"Entities ({len(graph.entities)}): {', '.join(graph.entities)}")

    return "\n".join(output)


# Example
if __name__ == "__main__":
    text = """
    Elon Musk is the CEO of Tesla and SpaceX. Tesla is headquartered in
    Austin, Texas. SpaceX was founded in 2002 and operates from Hawthorne,
    California. Musk also acquired Twitter in 2022 and renamed it to X.
    """

    graph = extract_relationships(text)
    print(visualize_graph(graph))

    print("\nTriples:")
    for triple in graph.to_triples():
        print(f"  {triple}")

Part 4: Text Summarization

Multi-Strategy Summarization

Implement different summarization strategies for various use cases.

# summarizer.py
from typing import List, Literal
from pydantic import BaseModel
from enum import Enum
import ollama


class SummaryType(str, Enum):
    EXTRACTIVE = "extractive"
    ABSTRACTIVE = "abstractive"
    BULLET_POINTS = "bullet_points"
    HEADLINE = "headline"
    TL_DR = "tldr"


class Summary(BaseModel):
    """Summarization result."""
    original_length: int
    summary_length: int
    compression_ratio: float
    summary_type: SummaryType
    content: str
    key_points: List[str] = []


def summarize(
    text: str,
    summary_type: SummaryType = SummaryType.ABSTRACTIVE,
    max_length: int = 150,
    model: str = "phi3:mini"
) -> Summary:
    """Summarize text using specified strategy."""

    strategy_prompts = {
        SummaryType.EXTRACTIVE: f"""Extract the most important sentences from this text verbatim.
Select 2-3 key sentences that capture the main points.
Text: {text}

Important sentences:""",

        SummaryType.ABSTRACTIVE: f"""Write a concise summary of this text in your own words.
Keep it under {max_length} words.
Text: {text}

Summary:""",

        SummaryType.BULLET_POINTS: f"""Summarize this text as 3-5 bullet points.
Each point should be one clear, complete thought.
Text: {text}

Bullet points:""",

        SummaryType.HEADLINE: f"""Write a single headline (under 15 words) that captures the main point.
Text: {text}

Headline:""",

        SummaryType.TL_DR: f"""Write a TL;DR (Too Long; Didn't Read) summary in 1-2 sentences.
Text: {text}

TL;DR:"""
    }

    prompt = strategy_prompts[summary_type]

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.3}
    )

    summary_text = response["message"]["content"].strip()

    # Extract key points for bullet point summaries
    key_points = []
    if summary_type == SummaryType.BULLET_POINTS:
        for line in summary_text.split("\n"):
            line = line.strip()
            if line.startswith(("•", "-", "*", "1", "2", "3", "4", "5")):
                # Remove bullet/number prefix
                clean = line.lstrip("•-*0123456789.) ").strip()
                if clean:
                    key_points.append(clean)

    original_len = len(text.split())
    summary_len = len(summary_text.split())

    return Summary(
        original_length=original_len,
        summary_length=summary_len,
        compression_ratio=summary_len / original_len if original_len > 0 else 0,
        summary_type=summary_type,
        content=summary_text,
        key_points=key_points
    )


def multi_summarize(text: str, model: str = "phi3:mini") -> dict:
    """Generate multiple summary types for the same text."""
    return {
        summary_type.value: summarize(text, summary_type, model=model)
        for summary_type in SummaryType
    }


# Example
if __name__ == "__main__":
    article = """
    Artificial intelligence has made remarkable strides in recent years,
    transforming industries from healthcare to finance. Machine learning
    models can now diagnose diseases, predict market trends, and even
    generate creative content. However, these advances come with significant
    challenges including bias in training data, lack of explainability,
    and concerns about job displacement. Researchers are working on
    developing more transparent and fair AI systems. Governments worldwide
    are beginning to implement regulations to ensure AI is developed and
    deployed responsibly. The next decade will be crucial in determining
    how AI shapes our society and economy.
    """

    print("=" * 60)
    print("SUMMARIZATION EXAMPLES")
    print("=" * 60)

    for stype in SummaryType:
        result = summarize(article, stype)
        print(f"\n{stype.value.upper()}:")
        print(f"  {result.content}")
        print(f"  [Compression: {result.compression_ratio:.1%}]")

Part 5: Intent Detection and Slot Filling

Conversational Intent Detection

Build an intent classifier for conversational AI applications.

# intent_detector.py
from typing import List, Dict, Optional
from pydantic import BaseModel
import ollama


class Intent(BaseModel):
    """Detected user intent."""
    name: str
    confidence: float
    slots: Dict[str, str] = {}


class IntentResult(BaseModel):
    """Intent detection result."""
    primary_intent: Intent
    alternative_intents: List[Intent] = []
    raw_input: str


# Define intents and their slots
INTENT_SCHEMA = {
    "book_flight": {
        "description": "User wants to book a flight",
        "slots": ["origin", "destination", "date", "passengers", "class"]
    },
    "check_weather": {
        "description": "User wants weather information",
        "slots": ["location", "date"]
    },
    "set_reminder": {
        "description": "User wants to set a reminder",
        "slots": ["task", "datetime", "recurrence"]
    },
    "play_music": {
        "description": "User wants to play music",
        "slots": ["song", "artist", "genre", "playlist"]
    },
    "order_food": {
        "description": "User wants to order food",
        "slots": ["item", "quantity", "restaurant", "delivery_address"]
    },
    "get_directions": {
        "description": "User wants navigation help",
        "slots": ["origin", "destination", "mode"]
    },
    "general_question": {
        "description": "User has a general question",
        "slots": ["topic"]
    },
    "chitchat": {
        "description": "Casual conversation",
        "slots": []
    }
}


def detect_intent(
    user_input: str,
    intent_schema: dict = INTENT_SCHEMA,
    model: str = "phi3:mini"
) -> IntentResult:
    """Detect intent and extract slots from user input."""

    # Build intent descriptions for prompt
    intent_desc = "\n".join([
        f"- {name}: {info['description']} (slots: {', '.join(info['slots'])})"
        for name, info in intent_schema.items()
    ])

    prompt = f"""Analyze this user input and determine the intent.

User input: "{user_input}"

Available intents:
{intent_desc}

Respond with JSON:
{{
    "intent": "intent_name",
    "confidence": 0.0-1.0,
    "slots": {{"slot_name": "extracted_value", ...}},
    "alternatives": [{{"intent": "name", "confidence": 0.0-1.0}}]
}}

Extract slot values from the user input when present. JSON:"""

    response = ollama.chat(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        options={"temperature": 0.1}
    )

    import json
    content = response["message"]["content"]

    try:
        start = content.find("{")
        end = content.rfind("}") + 1
        data = json.loads(content[start:end])

        primary = Intent(
            name=data.get("intent", "general_question"),
            confidence=data.get("confidence", 0.5),
            slots=data.get("slots", {})
        )

        alternatives = [
            Intent(name=alt["intent"], confidence=alt["confidence"])
            for alt in data.get("alternatives", [])
        ]

        return IntentResult(
            primary_intent=primary,
            alternative_intents=alternatives,
            raw_input=user_input
        )

    except Exception:
        return IntentResult(
            primary_intent=Intent(name="general_question", confidence=0.5),
            raw_input=user_input
        )


# Example
if __name__ == "__main__":
    test_inputs = [
        "Book me a flight from New York to London next Friday for 2 people",
        "What's the weather like in Tokyo tomorrow?",
        "Remind me to call mom at 5pm",
        "Play some jazz music",
        "How do I get to the airport from downtown?",
        "Hey, how's it going?"
    ]

    print("Intent Detection Results")
    print("=" * 60)

    for input_text in test_inputs:
        result = detect_intent(input_text)
        print(f"\nInput: {input_text}")
        print(f"Intent: {result.primary_intent.name} ({result.primary_intent.confidence:.0%})")
        if result.primary_intent.slots:
            print(f"Slots: {result.primary_intent.slots}")

Part 6: Building a Complete NLP Pipeline

Unified Pipeline

Combine all components into a reusable pipeline.

# nlp_pipeline.py
from typing import List, Dict, Any, Optional
from pydantic import BaseModel
from enum import Enum
import ollama
from dataclasses import dataclass
import time


class TaskType(str, Enum):
    CLASSIFY = "classify"
    NER = "ner"
    EXTRACT = "extract"
    SUMMARIZE = "summarize"
    INTENT = "intent"
    ALL = "all"


@dataclass
class PipelineConfig:
    """Pipeline configuration."""
    model: str = "phi3:mini"
    temperature: float = 0.1
    tasks: List[TaskType] = None

    def __post_init__(self):
        if self.tasks is None:
            self.tasks = [TaskType.ALL]


class PipelineResult(BaseModel):
    """Complete pipeline result."""
    text: str
    processing_time_ms: float
    results: Dict[str, Any]


class NLPPipeline:
    """Unified NLP pipeline using SLMs."""

    def __init__(self, config: PipelineConfig = None):
        self.config = config or PipelineConfig()

    def _call_model(self, prompt: str) -> str:
        """Make a model call with consistent settings."""
        response = ollama.chat(
            model=self.config.model,
            messages=[{"role": "user", "content": prompt}],
            options={"temperature": self.config.temperature}
        )
        return response["message"]["content"]

    def _parse_json(self, content: str) -> dict:
        """Parse JSON from model response."""
        import json
        try:
            start = content.find("{")
            end = content.rfind("}") + 1
            if start != -1 and end > start:
                return json.loads(content[start:end])
        except:
            pass
        return {}

    def classify(self, text: str) -> dict:
        """Run classification."""
        prompt = f"""Classify this text.
Text: {text}

JSON response with sentiment (positive/negative/neutral),
category, and confidence (0-1):"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def extract_entities(self, text: str) -> dict:
        """Run NER."""
        prompt = f"""Extract named entities (PERSON, ORG, LOCATION, DATE, etc).
Text: {text}

JSON with "entities" list containing text and type:"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def extract_info(self, text: str) -> dict:
        """Run information extraction."""
        prompt = f"""Extract key information as structured data.
Text: {text}

JSON with relevant fields (names, dates, amounts, etc):"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def summarize(self, text: str) -> dict:
        """Run summarization."""
        prompt = f"""Summarize in 2-3 sentences.
Text: {text}

JSON with "summary" and "key_points" list:"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def detect_intent(self, text: str) -> dict:
        """Run intent detection."""
        prompt = f"""Detect the intent and extract slots.
Text: {text}

JSON with "intent", "confidence", and "slots":"""

        result = self._call_model(prompt)
        return self._parse_json(result)

    def process(self, text: str) -> PipelineResult:
        """Run the complete pipeline."""
        start_time = time.time()
        results = {}

        tasks = self.config.tasks
        if TaskType.ALL in tasks:
            tasks = [TaskType.CLASSIFY, TaskType.NER, TaskType.EXTRACT,
                    TaskType.SUMMARIZE, TaskType.INTENT]

        for task in tasks:
            if task == TaskType.CLASSIFY:
                results["classification"] = self.classify(text)
            elif task == TaskType.NER:
                results["entities"] = self.extract_entities(text)
            elif task == TaskType.EXTRACT:
                results["extraction"] = self.extract_info(text)
            elif task == TaskType.SUMMARIZE:
                results["summary"] = self.summarize(text)
            elif task == TaskType.INTENT:
                results["intent"] = self.detect_intent(text)

        processing_time = (time.time() - start_time) * 1000

        return PipelineResult(
            text=text,
            processing_time_ms=processing_time,
            results=results
        )

    def batch_process(self, texts: List[str]) -> List[PipelineResult]:
        """Process multiple texts."""
        return [self.process(text) for text in texts]


# Example usage
if __name__ == "__main__":
    # Create pipeline
    pipeline = NLPPipeline(PipelineConfig(
        model="phi3:mini",
        tasks=[TaskType.CLASSIFY, TaskType.NER, TaskType.SUMMARIZE]
    ))

    text = """
    Microsoft announced today that CEO Satya Nadella will present the company's
    new AI strategy at the Build 2024 conference in Seattle. The presentation
    will cover Azure AI services and the integration of GPT-4 across Office 365.
    Analysts expect this to drive significant growth in cloud revenue.
    """

    result = pipeline.process(text.strip())

    print("NLP Pipeline Results")
    print("=" * 60)
    print(f"Processing time: {result.processing_time_ms:.0f}ms")
    print()

    for task_name, task_result in result.results.items():
        print(f"{task_name.upper()}:")
        for key, value in task_result.items():
            print(f"  {key}: {value}")
        print()

FastAPI Application

Expose the pipeline as a REST API.

# api.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import List, Optional
import uvicorn

from nlp_pipeline import NLPPipeline, PipelineConfig, TaskType

app = FastAPI(
    title="SLM Text Tasks API",
    description="NLP tasks powered by small language models",
    version="1.0.0"
)

# Initialize pipeline
pipeline = NLPPipeline(PipelineConfig(model="phi3:mini"))


class TextRequest(BaseModel):
    text: str
    tasks: Optional[List[str]] = None


class ClassifyRequest(BaseModel):
    text: str
    include_reasoning: bool = False


class NERRequest(BaseModel):
    text: str
    entity_types: Optional[List[str]] = None


class ExtractRequest(BaseModel):
    text: str
    schema_type: str = "auto"  # auto, contact, event, product


class SummarizeRequest(BaseModel):
    text: str
    style: str = "abstractive"  # extractive, abstractive, bullet_points, headline, tldr


class IntentRequest(BaseModel):
    text: str


class BatchRequest(BaseModel):
    texts: List[str]
    tasks: Optional[List[str]] = None


@app.get("/health")
async def health_check():
    """Health check endpoint."""
    return {"status": "healthy", "model": pipeline.config.model}


@app.post("/analyze")
async def analyze_text(request: TextRequest):
    """Run full NLP analysis on text."""
    try:
        if request.tasks:
            task_types = [TaskType(t) for t in request.tasks]
            pipeline.config.tasks = task_types
        else:
            pipeline.config.tasks = [TaskType.ALL]

        result = pipeline.process(request.text)
        return result.model_dump()
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))


@app.post("/classify")
async def classify_text(request: ClassifyRequest):
    """Classify text sentiment and category."""
    result = pipeline.classify(request.text)
    return result


@app.post("/ner")
async def extract_entities(request: NERRequest):
    """Extract named entities from text."""
    result = pipeline.extract_entities(request.text)
    return result


@app.post("/extract")
async def extract_info(request: ExtractRequest):
    """Extract structured information from text."""
    result = pipeline.extract_info(request.text)
    return result


@app.post("/summarize")
async def summarize_text(request: SummarizeRequest):
    """Summarize text."""
    result = pipeline.summarize(request.text)
    return result


@app.post("/intent")
async def detect_intent(request: IntentRequest):
    """Detect user intent."""
    result = pipeline.detect_intent(request.text)
    return result


@app.post("/batch")
async def batch_analyze(request: BatchRequest):
    """Analyze multiple texts."""
    try:
        if request.tasks:
            task_types = [TaskType(t) for t in request.tasks]
            pipeline.config.tasks = task_types

        results = pipeline.batch_process(request.texts)
        return [r.model_dump() for r in results]
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))


if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

Test the API

# Start the server
python api.py

# Test endpoints
curl -X POST http://localhost:8000/classify \
  -H "Content-Type: application/json" \
  -d '{"text": "The product quality exceeded my expectations!"}'

curl -X POST http://localhost:8000/ner \
  -H "Content-Type: application/json" \
  -d '{"text": "Tim Cook announced new Apple products at WWDC in San Jose."}'

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"text": "Book a flight to Paris for next Monday", "tasks": ["intent", "ner"]}'

Model Selection Guide

Task	Recommended Model	Why
Classification	phi3:mini	Fast, good at structured output
NER	qwen2.5:3b	Better entity boundary detection
Extraction	phi3:mini	Strong instruction following
Summarization	gemma2:2b	Natural language generation
Intent	phi3:mini	Good at classification + slots

Prompt Engineering Tips

Be Specific About Format

# Good - explicit format
prompt = """
Respond with ONLY a JSON object:
{"sentiment": "positive/negative/neutral", "confidence": 0.0-1.0}
"""

# Bad - ambiguous format
prompt = "What is the sentiment? Return JSON."

Use Few-Shot Examples

# Few-shot for consistent output
prompt = """
Extract entities. Examples:
"Apple released iPhone" -> [{"text": "Apple", "type": "ORG"}, {"text": "iPhone", "type": "PRODUCT"}]
"Tim Cook visited Berlin" -> [{"text": "Tim Cook", "type": "PERSON"}, {"text": "Berlin", "type": "LOCATION"}]

Now extract from: {text}
"""

Handle Edge Cases

def safe_extract(text: str) -> dict:
    """Extract with fallback handling."""
    if not text or len(text.strip()) < 5:
        return {"error": "Text too short"}

    if len(text) > 10000:
        # Chunk long text
        text = text[:10000]

    # ... extraction logic

Performance Optimization

Batch Processing

async def batch_classify(texts: List[str], batch_size: int = 5):
    """Process texts in batches for better throughput."""
    results = []
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i + batch_size]
        # Process batch concurrently
        batch_results = await asyncio.gather(
            *[classify_async(text) for text in batch]
        )
        results.extend(batch_results)
    return results

Caching

from functools import lru_cache
import hashlib

@lru_cache(maxsize=1000)
def cached_classify(text_hash: str, text: str) -> dict:
    """Cache classification results."""
    return pipeline.classify(text)

def classify_with_cache(text: str) -> dict:
    text_hash = hashlib.md5(text.encode()).hexdigest()
    return cached_classify(text_hash, text)

Exercises

Custom Entity Types: Add support for extracting custom entity types specific to your domain (e.g., programming languages, frameworks, APIs)
Confidence Calibration: Implement a calibration layer that adjusts model confidence scores based on validation data
Multi-Language Support: Extend the pipeline to handle multiple languages with automatic language detection
Active Learning: Build a feedback loop where low-confidence predictions are flagged for human review

Key Concepts Recap

Concept	What It Is	Why It Matters
Text Classification	Assign labels (sentiment, category) to text	Foundation for routing, filtering, analysis
NER	Named Entity Recognition - extract people, orgs, dates	Structured data from unstructured text
Information Extraction	Pull specific fields (email, phone, price)	Automate data entry from documents
Structured Output	Force JSON responses with Pydantic schemas	Reliable parsing, type safety
Low Temperature	temperature=0.1 for deterministic output	Consistent, reproducible results
JSON Extraction	Use regex to find `{...}` in response	Robust parsing when model adds text
Intent Detection	Classify user intent + extract slots	Build conversational AI systems
Relationship Extraction	Find (subject, predicate, object) triples	Build knowledge graphs from text
Few-Shot Prompting	Include examples in prompt	Guide model to exact output format
Batch Processing	Process multiple texts together	Higher throughput, better efficiency

Next Steps

SLM Evaluation & Benchmarking - Measure and compare model performance
SLM Fine-tuning - Customize models for your specific tasks
SLM-Powered RAG - Combine SLMs with retrieval systems

SLM for Text Tasks

On this page

SLM for Text Tasks

On this page