Skip to content

Architecture

OpenSift Architecture

Pipeline Overview

OpenSift follows a four-stage pipeline:

graph TD
    A[User Query] --> B[Query Planner]
    B -->|search queries + criteria| C[Search Adapters]
    C -->|raw results| D[Evidence Verifier]
    D -->|assessments| E[Result Classifier]
    E --> F[Structured Response]

    W[WisModel LLM] -.->|powers| B
    W -.->|powers| D

Stage 1: Query Planner

Takes a natural language question and generates via LLM:

  • Search queries — 2–4 precise keyword phrases for the search backend
  • Screening criteria — 1–4 quantified rules, each with type, description, and weight

Stage 2: Search Adapters

Dispatches the generated queries to one or more search backends via the adapter pattern. Results are normalized to a standard schema.

Stage 3: Evidence Verifier

Verifies each search result against each criterion using the LLM:

Assessment Meaning
Support Criterion clearly met, with cited evidence
Somewhat Support Partially relevant but not fully met
Reject Clearly does not meet the criterion
Insufficient Information Not enough info to judge

Stage 4: Result Classifier

Automatically classifies based on verification results:

Classification Rule
Perfect All criteria are Support
Partial At least one non-time criterion is Support or Somewhat Support
Reject All criteria are Reject, or only time criteria pass

Project Structure

opensift/
├── src/opensift/
│   ├── core/                     # Core AI pipeline
│   │   ├── engine.py             # Orchestrator (Plan → Search → Verify → Classify)
│   │   ├── planner/planner.py    # Query planning
│   │   ├── verifier/verifier.py  # Result verification
│   │   ├── classifier.py         # Classification
│   │   └── llm/                  # LLM client + prompt templates
│   ├── adapters/                 # Search backend adapters (pluggable)
│   │   ├── base/                 # Abstract interface
│   │   ├── atomwalker/           # AtomWalker academic search
│   │   ├── elasticsearch/        # Elasticsearch
│   │   ├── opensearch/           # OpenSearch
│   │   ├── solr/                 # Apache Solr
│   │   ├── meilisearch/          # MeiliSearch
│   │   └── wikipedia/            # Wikipedia
│   ├── models/                   # Data models (Pydantic)
│   ├── client/                   # Python SDK
│   ├── api/                      # REST API (FastAPI)
│   ├── config/                   # Config management
│   └── observability/            # Logging
├── tests/
│   ├── unit/                     # Unit tests (mocked)
│   └── integration/              # Integration tests (Docker)
├── deployments/docker/           # Docker Compose files
├── docs/                         # Documentation (this site)
└── pyproject.toml

Data Flow

sequenceDiagram
    participant U as User
    participant API as REST API
    participant E as Engine
    participant P as Planner
    participant S as Search Adapter
    participant V as Verifier
    participant C as Classifier

    U->>API: POST /v1/search
    API->>E: execute(query, options)
    E->>P: plan(query)
    P-->>E: search_queries + criteria
    E->>S: search(queries)
    S-->>E: raw results
    E->>V: verify(results, criteria)
    V-->>E: assessments
    E->>C: classify(assessments)
    C-->>E: perfect / partial / reject
    E-->>API: structured response
    API-->>U: JSON response