FindPyv0.1
// ABOUT

FindPy is the analyst's assistant.

A swarm of specialized AI agents that autonomously plan and conduct OSINT investigations, with cryptographic provenance on every claim. Sovereign by default. Built for the Indian Air Force.

THE PROBLEM

Open-source intelligence has outgrown the analyst. A single geopolitical event spawns thousands of articles, hundreds of social posts in a dozen languages, and floods of imagery — much of it now AI-generated or part of coordinated influence operations. Traditional tools scrape and dashboard. They don't reason, they don't verify, and they leave the analyst alone with raw data and a deadline.

THE THESIS

FindPy decomposes an analyst's plain-language question into a directed graph of tasks dispatched to 11 specialized agents — web crawlers, social listeners, satellite STAC search, image forensics, deepfake jury, geolocator, source credibility, narrative genealogy, coordinated-behavior detection. They collaborate via a shared property graph, and the planner can re-plan when partial results land.

Every source is content-hashed and HMAC-signed at ingest. Every claim cites its sources. A single endpoint re-verifies every envelope and re-hashes every artifact byte — a defensible chain of custody before action.

WHAT MAKES IT DIFFERENT

  • Agentic, not pipeline. Most OSINT tools are linear scrape→clean→dashboard. FindPy dispatches agents that read & write a shared evidence graph and the planner can re-plan after seeing partial results.
  • Verifiable evidence. Every Source carries a content hash + HMAC signature over a canonical envelope. The audit endpoint re-verifies all signatures and re-hashes all bytes — defensible up the chain of command.
  • Real forensic jury. Five real algorithms vote with explainable rationale: ELA, JPEG-ghost recompression, NOAA sun-position shadow physics, GAN-fingerprint heuristic, and amplification-pattern analysis (the OSINT-grade signal that catches influence ops even when pixels look clean).
  • Sovereign by default. Hot-swappable LLM layer (Ollama / vLLM / mock). Air-gap mode switch disables all outbound network. No hosted API in the demo path.
  • IAF-vertical. Aerial-domain gazetteer, Sentinel-2 STAC change-detection wired against Copernicus, demo scenarios built around airbase change-detection rather than celebrity news.

HOW IT WORKS

  1. Analyst types a question in plain English.
  2. Planner LLM decomposes it into a DAG of tasks.
  3. Agents fan out, ingest sources, sign every artifact at ingest.
  4. Image agents extract pHash + EXIF; sat-imagery agent calls STAC.
  5. Credibility scorer rates every source on four factors.
  6. Deepfake jury votes on every image; CIB agent looks for clusters.
  7. Synthesizer writes a brief — every claim ends with an evidence ID.
  8. Audit endpoint can re-verify every signature on demand.

WHO IT IS FOR

Defence intelligence
ATO-friendly architecture, air-gap mode, signed evidence, IAF aerial-domain vertical. Designed as an input feeder to a broader intelligence architecture.
Counter-disinformation
CIB clustering, narrative genealogy, amplification-pattern detection make influence-op tradecraft visible at the account-network level.
Investigative journalism
Provenance-first design, image forensics with explainable verdicts, exportable brief with citations.
Security research
Open-source, plug-in agent contract, evidence-graph schema that maps cleanly to Neo4j.

THE STACK

  • LLM ............. Ollama / vLLM / OpenAI-compatible / mock
  • Reasoning ....... Qwen2.5-72B (production), qwen2.5:7b (dev)
  • Embeddings ...... hashing-trick (dev) → BGE-M3 (production)
  • Evidence graph .. SQLite (dev) → Neo4j (production)
  • Backend ......... FastAPI + WebSocket pub/sub
  • Frontend ........ Next.js 14 + Tailwind + React Flow + Leaflet
  • Sat imagery ..... Sentinel-2 via Copernicus Earth Search STAC
  • Forensics ....... pure PIL + numpy (no model weights required)
  • Signing ......... HMAC-SHA256 (dev) → Ed25519 (production)

STATUS

v0.1 — prototype. 11 agents working end-to-end. Real forensic jury. Real Sentinel-2 STAC discovery. Real evidence-graph audit. 23/23 tests passing. Frontend live, backend on Fly.io Mumbai region. Designed for an ADITI 4.0 / iDEX submission to the Indian Air Force's Problem Statement 18.