Tools & Resources
Everything you need to monitor, investigate, and take action for peace, justice, and non-violent social change.
OSINT & Investigation Tools
Tools and methods to track conflicts, misinformation, and escalation risk.
Event data on conflicts & civilian harm
Why: Identifies where violence and civilian impact occur
Example: Use ACLED to find incident clusters by date and region
Learn more Global news & narrative tracking
Why: Detects shifts in media narratives that may precede escalation
Example: Use GDELT to spot narrative changes before an escalation
Learn more Geolocation & verification workflows
Why: Provides techniques to verify imagery and media
Example: Follow Bellingcat workflows to geolocate a photo
Learn more Automated public footprint gatherer
Learn more Modular reconnaissance framework
Why: Provides repeatable module-based OSINT searches
Example: Run Recon-ng modules to collect officers and corporate filings data
Learn more Collects public emails, subdomains and hosts
Why: Helps map a company’s public internet footprint
Example: Enumerate hosts and public email addresses for a defense contractor
Learn more Real-time dashboard: global conflicts, country instability, displacement & live intel feeds
Why: Aggregates ACLED, UNHCR, NASA FIRMS and live intelligence streams into one geopolitical interface
Example: Track live military escalation, humanitarian crises and regional risk scores across 180+ countries
Learn more Visual relationship & network mapping
Why: Makes complex relationships between people and organizations easy to visualise
Example: Create a graph linking a lobbying firm, contractors and donors
Learn more Search for exposed devices and services
Why: Reveals exposed infrastructure and potential operational systems
Example: Check for exposed servers or services linked to supplier logistics
Learn more Extracts metadata from public documents (PDF, DOC, etc.)
Why: Metadata can show authorship, server paths and software versions
Example: Find document author or internal paths that corroborate timelines
Learn more Geolocation & public social-post mapping
Why: Verifies location claims by mapping content on public accounts
Example: Map field photos posted publicly to confirm date and place
Learn more Enriches feeds and threat intel from public sources
Why: Aggregates and enriches lists for easier analysis
Example: Add sanctions or registry info to a supplier list for context
Learn more Enumerates subdomains and public hostnames
Why: Finds previously unknown public services related to an organisation
Example: Discover hidden portals hosting procurement files
Learn more Search engine for hosts and TLS certificates
Why: Shows hosting relationships and infrastructure connections
Example: Identify shared hosting between agencies and contractors
Learn more Metadata extraction for documents
Why: Reveals hidden items in attachments (authors, paths)
Example: Extract metadata from a tender PDF to trace origin
Learn more Community datasets & tool collections for offline use
Why: Packages tools and datasets for privacy-respecting analysis
Example: Set up a local tool collection for training or workshops
Learn more 📄 Document Processing & OCR Tools
Extract text from scanned documents, images, and PDFs with open-source OCR engines. Essential for archival research, leaked document analysis, and historical record digitization.
🔤 Open-Source OCR Engines
- Tesseract 5 — Industry standard OCR; supports 100+ languages; command-line and library
- PaddleOCR — Multilingual, fast, pre-trained models; easy Python integration
- EasyOCR — High accuracy, 80+ languages, PyTorch-based, simple API
- Dots OCR (RedNote HiLab) — Lightweight, efficient document processing, optimized for Chinese/multilingual text
- DeepSeek-OCR-2 — Advanced vision language model for document analysis, structured data extraction, table recognition
🛠 Text Processing Pipeline
- Rich — Format and display OCR output with syntax highlighting
- Python Requests + PIL/Pillow — Batch download images/PDFs, resize, preprocess
- OCRmyPDF — Embed OCR text layer into PDFs while preserving original layout
- Kraken — High-accuracy OCR for historical documents and manuscripts
💾 Data Organization
- SQLite FTS5 — Full-text search on extracted text; fast queries
- Elasticsearch — Distributed search engine for large document collections
- Weaviate — Vector database for semantic document search
Workflow Example: Download leaked PDFs → OCR with DeepSeek/PaddleOCR → Extract entities (names, amounts, dates) → Store in SQLite FTS5 → Search and cross-reference → Export findings as structured CSV/JSON.
Public APIs & Open Data Resources
Build Your Own Solutions
Instead of relying on corporate data services, use free and open-source APIs to collect, analyze, and visualize data directly. These resources are maintained by communities and trusted organizations.
📊 Data & Analytics
- ACLED — Armed conflict events and patterns globally
- GDELT — News narratives, geopolitical signals, escalation monitoring
- UNOSAT — Satellite imagery for infrastructure and humanitarian impact
- SIPRI — Military spending, arms transfers, military budgets
🔍 Investigation & OSINT
- Bellingcat — Geolocation, image verification, open-source methods
- Shodan — Internet device discovery and mapping
- SpiderFoot — Automated reconnaissance and OSINT automation
- theHarvester — Email, subdomain, and host discovery
🛠 Developer APIs
- Public APIs directory — Thousands of free APIs for every purpose
- GitHub — Open-source repositories, collaborative development
- OpenWeather, GeoNames — Free geolocation and environmental data
📋 Corporate & Lobbying Data
- LobbyFacts — EU lobbying register with company and spending data
- Transparency International — Corruption, government spending databases
- SIPRI Arms Database — Weapons transfers by country, company
🌍 Environmental & Procurement
- Copernicus (ESA) — Free satellite data for climate and environment
- TradeHub / COMTRADE — UN trade statistics
- National transparency portals — Public procurement in ES, IT, FR, DE
Why Open-Source? The Corporate Digital Collapse
Corporate platforms are losing their value. Here's why your alternative matters.
The problem: For over a decade, corporate digital services relied on a closed model: proprietary platforms, opaque pricing, licensing lock-in, and centralized control. This worked because there were no credible alternatives at scale.
That condition no longer exists. Large language models (LLMs) combined with mature open-source ecosystems are structurally undermining the value proposition of big tech products.
→ How corporate digital services decay to zero (click to expand)
- 1. Compression by LLMs: Search (replaced by ChatGPT), customer support (chatbots), analytics (self-service BI), content production—features that cost $5K+/month now cost $20/month or $0.
- 2. Open-source defeats proprietary: Every corporate tool has an open equivalent gaining adoption. Metabase eats Tableau. Mattermost replaces Slack. Users vote with their data.
- 3. Institutional distrust accelerates collapse: US Big Tech companies are weaponized by state and capital: surveillance, censorship, regulatory capture, geopolitical coercion. When alternatives exist, defection is inevitable.
- 4. Community-driven wins on speed: Open-source teams move faster than corporations. Ollama shipped in weeks. OpenAI spent years securing capital.
What this means for you:
- Evaluate alternatives: Before buying a corporate tool, check if an open-source equivalent exists.
- Build your own stack: Combine specialized open tools instead of relying on one corporate platform.
- Support interoperable systems: Prefer tools that export data, use open standards, and don't lock you in.
- Join communities, not platforms: Open-source projects are controlled by users, not shareholders.
Open-Source Alternatives to Corporate Products
| Corporate Service | Open-Source Alternative | Benefits |
| Google Suite (Gmail, Docs) | NextCloud, OnlyOffice, LibreOffice | Self-hosted, no data extraction, portable |
| Zoom | Jitsi Meet, Element (Matrix), BigBlueButton | End-to-end encrypted, federated, community-run |
| Slack | Mattermost, Rocketchat, Zulip | Transparent, full data control, export anytime |
| Salesforce/HubSpot (CRM) | Odoo, SuiteCRM, Pimcore | Customizable, no vendor lock-in, lower cost |
| ChatGPT (Proprietary AI) | Ollama, LLaMA, GPT4All (local models) | Run locally, no data sent to servers, free |
| Tableau/PowerBI (Analytics) | Metabase, Apache Superset, Grafana | Dashboard control, SQL direct, community support |
Power Networks: The Epstein Files Investigation
Why This Matters
In 2025, over 60,000 pages of Epstein-related documents were officially released by the U.S. House Oversight Committee and federal courts. These documents expose institutional failures, complicity networks, and power dynamics that enabled decades of abuse.
📊 Complete Analysis Guide
Download the comprehensive guide: document acquisition workflow, full-text search database setup, entity extraction, co-occurrence analysis, and reproducible methodology for investigating power networks.
📥 Download Complete Guide (200 KB) Includes: JMail.world email database, text extraction pipeline (Python + OCR), SQLite FTS5 database, entity mapping, co-occurrence analysis, timeline reconstruction.
Reclaim Your Digital Life: Legitimate Alternatives
Why reclaim digital infrastructure?
- Corporate control = data weaponization: Google, Amazon, Meta, Apple track everything. Your data is sold to state agencies (NSA, GCHQ, law enforcement).
- Algorithmic manipulation: Corporate platforms optimize for engagement, polarization, and compliance. Your feed is a behavioral control tool.
- Arbitrary enforcement: Sudden bans, suspensions, content removal without due process. No appeal. No transparency.
🌐 Browser & Search — Reclaim Privacy
- Chrome (Google tracking) →
Brave (built-in ad-blocking), Firefox (open-source), Ungoogled Chromium (privacy fork). - Google Search →
DuckDuckGo (no IP logging), SearXNG (privacy metasearch). - YouTube →
Invidious (YouTube without tracking/algorithm).
💬 Social Media & Communication
- Instagram, TikTok →
Mastodon (federated, no algorithm), Pixelfed (photo sharing). - Twitter/X →
Mastodon (open, decentralized). - Facebook →
Diaspora (federated), Signal (encrypted messaging).
🎬 Media Centers
- Netflix, Disney+ →
Jellyfin (self-hosted media), Kodi (media center). - Spotify →
Subsonic, Airsonic (self-hosted music).
All of these are legal, production-ready, and often superior to corporate alternatives. You keep your data. No account bans. No algorithm.
Campaign Templates & Materials
Download petitions, MP email templates, briefing PDFs and media assets
Petition templates
Download ready-to-send petitions demanding diplomacy, ceasefires and humanitarian prioritization.
Download EN MP / Representative email templates
Copy-and-paste email templates to contact MPs and local representatives.
Download EN Briefing PDFs for journalists and policymakers
One-page briefing templates for journalists and policymakers.
Download EN How to use templates in 10 minutes
- Open the most relevant template (petition, mp-email, briefing).
- Edit the first line to add target, dates and a short ask—avoid jargon.
- Add 1–2 evidence lines and link to the briefing or monitoring log.
- Run OSINT safety checklist: citations, image provenance, doxxing checks.
- Publish, then log the link and key metrics (signatures, replies, mentions).
Quick MP message example (copy-and-paste):
Dear [MP Name],
I'm a constituent and I urge you to request an independent audit of contract [#] due to concerns about [brief impact]. Please ask the minister to pause any further disbursements until the review is complete.
See the briefing template for evidence structure. When in doubt, consult the OSINT safety resources before publishing.
Toolkit: Investigation, Whistleblowing & Real Impact
- OSINT & Investigative Journalism: Bellingcat How-To, Lighthouse Reports, OCCRP for image verification, arms tracking, satellite analysis, collaborative investigations.
- Secure Whistleblowing: GlobaLeaks, DDoSecrets, SecureDrop for anonymous reporting of abuse and corruption.
- Scraping & Data Leaks: Open-source tools for extracting procurement, lobbying, arms export data. Cross-check with LobbyFacts and Transparency International.
- Legal Support & International Complaints: FIDH, Amnesty International, Transparency International for templates to file with ECHR, UN, and free legal aid.
- Public Dashboards & Campaigns: Datawrapper, Flourish, Google Sheets for publishing findings and building public pressure.
AI & Automation for Campaigns
Use AI and open-source tools to automate research, data cleaning, and coordination. Always verify AI outputs with human fact-checking.
⚙️ Practical AI Workflows
Don't let AI do your thinking for you. Use it to save time on boring stuff so you can focus on what matters: understanding your issue, talking to people, organizing action.
📖 Read the full resources → AI for Campaigns: A Practical Resources for Activists
Real use cases: - Summarizing reports: Feed a 100-page UN report to an LLM, get a 1-page summary with key facts. Then verify it against the original.
- Finding patterns: Train a model on news articles to detect when escalation language appears. Helps you spot when things are getting worse.
- Cleaning messy data: You have a spreadsheet of corporate contracts in 5 different formats? Use data tools to standardize and cross-check.
- Generating outreach: AI can help personalize emails to MPs at scale. But YOU write the message, YOU verify the facts.
- Organizing evidence: Use version-controlled databases to track who said what, when. Tamper-proof record keeping.
Open-Source LLMs: Run Models Locally Updated 2026
The frontier of open models has exploded. All run locally with Ollama or LM Studio — no cloud, no logging, no surveillance risk.
🧠 General Purpose Frontier - Gemma 4 (Google, 2025) — 1B–27B, multimodal, best-in-class open weights
- Llama 4 Scout / Maverick (Meta, 2025) — 17B–400B+, natively multimodal, open
- Mistral Small 3.1 / Mixtral 8x22B — Efficient MoE, Apache 2.0 license
- Qwen 3 (Alibaba, 2025) — 0.5B–235B, multilingual, open weights
- DeepSeek-R2 / V3 — 671B MoE, state-of-art reasoning, fully open
- Phi-4 (Microsoft, 14B) — Tiny but powerful for constrained hardware
- Command R+ (Cohere, open weights) — Best for RAG and document retrieval
🏥 Specialized / Domain Models - MedGemma 1.5 (Google DeepMind, 2025) — Medical reasoning, radiology, clinical text; 4B & 27B open weights
- BioMistral 7B — Medical domain fine-tuned, multilingual biomedical QA
- InternVL 2.5 — Top multimodal open model; document, chart, image understanding
- SeaLLM 3 — Southeast Asian languages; ideal for multilingual activist work
- CodeGemma — Code generation and analysis, open weights by Google
- MiniCPM 3.0 — Runs on smartphones (2–4GB VRAM); edge deployment
🔊 Vision, Audio & Multimodal - Whisper Large v3 — Best open speech-to-text; 100+ languages; transcribe testimonies
- CogVLM 2 — Open visual language model; image analysis and verification
- Whisper.cpp — Local whisper inference on CPU; no GPU needed
- SeamlessM4T v2 (Meta) — Real-time speech translation, 100+ languages
- Stable Diffusion 3.5 — Open image generation for campaign materials
Run everything locally: ollama pull gemma3:27b · ollama pull medgemma:4b · ollama pull deepseek-r2 · ollama pull qwen3:14b · ollama pull llama4:scout
2025–2026 AI Frameworks & Infrastructure New
Production-grade open tools to build AI pipelines, agents, and applications — without cloud lock-in.
🚀 Inference & Serving - vLLM — Fast, production-grade LLM serving; 10x faster than naïve inference
- Ollama — One-command local LLM runner; REST API included
- LM Studio — GUI for running local models, GGUF format
- llama.cpp — CPU inference for quantized models; runs on Raspberry Pi
- TGI (Hugging Face) — Enterprise inference server, open source
🤖 Agents & Orchestration - LangChain — Composable LLM pipelines; document loaders, retrievers, tools
- LlamaIndex — RAG (Retrieval-Augmented Generation) over your own documents
- CrewAI (2024–2025) — Multi-agent orchestration; define roles and tasks in Python
- AutoGen 0.4 (Microsoft) — Multi-agent conversation framework, async-first
- DSPy — Program LLMs with algorithms, not prompts; automatic optimization
- smolagents (🤗 HF, 2025) — Minimal agent framework; code agents, tool use
🗄 Vector DBs & RAG Infrastructure - ChromaDB — Easiest local vector DB; Python-native; zero config
- Qdrant — Fast, production vector search; Rust-based, self-hosted
- Weaviate — Multimodal vector DB; images, text, audio
- pgvector — Postgres extension for vector search; no extra infra needed
- LanceDB — Embedded vector DB, no server; works inside Python scripts
🔬 Fine-Tuning & Training - LLaMA Factory — Fine-tune any open LLM on your data; GUI + CLI
- Unsloth — 2x faster fine-tuning, 70% less VRAM; supports Gemma, Llama, Mistral
- Axolotl — Flexible fine-tuning framework; YAML config, multi-GPU
- TRL (HF) — RLHF, DPO, PPO training for open models
- easy-dataset — Generate fine-tuning datasets from documents automatically
📊 Evaluation & Observability - LiteLLM — Unified API for 100+ models (OpenAI-compatible); proxy + cost tracking
- OpenHands (formerly OpenDevin) — Open-source AI coding agent; runs locally
- RAGAS — Evaluate RAG pipeline quality automatically
- Langfuse — Open-source LLM observability; tracing, evals, cost tracking
- MLflow — Experiment tracking, model registry, reproducibility
🌐 Web & App Frameworks - Gradio — Build AI demos in minutes; shareable UIs for models
- Streamlit — Data apps and dashboards in Python; no frontend skills needed
- Open WebUI — ChatGPT-like UI for local Ollama models; multi-user, plugins
- AnythingLLM — Full local AI workspace: RAG, agents, multi-model, no cloud
- LobeChat — Open-source ChatGPT UI; plugins, local model support
Data Validation & Cleaning
OpenRefine — Hands-on data cleaning without coding. Dolt — Version-controlled databases.
Track changes in contracts, spending, corporate records. See who changed what, when. Build an auditable trail.
Workflow Automation
Apache Airflow — Schedule and monitor data pipelines. Run reports automatically.
Example: Every morning, download latest arms export data, cross-check against your target list, flag new contracts.
Sentiment & Narrative Analysis
Detect propaganda, escalation language, and disinformation in real time.
Use case: Monitor official statements and news feeds. When language shifts toward war rhetoric, alert your network. Get ahead of escalation.
Collaborative Investigation
Tools for teams to organize evidence securely and share findings responsibly.
Bellingcat methodology • Shared spreadsheets with access controls • Encrypted document stores.
Safety & Ethics: Using AI Responsibly
Never trust AI alone. It's a tool. It makes mistakes. It can hallucinate. It can be biased.
Rules: - ✓ Verify every claim against primary sources
- ✓ Use local models in high-risk countries (no cloud API logging)
- ✓ Keep datasets encrypted and version-controlled
- ✓ Train models only on verified data
- ✗ Never expose sources in plaintext to commercial LLM APIs
See Privacy & Security resources for threat modeling and safe AI use in restricted contexts.