In November 2025, a synthetic audio clip of a sitting MP was used to spread election misinformation in a state assembly by-election. The clip circulated on WhatsApp for 11 hours before being debunked. By then, it had been forwarded 2.3 lakh times.
This is the problem FakeReaper is built to solve — not in 11 hours, but in 0.3 seconds.
The Detection Challenge
Deepfake detection is hard because generation keeps improving. Every time a new detection model ships, the generation community adapts. This arms race has a structural problem: single-model detectors have a shelf life.
Our core architectural insight: no single detector is reliable across all modalities and all generation methods. The solution is a multi-model ensemble with dynamic weighting — where the confidence score from each constituent model influences how much it contributes to the final verdict.
The Architecture
The pipeline for a single media item:
- Ingestion: Media file received via API. Format normalised (MP4 → frames + audio track; image → normalised tensor).
- Routing: Content-aware router identifies which specialist models are relevant.
- Parallel inference: Each relevant model runs simultaneously on GPU.
- Feature fusion: 128-dimensional feature vectors from each model are concatenated and passed to the meta-classifier.
- Calibration: Output probability is calibrated against a held-out validation set to ensure well-calibrated confidence scores.
- Response: JSON payload with verdict, confidence, and per-model breakdown.
The Specialist Models
We run 7 specialist models in the ensemble:
- ViT-Fake-v3: Vision transformer fine-tuned on 2.1M labelled synthetic images (FaceForensics++, DFDC, our proprietary Indian deepfake dataset).
- FreqNet: Frequency-domain analyser. Catches GAN and diffusion artifacts invisible in pixel space.
- LipSync-Check: Audio-visual synchronisation analyser. Flags desynchronised lip movements characteristic of video deepfakes.
- VoiceGuard: Audio deepfake detector. Trained on 800K voice samples including Whisper-generated and ElevenLabs synthetic audio.
- NairaText-Detect: LLM-generated text detector. Calibrated for Indian English and Hinglish patterns.
- StegoScan: Steganographic watermark detector for content signed with C2PA or similar provenance standards.
- Temporal-Coherence: Video-specific model checking for frame-level inconsistencies in lighting, shadow, and texture.
One API Call
The entire pipeline is exposed as a single REST endpoint. A detection request looks like this:
The Indian Deepfake Dataset
One differentiator we're proud of: we built and maintain a proprietary dataset of Indian deepfake samples — synthesised faces, voices, and video clips of Indian public figures. Every major benchmark dataset (DFDC, FaceForensics++, DGM4) is almost exclusively Western faces and voices.
A detector trained only on those datasets performs materially worse on Indian faces and Indian accents. Our dataset corrects for this. It currently contains 140,000 synthetic Indian face images, 22,000 synthetic voice clips in 12 Indian languages, and 8,000 synthetic video segments.
What's Next
We are in conversations with two state governments about deploying FakeReaper as a real-time monitoring layer for election-related social media content. We are also building a WhatsApp integration — forward any suspicious media to a number and receive a verdict in seconds. No app download required.