Latest News Feed

Real-time AI and ML news from Google News, Reddit, and Twitter/X

Status: Loading...
Last Updated: Never
Filter by tag:

Fetching news feed...

Architecture & Workflow

How the AI News Tracker processes and generates content

Data Pipeline Overview

The AI News Tracker implements a comprehensive data acquisition and processing pipeline with two-tier security guardrails:

Stage 1: Data Acquisition

Three Feed Sources:

  • Feed A (Real-Time): Google News API for AI/tech articles
  • Feed B (Community): Reddit (r/MachineLearning, r/artificial)
  • Feed C (Expert): Twitter/X expert accounts
Stage 2: Pre-LLM Sanitization

Strategy 1 - Content Filtering:

  • Remove HTML/Markdown formatting
  • Strip boilerplate text (copyright, footers)
  • Truncate to max tokens (Google News: 2000, Reddit: 2000, Twitter: 1000)
  • Validate against domain allowlist
  • Detect injection patterns
Stage 3: LLM Processing (Claude)

Strategy 2 - Prompt-Level Guardrails:

  • System prompts with clear persona and constraints
  • Topic filtering (allowlist ML/AI topics, blocklist finance/politics)
  • Input validation for escape sequences and JSON manipulation
  • Output validation against JSON schema
Stage 4: Content Generation

Parallel Processing:

  • Article summarization (150-300 words)
  • Video idea generation
  • Thumbnail generation via Leonardo API
Stage 5: Data Merge & Display

Final Output:

  • Unified feed.json with all metadata
  • Web UI display with filtering/sorting
  • API endpoints for external consumption

Automation & Scheduling

n8n Workflow: Webhook-triggered orchestration

n8n Workflow Diagram
  • POST /webhook/run-pipeline triggers full pipeline
  • Cron scheduling every 6 hours via bash script
  • Execution logs and error handling

Security Features

  • ✓ Pre-LLM content sanitization prevents injection attacks
  • ✓ Prompt-level guardrails enforce topic focus
  • ✓ Domain whitelist prevents untrusted sources
  • ✓ Output validation ensures data integrity
  • ✓ Rate limiting and API key isolation (via .env)

Raw Output Feed

Complete feed.json data structure for API consumption

{
  "loading": true,
  "message": "Fetching feed.json..."
}
                

Get in Touch

Have a question or feedback? Send us a message!