Multi-Modal Safety System - Cellula Technologies

Project Overview

Understanding the challenge and solution approach

Problem Statement

Build a production-ready dual-stage moderation system that can analyze both text and images for harmful content, providing comprehensive safety assessments with real-time processing capabilities for digital platforms.

Solution Approach

Implemented a dual-stage pipeline combining LlamaGuard for hard filtering and DistilBERT+LoRA for fine-grained classification, with BLIP integration for multi-modal content analysis through a modular Streamlit application.

Expected Outcome

Create a production-ready dual-stage moderation system capable of detecting harmful content across multiple modalities with high accuracy, real-time performance, and comprehensive safety assessments.

Quick Links & How to Run

Live demo, repo and local setup for reproducing the Week 2 app

How to run locally

Reproduce the Streamlit app locally using the project requirements and a local copy of the fine-tuned model.

pip install -r requirements.txt
# copy .env.example to .env and add your OpenRouter API key
# place model weights under the path used by the app (see README)
streamlit run app_streamlit.py

Model files used in Week 2 are expected at: C:/Users/NightPrince/OneDrive/Desktop/Cellula-Internship/Week1/peft-distilbert-toxic-classifier/last-checkpoint/ — update paths in the app if needed.

Week 2 Highlights

What was achieved during the second week of the internship

Modular dual-stage moderation pipeline: Llama Guard (hard filter) followed by DistilBERT+LoRA (9-class classifier).
Multi-modal support via BLIP image captioning to handle images through the same text pipeline.
Production-ready Streamlit UI with clear feedback, class probabilities, and error handling.
Addressed class imbalance using SMOTE, oversampling, and class weights; documented experiments in reports.
Deployed a public demo on Hugging Face Spaces for evaluation and sharing.

Week 2 File Structure (excerpt)

Week2/
├── app_streamlit.py         # Main Streamlit app
├── pipeline/                # Modular pipeline
│   ├── blip_caption.py
│   ├── llama_guard.py
│   └── toxic_classifier.py
├── requirements.txt
├── .env.example
├── README.md
└── internship_week2_report.html

System Architecture

How the multi-modal safety system works

Dual-Stage Architecture

Pre-processing Stage

Accepts raw text and image inputs
Applies tokenization and image preprocessing
Handles different input formats (text, image, URL)

Analysis Stage

LlamaGuard for text safety analysis
BLIP for image captioning and content analysis
Separate pipelines for text and image

Safety Assessment

Combines results from both stages
Generates a final safety score
Outputs safety labels (e.g., "Safe", "Unsafe")

Modular Pipeline

Input Processing

Accepts text and image inputs through Streamlit interface

Text Analysis

LlamaGuard analyzes text for safety violations

Image Analysis

BLIP generates captions and analyzes image content

Safety Assessment

Combines results for comprehensive safety evaluation

Key Links

LlamaGuard Model

View on GitHub

BLIP Model

View on GitHub

Streamlit Application

View on GitHub

Methodology & Implementation

Step-by-step approach to building the multi-modal safety system

Dual-Stage Architecture Design: Designed a modular dual-stage pipeline with Stage 1 (LlamaGuard hard filter) and Stage 2 (DistilBERT+LoRA fine-grained classification) for comprehensive content moderation.
Stage 1 - LlamaGuard Integration: Integrated LlamaGuard via OpenRouter API for instant hard filtering of legally or ethically unsafe content, ensuring only 'safe' or 'unsafe' responses for maximum reliability.
Stage 2 - DistilBERT+LoRA Implementation: Deployed the fine-tuned DistilBERT model with PEFT-LoRA for nuanced 9-class toxic content classification, addressing class imbalance through SMOTE and weighted loss functions.
Multi-Modal BLIP Integration: Implemented BLIP model for image captioning, enabling visual content moderation by converting images to text and processing through the same dual-stage pipeline.
Modular Pipeline Development: Built a clean, modular architecture with separate components for BLIP captioning, LlamaGuard filtering, and toxic classification, ensuring maintainability and extensibility.
Production-Ready Streamlit App: Created a robust Streamlit application with comprehensive error handling, real-time processing, professional UI, and support for both text and image inputs.

Results & Performance

Key metrics and achievements from the multi-modal safety system

Dual-Stage

Moderation Pipeline

9 Categories

Toxic Classification

Multi-Modal

Text & Image Support

Production-Ready

Live Demo Available

Key Achievements

Successfully implemented a production-ready dual-stage moderation system with real-time processing
Created a modular, extensible pipeline architecture with clean separation of concerns
Integrated multiple state-of-the-art models (LlamaGuard, DistilBERT+LoRA, BLIP) into a unified system
Built a comprehensive Streamlit application with professional UI and robust error handling
Deployed live demo on Hugging Face Spaces for public access and testing

Technical Stack

Technologies and frameworks used in the project

LlamaGuard DistilBERT+LoRA BLIP Streamlit OpenRouter API PEFT Transformers Python Modular Pipeline Multi-Modal

Project Links & Resources

Access to live demo, code, and documentation

Live Demo & Resources

Live Demo

Hugging Face Space

Source Code

GitHub Repository

LlamaGuard Model

Meta LlamaGuard

BLIP Model

Salesforce BLIP

Challenges & Solutions

Key obstacles encountered and how they were overcome

Dual-Stage Pipeline Integration: Successfully integrated LlamaGuard (Stage 1) and DistilBERT+LoRA (Stage 2) by creating a unified interface and handling different API formats and model outputs.
Multi-Modal Processing: Implemented BLIP image captioning and developed algorithms to effectively combine text and image analysis results for comprehensive safety evaluation.
Modular Architecture Design: Designed a clean, modular pipeline with separate components for BLIP captioning, LlamaGuard filtering, and toxic classification, ensuring maintainability and extensibility.
Production Deployment: Optimized model loading, inference pipelines, and error handling to achieve real-time performance while maintaining accuracy for production deployment on Hugging Face Spaces.

Dual-Stage Toxic Moderation App

Project Overview

Problem Statement

Solution Approach

Expected Outcome

Quick Links & How to Run

Live Demo

Source Code

Week 2 Report

How to run locally

Week 2 Highlights

Week 2 File Structure (excerpt)

System Architecture

Dual-Stage Architecture

Pre-processing Stage

Analysis Stage

Safety Assessment

Modular Pipeline

Input Processing

Text Analysis

Image Analysis

Safety Assessment

Key Links

LlamaGuard Model

BLIP Model

Streamlit Application

Methodology & Implementation

Results & Performance

Key Achievements

Technical Stack

Project Links & Resources

Live Demo & Resources

Live Demo

Source Code

LlamaGuard Model

BLIP Model

Challenges & Solutions