Files

Chamika J e5e56e48f8 feat(spam-moderation): implement spam detection and moderation for team invitations and signups

- Integrated SpamDetector utility to check for spam patterns in team names and user names during signup and invitation processes.
- Enhanced TeamMembersController to log and block obvious spam invitations while allowing suspicious ones for review.
- Updated passport-local-signup strategy to flag high-risk signups and log details for admin review.
- Added moderation routes to handle spam-related actions and integrated rate limiting for invitation requests.
- Improved frontend components to provide real-time spam warnings during organization name input, enhancing user feedback.

2025-07-31 15:52:08 +05:30

7.0 KiB

Raw Blame History

Worklenz Spam Protection System Guide

Overview

This guide documents the spam protection system implemented in Worklenz to prevent abuse of user invitations and registrations.

System Components

1. Spam Detection (`/worklenz-backend/src/utils/spam-detector.ts`)

The core spam detection engine that analyzes text for suspicious patterns:

Flag-First Policy: Suspicious content is flagged for review, not blocked
Selective Blocking: Only extremely obvious spam (score > 80) gets blocked
URL Detection: Identifies links, shortened URLs, and suspicious domains
Spam Phrases: Detects common spam tactics (urgent, click here, win prizes)
Cryptocurrency Spam: Identifies blockchain/crypto compensation scams
Formatting Issues: Excessive capitals, special characters, emojis
Fake Name Detection: Generic names (test, demo, fake, spam)
Whitelist Support: Legitimate business names bypass all checks
Context-Aware: Smart detection reduces false positives

2. Rate Limiting (`/worklenz-backend/src/middleware/rate-limiter.ts`)

Prevents volume-based attacks:

Invite Limits: 5 invitations per 15 minutes per user
Organization Creation: 3 attempts per hour
In-Memory Store: Fast rate limit checking without database queries

3. Frontend Validation

Real-time feedback as users type:

/worklenz-frontend/src/components/account-setup/organization-step.tsx
/worklenz-frontend/src/components/admin-center/overview/organization-name/organization-name.tsx
/worklenz-frontend/src/components/settings/edit-team-name-modal.tsx

4. Backend Enforcement

Blocks spam at API level:

Team Members Controller: Validates organization/owner names before invites
Signup Process: Blocks spam during registration
Logging: All blocked attempts sent to Slack via winston logger

5. Database Schema

-- Teams table: Simple status field
ALTER TABLE teams ADD COLUMN status VARCHAR(20) DEFAULT 'active';

-- Moderation history tracking
CREATE TABLE team_moderation (
    id UUID PRIMARY KEY,
    team_id UUID REFERENCES teams(id),
    status VARCHAR(20), -- 'flagged', 'suspended', 'restored'
    reason TEXT,
    moderator_id UUID,
    created_at TIMESTAMP,
    expires_at TIMESTAMP -- For temporary suspensions
);

-- Spam detection logs
CREATE TABLE spam_logs (
    id UUID PRIMARY KEY,
    team_id UUID,
    content_type VARCHAR(50),
    original_content TEXT,
    spam_score INTEGER,
    spam_reasons JSONB,
    action_taken VARCHAR(50)
);

Admin Tools

API Endpoints

GET  /api/moderation/flagged-organizations - View flagged teams
POST /api/moderation/flag-organization - Manually flag a team
POST /api/moderation/suspend-organization - Suspend a team
POST /api/moderation/unsuspend-organization - Restore a team
GET  /api/moderation/scan-spam - Scan for spam in existing data
GET  /api/moderation/stats - View moderation statistics
POST /api/moderation/bulk-scan - Bulk scan and auto-flag

Slack Notifications

The system sends structured alerts to Slack for:

🚨 Spam Detected (score > 30)
🔥 High Risk Content (known spam domains)
🛑 Blocked Attempts (invitations/signups)
⚠️ Rate Limit Exceeded

Example Slack notification:

{
  "alert_type": "high_risk_content",
  "team_name": "CLICK LINK: gclnk.com/spam",
  "user_email": "spammer@example.com",
  "spam_score": 95,
  "reasons": ["Contains suspicious URLs", "Contains monetary references"],
  "timestamp": "2024-01-15T10:30:00Z"
}

Testing the System

Test Spam Patterns

These will be FLAGGED for review (flag-first approach):

Suspicious Words: "Free Software Solutions" (flagged but allowed)
URLs: "Visit our site: bit.ly/win-prize" (flagged but allowed)
Cryptocurrency: "🔔 $50,000 BLOCKCHAIN COMPENSATION" (flagged but allowed)
Urgency: "URGENT! Click here NOW!!!" (flagged but allowed)
Generic Names: "Test Company", "Demo Organization" (flagged but allowed)
Excessive Numbers: "Company12345" (flagged but allowed)
Single Emoji: "Great Company 💰" (flagged but allowed)

BLOCKED Patterns (zero-tolerance - score > 80):

Known Spam Domains: "CLICK LINK: gclnk.com/spam"
Extreme Scam Patterns: "🔔CHECK $213,953 BLOCKCHAIN COMPENSATION URGENT🔔"
Obvious Spam URLs: Content with bit.ly/scam patterns

Whitelisted (Will NOT be flagged):

Legitimate Business: "Microsoft Corporation", "Free Software Company"
Standard Suffixes: "ABC Solutions Inc", "XYZ Consulting LLC"
Tech Companies: "DataTech Services", "The Design Studio"
Context-Aware: "Free Range Marketing", "Check Point Systems"
Legitimate "Test": "TestDrive Automotive" (not generic)

Expected Behavior

Suspicious Signup: Flagged in logs, user allowed to proceed
Obvious Spam Signup: Blocked with user-friendly message
Suspicious Invitations: Flagged in logs, invitation sent
Obvious Spam Invitations: Blocked with support contact suggestion
Frontend: Shows warning message for suspicious content
Logger: Sends Slack notification for all suspicious activity
Database: Records all activity in spam_logs table

Database Migration

Run these SQL scripts in order:

spam_protection_tables.sql - Creates new schema
fix_spam_protection_constraints.sql - Fixes notification_settings constraints

Configuration

Environment Variables

No additional environment variables required. The system uses existing:

COOKIE_SECRET - For session management
Database connection settings

Adjusting Thresholds

In spam-detector.ts:

const isSpam = score >= 50; // Adjust threshold here

In rate-limiter.ts:

inviteRateLimit(5, 15 * 60 * 1000) // 5 requests per 15 minutes

Monitoring

Check Spam Statistics

SELECT * FROM moderation_dashboard;
SELECT COUNT(*) FROM spam_logs WHERE created_at > NOW() - INTERVAL '24 hours';

View Rate Limit Events

SELECT * FROM rate_limit_log WHERE blocked = true ORDER BY created_at DESC;

Troubleshooting

Issue: Legitimate users blocked

Check spam_logs for their content
Adjust spam patterns or scoring threshold
Whitelist specific domains if needed

Run the fix script: fix_spam_protection_constraints.sql

Issue: Slack notifications not received

Check winston logger configuration
Verify log levels in logger.ts
Ensure Slack webhook is configured

Future Enhancements

Machine Learning: Train on spam_logs data
IP Blocking: Geographic or reputation-based blocking
CAPTCHA Integration: For suspicious signups
Email Verification: Stronger email validation
Allowlist Management: Pre-approved domains

Security Considerations

Logs contain sensitive data - ensure proper access controls
Rate limit data stored in memory - consider Redis for scaling
Spam patterns should be regularly updated
Monitor for false positives and adjust accordingly

7.0 KiB Raw Blame History