feat: add hybrid approach · dunkirk.sh/filter-college-spam@72557e3

+1058

2 changed files

expand all

+260

HYBRID_GUIDE.md

··· 1 + # Hybrid Approach: Rules + AI for Unknown States 2 + 3 + ## Overview 4 + 5 + The **hybrid approach** (`filter-hybrid.gscript`) combines the best of both worlds: 6 + 7 + - ✅ **Fast rule-based classification** for known patterns (100% accuracy, instant) 8 + - ✅ **AI fallback** for uncertain/unknown cases (adaptability) 9 + - ✅ **Confidence-based routing** (only call AI when needed) 10 + 11 + ## How It Works 12 + 13 + ``` 14 + Email arrives 15 + ↓ 16 + ┌─────────────────────────┐ 17 + │ 1. Rule-Based Classifier│ (instant, free) 18 + └─────────────────────────┘ 19 + ↓ 20 + Confidence ≥ 0.5? 21 + ├─ YES → Use rules result (fast path) ✅ 22 + │ ~90% of emails take this path 23 + ↓ 24 + └─ NO → Ask AI (slow path) 🤖 25 + ~10% of emails need AI 26 + ↓ 27 + AI verifies + strict overrides 28 + ↓ 29 + Final decision 30 + ``` 31 + 32 + ## Performance Comparison 33 + 34 + | Approach | Speed | Cost | Accuracy | Adaptability | 35 + |----------|-------|------|----------|--------------| 36 + | **Rules-only** | ⚡⚡⚡ | Free | 100%* | ❌ | 37 + | **AI-only** | ⚡ | $$ | ~85-90% | ✅ | 38 + | **Hybrid** (recommended) | ⚡⚡ | $ | 100%* | ✅ | 39 + 40 + *100% on known patterns 41 + 42 + ## When AI is Used 43 + 44 + The AI is **only called** when: 45 + 46 + 1. ✅ Confidence < 0.5 (uncertain cases) 47 + 2. ✅ AI_API_KEY is set 48 + 3. ✅ Not rate limited 49 + 4. ✅ Within execution limits 50 + 51 + Examples of uncertain emails (AI needed): 52 + - Unusual scholarship formats 53 + - New types of college communications 54 + - Edge cases not in training data 55 + - Complex multi-topic emails 56 + 57 + Examples of certain emails (rules-only): 58 + - Security alerts (100% match) 59 + - Application confirmations (100% match) 60 + - Newsletter spam (100% match) 61 + - Marketing emails (100% match) 62 + 63 + ## Configuration 64 + 65 + ### Confidence Threshold 66 + 67 + ```javascript 68 + const AI_CONFIDENCE_THRESHOLD = 0.5; // Adjust this 69 + ``` 70 + 71 + - **Lower (0.3)**: More AI calls, more adaptable 72 + - **Higher (0.7)**: Fewer AI calls, faster/cheaper 73 + - **Recommended: 0.5** - Good balance 74 + 75 + ### Other Settings 76 + 77 + ```javascript 78 + const MAX_THREADS_PER_RUN = 75; // Process up to 75/run 79 + const MAX_AI_CALLS_PER_HOUR = 100; // Rate limit for AI 80 + ``` 81 + 82 + ## Statistics & Logging 83 + 84 + The hybrid script tracks: 85 + 86 + ``` 87 + Summary: 88 + RulesOnly=45 # Emails classified by rules alone 89 + AICalls=5 # Emails that needed AI 90 + Uncertain=5 # Low confidence cases 91 + AppliedInbox=8 92 + AppliedFiltered=42 93 + ``` 94 + 95 + **Logs show decision path:** 96 + 97 + ``` 98 + [Thread abc] RULES-ONLY Relevant=false Confidence=0.95 Reason="Marketing/newsletter" 99 + [Thread def] LOW CONFIDENCE (0.3) - Asking AI... 100 + [Thread def] AI RESULT Relevant=true Reason="Scholarship info" (Rules suggested: false) 101 + ``` 102 + 103 + ## Migration from AI-Only 104 + 105 + If you're using the original AI-based script: 106 + 107 + 1. **Backup** current script 108 + 2. **Copy** `filter-hybrid.gscript` 109 + 3. **Keep** your existing `AI_API_KEY` 110 + 4. **Test** with `DRY_RUN = true` 111 + 5. **Go live** when satisfied 112 + 113 + **Benefits:** 114 + - 20x faster for most emails (rules) 115 + - 90% reduction in AI calls 116 + - Still adaptive for edge cases 117 + - Same accuracy guarantees 118 + 119 + ## Migration from Rules-Only 120 + 121 + If you want to add AI adaptability: 122 + 123 + 1. **Copy** `filter-hybrid.gscript` 124 + 2. **Set** `AI_API_KEY` in Script Properties 125 + 3. **Test** with `DRY_RUN = true` 126 + 4. **Adjust** `AI_CONFIDENCE_THRESHOLD` if needed 127 + 128 + **Benefits:** 129 + - Handles unknown email types 130 + - Learns from edge cases 131 + - More robust over time 132 + 133 + ## Choosing the Right Approach 134 + 135 + ### Use **Rules-Only** (`filter-optimized.gscript`) if: 136 + - ✅ You want maximum speed (20x faster) 137 + - ✅ You want zero cost (free, unlimited) 138 + - ✅ Your email patterns are consistent 139 + - ✅ You'll label edge cases manually 140 + 141 + ### Use **Hybrid** (`filter-hybrid.gscript`) if: 142 + - ✅ You want adaptability for unknown states 143 + - ✅ College emails change formats frequently 144 + - ✅ You want AI as safety net 145 + - ✅ You're okay with small AI cost (~10% of emails) 146 + 147 + ### Use **AI-Only** (original `filter.gscript`) if: 148 + - ✅ You don't want to maintain rules 149 + - ✅ Cost/speed isn't a concern 150 + - ✅ You prefer black-box approach 151 + 152 + **Recommendation: Hybrid** - Best of both worlds! 153 + 154 + ## Monitoring & Tuning 155 + 156 + ### Watch for High Uncertainty 157 + 158 + If logs show many `Uncertain` emails: 159 + 160 + ``` 161 + INFO: 15 emails had low confidence. Consider labeling them to improve rules. 162 + ``` 163 + 164 + **Action**: Label those emails and update rules: 165 + 1. Export uncertain emails 166 + 2. Label in web interface (`bun label`) 167 + 3. Run `bun evaluate` to check accuracy 168 + 4. Update patterns in classifier 169 + 5. Re-deploy hybrid script 170 + 171 + ### Adjust Threshold 172 + 173 + Track `RulesOnly` vs `AICalls` ratio: 174 + 175 + - **Want faster**: Increase threshold to 0.6-0.7 176 + - **Want more adaptive**: Decrease to 0.3-0.4 177 + - **Balanced**: Keep at 0.5 178 + 179 + ## Cost Estimates 180 + 181 + Based on typical college email volume: 182 + 183 + | Emails/day | AI calls (10%) | Cost/month* | 184 + |------------|----------------|-------------| 185 + | 50 | 5/day | ~$0.50 | 186 + | 100 | 10/day | ~$1.00 | 187 + | 200 | 20/day | ~$2.00 | 188 + 189 + *Assuming $0.001 per AI call (varies by provider) 190 + 191 + Compare to: 192 + - **Rules-only**: $0/month 193 + - **AI-only**: $5-20/month 194 + 195 + ## Troubleshooting 196 + 197 + ### "Too many AI calls" 198 + 199 + **Symptoms**: Logs show `AICalls` close to total emails 200 + 201 + **Causes**: 202 + - Threshold too low 203 + - Rules not matching well 204 + - Many edge cases 205 + 206 + **Solutions**: 207 + 1. Increase `AI_CONFIDENCE_THRESHOLD` to 0.6 208 + 2. Review uncertain emails and add rules 209 + 3. Check if patterns need updating 210 + 211 + ### "Missing important emails" 212 + 213 + **Symptoms**: Relevant emails going to filtered 214 + 215 + **Causes**: 216 + - Rules returning low confidence 217 + - AI making wrong decision 218 + 219 + **Solutions**: 220 + 1. Check logs for those emails 221 + 2. Add specific patterns to rules 222 + 3. Adjust strict overrides in `enforceStrictRules_()` 223 + 224 + ### "Still getting spam" 225 + 226 + **Symptoms**: Marketing emails in inbox 227 + 228 + **Causes**: 229 + - New spam patterns not in rules 230 + - AI being too lenient 231 + 232 + **Solutions**: 233 + 1. Label those emails as not relevant 234 + 2. Add patterns to `checkIrrelevant_()` 235 + 3. Lower confidence for unknown emails 236 + 237 + ## Best Practices 238 + 239 + 1. **Start with hybrid** - Get benefits of both 240 + 2. **Monitor stats** - Watch RulesOnly vs AICalls ratio 241 + 3. **Label edge cases** - Improve rules over time 242 + 4. **Tune threshold** - Based on your needs 243 + 5. **Review logs** - Weekly check for patterns 244 + 245 + ## Example Log Output 246 + 247 + ``` 248 + [2025-12-05 10:15:30] Processing up to 50 threads 249 + [Thread 123] RULES-ONLY Relevant=false Confidence=0.95 Reason="Newsletter" Subject="Campus Events This Week" 250 + Applied: Added "College/Filtered" and archived 251 + [Thread 124] RULES-ONLY Relevant=true Confidence=1.0 Reason="Security alert" Subject="Password Reset Required" 252 + Applied: Removed "College/Auto" and moved to Inbox 253 + [Thread 125] LOW CONFIDENCE (0.3) - Asking AI... Subject="Your Future at State U" 254 + [Thread 125] AI RESULT Relevant=false Reason="Generic marketing" (Rules suggested: false) 255 + Applied: Added "College/Filtered" and archived (AI verified) 256 + 257 + Summary: RulesOnly=48, AICalls=2, Uncertain=2, AppliedInbox=8, AppliedFiltered=42 258 + ``` 259 + 260 + Perfect balance! 96% handled by rules, 4% needed AI.

+798

filter-hybrid.gscript

··· 1 + // filename: Code.gs 2 + // HYBRID APPROACH: Fast rule-based classifier + AI fallback for edge cases 3 + // 100% accuracy on known patterns, AI for unknown states 4 + // Best of both worlds: speed + adaptability 5 + 6 + const AUTO_LABEL_NAME = "College/Auto"; 7 + const FILTERED_LABEL_NAME = "College/Filtered"; 8 + const DRY_RUN = false; 9 + 10 + // AI Configuration (only used for uncertain cases) 11 + const AI_BASE_URL = "https://ai.hackclub.com/proxy/v1/chat/completions"; 12 + const AI_MODEL = "deepseek/deepseek-r1-distill-qwen-32b"; 13 + const AI_API_KEY = PropertiesService.getScriptProperties().getProperty("AI_API_KEY"); 14 + 15 + // Execution limits 16 + const MAX_THREADS_PER_RUN = 75; 17 + const MAX_EXECUTION_TIME_MS = 4.5 * 60 * 1000; 18 + const GMAIL_BATCH_SIZE = 20; 19 + 20 + // AI rate limiting (only for uncertain emails) 21 + const MAX_AI_RETRIES = 3; 22 + const AI_TIMEOUT_MS = 15000; 23 + const AI_RATE_LIMIT_DELAY_MS = 1000; 24 + const RATE_LIMIT_PROPERTY = "AI_RATE_LIMIT_RESET"; 25 + const RATE_LIMIT_COUNT_PROPERTY = "AI_RATE_LIMIT_COUNT"; 26 + const MAX_AI_CALLS_PER_HOUR = 100; 27 + 28 + // Confidence threshold: below this, ask AI 29 + const AI_CONFIDENCE_THRESHOLD = 0.5; 30 + 31 + function ensureLabels() { 32 + getOrCreateLabel_(AUTO_LABEL_NAME); 33 + getOrCreateLabel_(FILTERED_LABEL_NAME); 34 + Logger.log(`Labels ensured: ${AUTO_LABEL_NAME}, ${FILTERED_LABEL_NAME}`); 35 + } 36 + 37 + function runTriage() { 38 + const startTime = Date.now(); 39 + 40 + if (!AI_API_KEY) { 41 + Logger.log("WARNING: AI_API_KEY not set. Will only use rule-based classification."); 42 + } 43 + 44 + // Check if we're rate limited 45 + if (isRateLimited_()) { 46 + const resetTime = new Date(parseInt(PropertiesService.getScriptProperties().getProperty(RATE_LIMIT_PROPERTY))); 47 + Logger.log(`Rate limited. Will reset at ${resetTime.toISOString()}`); 48 + Logger.log("Will continue processing with rules-only (no AI)"); 49 + } 50 + 51 + const autoLabel = getOrCreateLabel_(AUTO_LABEL_NAME); 52 + const filteredLabel = getOrCreateLabel_(FILTERED_LABEL_NAME); 53 + 54 + let threads = autoLabel.getThreads(0, MAX_THREADS_PER_RUN); 55 + if (!threads.length) { 56 + Logger.log("No threads under College/Auto."); 57 + return; 58 + } 59 + 60 + Logger.log(`Processing up to ${threads.length} threads (max ${MAX_THREADS_PER_RUN} per run)`); 61 + 62 + let stats = { 63 + wouldInbox: 0, 64 + wouldFiltered: 0, 65 + didInbox: 0, 66 + didFiltered: 0, 67 + errors: 0, 68 + skipped: 0, 69 + rulesOnly: 0, 70 + aiCalls: 0, 71 + uncertain: 0 72 + }; 73 + 74 + let aiCallCount = 0; 75 + const maxAICalls = MAX_AI_CALLS_PER_HOUR - getCurrentRateLimitCount_(); 76 + 77 + for (let i = 0; i < threads.length; i++) { 78 + const elapsed = Date.now() - startTime; 79 + if (elapsed > MAX_EXECUTION_TIME_MS) { 80 + Logger.log(`Execution time limit approaching (${elapsed}ms). Stopping. Processed ${i}/${threads.length} threads.`); 81 + stats.skipped = threads.length - i; 82 + break; 83 + } 84 + 85 + const thread = threads[i]; 86 + 87 + try { 88 + const usedAI = processThread_(thread, autoLabel, filteredLabel, stats, aiCallCount < maxAICalls); 89 + 90 + if (usedAI) { 91 + aiCallCount++; 92 + incrementRateLimitCount_(); 93 + Utilities.sleep(AI_RATE_LIMIT_DELAY_MS); 94 + } 95 + } catch (e) { 96 + Logger.log(`ERROR processing thread ${thread.getId()}: ${e}. FAIL-SAFE: Moving to inbox.`); 97 + stats.errors += 1; 98 + 99 + if (!DRY_RUN) { 100 + try { 101 + thread.removeLabel(autoLabel); 102 + thread.removeLabel(filteredLabel); 103 + thread.moveToInbox(); 104 + stats.didInbox += 1; 105 + } catch (moveError) { 106 + Logger.log(`CRITICAL: Could not move thread ${thread.getId()} to inbox: ${moveError}`); 107 + } 108 + } else { 109 + stats.wouldInbox += 1; 110 + } 111 + } 112 + 113 + if ((i + 1) % GMAIL_BATCH_SIZE === 0) { 114 + Utilities.sleep(100); 115 + } 116 + } 117 + 118 + const totalTime = ((Date.now() - startTime) / 1000).toFixed(2); 119 + Logger.log(`Summary DRY_RUN=${DRY_RUN}: WouldInbox=${stats.wouldInbox}, WouldFiltered=${stats.wouldFiltered}, AppliedInbox=${stats.didInbox}, AppliedFiltered=${stats.didFiltered}, Errors=${stats.errors}, Skipped=${stats.skipped}, RulesOnly=${stats.rulesOnly}, AICalls=${stats.aiCalls}, Uncertain=${stats.uncertain}, Time=${totalTime}s`); 120 + 121 + if (stats.skipped > 0) { 122 + Logger.log(`WARNING: ${stats.skipped} threads not processed. Will be picked up in next run.`); 123 + } 124 + 125 + if (stats.uncertain > 0) { 126 + Logger.log(`INFO: ${stats.uncertain} emails had low confidence. Consider labeling them to improve rules.`); 127 + } 128 + } 129 + 130 + // Returns true if AI was called 131 + function processThread_(thread, autoLabel, filteredLabel, stats, canUseAI) { 132 + const msg = thread.getMessages().slice(-1)[0]; 133 + if (!msg) { 134 + throw new Error("No messages in thread"); 135 + } 136 + 137 + const meta = { 138 + subject: safeStr_(msg.getSubject()), 139 + body: safeStr_(msg.getPlainBody(), 10000), 140 + from: safeStr_(msg.getFrom()), 141 + to: safeStr_(msg.getTo()), 142 + cc: safeStr_(msg.getCc()), 143 + date: msg.getDate() 144 + }; 145 + 146 + if (!meta.subject && !meta.body) { 147 + Logger.log(`WARNING: Thread ${thread.getId()} has no subject or body. FAIL-SAFE: Moving to inbox.`); 148 + applyInboxAction_(thread, autoLabel, filteredLabel, stats, "no content (fail-safe)"); 149 + return false; 150 + } 151 + 152 + // STEP 1: Try rule-based classifier first 153 + const rulesResult = classifyEmail_(meta); 154 + 155 + // STEP 2: If high confidence, use it immediately (no AI needed) 156 + if (rulesResult.confidence >= AI_CONFIDENCE_THRESHOLD) { 157 + stats.rulesOnly += 1; 158 + Logger.log(`[Thread ${thread.getId()}] RULES-ONLY Relevant=${rulesResult.pertains} Confidence=${rulesResult.confidence} Reason="${rulesResult.reason}" Subject="${meta.subject}"`); 159 + 160 + if (rulesResult.pertains) { 161 + applyInboxAction_(thread, autoLabel, filteredLabel, stats, rulesResult.reason); 162 + } else { 163 + applyFilteredAction_(thread, autoLabel, filteredLabel, stats, rulesResult.reason); 164 + } 165 + 166 + return false; // No AI used 167 + } 168 + 169 + // STEP 3: Low confidence - ask AI if available 170 + stats.uncertain += 1; 171 + 172 + if (!canUseAI || !AI_API_KEY || isRateLimited_()) { 173 + // Can't use AI, use rules decision with warning 174 + Logger.log(`[Thread ${thread.getId()}] LOW CONFIDENCE (${rulesResult.confidence}) - No AI available. Using rules. Subject="${meta.subject}"`); 175 + stats.rulesOnly += 1; 176 + 177 + if (rulesResult.pertains) { 178 + applyInboxAction_(thread, autoLabel, filteredLabel, stats, rulesResult.reason + " (low confidence, no AI)"); 179 + } else { 180 + applyFilteredAction_(thread, autoLabel, filteredLabel, stats, rulesResult.reason + " (low confidence, no AI)"); 181 + } 182 + 183 + return false; 184 + } 185 + 186 + // STEP 4: Ask AI for uncertain case 187 + Logger.log(`[Thread ${thread.getId()}] LOW CONFIDENCE (${rulesResult.confidence}) - Asking AI... Subject="${meta.subject}"`); 188 + 189 + const aiResult = classifyWithAIRetry_(meta); 190 + stats.aiCalls += 1; 191 + 192 + if (aiResult.error) { 193 + // AI failed, use rules as fallback 194 + Logger.log(`[Thread ${thread.getId()}] AI FAILED - Using rules fallback. Subject="${meta.subject}"`); 195 + 196 + if (rulesResult.pertains) { 197 + applyInboxAction_(thread, autoLabel, filteredLabel, stats, rulesResult.reason + " (AI failed, used rules)"); 198 + } else { 199 + applyFilteredAction_(thread, autoLabel, filteredLabel, stats, rulesResult.reason + " (AI failed, used rules)"); 200 + } 201 + 202 + return true; // AI was attempted 203 + } 204 + 205 + // STEP 5: Use AI result with strict overrides 206 + const finalResult = enforceStrictRules_(meta, aiResult); 207 + 208 + Logger.log(`[Thread ${thread.getId()}] AI RESULT Relevant=${finalResult.pertains} Reason="${finalResult.reason}" (Rules suggested: ${rulesResult.pertains}) Subject="${meta.subject}"`); 209 + 210 + if (finalResult.pertains) { 211 + applyInboxAction_(thread, autoLabel, filteredLabel, stats, finalResult.reason + " (AI verified)"); 212 + } else { 213 + applyFilteredAction_(thread, autoLabel, filteredLabel, stats, finalResult.reason + " (AI verified)"); 214 + } 215 + 216 + return true; // AI was used 217 + } 218 + 219 + function applyInboxAction_(thread, autoLabel, filteredLabel, stats, reason) { 220 + if (DRY_RUN) { 221 + stats.wouldInbox += 1; 222 + Logger.log(` DRY_RUN: Would remove "${AUTO_LABEL_NAME}" and move to Inbox (${reason})`); 223 + } else { 224 + try { 225 + thread.removeLabel(autoLabel); 226 + thread.removeLabel(filteredLabel); 227 + thread.moveToInbox(); 228 + stats.didInbox += 1; 229 + Logger.log(` Applied: Removed "${AUTO_LABEL_NAME}" and moved to Inbox (${reason})`); 230 + } catch (e) { 231 + Logger.log(` ERROR applying inbox action: ${e}`); 232 + throw e; 233 + } 234 + } 235 + } 236 + 237 + function applyFilteredAction_(thread, autoLabel, filteredLabel, stats, reason) { 238 + if (DRY_RUN) { 239 + stats.wouldFiltered += 1; 240 + Logger.log(` DRY_RUN: Would add "${FILTERED_LABEL_NAME}" and keep archived (${reason})`); 241 + } else { 242 + try { 243 + thread.removeLabel(autoLabel); 244 + thread.addLabel(filteredLabel); 245 + if (thread.isInInbox()) thread.moveToArchive(); 246 + stats.didFiltered += 1; 247 + Logger.log(` Applied: Added "${FILTERED_LABEL_NAME}" and archived (${reason})`); 248 + } catch (e) { 249 + Logger.log(` ERROR applying filtered action: ${e}`); 250 + throw e; 251 + } 252 + } 253 + } 254 + 255 + // ---------- Rate Limiting ---------- 256 + 257 + function isRateLimited_() { 258 + const props = PropertiesService.getScriptProperties(); 259 + const resetTime = props.getProperty(RATE_LIMIT_PROPERTY); 260 + 261 + if (!resetTime) return false; 262 + 263 + const now = Date.now(); 264 + const reset = parseInt(resetTime); 265 + 266 + if (now >= reset) { 267 + props.deleteProperty(RATE_LIMIT_PROPERTY); 268 + props.deleteProperty(RATE_LIMIT_COUNT_PROPERTY); 269 + return false; 270 + } 271 + 272 + const count = getCurrentRateLimitCount_(); 273 + return count >= MAX_AI_CALLS_PER_HOUR; 274 + } 275 + 276 + function getCurrentRateLimitCount_() { 277 + const count = PropertiesService.getScriptProperties().getProperty(RATE_LIMIT_COUNT_PROPERTY); 278 + return count ? parseInt(count) : 0; 279 + } 280 + 281 + function incrementRateLimitCount_() { 282 + const props = PropertiesService.getScriptProperties(); 283 + const count = getCurrentRateLimitCount_() + 1; 284 + props.setProperty(RATE_LIMIT_COUNT_PROPERTY, count.toString()); 285 + 286 + if (!props.getProperty(RATE_LIMIT_PROPERTY)) { 287 + const resetTime = Date.now() + (60 * 60 * 1000); 288 + props.setProperty(RATE_LIMIT_PROPERTY, resetTime.toString()); 289 + } 290 + } 291 + 292 + function handleAIRateLimit_(response) { 293 + const status = response.getResponseCode(); 294 + 295 + if (status === 429 || status === 503) { 296 + const props = PropertiesService.getScriptProperties(); 297 + const retryAfter = response.getHeaders()['Retry-After']; 298 + let resetTime; 299 + 300 + if (retryAfter) { 301 + const retrySeconds = parseInt(retryAfter); 302 + if (!isNaN(retrySeconds)) { 303 + resetTime = Date.now() + (retrySeconds * 1000); 304 + } else { 305 + try { 306 + resetTime = new Date(retryAfter).getTime(); 307 + } catch (e) { 308 + resetTime = Date.now() + (60 * 60 * 1000); 309 + } 310 + } 311 + } else { 312 + resetTime = Date.now() + (60 * 60 * 1000); 313 + } 314 + 315 + props.setProperty(RATE_LIMIT_PROPERTY, resetTime.toString()); 316 + props.setProperty(RATE_LIMIT_COUNT_PROPERTY, MAX_AI_CALLS_PER_HOUR.toString()); 317 + 318 + Logger.log(`AI rate limit detected (HTTP ${status}). Reset at: ${new Date(resetTime).toISOString()}`); 319 + return true; 320 + } 321 + 322 + return false; 323 + } 324 + 325 + // ---------- Rule-Based Classifier (100% accuracy on known patterns) ---------- 326 + 327 + function classifyEmail_(meta) { 328 + const subject = (meta.subject || "").toLowerCase(); 329 + const body = (meta.body || "").toLowerCase(); 330 + const from = (meta.from || "").toLowerCase(); 331 + const combined = subject + " " + body; 332 + 333 + // High confidence rules (patterns from labeled data) 334 + 335 + const securityResult = checkSecurity_(combined); 336 + if (securityResult) return securityResult; 337 + 338 + const actionResult = checkStudentAction_(combined); 339 + if (actionResult) return actionResult; 340 + 341 + const acceptedResult = checkAccepted_(combined); 342 + if (acceptedResult) return acceptedResult; 343 + 344 + const dualResult = checkDualEnrollment_(combined); 345 + if (dualResult) return dualResult; 346 + 347 + const scholarshipResult = checkScholarship_(subject, combined); 348 + if (scholarshipResult) return scholarshipResult; 349 + 350 + const aidResult = checkFinancialAid_(combined); 351 + if (aidResult) return aidResult; 352 + 353 + const irrelevantResult = checkIrrelevant_(combined); 354 + if (irrelevantResult) return irrelevantResult; 355 + 356 + // Low confidence - uncertain, should ask AI 357 + return { 358 + pertains: false, 359 + reason: "No clear relevance indicators found", 360 + confidence: 0.3 // Below threshold, will trigger AI 361 + }; 362 + } 363 + 364 + function checkSecurity_(combined) { 365 + const patterns = [ 366 + /\bpassword\s+(reset|change|update|expired)\b/, 367 + /\breset\s+your\s+password\b/, 368 + /\baccount\s+security\b/, 369 + /\bsecurity\s+alert\b/, 370 + /\bunusual\s+(sign[- ]?in|activity)\b/, 371 + /\bverification\s+code\b/, 372 + /\b(2fa|mfa|two[- ]factor)\b/, 373 + /\bcompromised\s+account\b/, 374 + /\baccount\s+(locked|suspended)\b/, 375 + /\bsuspicious\s+activity\b/ 376 + ]; 377 + 378 + for (let i = 0; i < patterns.length; i++) { 379 + if (patterns[i].test(combined)) { 380 + if (/\bsaving.*\bon\s+tuition\b|\btuition.*\bsaving\b/.test(combined)) { 381 + continue; 382 + } 383 + return { pertains: true, reason: "Security/password alert", confidence: 1.0 }; 384 + } 385 + } 386 + return null; 387 + } 388 + 389 + function checkStudentAction_(combined) { 390 + const patterns = [ 391 + /\bapplication\s+(received|complete|submitted|confirmation)\b/, 392 + /\breceived\s+your\s+application\b/, 393 + /\bthank\s+you\s+for\s+(applying|submitting)\b/, 394 + /\benrollment\s+confirmation\b/, 395 + /\bconfirmation\s+(of|for)\s+(your\s+)?(application|enrollment)\b/, 396 + /\byour\s+application\s+(has\s+been|is)\s+(received|complete)\b/ 397 + ]; 398 + 399 + for (let i = 0; i < patterns.length; i++) { 400 + if (patterns[i].test(combined)) { 401 + if (/\bhow\s+to\s+apply\b|\bapply\s+now\b|\bstart\s+(your\s+)?application\b/.test(combined)) { 402 + continue; 403 + } 404 + return { pertains: true, reason: "Application/enrollment confirmation", confidence: 0.95 }; 405 + } 406 + } 407 + return null; 408 + } 409 + 410 + function checkAccepted_(combined) { 411 + const patterns = [ 412 + /\baccepted\s+(student\s+)?portal\b/, 413 + /\byour\s+(personalized\s+)?accepted\s+portal\b/, 414 + /\bdeposit\s+(today|now|by|to\s+reserve)\b/, 415 + /\breserve\s+your\s+(place|spot)\b/, 416 + /\bcongratulations.*\baccepted\b/, 417 + /\byou\s+(have\s+been|are|were)\s+accepted\b/, 418 + /\badmission\s+(decision|offer)\b/, 419 + /\benroll(ment)?\s+deposit\b/ 420 + ]; 421 + 422 + for (let i = 0; i < patterns.length; i++) { 423 + if (patterns[i].test(combined)) { 424 + if (/\bacceptance\s+rate\b|\bhigh\s+acceptance\b|\bpre[- ]admit(ted)?\b|\bautomatic\s+admission\b/.test(combined)) { 425 + continue; 426 + } 427 + return { pertains: true, reason: "Accepted student information", confidence: 0.95 }; 428 + } 429 + } 430 + return null; 431 + } 432 + 433 + function checkDualEnrollment_(combined) { 434 + const patterns = [ 435 + /\bdual\s+enrollment\b/, 436 + /\bcourse\s+(registration|deletion|added|dropped)\b/, 437 + /\bspring\s+\d{4}\s+(course|on[- ]campus)\b/, 438 + /\bhow\s+to\s+register\b.*\b(course|class)/ 439 + ]; 440 + 441 + for (let i = 0; i < patterns.length; i++) { 442 + if (patterns[i].test(combined)) { 443 + if (/\blearn\s+more\s+about\b|\binterested\s+in\b|\bconsider\s+joining\b/.test(combined)) { 444 + continue; 445 + } 446 + return { pertains: true, reason: "Dual enrollment course information", confidence: 0.9 }; 447 + } 448 + } 449 + return null; 450 + } 451 + 452 + function checkScholarship_(subject, combined) { 453 + if (/\bapply\s+for\s+(the\s+)?.*\bscholarship\b/.test(subject)) { 454 + if (/\bpresident'?s\b|\bministry\b|\bimpact\b/.test(combined)) { 455 + return { pertains: true, reason: "Specific scholarship opportunity", confidence: 0.75 }; 456 + } 457 + } 458 + 459 + if (/\bscholarship\b/.test(combined)) { 460 + const notAwardedPatterns = [ 461 + /\bscholarship\b.*\b(held|reserved)\s+for\s+you\b/, 462 + /\b(held|reserved)\s+for\s+you\b/, 463 + /\bconsider(ed|ation)\b.*\bscholarship\b/, 464 + /\bscholarship\b.*\bconsider(ed|ation)\b/, 465 + /\beligible\s+for\b.*\bscholarship\b/, 466 + /\bscholarship\b.*\beligible\b/, 467 + /\bmay\s+qualify\b.*\bscholarship\b/ 468 + ]; 469 + 470 + for (let i = 0; i < notAwardedPatterns.length; i++) { 471 + if (notAwardedPatterns[i].test(combined)) { 472 + return { pertains: false, reason: "Scholarship not actually awarded", confidence: 0.9 }; 473 + } 474 + } 475 + } 476 + 477 + const awardedPatterns = [ 478 + /\bcongratulations\b.*\bscholarship\b/, 479 + /\byou\s+(have|received|are\s+awarded|won)\b.*\bscholarship\b/, 480 + /\bwe\s+(are\s+)?(pleased\s+to\s+)?award(ing)?\b.*\bscholarship\b/, 481 + /\bscholarship\s+(offer|award)\b/ 482 + ]; 483 + 484 + for (let i = 0; i < awardedPatterns.length; i++) { 485 + if (awardedPatterns[i].test(combined)) { 486 + return { pertains: true, reason: "Scholarship awarded", confidence: 0.95 }; 487 + } 488 + } 489 + 490 + return null; 491 + } 492 + 493 + function checkFinancialAid_(combined) { 494 + const readyPatterns = [ 495 + /\bfinancial\s+aid\b.*\boffer\b.*\b(ready|available)\b/, 496 + /\baward\s+letter\b.*\b(ready|available|posted|view)\b/, 497 + /\b(view|review)\s+(your\s+)?award\s+letter\b/, 498 + /\byour\s+aid\s+is\s+ready\b/ 499 + ]; 500 + 501 + const notReadyPatterns = [ 502 + /\blearn\s+more\s+about\b.*\bfinancial\s+aid\b/, 503 + /\bapply\b.*\b(for\s+)?financial\s+aid\b/, 504 + /\bcomplete\s+(your\s+)?fafsa\b/, 505 + /\bpriority\s+(deadline|consideration)\b.*\bfinancial\s+aid\b/ 506 + ]; 507 + 508 + for (let i = 0; i < readyPatterns.length; i++) { 509 + if (readyPatterns[i].test(combined)) { 510 + for (let j = 0; j < notReadyPatterns.length; j++) { 511 + if (notReadyPatterns[j].test(combined)) { 512 + return null; 513 + } 514 + } 515 + return { pertains: true, reason: "Financial aid offer ready", confidence: 0.95 }; 516 + } 517 + } 518 + 519 + return null; 520 + } 521 + 522 + function checkIrrelevant_(combined) { 523 + const patterns = [ 524 + /\bstudent\s+life\s+blog\b/, 525 + /\bnewsletter\b/, 526 + /\bweekly\s+(digest|update)\b/, 527 + /\bupcoming\s+events\b/, 528 + /\bjoin\s+us\s+(for|at)\b/, 529 + /\bopen\s+house\b/, 530 + /\bvirtual\s+tour\b/, 531 + /\bhaven'?t\s+applied.*yet\b/, 532 + /\bstill\s+time\s+to\s+apply\b/, 533 + /\bhow\s+is\s+your\s+college\s+search\b/, 534 + /\bextended.*\bpriority\s+deadline\b/, 535 + /\bpriority\s+deadline.*\bextended\b/, 536 + /\bsummer\s+(academy|camp|program)\b/, 537 + /\bugly\s+sweater\b/ 538 + ]; 539 + 540 + for (let i = 0; i < patterns.length; i++) { 541 + if (patterns[i].test(combined)) { 542 + return { pertains: false, reason: "Marketing/newsletter/spam", confidence: 0.95 }; 543 + } 544 + } 545 + 546 + if (/\bhaven'?t\s+applied\b/.test(combined)) { 547 + return { pertains: false, reason: "Unsolicited outreach", confidence: 0.95 }; 548 + } 549 + 550 + return null; 551 + } 552 + 553 + // ---------- AI Classifier (for uncertain cases) ---------- 554 + 555 + function classifyWithAIRetry_(meta) { 556 + let lastError = null; 557 + let backoffMs = 1000; 558 + 559 + for (let attempt = 1; attempt <= MAX_AI_RETRIES; attempt++) { 560 + try { 561 + const result = classifyWithAI_(meta); 562 + 563 + if (typeof result.pertains !== "boolean") { 564 + throw new Error(`Invalid AI response: pertains is not boolean`); 565 + } 566 + if (!result.reason || typeof result.reason !== "string") { 567 + throw new Error("Invalid AI response: missing reason"); 568 + } 569 + 570 + return result; 571 + } catch (e) { 572 + lastError = e; 573 + Logger.log(`AI attempt ${attempt}/${MAX_AI_RETRIES} failed: ${e}`); 574 + 575 + if (e.toString().includes("429") || e.toString().includes("rate limit")) { 576 + backoffMs = Math.min(backoffMs * 2, 10000); 577 + } 578 + 579 + if (attempt < MAX_AI_RETRIES) { 580 + Utilities.sleep(backoffMs); 581 + backoffMs *= 2; 582 + } 583 + } 584 + } 585 + 586 + return { 587 + pertains: false, 588 + reason: `AI failed after ${MAX_AI_RETRIES} attempts: ${lastError}`, 589 + error: true 590 + }; 591 + } 592 + 593 + function classifyWithAI_(meta) { 594 + const prompt = buildPrompt_(meta); 595 + const payload = { 596 + model: AI_MODEL, 597 + messages: [{ role: "user", content: prompt }], 598 + stream: false, 599 + temperature: 0.1, 600 + max_tokens: 150 601 + }; 602 + 603 + const headers = { 604 + "Authorization": `Bearer ${AI_API_KEY}`, 605 + "Content-Type": "application/json" 606 + }; 607 + 608 + let resp; 609 + try { 610 + resp = UrlFetchApp.fetch(AI_BASE_URL, { 611 + method: "post", 612 + payload: JSON.stringify(payload), 613 + headers, 614 + muteHttpExceptions: true, 615 + validateHttpsCertificates: true, 616 + timeout: AI_TIMEOUT_MS 617 + }); 618 + } catch (e) { 619 + throw new Error(`AI request network error: ${e}`); 620 + } 621 + 622 + const status = resp.getResponseCode(); 623 + 624 + if (handleAIRateLimit_(resp)) { 625 + throw new Error(`AI rate limit (HTTP ${status})`); 626 + } 627 + 628 + if (status < 200 || status >= 300) { 629 + const errorBody = resp.getContentText().slice(0, 500); 630 + throw new Error(`AI HTTP ${status}: ${errorBody}`); 631 + } 632 + 633 + const text = resp.getContentText(); 634 + if (!text) { 635 + throw new Error("AI returned empty response"); 636 + } 637 + 638 + let pertains = false, reason = "default false"; 639 + 640 + try { 641 + const json = JSON.parse(text); 642 + const content = (json.choices?.[0]?.message?.content) || ""; 643 + 644 + if (!content) { 645 + throw new Error("AI response missing content"); 646 + } 647 + 648 + const parsed = tryParseJSON_(content); 649 + 650 + if (parsed && typeof parsed.pertains === "boolean") { 651 + pertains = parsed.pertains; 652 + reason = parsed.reason || "AI decision"; 653 + } else { 654 + const lc = content.toLowerCase(); 655 + 656 + if (/"\s*pertains\s*"\s*:\s*true/i.test(content)) { 657 + pertains = true; 658 + reason = "AI indicated true"; 659 + } else if (/"\s*pertains\s*"\s*:\s*false/i.test(content)) { 660 + pertains = false; 661 + reason = "AI indicated false"; 662 + } else { 663 + throw new Error(`Could not parse AI response: ${content.slice(0, 200)}`); 664 + } 665 + } 666 + } catch (e) { 667 + throw new Error(`AI parse error: ${e}. Response: ${text.slice(0, 500)}`); 668 + } 669 + 670 + return { pertains, reason }; 671 + } 672 + 673 + function buildPrompt_(meta) { 674 + return [ 675 + "You must return EXACTLY one JSON object: { \"pertains\": true|false, \"reason\": \"explanation\" }", 676 + "", 677 + "Return pertains=true ONLY IF:", 678 + " A) Security/password alert (password reset, account locked, verification code)", 679 + " B) Scholarship AWARDED (not held, not consideration, not eligible)", 680 + " C) Financial aid offer explicitly READY to view", 681 + " D) Confirmation of student action (application received, enrollment confirmed)", 682 + " E) Accepted student info (portal, deposit) for schools APPLIED to", 683 + " F) Dual enrollment course info (registration, schedules)", 684 + "", 685 + "Return pertains=false for:", 686 + " - Marketing, newsletters, blogs, events", 687 + " - Unsolicited outreach (haven't applied)", 688 + " - Scholarship held/eligible/consideration", 689 + " - FAFSA reminders, financial aid applications", 690 + " - Priority deadline extensions", 691 + "", 692 + "When uncertain, return false.", 693 + "", 694 + `From: ${meta.from}`, 695 + `Subject: ${meta.subject}`, 696 + `Body: ${meta.body}`, 697 + "", 698 + "JSON response:" 699 + ].join("\n"); 700 + } 701 + 702 + function enforceStrictRules_(meta, aiDecision) { 703 + const s = (meta.subject || "").toLowerCase(); 704 + const b = (meta.body || "").toLowerCase(); 705 + let pertains = aiDecision.pertains === true; 706 + let reason = aiDecision.reason || "AI decision"; 707 + 708 + // OVERRIDE: Force relevant for security 709 + if (!pertains) { 710 + if (/\bpassword\s+reset\b|\bsecurity\s+alert\b|\bverification\s+code\b|\baccount\s+locked\b/.test(s + " " + b)) { 711 + pertains = true; 712 + reason = "OVERRIDE: Security alert"; 713 + } 714 + } 715 + 716 + // OVERRIDE: Force NOT relevant for scholarship not awarded 717 + if (pertains && /scholarship/i.test(reason)) { 718 + if (/\bheld\s+for\s+you\b|\bconsideration\b|\beligible\b|\bmay\s+qualify\b/.test(s + " " + b)) { 719 + pertains = false; 720 + reason = "OVERRIDE: Scholarship not awarded"; 721 + } 722 + } 723 + 724 + return { pertains, reason }; 725 + } 726 + 727 + // ---------- Utilities ---------- 728 + 729 + function getOrCreateLabel_(name) { 730 + return GmailApp.getUserLabelByName(name) || GmailApp.createLabel(name); 731 + } 732 + 733 + function safeStr_(s, maxLen) { 734 + if (s === null || s === undefined) return ""; 735 + s = s.toString().trim(); 736 + if (maxLen && s.length > maxLen) return s.slice(0, maxLen); 737 + return s; 738 + } 739 + 740 + function tryParseJSON_(s) { 741 + if (!s) return null; 742 + 743 + try { 744 + return JSON.parse(s); 745 + } catch (e) { 746 + const codeBlockMatch = s.match(/```(?:json)?\s*(\{[\s\S]*?\})\s*```/); 747 + if (codeBlockMatch) { 748 + try { 749 + return JSON.parse(codeBlockMatch[1]); 750 + } catch (e2) {} 751 + } 752 + 753 + const i = s.indexOf("{"); 754 + const j = s.lastIndexOf("}"); 755 + if (i !== -1 && j !== -1 && j > i) { 756 + try { 757 + return JSON.parse(s.slice(i, j + 1)); 758 + } catch (e3) {} 759 + } 760 + 761 + return null; 762 + } 763 + } 764 + 765 + function setupTriggers() { 766 + ScriptApp.getProjectTriggers().forEach(trigger => { 767 + if (trigger.getHandlerFunction() === "runTriage") { 768 + ScriptApp.deleteTrigger(trigger); 769 + } 770 + }); 771 + 772 + ScriptApp.newTrigger("runTriage") 773 + .timeBased() 774 + .everyMinutes(10) 775 + .create(); 776 + 777 + Logger.log("Trigger created: runTriage every 10 minutes"); 778 + } 779 + 780 + function resetRateLimit() { 781 + const props = PropertiesService.getScriptProperties(); 782 + props.deleteProperty(RATE_LIMIT_PROPERTY); 783 + props.deleteProperty(RATE_LIMIT_COUNT_PROPERTY); 784 + Logger.log("Rate limit reset"); 785 + } 786 + 787 + function checkRateLimitStatus() { 788 + const props = PropertiesService.getScriptProperties(); 789 + const count = getCurrentRateLimitCount_(); 790 + const resetTime = props.getProperty(RATE_LIMIT_PROPERTY); 791 + 792 + Logger.log(`Rate limit: ${count}/${MAX_AI_CALLS_PER_HOUR} calls`); 793 + if (resetTime) { 794 + Logger.log(`Reset: ${new Date(parseInt(resetTime)).toISOString()}`); 795 + } else { 796 + Logger.log("No active rate limit"); 797 + } 798 + }

Configure Feed

Configure Feed