Technical Documentation: AI Calendar Tools
Architecture Overview
Core Components
main.py
├── OpenAI Client Setup (get_openai_client)
├── PDF Text Extraction (extract_text_from_pdf)
├── School Calendar Analyzer
│ ├── Two-Pass Analysis System
│ ├── Date Merging Logic
│ └── Date Inference Engine
├── Parenting Plan Analyzer
│ ├── Basic Analysis (SYSTEM_PROMPT)
│ └── Enhanced Analysis with Form Snapshot (ENHANCED_PARENTING_PLAN_PROMPT)
└── Drafting Audit Report Generator
AI Integration
The system uses OpenAI's GPT-4o model via two possible configurations:
- User API Key (OPENAI_API_KEY) - Direct connection to OpenAI, works in all environments
- Replit Integration (AI_INTEGRATIONS_OPENAI_API_KEY + AI_INTEGRATIONS_OPENAI_BASE_URL) - Works in both development and production
Priority: User API key is checked first, then Replit integration as fallback.
Troubleshooting: AI Features Hanging in Production
Symptom: Features like "Extracting calendar dates..." or "Analyzing parenting plan..." spin indefinitely in production but work in development.
Root Cause: The get_openai_client() function may have a restriction that prevents the Replit AI integration from initializing in production.
Fix (December 2024): Ensure the Replit integration check in get_openai_client() does NOT require "localhost" in the base URL. The condition should be:
# CORRECT - works in both dev and production: if replit_api_key and replit_base_url: # WRONG - only works in development: if replit_api_key and replit_base_url and "localhost" in replit_base_url:
Why: In development, the base URL contains "localhost". In production, Replit provides a different proxy URL (e.g., https://proxy.replit.com/...). Both are valid.
Verification: Test with curl http://localhost:5000/test_openai - should return {"success": true}.
PDF Processing
Text extraction uses a two-tier approach:
- Primary:
pdfplumberfor native text extraction - Fallback:
pytesseract+pdf2imagefor OCR on scanned documents
School Calendar Analyzer
Two-Pass Analysis System
The school calendar analyzer uses a sophisticated two-pass system to maximize accuracy:
Pass 1: Raw Date Extraction (AI)
Function: extract_raw_calendar_dates(text)
Prompt: SCHOOL_CALENDAR_RAW_EXTRACTION_PROMPT
The AI extracts every marked date from the calendar with:
- Date and optional end date (for ranges)
- Label (exact text from calendar)
- Category (holiday, break, teacher_day, student_holiday, early_release, other)
- isStudentDayOff flag
- Visual indicator description (shading, colors, etc.)
Pass 2: Merge and Normalize (Python)
Function: merge_and_normalize_breaks(raw_result)
Deterministic Python logic that:
- Parses dates and filters for student days off
- Sorts entries chronologically
- Merges adjacent dates (gap ≤ 1 day, or Friday-Monday patterns with gap ≤ 3)
- Handles month boundary merging (end of month to start of next)
- Normalizes break names based on month and label content
Key Merging Logic
# Merge conditions (lines ~1707-1724 in main.py):
should_merge = False
# Adjacent or same day
if gap_days <= 1:
should_merge = True
# Weekend spanning (Friday to Monday patterns)
elif gap_days <= 3:
if current_end.weekday() == 4 and entry_start.weekday() == 0: # Fri to Mon
should_merge = True
# ... additional weekend patterns
# Month boundary (late month to early next month)
if current_end.month != entry_start.month:
if current_end.day >= 28 and entry_start.day <= 5:
if gap_days <= 5:
should_merge = True
Break Naming Convention
Function: get_break_name(month, label, is_multi_day)
| Month | Standard Name | Notes |
|---|---|---|
| October | Fall Break | Any multi-day break in October |
| November | Thanksgiving Break | Around Thanksgiving Day |
| December | Christmas Break | ANY break in December, even if labeled "Winter Break" |
| February | Winter Break | Usually around Presidents Day |
| March/April | Spring Break | Multi-day breaks in these months |
Student Holiday Detection
A date is marked as a student day off if any of these conditions are true:
isStudentDayOff == truein the AI response- Label or notes contain "student holiday" or "student day off"
- Category is teacher_day/teacher_planning AND (isStudentDayOff OR label contains "student holiday")
Key principle: A break ends on the last day BEFORE school resumes. Any Student Holiday following a break extends that break.
CRITICAL: Christmas Break Extension into January
Problem: This is the most common bug - failing to append January student holidays to Christmas Break.
Example from Gwinnett County 2025-26:
December 22-31: Winter Break (School Holidays) January 1: Winter Break (School Holidays) January 2: Teacher Planning/Staff Development [#8-9] (Student Holiday) <-- THIS MUST BE INCLUDED CORRECT: Christmas Break = Dec 22 - Jan 2 WRONG: Christmas Break = Dec 22 - Jan 1 (missing Jan 2)
Root Cause: In PDF parsing, "(Student Holiday)" may appear on a separate line from "Teacher Planning", causing the AI to miss it.
Fix locations:
SCHOOL_CALENDAR_RAW_EXTRACTION_PROMPT- Rules 11-14 explicitly handle January datesSCHOOL_CALENDAR_SYSTEM_PROMPT- "CHRISTMAS BREAK EXTENSION INTO JANUARY" sectionmerge_and_normalize_breaks()- Python merge logic handles adjacent days
DO NOT MODIFY these sections without understanding the full extraction + merge pipeline.
Parenting Plan Analyzer
Two Analysis Modes
Basic Analysis (No Form Snapshot)
Function: analyze_with_openai(text)
Prompt: SYSTEM_PROMPT
Extracts scheduling information without school calendar context.
Enhanced Analysis (With Form Snapshot)
Function: analyze_with_openai(text, form_snapshot)
Prompt: ENHANCED_PARENTING_PLAN_PROMPT
When the frontend provides a form snapshot (including school calendar dates), the AI can:
- Apply date correction rules from the parenting plan
- Compute adjusted date ranges based on provisions like "begins when school dismisses"
- Return corrected dateFields with reasoning
Form Snapshot Structure
{
"dateFields": [
{
"name": "christmas_break_start_even",
"label": "Christmas Break Start (Even Years)",
"currentValue": "2026-12-21"
},
// ... more date fields
],
"schoolCalendar": {
"schoolName": "Gwinnett County",
"holidays": [...]
}
}
Response Structure
{
"parentA": "Mother's Name",
"parentB": "Father's Name",
"weeklySchedule": { ... },
"holidaySchedule": {
"christmasBreak": {
"evenYears": { "option": "Parent A", "reasoning": "..." },
"oddYears": { "option": "Parent B", "reasoning": "..." }
},
// ... other holidays
},
"summerSchedule": { ... },
"detectedRules": [
{
"breakName": "Christmas Break",
"provision": "begins when school dismisses for the break",
"effect": "Start date adjusted to Friday before official start"
}
],
"correctedDateFields": [
{
"name": "christmas_break_start_even",
"originalValue": "2026-12-21",
"correctedValue": "2026-12-18",
"reasoning": "School dismisses Friday Dec 18 for break starting Dec 21"
}
],
"shortSummary": "...",
"longSummary": "...",
"confidence": "high|medium|low"
}
Date Correction UI
Fields with corrected values receive the CSS class ai-adjusted which triggers a pulse animation to draw attention to AI-corrected values.
Date Inference System
Function: infer_missing_years(calendar_result)
Purpose
School calendars typically show only one academic year. This function infers dates for adjacent years to support 24-month calendar generation.
Inference Methods
- Federal Holidays - Uses official rules (e.g., MLK Day = 3rd Monday in January)
- Template-Based Pattern Matching - For breaks like Fall Break and Winter Break (February), preserves exact weekday + week + duration pattern
- Christmas Break - Maintains relative position to Dec 25
- Thanksgiving - Anchors to 4th Thursday in November
Template-Based Break Inference (Critical Algorithm)
Function: infer_break_by_pattern(target_year, source_holiday)
What Gets Preserved
- Nth Full Week - Which full week of the month (e.g., 2nd full Monday-Sunday week of October)
- Weekday Offset - Which day of the week the break starts (e.g., Thursday = weekday 3)
- Duration - How many days the break lasts (e.g., 5 days)
Algorithm
- Extract source start weekday:
source_weekday = source_start.weekday()(Mon=0, Sun=6) - Determine which Nth full week the source falls in:
nth_full_week = get_nth_full_week(source_start) - Calculate duration:
duration = (source_end - source_start).days - Find the Monday of that Nth full week in the target year:
week_monday = get_nth_full_week_start(target_year, month, nth_full_week) - Apply weekday offset:
inferred_start = week_monday + timedelta(days=source_weekday) - Apply duration:
inferred_end = inferred_start + timedelta(days=duration)
Example: Fall Break
| Source (2025) | Pattern Extracted | Inferred (2026) |
|---|---|---|
| Oct 9-13, 2025 | Thursday of 2nd full week, 4 days | Oct 8-12, 2026 |
Why Oct 8? October 2026 starts on Thursday. First full week starts Monday Oct 5. Second full week starts Monday Oct 12. Wait - we need to check: Oct 9, 2025 = Thursday. The 2nd full week of October 2025 starts Monday Oct 6, so Thursday of that week = Oct 9. In 2026, 2nd full week starts Monday Oct 12, so Thursday = Oct 15.
Example: Winter Break (February)
| Source (2026) | Pattern Extracted | Inferred (2027) |
|---|---|---|
| Feb 12-16, 2026 | Thursday of 2nd full week, 4 days | Feb 11-15, 2027 |
Key Functions
get_nth_weekday_of_month(year, month, weekday, n)- Gets nth occurrence of weekdayget_nth_full_week(dt)- Determines which full week a date falls inget_nth_full_week_start(year, month, nth)- Gets Monday of the Nth full week in a monthinfer_break_by_pattern(target_year, source_holiday)- Applies template to infer break datesget_federal_holiday_date(name, year)- Returns official federal holiday datesinfer_christmas_break(year, source_holiday)- Special handling for Christmas Break
Marking Inferred Dates
Inferred entries include "inferred": true in their JSON and are displayed with a star symbol in the UI.
Visual Shading Extraction (Fallback)
Function: add_missing_breaks_from_shading(calendar_result, shading_info)
Purpose
Some school calendar PDFs have complex layouts (e.g., two-column formats) that produce garbled text when extracted. This function supplements text-based extraction by detecting visually shaded day cells in the calendar grid.
Two-Column Layout Handling
Function: extract_shading_from_calendar(pdf_path)
- Month headers are assigned to left or right page halves based on x-position
- Left half: x < page_width / 2
- Right half: x >= page_width / 2
- Day cells are associated with the nearest month header in their half
Weekend-Aware Gap Bridging (Critical Algorithm)
Business Rule: When reconstructing break date ranges from detected shaded days, gaps between detected days are ONLY bridged if ALL gap days fall on a weekend (Saturday or Sunday).
Rationale
Children don't attend school on weekends. If school is closed Friday and Monday, it's logically one continuous break (the weekend is implicitly included). However, if we detect Monday and Thursday as shaded, the gap includes Tuesday and Wednesday—actual school days—so those should NOT be merged.
Helper Function
def is_weekend_gap(year, month, day1, day2):
"""Check if all days between day1 and day2 (exclusive) are weekends."""
if day2 <= day1 + 1:
return True # Consecutive days, no gap
for d in range(day1 + 1, day2):
try:
dt = datetime(year, month, d)
if dt.weekday() < 5: # Mon=0, Sun=6; < 5 means weekday
return False
except ValueError:
return False
return True
Algorithm
- Group detected shaded days by month
- Sort days within each month
- Determine the year for each month from schoolYear (Jan-Jun = second year, Jul-Dec = first year)
- Build consecutive runs:
- If next day is immediately consecutive (d == end + 1) → extend run
- If gap contains ONLY weekend days → extend run
- If gap contains ANY weekday → start new run
- Create breaks from runs with at least 2 days
Example: February 2026
| Detected Days | Day of Week | Gap Analysis |
|---|---|---|
| Feb 6 | Friday | — |
| Feb 10 | Tuesday | Gap (7, 8, 9) = Sat, Sun, Mon → includes weekday → NEW RUN |
| Feb 12 | Thursday | Gap (11) = Wed → weekday → NEW RUN |
| Feb 13 | Friday | Consecutive → extend run |
| Feb 16 | Monday | Gap (14, 15) = Sat, Sun → all weekend → BRIDGE |
Result: Winter Break detected as February 12-16, 2026 (Thursday through Monday, bridging the weekend)
Shading Detection Notes
- Detects cells with gray/dark background colors
- Aggregates individual digit characters into complete day numbers
- Associates each day number with the correct month based on column position
- This is a fallback mechanism—primary extraction is text-based AI analysis
Early Release Days (Critical Exclusion)
Business Rule: Early Release days (e.g., "Early Release for High School Exams") are days when students are still in school but dismissed early. These should NEVER be merged with adjacent breaks.
Example
| Date | Label | Is Day Off? |
|---|---|---|
| Dec 17-19 | Early Release for High School Exams | NO - students still in school |
| Dec 22-31 | Winter Break (School Holidays) | YES - Christmas Break |
Result: Christmas Break starts December 22, NOT December 17.
Implementation
- AI prompt explicitly instructs: Early Release =
isStudentDayOff: falseandcategory: "early_release" - Merge logic skips entries with "early release" in the label (
continuestatement) - Early Release entries are not added to the
date_entrieslist that feeds the merge algorithm
Date Range Boundary Rules (Critical)
Business Rule: When a calendar entry shows a date range like "6-10 Spring Break", the extraction must use ONLY those dates (April 6-10), even if other numbers appear in nearby text (e.g., "13 Students Return").
Problem Scenario
| Calendar Text | Incorrect Extraction | Correct Extraction |
|---|---|---|
| "6-10 Spring Break (School Holidays) ... 13 Students Return" | April 6-13 (wrong!) | April 6-10 (correct) |
The "13" from "Students Return" is unrelated—it indicates when school resumes, not the break end date.
AI Prompt Rules (19-22)
- Rule 19: Only use the hyphen-bound date range from the label
- Rule 20: Do NOT extend ranges by including unrelated numbers from nearby entries
- Rule 21: Use EXACTLY the stated start/end days when hyphen notation is present (e.g., "22-31", "6-10")
- Rule 22: When shading shows specific days, use ONLY those shaded days as the range
Validation Priority
- Explicit hyphen-bound range in text (e.g., "6-10") → use exactly those days
- Visual shading confirmation → cross-reference with text extraction
- If text and shading conflict → prefer shading (visual truth over OCR errors)
Current/Ongoing Break Handling (Critical)
Business Rule: When extracting a school calendar during an ongoing break (e.g., running the analyzer on Dec 26 during Christmas break), the system may infer three occurrences of the same break type with the same even/odd parity.
Problem Scenario
| Uploaded Calendar | Inferred Breaks | Issue |
|---|---|---|
| 2026-27 School Year (Christmas Dec 21-31, 2026) | Dec 15, 2025 - Jan 1, 2026 (odd) Dec 20-27, 2027 (odd) | Both are odd years → one overwrites the other |
Solution: Consolidated Current Break Fields
- Three hidden form fields store the ongoing break:
currentBreakStart,currentBreakEnd,currentBreakType - Only one break can be active at a time, so a single set of fields suffices
applyExtractedDates()checks if today falls within a break's date range- If a slot conflict occurs (same even/odd parity), the ongoing break is stored in "current" fields
handleCurrentBreak()usescurrentBreakTypeto look up the correct parent selector
Flow
- Extract holidays from uploaded calendar
- Infer missing years (backward and forward)
- For each break, check if today falls within the date range
- If two breaks share the same even/odd slot, store the current break separately
- During calendar generation, apply both even/odd breaks AND current breaks
API Endpoints
POST /extract_school_calendar
Analyzes uploaded school calendar PDF.
| Input | PDF file (multipart form) |
|---|---|
| Process | Extract text → Two-pass analysis → Infer missing years |
| Output | JSON with holidays array, omittedHolidays, metadata |
POST /analyze_document
Analyzes uploaded parenting plan PDF.
| Input | PDF file + optional formSnapshot (JSON string) |
|---|---|
| Process | Extract text → AI analysis (basic or enhanced) |
| Output | JSON with scheduling information, date corrections, summaries |
POST /generate_audit_report
Generates drafting audit of parenting plan.
| Input | PDF file |
|---|---|
| Process | Extract text → Audit analysis with specialized prompt |
| Output | JSON with findings categorized by severity |
Key Conventions
Break Naming Rules (Critical)
- Christmas Break = ANY break in December (even if document says "Winter Break")
- Winter Break = Breaks in February only
- Fall Break = October breaks (even if labeled differently)
Date Formatting
All dates use YYYY-MM-DD format (ISO 8601).
Break End Date Logic
A break ends on the last day BEFORE school resumes. If a Teacher Planning Day (Student Holiday) follows a break, it extends that break.
File Size Limits
PDF uploads are limited to 10MB.
Error Handling
API endpoints track the current step and return failed_at_step in error responses for debugging.
Protected Files
Per user preferences, do not modify without explicit approval:
auth.pypayments.pymodels.pyconfig.pyextensions.py