SBPI Prediction Pipeline — Full Stack Execution

April 2, 2026  |  ShurAI Internal Technical Report

Session Overview

In a single session, we executed the complete SBPI prediction pipeline from infrastructure recovery through live prediction generation. Six interdependent steps were completed: config deployment, critical source code recovery, knowledge graph expansion, W14 prediction generation, signal weight optimization, and brand intelligence card creation. The pipeline is now fully operational with a dual-config test locked for W14 evaluation.

Key Metrics

4,052 RDF Triples in Oxigraph
up from 1,672
105 W14 Predictions Locked
5 methods × 21 companies
4 Weeks of Data Loaded
W10 – W13
66.7% Signal Weight Optimization
Optuna TPE best score
21 Brand Intel Cards
deployed & live
4 Live Sites Deployed
Cloudflare Pages

Pipeline Steps Completed

1. Infrastructure Recovery

Decompiled missing sbpi_to_rdf.py from bytecode. Resolved Oxigraph store lock via SPARQL INSERT DATA batches.

2. Knowledge Graph Expansion

Loaded 4 weeks of data (W10–W13), growing the store from 1,672 to 4,052 triples (+142%).

3. Prediction Pipeline Upgrade

Added kg_optimized as 5th prediction method. Dual-config test locked for W14.

4. W14 Predictions Generated

105 predictions locked across 5 methods and 21 companies.

5. Signal Weight Optimization

Optuna TPE optimizer (30 trials) achieved 66.7% appropriate rate on synthetic labels.

6. Brand Intelligence Cards

21 sortable company intelligence cards deployed to sbpi-brand-intel.pages.dev.

Infrastructure Recovery

Two critical infrastructure issues were resolved before the pipeline could execute.

sbpi_to_rdf.py Recovery

The source file for the RDF ETL pipeline was missing. Only a compiled .pyc existed in __pycache__. An agent decompiled the bytecode, recovering all 7 functions and 63 module-level names with exact fidelity. The reconstructed script was verified to generate 1,033 triples from W13 data.

A null-safety patch was applied for archive files with missing previous_composite fields.

Recovered Functions

# 7 functions recovered from bytecode decompilation def load_state_file(path) def state_to_graph(state, week_label) def link_weeks(graph, weeks) def validate_graph(graph) def load_to_oxigraph(graph, endpoint) def run_sample_queries(endpoint) def main()

CLI Interface

# Available command-line options --current # Process current week only --all # Process all available weeks --validate # Run graph validation checks --serve # Start local Oxigraph server --output-turtle # Export as Turtle format

Oxigraph Store Lock Resolution

The Oxigraph server (PID 2674) held an exclusive lock on the store, blocking the standard pyoxigraph file-access pattern.

Resolution Path

  1. Generated N-Triples via rdflib from the state files
  2. Loaded via SPARQL INSERT DATA batches (50 triples per batch) through the HTTP endpoint on port 7878
  3. The /store POST endpoint accepted data (HTTP 201) but didn't persist
  4. SPARQL INSERT DATA was the working path that persisted triples reliably

Prediction Pipeline Upgrade

The prediction experiment script was upgraded from 4 methods to 5, adding the Optuna-tuned kg_optimized configuration for a dual-config head-to-head test.

Method Comparison

MethodDescriptionW14 Predictions
persistence Predicts no change (delta = 0) 21 STABLE
naive_momentum Predicts delta = last week's delta 8 UP 8 STABLE 5 DOWN
mean_reversion Predicts reversion toward tier midpoint 21 UP
kg_default Original hardcoded thresholds (untuned) 21 STABLE
kg_optimized Optuna-tuned 12-parameter config 13 UP 8 STABLE

Dual-Config Test

kg_optimized uses the Optuna TPE-optimized parameters from best-config.json (69.9% training accuracy on W10–W12 data). It imports load_best_config() and predict_with_config() from kg_interface_optimizer.py. Falls back to default config if best-config.json is missing.

Key Optimized Parameters

ParameterDefaultOptimizedChange
direction_threshold0.5001.295+159%
confidence_base0.6000.443−26%
mean_reversion_rate0.1000.257+157%
anomaly_contributesfalsetrueenabled
divergence_weight0.180new
tier_proximity_weight0.096new

Sample Prediction Comparison — Amazon W14

MethodDirectionDeltaConfidence
kg_default STABLE 0.00 0.50
kg_optimized UP +1.99 0.95
naive_momentum DOWN −2.60 0.55
mean_reversion UP +0.78 0.45

Knowledge Graph Expansion

1,672 Previous Triples
4,052 Current Triples
+142% Growth

Data Loaded — 4 Weeks

WeekScore RecordsStatus
W10-202616Archive
W11-202617Archive
W12-202621Archive
W13-202621Current

Triple Composition Per Company

Company entity
~10 triples
type, slug, name, vertical, geography, roles
ScoreRecord
~8 triples
type, forCompany, forWeek, composite, previous, delta, tier
DimensionScore
~15 triples
5 dimensions × 3 triples each
Signal
~3–4 triples
type, text, URL
Attestation
~6 triples
type, confidence, source, provenance

SPARQL Verification — W13 Top 5

RankCompanyComposite Score
1DramaBox82.75
2ReelShort81.20
3Disney77.10
4iQiYi67.30
5JioHotstar65.40

Biggest W13 Movers

CompanyDelta
Amazon+4.05
JioHotstar+3.15
COL/BeLive+2.70
Both Worlds+2.65
GammaTime+2.35

21 distinct companies confirmed across the knowledge graph via SPARQL query.

Signal Weight Optimization

Track C: Signal Weighting Research Program  |  Experiment 3

Research Question

What parameters control the threshold between "this signal warrants a mitigation recommendation" and "this signal is expected volatility within a functioning strategy"?

Optimizer Configuration

Optuna TPE Optimizer
30 Trials
66.7% Best Score
75 Synthetic Labels

Optimized Parameters

ParameterDefaultOptimizedChange
materiality_threshold 2.000 2.362 +18%
structural_change_weight 1.500 2.286 +52%
competitor_action_weight 1.300 1.806 +39%
tier_proximity_sensitivity 3.000 5.902 +97%
multi_dimension_threshold 2 2 +0
trajectory_window 4 8 +4 weeks
volatility_dampener 0.300 0.658 +119%

Interpretation

The optimizer learned to raise the materiality threshold (+18%), weight structural changes more heavily (+52%), extend the trajectory lookback window from 4 to 8 weeks, and increase the volatility dampener significantly (+119%) — meaning historically volatile companies get reduced urgency.

Tier proximity sensitivity nearly doubled (+97%), meaning the system becomes much more alert when companies approach tier boundaries.

Trial Distribution

Mean
62.2%
Std Dev
2.4%
95% CI
61.3% – 63.2%
Best
66.7%
0% appropriate_rate (maximized) 100%

Deployed Deliverables

4 sites deployed during this session, all on Cloudflare Pages.

SiteURLAccountContent
W13 Editorial microco-weekly-editorial-bja-8zm.pages.dev weareshur 8-tab weekly report with corrected framing
W13 Validation sbpi-w13-validation-3y2.pages.dev weareshur Prediction validation with back-to-editorial nav
Brand Intel Cards sbpi-brand-intel.pages.dev weareshur 21 sortable company intelligence cards
Autoresearch Status sbpi-autoresearch-status.pages.dev getsteady 13-day pipeline activity report

Editorial Corrections Applied

  • Changed "KG-augmented is broken, not wrong" to "KG-augmented — first tuning pass pending"
  • Changed "predicts stable for everything because the optimized config has not been deployed" to "ran with default (untuned) parameters; an optimized config deploys for W14"
  • Updated all validation report links to new weareshur URLs
  • Added back-to-editorial navigation on validation report