Reading Results

This page answers two questions:

Did the run work?
What did SemaTune actually change?

The snippets below use sanitized real-shape examples. Paths use results/<run>/... instead of machine-specific absolute paths.

Optimization History Entry

{
  "iteration": 1,
  "parameters": {
    "min_granularity_ns": 3000000
  },
  "metrics": {
    "throughput": 1481.85,
    "goodput": 1481.85,
    "latency_p99": 2.79,
    "interval_reporting": true,
    "intervals_file": "results/sysbench_cpu_fixed/sysbench_cpu_windows/window_1/sysbench_intervals.json",
    "continuous_log_file": "results/sysbench_cpu_fixed/sysbench_cpu_windows/continuous_sysbench.log",
    "instructions_per_cycle": 0.42
  },
  "reward": 1481.85,
  "constraint_violated": false,
  "post_tuning_phase": false,
  "tuner_timing": {
    "parameters_applied": true,
    "target_iteration": 2,
    "justification": null,
    "token_metrics": null
  }
}

The top-level file also includes best_parameters, best_reward, iterations, total_time, timestamp, and terminated_reason.

LLM Response Index Entry

{
  "iteration": 2,
  "agent": "single",
  "provider": "gemini",
  "model": "gemma-4-31b-it",
  "log_file": "results/sysbench_cpu_llm/llm_api_logs/llm_api_iter0002_single_20260512_183912_877.txt",
  "response_text": "Analysis: ...\nConfig: {\"min_granularity_ns\": 10000000, \"converged\": false}",
  "response_parsed": {
    "min_granularity_ns": 10000000,
    "converged": false
  },
  "request_config": {
    "temperature": 0.2,
    "thinking_config": {
      "thinking_level": "HIGH"
    }
  }
}

If response_parsed is null, inspect the full log file. The model may have returned text that was usable by fallback parsing but not recorded as structured JSON in the compact index.

Sysbench Interval File

{
  "window_number": 1,
  "window_duration": 10,
  "report_interval": 1,
  "continuous_log_file": "results/sysbench_cpu_fixed/sysbench_cpu_windows/continuous_sysbench.log",
  "intervals": [
    {
      "elapsed_s": 1,
      "threads": 4,
      "events_per_second": 1482.31,
      "latency_percentile": 99,
      "latency_ms": 2.79,
      "latency_p99_ms": 2.79
    }
  ],
  "metrics": {
    "throughput": 1481.85,
    "events_total": 14818,
    "interval_count": 10
  }
}

TPCC Window Output

BenchBase windows write a summary JSON plus CSVs:

{
  "Throughput (requests/second)": 214.86,
  "Goodput (requests/second)": 214.03,
  "Measured Requests": 6446,
  "Latency Distribution": {
    "Average Latency (microseconds)": 74268.0,
    "95th Percentile Latency (microseconds)": 276111.0,
    "99th Percentile Latency (microseconds)": 597266.0
  }
}

SemaTune converts latency values to milliseconds in optimization history: latency_p99: 597.266.

Useful `jq` Commands

Best reward:

jq '{best_reward, best_parameters, terminated_reason}' \
  results/<run>/optimization_history_*.json

Final parameters:

jq '.history[-1].parameters' results/<run>/optimization_history_*.json

Parameter trajectory:

jq '.history[] | {iteration, parameters}' results/<run>/optimization_history_*.json

Metrics and rewards:

jq '.history[] | {iteration, reward, post_tuning_phase, metrics}' \
  results/<run>/optimization_history_*.json

Failed or rejected proposals:

jq '.history[] | select(.tuner_timing.apply_failed == true or .tuner_request_skipped == true)' \
  results/<run>/optimization_history_*.json

LLM justifications:

jq '.history[] | select(.llm_justification) |
  {iteration, llm_justification, parameters}' \
  results/<run>/optimization_history_*.json

Token usage:

jq '.history[] | select(.token_metrics) |
  {iteration, token_metrics}' \
  results/<run>/optimization_history_*.json

Post-tuning windows only:

jq '.history[] | select(.post_tuning_phase == true) |
  {iteration, reward, parameters, metrics}' \
  results/<run>/optimization_history_*.json

Compact LLM response stream:

jq '{iteration, agent, provider, model, response_text}' \
  results/<run>/llm_api_logs/llm_responses.jsonl

Success Checklist

A successful run should have:

one optimization_history_*.json
one history entry per measured tuning or post-tuning window
non-null value for the configured optimization_metric
target-specific window outputs
llm_api_logs/llm_responses.jsonl for live LLM runs
terminated_reason that matches how the run ended