Experiment: Global Schedule Speed (A01 Long v2 Series) ===================================================== This experiment tests the effect of **global_schedule_speed** (and related env tweaks) on A01 single-map long training. Runs: **A01_as20_long_v2**, **v2.1**, **v2.2**, **v2.3**, **v2.4**. Goal: see whether a faster schedule (e.g. **global_schedule_speed: 4**) helps break the **24.5 s** barrier on A01. Experiment Overview ------------------- We compared five runs from the A01_as20_long_v2 series. **global_schedule_speed** multiplies frame counts in LR/gamma/epsilon schedules, so higher values progress through the schedule faster (in terms of environment steps). Some runs also changed **tm_engine_step_per_action**, **n_zone_centers_in_inputs**, and **n_zone_centers_extrapolate_before_start_of_map**, so the comparison mixes schedule speed with environment resolution. **Main hypothesis:** ``global_schedule_speed: 4`` may help reach sub-24.5 s on A01 compared to 1 or 8. Results ------- **Important:** If run durations differed, interpret **by-time** tables using the script’s time axis (default **auto** → **cumulative training hours** when logged, not raw wall minutes across merged TB chunks). Use ``scripts/analyze_experiment_by_relative_time.py`` and optionally ``scripts/audit_tensorboard_training_timeline.py``; see :doc:`index`. **Key findings:** - Final best A01 from saved run state (``save//accumulated_stats.joblib``): - ``A01_as20_long_v2``: **24.150s** (``24150`` ms) - ``A01_as20_long_v2.1``: **24.440s** (``24440`` ms) - ``A01_as20_long_v2.2``: **300.000s** (no successful finish recorded) - ``A01_as20_long_v2.3``: **300.000s** (no successful finish recorded) - ``A01_as20_long_v2.4``: **25.150s** (``25150`` ms) - Ranking by final best A01 is therefore: **v2 (gss=4) > v2.1 (gss=8) > v2.4 (gss=1) >> v2.2/v2.3**. - For TensorBoard comparisons, runs must be merged across suffix chunks (``run``, ``run_2``, ``run_3``, ...); otherwise best values can be under-reported. Run Analysis ------------ - **A01_as20_long_v2**: **global_schedule_speed: 4**. Default env: tm_engine_step_per_action 5, n_zone_centers 40, batch 4096. Single map A01, long run (tensorboard_suffix_schedule up to 150M steps). Save: ``save\A01_as20_long_v2``. - **A01_as20_long_v2.1**: **global_schedule_speed: 8**. Same env as v2. Save: ``save\A01_as20_long_v2.1``. - **A01_as20_long_v2.2**: **global_schedule_speed: 8**. Env: tm_engine_step_per_action 1, n_zone_centers_in_inputs 200, n_zone_centers_extrapolate_before_start_of_map 100. Save: ``save\A01_as20_long_v2.2``. - **A01_as20_long_v2.3**: **global_schedule_speed: 1**. Same env as v2.2 (tm_engine_step 1, n_zone 200). Save: ``save\A01_as20_long_v2.3``. - **A01_as20_long_v2.4**: **global_schedule_speed: 1**. Env: tm_engine_step_per_action 3, n_zone_centers 40. Save: ``save\A01_as20_long_v2.4``. TensorBoard logs: ``tensorboard\A01_as20_long_v2``, ``tensorboard\A01_as20_long_v2.1``, … (and suffix dirs ``_2``, ``_3``, … where applicable). Reproduce comparison: :: python scripts/analyze_experiment_by_relative_time.py A01_as20_long_v2 A01_as20_long_v2.1 A01_as20_long_v2.2 A01_as20_long_v2.3 A01_as20_long_v2.4 --interval-training-hours 0.25 --step_interval 1000000 --logdir tensorboard Use ``--plot --output-dir docs/source/_static --prefix exp_global_schedule_speed_v2`` to generate comparison plots. The script prints per-run duration in **hours** (cumulative training) or **minutes** (wall), depending on the axis chosen. Detailed TensorBoard Metrics Analysis ------------------------------------- **Methodology — By time and by steps:** Prefer cumulative-training-hour checkpoints (``--time-axis auto``) or BY STEP tables. Race times from per-race ``Race/eval_race_time_*``, ``Race/explo_race_time_*``; scalars (loss, Q, GPU %) = last value at that checkpoint. The figures below use the same default as ``generate_experiment_plots.py`` (training hours on X when the scalar exists). **Runs v2 / v2.1 / v2.4:** v2 and v2.1 show **wall ≫ training** (~2.4–2.7×); v2.4 is ~1× (short run). See :doc:`time_axis_conventions` audit table. Fill subsections from script output (**cumul_training_hours** or BY STEP). Example command: :: python scripts/analyze_experiment_by_relative_time.py A01_as20_long_v2 A01_as20_long_v2.4 --interval-training-hours 0.25 --step_interval 1000000 --logdir tensorboard A01 (per-race eval_race_time_trained_A01) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Report best/mean/std, finish rate, and first finish (min) at selected checkpoints (e.g. 60 min, 120 min) and at step checkpoints (e.g. 500k, 1M). Compare v2 (gss=4) vs v2.4 (gss=1) vs v2.1 (gss=8) over the common window. .. image:: ../_static/exp_global_schedule_speed_v2_A01_best.jpg :alt: A01 eval best time by relative time (v2 vs v2.1 vs v2.4, global_schedule_speed) Training Loss ~~~~~~~~~~~~~ - At same relative time and step checkpoints; compare across runs. .. image:: ../_static/exp_global_schedule_speed_v2_loss.jpg :alt: Training loss by relative time (v2 vs v2.1 vs v2.4) Average Q-values (RL/avg_Q_trained_A01) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - At same checkpoints. .. image:: ../_static/exp_global_schedule_speed_v2_avg_q.jpg :alt: Avg Q by relative time (v2 vs v2.1 vs v2.4) GPU Utilization ~~~~~~~~~~~~~~~~ - ``Performance/learner_percentage_training`` over the common window. Configuration Changes --------------------- **Training** (``training`` in config YAML): - **global_schedule_speed**: 1 (v2.3, v2.4), 4 (v2), 8 (v2.1, v2.2). - **run_name**: ``A01_as20_long_v2``, ``A01_as20_long_v2.1``, … ``A01_as20_long_v2.4``. - **batch_size**: 4096; **lr_schedule**, **gamma_schedule**, **tensorboard_suffix_schedule** shared across runs (schedules are multiplied by global_schedule_speed in code). **Environment** (where different): - **v2, v2.1**: tm_engine_step_per_action 5, n_zone_centers_in_inputs 40, n_zone_centers_extrapolate_before_start_of_map 20. - **v2.2, v2.3**: tm_engine_step_per_action 1, n_zone_centers_in_inputs 200, n_zone_centers_extrapolate_before_start_of_map 100. - **v2.4**: tm_engine_step_per_action 3, n_zone_centers_in_inputs 40, n_zone_centers_extrapolate_before_start_of_map 20. Hardware -------- - Document GPU, number of collectors, and system if known (e.g. from run logs or machine). Conclusions ----------- - **global_schedule_speed: 4** (v2) is the strongest setting in this series by final best A01 (**24.150s**), beating both gss=8 (best **24.440s**) and gss=1 variants (best **25.150s** in v2.4). - v2.2/v2.3 (finer env) vs v2/v2.4 (coarser env) confound schedule speed with environment resolution; separate ablations can clarify. Recommendations --------------- - Use **global_schedule_speed: 4** when targeting sub-24.5 s on A01 with the current long-training setup. - Re-run ``analyze_experiment_by_relative_time.py`` for the five runs to fill exact durations and metric tables; use ``--plot`` to regenerate comparison JPGs and embed them in this page (one metric per graph, with ``:alt:`` captions). **Analysis tools:** - By **relative time and by steps**: ``python scripts/analyze_experiment_by_relative_time.py A01_as20_long_v2 A01_as20_long_v2.1 A01_as20_long_v2.4 --interval 5 --step_interval 1000000`` (add ``--logdir ""`` if not from project root). Outputs both relative-time and BY STEP tables. - With plots: add ``--plot --output-dir docs/source/_static --prefix exp_global_schedule_speed_v2``. - Key metrics: per-race ``Race/eval_race_time_trained_A01``, ``Race/explo_race_time_trained_A01``; scalars ``Training/loss``, ``RL/avg_Q_trained_A01``, ``Performance/learner_percentage_training``, ``alltime_min_ms_A01``.