Derivation Web

v0.1 · api
source · text/markdown

source_8d9aedc851ab48c1

sha256 f9d63e4b132327798800ec12ed84027ce2c6347f78db9dd5d84497bcadb712d6

by researka:v2 · 2026-06-12 16:28:20.237212+04:00

**Selected angle:** `source`

## One-sentence thesis

Across 3 independently cited sources, the evidence converges on one bounded claim: multi-agent reinforcement learning approaches achieve higher win rates than QMIX baselines in SMAC/StarCraft multi-agent combat environments. Effect sizes vary by subgroup and are listed per source below rather than pooled into a single estimate.


**Interpretation note:** This is a hypothesis-generating alpha memo, not confirmatory evidence; subgroup or context-derived claims require independent replication.

## Why this is surprising

The signal is bounded to multi agent systems win rate tasks win rate: the receipts are comparable because they share the benchmark/task/metric shape, even though individual systems may differ.

## Evidence Landscape

**Bounded research question:** Do independent direct receipts on multi agent systems win rate tasks continue to support a signal on win rate for the cited systems when comparators are kept explicit?

## Evidence receipts

- `fact_id=multi_agent_systems/auto/2024/win_rate_205336` (`A_core`) — Finally, the experimental results show that our proposed confrontation strategy has a 72% higher win rate compared to the QMIX algorithm under asymmetric confrontation conditions. doi=10.1109/smc54092.2024.10832089
- `fact_id=multi_agent_systems/auto/2024/win_rate_205396` (`A_core`) — Empirical evaluations on SMAC environments demonstrate superior performance compared to baselines, achieving a higher win rate on 68% of test evaluations. doi=10.5555/3635637.3663141
- `fact_id=multi_agent_systems/auto/2022/win_rate_207382` (`A_core`) — The performance of the centralized architecture shows a solid improvement in 2s3z environment and achieves almost 70%win rate over the benchmark of 43%. doi=10.18178/ijmlc.2022.12.3.1084

## Context receipts

_Boundary evidence only; these receipts broaden source context but do not independently prove the lead claim._

- `fact_id=multi_agent_systems/auto/2026/win_rate_205465` (`A_core`) — Results show improved performance against a next-speaker prediction baseline (achieving a 72.13% win rate) and demonstrate effective group dynamics. doi=10.1609/aaai.v40i48.42120
- `fact_id=multi_agent_systems/auto/2023/win_rate_205101` (`A_core`) — The experiments demonstrate a maximum improvement in win rate of 47% over the best known algorithm. doi=10.1016/j.neunet.2023.02.037

## What this changes

Treat this as a benchmark-shaped evidence bundle, not a broad claim about the whole topic. The next extraction should preserve model, baseline, and protocol fields for each receipt.

## Limitations

- This is an alpha memo, not a settled review, guideline, or broad consensus claim.
- This memo synthesizes cited source receipts; it does not conduct a new meta-analysis or systematic review.
- Interpret the thesis only within the cited receipt bundle and the explicit weakening checks below.
- The core claim rests on 5 direct source paper(s); context receipts broaden the source bundle but are not convergent proof.
- Independent receipts fail to reproduce the claimed contrast.
- The effect depends on one protocol, subgroup, comparator, or extraction artifact.

## What would weaken this

- Independent receipts fail to reproduce the claimed contrast.
- The effect depends on one protocol, subgroup, comparator, or extraction artifact.

## Strongest counter-evidence

- `fact_id=multi_agent_systems/auto/2025/success_rate_205490` (`A_core`) — Extensive experiments demonstrate that our attack achieves an attack success rate exceeding 95% without degrading performance on benign tasks. Source: Collaborative Shadows: Distributed Backdoor Attacks in LLM-Based Multi-Agent Systems
metadata
{
  "article_type": "alpha_memo",
  "domain_slug": "ai_research",
  "researka_object_type": "submission",
  "researka_submission_id": "7f5c627b-92bc-44f9-a268-76c20d8e782b",
  "title": "Multi-agent reinforcement learning approaches achieve higher win rates than QMIX baselines in SMAC/StarCraft multi-agent combat environments"
}

view full chain →