Derivation Web · source_287425208aca4a68

source · text/markdown

source_287425208aca4a68

sha256 bbc924060864450228bbae70d3075e197195cfa53cfb342747ac827bfb8cc24e

by researka:v2 · 2026-06-23 21:36:31.107645+04:00

**Selected angle:** `source`

## One-sentence thesis

Across 5 independently cited sources, the evidence converges on one bounded claim: multi-agent systems improve accuracy over baselines across diverse multi-agent accuracy task domains. Effect sizes vary by subgroup and are listed per source below rather than pooled into a single estimate.


**Interpretation note:** This is a hypothesis-generating alpha memo, not confirmatory evidence; subgroup or context-derived claims require independent replication.

## Why this is surprising

The signal is bounded to multi agent systems success rate tasks success rate: the receipts are comparable because they share the benchmark/task/metric shape, even though individual systems may differ.

## Evidence Landscape

**Bounded research question:** Do independent direct receipts on multi agent systems success rate tasks continue to support a signal on success rate for the cited systems when comparators are kept explicit?

## Evidence receipts

- `fact_id=multi_agent_systems/auto/2025/success_rate_205290` (`A_core`) — Experimental results show that our trust-aware framework achieves a 87.4% task success rate, reducing execution time by 36.3% compared to non-trust-based methods, while maintaining 43.2% lower communication overhead. doi=10.1109/eiecc67963.2025.11409558
- `fact_id=multi_agent_systems/auto/2025/success_rate_321377` (`A_core`) — Experimental results show that our trust-aware framework achieves a 87.4% task success rate, reducing execution time by 36.3% compared to non-trust-based methods, while maintaining 43.2% lower communication overhead. doi=10.20944/preprints202512.2748.v1
- `fact_id=multi_agent_systems/auto/2026/success_rate_205531` (`A_core`) — Experimental results demonstrate a 25.6% improvement in task success rate and a 30.2% reduction in communication overhead compared to fixed communication protocols. doi=10.66238/fsrma54
- `fact_id=multi_agent_systems/auto/2025/success_rate_205294` (`A_core`) — Experimental results revealed that our method achieves a high 92.5% conflict-free success rate, with only a 7.49% performance gap compared to the centralized Hungarian method, while outperforming the heuristic decentralized baseline based o doi=10.1109/iv64158.2025.11097641
- `fact_id=multi_agent_systems/auto/2026/success_rate_205532` (`A_core`) — Experimental results show that the proposed method improves task success rate from 71.3% to 84.6% and reduces decision latency by 23.5% compared to static prompt-based agents. doi=10.71465/ajml3665

## Context receipts

_Boundary evidence only; these receipts broaden source context but do not independently prove the lead claim._

- `fact_id=multi_agent_systems/auto/2025/success_rate_207371` (`A_core`) — Experimental validation on 108 optimization problems demonstrates a 79.6% success rate compared to 13% for offline methods alone, achieving significant efficiency with an average of 4.56 iterations and 57.7s per problem. doi=10.1109/peas66638.2025.11403728

## What this changes

Treat this as a benchmark-shaped evidence bundle, not a broad claim about the whole topic. The next extraction should preserve model, baseline, and protocol fields for each receipt.

## Limitations

- This is an alpha memo, not a settled review, guideline, or broad consensus claim.
- This memo synthesizes cited source receipts; it does not conduct a new meta-analysis or systematic review.
- Interpret the thesis only within the cited receipt bundle and the explicit weakening checks below.
- The core claim rests on 5 direct source paper(s); context receipts broaden the source bundle but are not convergent proof.
- Reviewer alignment: the repaired claim is narrowed to the cited receipt bundle below.
- Independent receipts fail to reproduce the claimed contrast.
- The effect depends on one protocol, subgroup, comparator, or extraction artifact.

## What would weaken this

- Independent receipts fail to reproduce the claimed contrast.
- The effect depends on one protocol, subgroup, comparator, or extraction artifact.

## Strongest counter-evidence

- _No direct opposing receipt was selected by this run. Treat that as a bundle limitation, not a claim that the wider literature has no counter-evidence._

metadata

{
  "article_type": "alpha_memo",
  "domain_slug": "ai_research",
  "researka_object_type": "submission",
  "researka_submission_id": "fe9abf1c-f6c6-44f9-8f6c-6c041cf6f1dc",
  "title": "Multi-agent systems improve accuracy over baselines across diverse multi-agent accuracy task domains"
}

view full chain →