Building the autonomy layer for future healthcare systems
Published or cited by








Building the autonomy layer for future healthcare systems
Published or cited by




Where our team
comes from
Where our team
comes from
20+
20+
Patents Filed
10+
10+
Academic Papers Published
12
12
Papers Cited




































Featured publications and resources
Featured publications and resources



20 Jun 2025
20 Jun 2025
Whitepaper
Whitepaper
The Consensus Mechanism: Toward Trustworthy AI Collaboration
The Consensus Mechanism: Toward Trustworthy AI Collaboration
Defines Sully’s Consensus Mechanism, a protocol enabling AI agents to reach verifiable agreement through structured proposal-and-critique cycles, weighted scoring, and reputation tracking. This mechanism underpins transparent, multi-agent decision-making across healthcare, legal, and knowledge domains.



20 Jun 2025
20 Jun 2025
Whitepaper
Whitepaper
Scalable architecture for multi modal healthcare agents
Scalable architecture for multi modal healthcare agents
Introduces Sully’s SuperAgent architecture — a composable ecosystem of isolated, self-contained agent packages. Each agent supports multimodal input (voice, web, phone, SMS) and integrates with common authentication, billing, and access layers.



20 Jun 2025
20 Jun 2025
Whitepaper
Whitepaper
QnA benchmarks LLM performance analysis
QnA benchmarks LLM performance analysis
A large-scale benchmark across 12+ models and medical specialties. Finds O1 leading in overall accuracy (45%), with GPT-4.5-Preview dominating treatment tasks and O3-Mini excelling in lymphatic diagnosis. Recommends ensemble model use for real-world applications.

Research performance highlights
Research performance highlights
Improvement in clinical note quality with agentic workflows
17.3%
17.3%
Lower processing cost per note using optimized models
50%
50%
Decreased hallucinations using agents in real clinical practice
.0328%
.0328%
Increase in template compliance through automation
98%
98%
Active research topics
Active research topics
Partner with
Sully Labs
Work with our research team to design, test, and deploy
next-generation multi-agent AI.
Book a 30-min call
Partner with
Sully Labs
Work with our research team to design, test, and deploy
next-generation multi-agent AI.
Book a 30-min call
Partner with
Sully Labs
Work with our research team to design, test, and deploy next-generation multi-agent AI.
Book a 30-min call
Insights from our data
Insights from our data
Model strengths vary by medical domain
Benchmarking 12+ models across medical specialties revealed clear domain strengths: O1 excelled in general medical Q&A, O3-MINI led in lymphatic diagnosis, and GPT-4.5-Preview dominated treatment and endocrine system tasks.
Benchmarking 12+ models across medical specialties revealed clear domain strengths: O1 excelled in general medical Q&A, O3-MINI led in lymphatic diagnosis, and GPT-4.5-Preview dominated treatment and endocrine system tasks.



Agentic models deliver measurable gains
Agentic multi-agent workflows improved clinical note quality by 10–20% across five tested models — with GPT-OSS-120B achieving a 17.3% quality increase at half the baseline cost. This validates structured, multi-step reasoning pipelines for medical scribing tasks.
Agentic multi-agent workflows improved clinical note quality by 10–20% across five tested models — with GPT-OSS-120B achieving a 17.3% quality increase at half the baseline cost. This validates structured, multi-step reasoning pipelines for medical scribing tasks.



Press and Citations
Press and Citations

Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble
Despite the growing clinical adoption of large language models (LLMs), current approaches heavily rely on single model architectures. To overcome risks of obsolescence and rigid dependence on single model systems, we present a novel framework, termed the Consensus Mechanism. Mimicking clinical triage and multidisciplinary clinical decision-making, the Consensus Mechanism implements an ensemble of specialized medical expert agents enabling improved clinical decision making while maintaining robust adaptability. This architecture enables the Consensus Mechanism to be optimized for cost, latency, or performance, purely based on its interior model configuration.
Amit Kumthekar
20th Jun 2025

Y Combinator-Backed Sully.ai Outperforms Tech Giants in Healthcare AI Benchmarks, Revolutionizing Clinical Operations
Sully.ai, a Y Combinator-backed healthcare AI startup, announced breakthrough performance results that position its agentic team ahead of industry leaders OpenAI, Anthropic, and Google in healthcare-specific AI benchmarks. The company’s superhuman medical agents are transforming clinical operations by reducing administrative burdens while accelerating patient diagnosis and care delivery.
Brand Push
17th Jul, 2025

A spotlight on Sully.ai
How serial founders built a culture designed for former founders
All successful companies have at least one thing in common: they have effective people making decisions and getting things done. This sounds rather simple but if you ask any successful startup founder what is keeping them up at night—they will 99% of the time tell you something to do with people. Coordinating people. Hiring people. Empowering people.
Ben Lang
24th APR 2025

Y Combinator-Backed Sully.ai Outperforms Tech Giants in Healthcare AI Benchmarks, Revolutionizing Clinical Operations
Sully.ai, a Y Combinator-backed healthcare AI startup, announced breakthrough performance results that position its agentic team ahead of industry leaders OpenAI, Anthropic, and Google in healthcare-specific AI benchmarks. The company’s superhuman medical agents are transforming clinical operations by reducing administrative burdens while accelerating patient diagnosis and care delivery.
GetNews - TGAM
17th Jul 2025

Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble
Despite the growing clinical adoption of large language models (LLMs), current approaches heavily rely on single model architectures. To overcome risks of obsolescence and rigid dependence on single model systems, we present a novel framework, termed the Consensus Mechanism. Mimicking clinical triage and multidisciplinary clinical decision-making, the Consensus Mechanism implements an ensemble of specialized medical expert agents enabling improved clinical decision making while maintaining robust adaptability. This architecture enables the Consensus Mechanism to be optimized for cost, latency, or performance, purely based on its interior model configuration.
Amit Kumthekar
20th Jun 2025

Y Combinator-Backed Sully.ai Outperforms Tech Giants in Healthcare AI Benchmarks, Revolutionizing Clinical Operations
Sully.ai, a Y Combinator-backed healthcare AI startup, announced breakthrough performance results that position its agentic team ahead of industry leaders OpenAI, Anthropic, and Google in healthcare-specific AI benchmarks. The company’s superhuman medical agents are transforming clinical operations by reducing administrative burdens while accelerating patient diagnosis and care delivery.
Brand Push
17th Jul, 2025

A spotlight on Sully.ai
How serial founders built a culture designed for former founders
All successful companies have at least one thing in common: they have effective people making decisions and getting things done. This sounds rather simple but if you ask any successful startup founder what is keeping them up at night—they will 99% of the time tell you something to do with people. Coordinating people. Hiring people. Empowering people.
Ben Lang
24th APR 2025

Y Combinator-Backed Sully.ai Outperforms Tech Giants in Healthcare AI Benchmarks, Revolutionizing Clinical Operations
Sully.ai, a Y Combinator-backed healthcare AI startup, announced breakthrough performance results that position its agentic team ahead of industry leaders OpenAI, Anthropic, and Google in healthcare-specific AI benchmarks. The company’s superhuman medical agents are transforming clinical operations by reducing administrative burdens while accelerating patient diagnosis and care delivery.
GetNews - TGAM
17th Jul 2025

Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble
Despite the growing clinical adoption of large language models (LLMs), current approaches heavily rely on single model architectures. To overcome risks of obsolescence and rigid dependence on single model systems, we present a novel framework, termed the Consensus Mechanism. Mimicking clinical triage and multidisciplinary clinical decision-making, the Consensus Mechanism implements an ensemble of specialized medical expert agents enabling improved clinical decision making while maintaining robust adaptability. This architecture enables the Consensus Mechanism to be optimized for cost, latency, or performance, purely based on its interior model configuration.
Amit Kumthekar
20th Jun 2025

Y Combinator-Backed Sully.ai Outperforms Tech Giants in Healthcare AI Benchmarks, Revolutionizing Clinical Operations
Sully.ai, a Y Combinator-backed healthcare AI startup, announced breakthrough performance results that position its agentic team ahead of industry leaders OpenAI, Anthropic, and Google in healthcare-specific AI benchmarks. The company’s superhuman medical agents are transforming clinical operations by reducing administrative burdens while accelerating patient diagnosis and care delivery.
Brand Push
17th Jul, 2025

A spotlight on Sully.ai
How serial founders built a culture designed for former founders
All successful companies have at least one thing in common: they have effective people making decisions and getting things done. This sounds rather simple but if you ask any successful startup founder what is keeping them up at night—they will 99% of the time tell you something to do with people. Coordinating people. Hiring people. Empowering people.
Ben Lang
24th APR 2025

Y Combinator-Backed Sully.ai Outperforms Tech Giants in Healthcare AI Benchmarks, Revolutionizing Clinical Operations
Sully.ai, a Y Combinator-backed healthcare AI startup, announced breakthrough performance results that position its agentic team ahead of industry leaders OpenAI, Anthropic, and Google in healthcare-specific AI benchmarks. The company’s superhuman medical agents are transforming clinical operations by reducing administrative burdens while accelerating patient diagnosis and care delivery.
GetNews - TGAM
17th Jul 2025
The consensus mechanism
Second opinion matters: Towards adaptive clinical AI via
the consensus of expert model ensemble
The consensus mechanism
Second opinion matters: Towards adaptive clinical AI via
the consensus of expert model ensemble
The consensus mechanism
Second opinion matters: Towards adaptive clinical AI via the consensus of expert model ensemble
Resources
© Sully AI 2025. All Rights Reserved.
Epic is a registered trademark of Epic Systems Corporation.
Resources
© Sully AI 2025. All Rights Reserved.
Epic is a registered trademark of Epic Systems Corporation.
Resources
© Sully AI 2025. All Rights Reserved.
Epic is a registered trademark of Epic Systems Corporation.
