AiA Feed
Filtered by #ai-agentsClear
R
Roni Itkin, Noam Issachar, Yehonatan Keypur, Yehonatan Keypur, Anpei Chen, Sagie Benaim

RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

RAD-2 combines diffusion-based trajectory generation with RL-optimized reranking for autonomous driving motion planning. The framework achieves 56% collision rate reduction through temporal consistency in policy optimization and structured feedback mechanisms. Real-world deployment validates improved safety and smoothness in urban traffic scenarios.See more

4dRoni Itkin, Noam Issachar, Yehonatan Keypur, Yehonatan Keypur, Anpei Chen, Sagie Benaim
R
Roni Itkin, Noam Issachar, Yehonatan Keypur, Yehonatan Keypur, Anpei Chen, Sagie Benaim

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation

DR^3-Eval is a new benchmark for evaluating deep research agents on complex multi-step tasks using a static corpus that simulates web complexity while remaining reproducible. It introduces a five-dimensional evaluation framework (Information Recall, Factual Accuracy, Citation Coverage, Instruction Following, Depth Quality) and reveals critical failure modes in retrieval robustness and hallucination control.See more

4dRoni Itkin, Noam Issachar, Yehonatan Keypur, Yehonatan Keypur, Anpei Chen, Sagie Benaim
N
Nivetha Purusothaman, Dr. William Cunningham

Agentic AI costs more than you budgeted. Here's why.

Agentic AI deployments often exceed budgets because teams focus on development costs while overlooking operating expenses: token usage, governance, evaluation infrastructure, security, and scaling all compound rapidly. Most enterprises don't model these hidden costs until they're already absorbing them in production. Accurate ROI requires forecasting the full total cost of ownership, not just initial build.See more

6dNivetha Purusothaman, Dr. William Cunningham
A
Artificial Intelligence News — Newsletter on Deep Learning & AI

AI Weekly Issue #484: Legal risks, robotics, and developer tools—quick hits

AI Weekly Issue 484 covers regulatory risk, robotics adoption, and developer tools. Key stories: AI chat logs can be used as legal evidence in court, creating compliance concerns; Chery's humanoid robot at $42K signals automotive's robotics pivot with pricing expected to halve within a year. Anthropic's Claude Code Routines gained strong developer adoption (686 HN points) for automating repetitive workflows.See more

4dArtificial Intelligence News — Newsletter on Deep Learning & AI
A
Anthony Ha

Hightouch reaches $100M ARR fueled by marketing tools powered by AI

Hightouch, a data integration platform for marketing teams, reached $100M in annual recurring revenue after growing ARR by $70M over just 20 months—fueled by its newly launched AI agent platform. The milestone reflects surging enterprise demand for AI-powered marketing automation. Hightouch's rapid growth demonstrates how AI-driven tools are becoming essential infrastructure for modern marketing operations.See more

5dAnthony Ha
A
Anthony Ha

Gitar, a startup that uses agents to secure code, emerges from stealth with $9 million

Gitar, a startup using AI agents to review code—including code generated by AI systems—emerged from stealth with $9 million in funding. The company tackles the growing challenge of securing AI-generated code alongside traditionally-written code. As AI code generation becomes more prevalent in development workflows, automated agent-based review solutions offer timely security assessment.See more

5dAnthony Ha
AiA Feed · Generated with AI, which can make mistakes.