Humans & Robots
Reduce your workload..
Physical world & Digital Interaction
It’s an Interconnected world..
Virtual World & Security
Creating a Virtual World
Enhancing Physical world experience
Distributed Verifiable Digital Ledger
Secure the Cyber Space
Tech News
When testing an AI model, it’s hard to tell if it is reasoning or just regurgitating answers from its training data. Xbench, a new benchmark developed by the Chinese venture capital firm HSG, or HongShan Capital Group, might help to sidestep that issue. That’s thanks to the way it evaluates models not only on the ability to pass arbitrary tests, like most other benchmarks, but also on the ability to execute real-world tasks, which is more unusual. It will be updated on a regular basis to try to keep it evergreen. This week the company is making part of its […]
A large language model (LLM) deployed to make treatment recommendations can be tripped up by nonclinical information in patient messages, like typos, extra white space, missing gender markers, or the use of uncertain, dramatic, and informal language, according to a study by MIT researchers. They found that making stylistic or grammatical changes to messages increases the likelihood an LLM will recommend that a patient self-manage their reported health condition rather than come in for an appointment, even when that patient should seek medical care. Their analysis also revealed that these nonclinical variations in text, which mimic how people really communicate, […]
May 8 AI Codecon was a huge success. We had amazing speakers and content. We also had over 9,000 live attendees and another 12,000 who signed up to be able to view the content later on the O’Reilly learning platform. (Here’s a post with video excerpts and some of my takeaways.) So we’re doing it again. The next AI Codecon is scheduled for September 9. Our focus this time is going to be on agentic coding. Now I know that Simon Willison and others have derided “agentic” as a marketing buzzword, and that no one can even agree on what […]