As Media Lab students in 2010, Karthik Dinakar SM ’12, PhD ’17 and Birago Jones SM ’12 teamed up for a class project to build a tool that would help content moderation teams at companies like Twitter (now X) and YouTube. The project generated a huge amount of excitement, and the researchers were invited to give a demonstration at a cyberbullying summit at the White House — they just had to get the thing working.

The day before the White House event, Dinakar spent hours trying to put together a working demo that could identify concerning posts on Twitter. Around 11 p.m., he called Jones to say he was giving up.

Then Jones decided to look at the data. It turned out Dinakar’s model was flagging the right types of posts, but the posters were using teenage slang terms and other indirect language that Dinakar didn’t pick up on. The problem wasn’t the model; it was the disconnect between Dinakar and the teens he was trying to help.

“We realized then, right before we got to the White House, that the people building these models should not be folks who are just machine-learning engineers,” Dinakar says. “They should be people who best understand their data.”

The insight led the researchers to develop point-and-click tools that allow nonexperts to build machine-learning models. Those tools became the basis for Pienso, which today is helping people build large language models for detecting misinformation, human trafficking, weapons sales, and more, without writing any code.

“These kinds of applications are important to us because our roots are in cyberbullying and understanding how to use AI for things that really help humanity,” says Jones.

Related work from others:  Latest from MIT : AI model identifies certain breast tumor stages likely to progress to invasive cancer

As for the early version of the system shown at the White House, the founders ended up collaborating with students at nearby schools in Cambridge, Massachusetts, to let them train the models.

“The models those kids trained were so much better and nuanced than anything I could’ve ever come up with,” Dinakar says. “Birago and I had this big ‘Aha!’ moment where we realized empowering domain experts — which is different from democratizing AI — was the best path forward.”

A project with purpose

Jones and Dinakar met as graduate students in the Software Agents research group of the MIT Media Lab. Their work on what became Pienso started in Course 6.864 (Natural Language Processing) and continued until they earned their master’s degrees in 2012.

It turned out 2010 wasn’t the last time the founders were invited to the White House to demo their project. The work generated a lot of enthusiasm, but the founders worked on Pienso part time until 2016, when Dinakar finished his PhD at MIT and deep learning began to explode in popularity.

“We’re still connected to many people around campus,” Dinakar says. “The exposure we had at MIT, the melding of human and computer interfaces, widened our understanding. Our philosophy at Pienso couldn’t be possible without the vibrancy of MIT’s campus.”

The founders also credit MIT’s Industrial Liaison Program (ILP) and Startup Accelerator (STEX) for connecting them to early partners.

One early partner was SkyUK. The company’s customer success team used Pienso to build models to understand their customer’s most common problems. Today those models are helping to process half a million customer calls a day, and the founders say they have saved the company over £7 million pounds to date by shortening the length of calls into the company’s call center.

Related work from others:  Latest from MIT : How well do explanation methods for machine-learning models work?

The difference between democratizing AI and empowering people with AI comes down to who understands the data best — you or a doctor or a journalist or someone who works with customers every day?” Jones says. “Those are the people who should be creating the models. That’s how you get insights out of your data.”

In 2020, just as Covid-19 outbreaks began in the U.S., government officials contacted the founders to use their tool to better understand the emerging disease. Pienso helped experts in virology and infectious disease set up machine-learning models to mine thousands of research articles about coronaviruses. Dinakar says they later learned the work helped the government identify and strengthen critical supply chains for drugs, including the popular antiviral remdesivir.

“Those compounds were surfaced by a team that did not know deep learning but was able to use our platform,” Dinakar says.

Building a better AI future

Because Pienso can run on internal servers and cloud infrastructure, the founders say it offers an alternative for businesses being forced to donate their data by using services offered by other AI companies.

“The Pienso interface is a series of web apps stitched together,” Dinakar explains. “You can think of it like an Adobe Photoshop for large language models, but in the web. You can point and import data without writing a line of code. You can refine the data, prepare it for deep learning, analyze it, give it structure if it’s not labeled or annotated, and you can walk away with fine-tuned, large language model in a matter of 25 minutes.”

Related work from others:  Latest from MIT Tech Review - The industrial metaverse: A game-changer for operational technology

Earlier this year, Pienso announced a partnership with GraphCore, which provides a faster, more efficient computing platform for machine learning. The founders say the partnership will further lower barriers to leveraging AI by dramatically reducing latency.

“If you’re building an interactive AI platform, users aren’t going to have a cup of coffee every time they click a button,” Dinakar says. “It needs to be fast and responsive.”

The founders believe their solution is enabling a future where more effective AI models are developed for specific use cases by the people who are most familiar with the problems they are trying to solve.

“No one model can do everything,” Dinakar says. “Everyone’s application is different, their needs are different, their data is different. It’s highly unlikely that one model will do everything for you. It’s about bringing a garden of models together and allowing them to collaborate with each other and orchestrating them in a way that makes sense — and the people doing that orchestration should be the people who understand the data best.”

Share via
Copy link
Powered by Social Snap