Meet the team developing an open source ChatGPT alternative

by Ana Lopez

At risk to say the obvious, AI-powered chatbots are hot right now.

The tools, which can write essays, emails, and more with a few text-based instructions, have caught the attention of tech enthusiasts and enterprises alike. OpenAI’s ChatGPT, perhaps its predecessor, has a estimated more than 100 million users. Through an API, brands like Instacart, Quizlet, and Snap have started building it into their respective platforms, further driving usage numbers.

But to the chagrin of some in the developer community, the organizations that build these chatbots remain part of a well-funded, well-equipped, and exclusive club. Anthropic, DeepMind and OpenAI – all of which have deep pockets – are among the few who have managed to develop their own modern chatbot technologies. In contrast, the open source community has been hampered in its efforts to create one.

That’s largely because training the AI ​​models that underlie the chatbots requires a huge amount of processing power, not to mention a large training dataset that must be painstakingly assembled. But a new, loosely connected group of researchers calling themselves Together strive to overcome those challenges to be the first to open a ChatGPT-like system.

Together, progress has already been made. Last week it released trained models that any developer can use to create an AI-powered chatbot.

“Together is building an accessible platform for open foundation models,” Vipul Ved Prakash, the co-founder of Together, told in an email interview. “We think of what we build as part of AI’s ‘Linux moment’. We want to enable researchers, developers and companies to use and improve open source AI models with a platform that brings together data, models and calculations.”

Prakash previously co-founded Cloudmark, a cybersecurity startup that bought Proofpoint in 2017 for $110 million. After Apple acquired Prakash’s next venture, social media search and analytics platform Topsy, in 2013, he remained as a senior director at Apple for five years before leaving to start together.

Over the weekend, Together rolled out its first major project, OpenChatKit, a framework for creating both specialized and general AI-powered chatbots. The kit, available on GitHub, includes the aforementioned trained models and an “expandable” retrieval system that allows the models to pull information (e.g. current sports scores) from various sources and websites.

The basic models came from EleutherAI, a non-profit group of researchers who research text-generating systems. But they’ve been refined using Together’s computing infrastructure, Together Decentralized Cloud, which pools hardware resources, including GPUs from Internet volunteers.

“Together, we developed the source repositories that allow anyone to replicate the model results, fine-tune their own model or integrate a retrieval system,” Prakash said. “Also developed documentation and community processes together.”

In addition to the training infrastructure, Together collaborated with other research organizations, including LAION (which helped develop Stable Diffusion) and technologist Huu Nguyen’s Chord to create a training dataset for the models. Called the Open Instruction Generalist datasetthe dataset contains more than 40 million sample questions and answers, follow-up questions, and more designed to “teach” a model how to respond to various instructions (e.g., “Write an outline for a Civil War history paper”).

To ask for feedback, Together has a demo that anyone can use to interact with the OpenChatKit models.

“The main motivation was to enable everyone to use OpenChatKit to improve the model and create more task-specific chat models,” added Prakash. “While large language models have shown an impressive ability to answer general questions, they tend to achieve much higher accuracy when tailored for specific applications.”

Prakash says the models can perform a range of tasks, including solving basic high school math problems, generating Python code, writing stories and summarizing documents. So how well do they stand up to testing? Good enough, in my experience — at least for basic things like writing plausible-sounding cover letters.


OpenChatKit can write application letters, among other things. Image Credits: OpenChatKit

But there is a very clear limit. Keep chatting with the OpenChatKit models long enough and they start running into the same problems that ChatGPT and other recent chatbots have, such as parroting false information. I had the OpenChatKit models give a contradictory answer about, say, whether the Earth was flat, and a flat-out false statement about who won the 2020 US presidential election.


OpenChatKit, (incorrectly) answering a question about the 2020 US presidential election. Image Credits: OpenChatKit

The OpenChatKit models are weak in other less alarming areas, such as context switching. If you change the subject in the middle of a conversation, they often get confused. Nor are they particularly adept at creative writing and coding, sometimes repeating their answers endlessly.

Prakash blames the training dataset, which he notes is being actively worked on. “It’s an area we’ll continue to improve and we’ve designed a process for the open community to actively participate in,” he said, referring to the demo.

The quality of OpenChatKit’s responses can leave something to be desired. (To be fair, ChatGPTs aren’t dramatically better depending on the prompt.) But Together is be proactive – or at least to attempt to be proactive – in terms of moderation.

While some chatbots along the lines of ChatGPT can be pushed to write biased or hateful text because of their training data, some of which comes from toxic sources, the OpenChatKit models are more difficult to force. I managed to get them to write a phishing email, but they didn’t get sucked into more controversial territory, like endorsing the Holocaust or justifying why men are better CEOs than women.


OpenChatKit uses some moderation, as seen here. Image Credits: OpenChatKit

However, moderation is an optional feature of the OpenChatKit – developers are not required to use it. While one of the models is designed “specifically as a crash barrier” for the other, larger model — the one powering the demo — no filtering is applied to the larger model by default, Prakash said.

That’s different from the top-down approach favored by OpenAI, Anthropic, and others, which involves a combination of human and automated API-level moderation and filtering. Prakash argues that this opacity behind closed doors could be more damaging in the long run than the lack of a mandatory filter in OpenChatKit.

“Like many dual-use technologies, AI can certainly be used in malicious contexts. This applies to open AI or closed systems commercially available through APIs,” Prakash said. We believe that a world in which the power of large generative AI models is solely in the hands of a handful of large technology companies, unable to control, inspect or understand, poses a greater risk entails.”

To underscore Prakash’s point about open development, OpenChatKit includes a second training dataset called OIG Moderation, which aims to address a range of chatbot moderation challenges, including bots that take on overly aggressive or depressive tones. (To see: BinChat.) It was used to train the smaller of the two models in OpenChatKit, and Prakash says OIG moderation can be applied to create other models that detect and filter out problematic text if developers choose to do so.

“We care deeply about AI security, but we believe security through obscurity is a bad approach in the long run. An open, transparent attitude is widely accepted as the default attitude in the world of computer security and cryptography, and we believe transparency will be critical if we are to build secure AI,” said Prakash. “Wikipedia is a great testament to how an open community can be a great solution to challenging moderation tasks at scale.”

I’m not so sure. For starters, Wikipedia isn’t exactly the gold standard — the site’s moderation process is famously opaque and territorial. Then there’s the fact that open source systems are often (and quickly) abused. Taking the image-generating AI system Stable Diffusion as an example, within days of its release, communities like 4chan were using the model — which also includes optional moderation tools — to create non-consensual pornographic deepfakes of famous actors.

The OpenChatKit license prohibits explicit uses such as generating misinformation, promoting hate speech, spamming, and cyberbullying or harassment. But there’s nothing to stop adversaries from ignoring both these terms and moderation tools.

Ahead of the worst, some researchers have started sounding the alarm about open-access chatbots.

NewsGuard, a company that tracks online disinformation, found in a recent study that newer chatbots, especially ChatGPT, can be induced to write content that advances harmful health claims about vaccines, mimics propaganda and disinformation from China and Russia, and repeats the tone of partisan news outlets. According to the study, ChatGPT complied in about 80% of the cases when asked to write answers based on incorrect and misleading ideas.

In response to NewsGuard’s findings, OpenAI has improved ChatGPT’s backend content filters. Of course, that wouldn’t be possible with a system like OpenChatKit, which places the responsibility of keeping models up to date on developers.

Prakash stands by his argument.

“Many applications require customization and specialization, and we believe an open-source approach will better support a healthy diversity of approaches and applications,” he said. “The open models are getting better and we expect a strong increase in adoption.”

Related Posts