Barun Saha

In addition to my professional capacity as a scientist, I also work on different personal projects related to Generative AI and Large Language Models. I have participated in some competitions and hackathons as well. Overall, I try to build a small solution or two that can help make our lives better.

CodePrompt

How does one measure the impact of GenAI? CodePrompt is a small, hand-crafted dataset built to assess the efficiency of code generation using the PaLM 2 LLM. CodePrompt consists of 30 coding problems in Python. These range from code generation to completion and troubleshooting errors.

paper dataset Try it →

(Banner generated using Bannerbear)

Slide Deck AI

We often spend a lot of time to create presentation slide decks. With SlideDeck AI, users and generative AI co-create a presentation slide deck in a few steps using Mistral 7B Instruction tuned. Previously, it used Llama 2, and had won 3rd place in the Llama 2 Hackathon with Clarifai.

hackathon Try it →

Gemini Senpai

Gemini Senpai is a small, experimental AI assistant prototype built using Gemini's function calling. Currently, Gemini Senpai allows users to generate Python code and small Python applications spanning multiple modules. Build software with an AI assistant, collaboratively.

project Try it →

Sys2Doc

As scientists and engineers, we often draw a lot of diagrams depicting systems, for example, architecture, state machines, and flow diagrams. However, writing their descriptions can often be tedious, but without which system documentation remains incomplete. With Sys2Doc, one can generate system documentation based on a given diagram of any system. Sys2Doc is powered by Gemini Pro Vision.

hackathon Try it →

RAG2Rich

Building an RAG app is easy. However, optimizing it is necessarily not. With RAG2Rich, one can identify the optimal parameters/configurations using answer "richness" score, which is evaluated based on the context relevance, answer relevance, and groundedness measures computed by TruLens. In other words, RAG2Rich offers a scientific approach toward optimizing Retrieval-Augmented Generation System. RAG2Rich is powered by Vertex AI and PaLM 2, among others.

hackathon Try it →

Poem2Pic

Poetry is food for the soul. On the other hand, an image is worth a thousand words. With Poem2Pic, one blends poetry with art. Poem2Pic enables the generation of an image based on a poem. In particular, Flan-T5, a large language model (LLM), is used to generate a very short summary of an input poem. The summary is then fed to Stable Diffusion in order to generate an image. The final image is displayed to the user.

hackathon Try it →

Datasets

NLU Dataset for IBNs

A synthetically generated dataset to train a Natural Language Understanding (NLU) model for developing Intent-based Networks (IBNs). A set of sample intents that capture some common network operations, such as creation of a flow between two endpoints, are provided. This dataset is primarily intended to be used with Rasa but can be used with any other framework as well.

paper dataset Get it →

বাংলা সাহিত্য (Bangla Sahitya or Bengali Literature)

A non-exhaustive collection of Bengali literary works (poems, stories, novels, songs, essays, letters) by more than 35 authors. These works span between the 8th and 20th centuries. All these works are in the public domain.

dataset Get it →

বাংলা নির্দেশাবলী (Bangla Nirdeshabali or Bengali Instructions)

A tiny question-answer dataset in Bengali, created from scratch. Bangla Nirdeshabali covers different topics, such as literature, culture, and geography.

dataset Get it →

Bengali Poems

A subset of the Bangla Sahitya dataset. It contains Bengali poems (public domain).

dataset Get it →

Generative AI Solutions

CodePrompt

Slide Deck AI

Gemini Senpai

Sys2Doc

RAG2Rich

Poem2Pic

Datasets

NLU Dataset for IBNs

বাংলা সাহিত্য (Bangla Sahitya or Bengali Literature)

বাংলা নির্দেশাবলী (Bangla Nirdeshabali or Bengali Instructions)

Bengali Poems