Externally organized talk - ChemChat | LLM-Powered Conversational Assistant for Material Science and Data Visualization
Tim Erdmann
IBM Research, San Jose

Mon., July 1, 2024, 3 p.m.
This seminar is held in presence.
Room: BAR I89

Linkedin


In recent decades, remarkable advancements have been made in the field of computational chemistry and machine learning (ML), yielding a plethora of sophisticated tools and artificial intelligence (AI) models. Despite their potential, these resources have yet to be fully harnessed due to their steep learning curves and their tendency to operate in isolation. Furthermore, the need for capabilities in programming and ML constitute access barriers to the targeted community – often experimental scientists. Concurrently, the advent of large-language models (LLMs) like (Chat)GPT has been revolutionizing various domains. Nevertheless, their efficacy in addressing chemistry-related challenges has been limited. Especially, these models lack knowledge about scientific workflows and the employed operations, such as in drug discovery or polymer material development, access to information sources providing up-to-date data, and the ability to accurately reference but tend to hallucinate in their responses. These circumstances cause questioning of credibility, trust, and applicability of LLMs in domain-specific fields. However, these short-comings can be overcome by establishing an LLM-powered conversational application equipped with task and domain-specific services and enabling the LLM to reason over their appropriate usage based on provided instructions. It can be anticipated that this will result in a significant increase in the utilization of the developed cheminformatic tools and AI models and contribute to the scientific discovery and education overall.
Here, we present ChemChat, a web application and conversational assistant with a chatbot-driven user interface that is powered by non-GPT/OpenAI LLMs. Through the integration of existing cheminformatics tools and expert-developed AI models such as PubChem, CIRCA, RDKit, GT4SD, RXN, MolFormer and other knowledge sources the application is capable of assisting chemist in tasks like (I) molecule investigation including identification, property calculation, prediction, and generation, (II) retrosynthesis, (III) data visualization, and (IV) literature research. Central to the talk will be demonstrating use case-specific capabilities in comparison to related applications and the architecture and workflow behind ChemChat.


Brief CV

Dr. Tim Erdmann holds a PhD in Polymer Chemistry from TU Dresden/CFAED (Cluster of Excellence ‘Center for Advancing Electronics Dresden’) with specialization in synthesis and characterization of semiconducting polymers and joined IBM Research end of 2017 through a Feodor Lynen Postdoctoral Research Fellowship of the Humboldt foundation. In early 2019 while working on conductive polymer-based sensors for VOCs, he discovered his passion for programming and since then followed a self-guided learning path while working with Dr. Jim Hedrick and the team on organocatalytic polymerizations in flow reactors, carbonate monomer synthesis, upcycling of CO2, and automated sol-gel synthesis partly involving AI model training. Since Spring 2023, Tim leads the project ChemChat, an LLM-powered and cloud-deployed conversational assistant for material science and data visualization. In parallel, he is also responsible for the Chemspeed synthesis robot which was installed in Spring 2024 and is planned to be integrated into a truly autonomous synthesis platform including AI optimization algorithms.
6:39 AM



Share
Externally organized talk - ChemChat | LLM-Powered Conversational Assistant for Material Science and Data Visualization
Tim Erdmann
IBM Research, San Jose

Mon., July 1, 2024, 3 p.m.
This seminar is held in presence.
Room: BAR I89

Linkedin


In recent decades, remarkable advancements have been made in the field of computational chemistry and machine learning (ML), yielding a plethora of sophisticated tools and artificial intelligence (AI) models. Despite their potential, these resources have yet to be fully harnessed due to their steep learning curves and their tendency to operate in isolation. Furthermore, the need for capabilities in programming and ML constitute access barriers to the targeted community – often experimental scientists. Concurrently, the advent of large-language models (LLMs) like (Chat)GPT has been revolutionizing various domains. Nevertheless, their efficacy in addressing chemistry-related challenges has been limited. Especially, these models lack knowledge about scientific workflows and the employed operations, such as in drug discovery or polymer material development, access to information sources providing up-to-date data, and the ability to accurately reference but tend to hallucinate in their responses. These circumstances cause questioning of credibility, trust, and applicability of LLMs in domain-specific fields. However, these short-comings can be overcome by establishing an LLM-powered conversational application equipped with task and domain-specific services and enabling the LLM to reason over their appropriate usage based on provided instructions. It can be anticipated that this will result in a significant increase in the utilization of the developed cheminformatic tools and AI models and contribute to the scientific discovery and education overall.
Here, we present ChemChat, a web application and conversational assistant with a chatbot-driven user interface that is powered by non-GPT/OpenAI LLMs. Through the integration of existing cheminformatics tools and expert-developed AI models such as PubChem, CIRCA, RDKit, GT4SD, RXN, MolFormer and other knowledge sources the application is capable of assisting chemist in tasks like (I) molecule investigation including identification, property calculation, prediction, and generation, (II) retrosynthesis, (III) data visualization, and (IV) literature research. Central to the talk will be demonstrating use case-specific capabilities in comparison to related applications and the architecture and workflow behind ChemChat.


Brief CV

Dr. Tim Erdmann holds a PhD in Polymer Chemistry from TU Dresden/CFAED (Cluster of Excellence ‘Center for Advancing Electronics Dresden’) with specialization in synthesis and characterization of semiconducting polymers and joined IBM Research end of 2017 through a Feodor Lynen Postdoctoral Research Fellowship of the Humboldt foundation. In early 2019 while working on conductive polymer-based sensors for VOCs, he discovered his passion for programming and since then followed a self-guided learning path while working with Dr. Jim Hedrick and the team on organocatalytic polymerizations in flow reactors, carbonate monomer synthesis, upcycling of CO2, and automated sol-gel synthesis partly involving AI model training. Since Spring 2023, Tim leads the project ChemChat, an LLM-powered and cloud-deployed conversational assistant for material science and data visualization. In parallel, he is also responsible for the Chemspeed synthesis robot which was installed in Spring 2024 and is planned to be integrated into a truly autonomous synthesis platform including AI optimization algorithms.
6:39 AM



Share