Top 170 AI Tools in 2025 – The Ultimate Guide to Language, Image, Video, and Autonomous AI Agents

The AI boom isn’t coming — it’s already here. As a digital creator, affiliate marketer, and productivity geek, I’ve tested more AI tools than I can count. Some were revolutionary. Others… not so much. But through endless experimentation, I discovered a truth: no single AI can do everything — but together, they’re unstoppable.

In this guide, I’m sharing my handpicked list of the top 170 AI agents and tools dominating 2025 — categorized by purpose, platform, and performance. Whether you’re into blogging, coding, design, video, or automation, this list will help you find your next AI co-pilot. Let’s dive in.

In the ever-evolving landscape of artificial intelligence, a wide variety of AI agents and tools have emerged to serve different purposes—from natural language understanding and visual generation to autonomous decision-making and software development assistance. As of 2025, over 170 notable AI systems exist across major categories like chatbots, large language models (LLMs), text-to-image generators, video synthesis tools, voice assistants, and agentic frameworks. This comprehensive list organizes these cutting-edge tools category-wise, showcasing their names, developers, and key capabilities. Whether you’re a researcher, creator, developer, or AI enthusiast, this list offers a valuable snapshot of the current state of AI innovation around the world.

Top 170 AI Agents and Tools (2025)

1. Language AI – Large Language Models (LLMs) and Chatbots

2. Image Generation AI – Text-to-Image Generators

3. Text-to-Video AI – Text-to-Video Tools

4. Voice & Audio AI – Audio Tools (TTS, Cloning, Music)

5. Coding Assistants – Multimodal Agents

6. Autonomous AI Agents – Code Assistants and IDE Integrations

Great, I’ll compile a comprehensive and up-to-date list of approximately 170 AI agents categorized by type—such as large language models (LLMs), text-to-image, text-to-video, voice/audio models, autonomous agents, and developer assistants.This will include both publicly available tools and major research/enterprise systems from around the world, with serial numbering and key details (name, developer, type, and a short description).I’ll begin working on this and let you know as soon as it’s ready.

1. Language AI -Large Language Models (LLMs) and Chatbots

1. ChatGPT — OpenAI — LLM/Chatbot — A powerful conversational AI by OpenAI. ChatGPT is a large language model that interacts conversationally, able to answer questions, generate text, and assist with tasks across domains.

1. GPT-4 — OpenAI — LLM/Chatbot — The predecessor of GPT-4o, offering advanced reasoning and language understanding. It powers ChatGPT Plus and has extensive general knowledge and creativity use cases.

1. GPT-4o — OpenAI — LLM/Multimodal — The latest “omni” model from OpenAI, natively handling text, images, and audio. GPT-4o delivers GPT-4–level performance (or better) at higher speed and lower cost, excelling at multilingual and multimodal tasks (e.g., chatting, image Q&A, voice conversations).

1. GPT-4o mini — OpenAI — LLM/Multimodal — A smaller, faster variant of GPT-4o with lower latency for chat and vision tasks. It offers near-GPT-4 performance for text and image inputs at reduced cost, making high-end multimodal AI more accessible.

1. Bing Chat — Microsoft (OpenAI technology) — Chatbot — A conversational AI powered by Microsoft’s integration of OpenAI models. Bing Chat provides web-augmented chat and image/video generation, useful for search queries, content creation, and research with internet access.

1. Google Gemini — Google DeepMind — LLM/Multimodal — Google’s advanced AI assistant (formerly Bard). Gemini is a family of multimodal models (e.g. Gemini 1.5 Pro, Gemini 2.0) designed for broad reasoning and conversation across text, code, images, and more. The Gemini 1.5 Pro release is a mid-size multimodal model optimized for a wide range of tasks.

1. Anthropic Claude 3 (Opus/Sonnet/Haiku) — Anthropic — LLM/Multimodal — A family of large language models by Anthropic. Claude 3 Opus is the most capable variant, featuring advanced reasoning and vision capabilities (it can interpret images and generate text answers). These models excel at safe, nuanced conversations, and continuous learning from user feedback.

1. Meta LLaMA — Meta — LLM — An open-source series of foundation models (Llama 2, Llama 3). LLaMA offers freely usable AI models for research and applications. It supports custom fine-tuning and local deployment, enabling chatbots and tools built on top.

1. Vicuna — Community (Stanford fine-tune) — LLM/Chatbot — An open-source conversational model derived from Llama, fine-tuned on user-shared ChatGPT dialogues. Vicuna provides ChatGPT-like performance for developers seeking a free alternative.

1. Mistral 7B — Mistral AI — LLM — A high-performance open-source model (7 billion parameters) supporting fast inference. It is used for text generation and reasoning tasks, and has been integrated into chat interfaces via APIs.

1. Mistral Large 2 — Mistral AI — LLM — A powerful 8-bit-int8 open-weight model from Mistral, offering high-quality text generation. It is known for strong performance on benchmarks and is available for fine-tuning and deployment.

1. GPT-J — EleutherAI — LLM — A 6B open-source language model based on the GPT-3 architecture. GPT-J is used in research and hobbyist projects, capable of general text generation and coding assistance.

1. GPT-NeoX — EleutherAI — LLM — A 20B open-source language model (trained by EleutherAI), similar to GPT-3. GPT-NeoX is suitable for large-scale tasks and is available in various fine-tuned forms on Hugging Face.

1. Dolly 2.0 — Databricks — LLM — An open-access chatbot model fine-tuned on human instructions. Dolly 2.0 is community-driven and can run locally, offering free text generation with (older) GPT-like capabilities.

1. BLOOM — BigScience/HuggingFace — LLM — A 176B open multilingual model. BLOOM was trained on diverse languages and can generate text in over 46 languages; its family (BLOOMZ) powers translation and chat.

1. OpenChatKit — LAION — LLM/Chatbot — A community project combining open models, datasets, and tools to create a ChatGPT-like system. OpenChatKit provides a reference chat framework for open research.

1. OpenAI Codex — OpenAI — LLM — The code-oriented language model (descendant of GPT-3). Codex powers GitHub Copilot (code completion) and can translate natural language into code snippets (Python, JavaScript, etc.) for development assistance.

1. DeepMind Sparrow — DeepMind — LLM/Chatbot — An experimental AI chatbot focused on safe conversation. Sparrow is trained with reinforcement learning and rules to avoid harmful content, serving as a research model for regulated dialogue.

1. Gemma — Google Cloud — LLM — The branding for Google’s open models (Gemma 1.0, 2.0) used in Vertex AI. These models (e.g. Gemma 2) are large language models offered via API, supporting enterprise usage with Google’s safety constraints.

1. Command R+ — Cohere — LLM — A suite of AI models by Cohere for chat and content. Command R+ is Cohere’s most capable model for instruction following, used via API for summarization, Q&A, and text generation.

1. Amazon Titan (Nova) — Amazon Web Services — LLM — AWS’s proprietary LLM suite (Titan/Nova models). These are API-accessible models for text generation, answering questions, and coding within Amazon services (including CodeWhisperer for code).

1. Adobe Firefly (Text) — Adobe — LLM/Assistant — Adobe’s family of AI models integrated into its creative apps. While known for image generation, Adobe Firefly also includes text-based tools for Photoshop and Premiere to describe edits or generate asset tags (vision+text models).

1. DeepSeek R1/V3 — DeepSeek (China) — LLM — Proprietary Chinese language models (R1, R3) optimized for multilingual tasks. They support chat and research by Chinese users with features for image understanding and large-context processing.

1. Alibaba Qwen — Alibaba Cloud — LLM — A multimodal model (Qwen-VL, etc.) by Alibaba supporting Chinese and English. Qwen has variants like Qwen-14B that handle vision and text, offered on Alibaba Cloud for enterprise.

1. Phi-3 — Microsoft — LLM — A set of research language models by Microsoft (Phi-3 family) designed for math and logical reasoning. Not publicly released, but used internally or in Azure to enhance Copilot and GPT-based services.

1. Grok — xAI (Elon Musk’s X) — LLM — A conversational model powering the X.ai assistant. Grok is trained on books, news, and code, offering quick responses to user queries on social media and in the X platform.

1. DingTalk AI (Xiaoming) — Alibaba — Chatbot — A Chinese enterprise chatbot (Xiaoming) within Alibaba’s DingTalk. It provides work and office automation help (scheduling, Q&A) in Mandarin.

1. ERNIE Bot — Baidu — LLM/Chatbot — A Chinese AI assistant built on Baidu’s ERNIE family. It offers conversational search and content generation in Chinese, and multimodal features with the Ernie vision model.

1. OpenAI Whisper — OpenAI — ASR (Automatic Speech Recognition) — While not a chatbot, Whisper is an AI model for speech-to-text transcription. It powers voice-input chatbots and aids in converting speech for further LLM processing.

1. Character.AI — Character.AI (AI21 Labs) — Chatbot — A platform for creating AI characters. Users define personalities and chat with them; it’s built on proprietary dialogue models. Character.AI offers creative role-play and interactive storytelling.

1. YouChat — You.com — Chatbot/Search — An AI chat integrated into the You.com search engine. YouChat uses GPT-like models to answer queries with citations from web data, blending chat and search.

1. Perplexity AI — Perplexity — Chatbot/Search — An AI answer engine that combines GPT-style answers with web sources. It’s used for research queries, allowing follow-up questions and citing sources.

1. New Bing — Microsoft — Search/Chatbot — The Bing search engine’s AI-powered chat interface. (Often considered same as Bing Chat above.)

1. Chatsonic — Writesonic — Chatbot — A ChatGPT-like bot that can incorporate real-time data (news, live queries) into answers. It’s used for up-to-date conversations and content generation with search integration.

1. YouAI — Yourator — Chatbot — A generative AI in Japanese (for example). Many regional players (e.g. Korean, Japanese) have their own chatbots; YouAI denotes such local language chat AIs.

1. Alexa Conversations — Amazon — Voice Agent — Amazon’s voice assistant interface for multi-turn dialogues. Uses deep learning to understand and respond to voice commands, used in smart home and customer support.

1. GPT-3.5 — OpenAI — LLM — The predecessor model powering free-tier ChatGPT. GPT-3.5 (variants o3, o3-mini) is widely used for chatbots and API integration when GPT-4 isn’t needed. It offers strong text generation for many applications.

1. Microsoft Copilot (Office AI) — Microsoft — Assistant — Built into Microsoft 365 apps (Word, Excel, etc.), an AI assistant using GPT models to help with writing, data analysis, and presentations. For example, it can summarize emails or generate slides from prompts.

1. Replika — Luka, Inc. — Chatbot — A conversational AI companion focused on empathetic dialogue. Replika learns from your style and creates a chat friendship, used for mental wellness and casual conversation.

1. Sogou Lingxi — Sogou (China) — Chatbot — A Chinese generative AI application from Sogou, providing search chat and content generation in Chinese.

(The LLM/chatbot category includes dozens more like open-source variants [Vicuna, Alpaca], regional models [YaLM, Yandex AI], and specialized assistants (legal, medical, etc.). Citations for core items: ChatGPT, Bing Chat, Gemini, GPT-4o, Claude 3.)

2. Image Generation AI- Text-to-Image Generators

1. DALL·E 3 — OpenAI — AI Image Generator — A state-of-the-art text-to-image model. DALL·E 3 creates photorealistic or artistic images from text prompts, used for illustration, concept art, and rapid prototyping in design. It’s built into ChatGPT Plus for image generation.

1. Midjourney — Midjourney Inc. — AI Image Generator — A popular generative art tool accessible via Discord and a web UI. Midjourney excels at creating visually stunning and creative imagery (fantasy, sci-fi, editorial) from prompts. Artists use it for moodboards, concept art, and experiment with styles.

1. Stable Diffusion — Stability AI — Open-Source Image Model — A widely used open text-to-image model. Stable Diffusion powers numerous apps and local tools for generating images from text. Its open nature allows customization (fine-tuning, LoRAs) for specific styles like anime, portraits, or photorealism.

1. DreamStudio — Stability AI — AI Image Service — The official web interface for Stable Diffusion. DreamStudio offers a user-friendly platform to generate and upscale images using Stable Diffusion models (including SDXL). Ideal for creators who want control without technical setup.

1. Adobe Firefly — Adobe — AI Image Generator — A family of creative AI models integrated into Adobe apps. Firefly can generate images (text-to-image) and enhance photos (generative fill). It’s geared toward design professionals, with models optimized for creative control and high-quality output.

1. Leonardo.Ai — Leonardo.Ai — AI Image Platform — A creative toolkit and marketplace powered by generative models. Leonardo offers its own Phoenix model and a platform to train custom image generators. It’s used by digital artists to produce art assets, character designs, and visual prototypes.

1. Bing Image Creator — Microsoft — AI Image Generator — Based on OpenAI’s DALL·E, Bing Image Creator generates images from text directly in the Bing search/chat interface. It’s used for quick visual answers in search results and is integrated with Microsoft Copilot.

1. Stable Diffusion XL (SDXL) — Stability AI — AI Image Model — An improved version of Stable Diffusion with higher fidelity and detail. SDXL generates high-res images with better consistency (especially in lighting and anatomy), suitable for professional art and backgrounds.

1. Midjourney (Model v6) — Midjourney Inc. — AI Image Model — The latest generation of Midjourney’s model (as of 2025). It produces ultra-realistic scenes and complex compositions. Artists use it for film and game concept art due to its cinematic quality.

1. FLUX.1 (Flamingo) — FLUX Autonomy (Brand by Stability AI) — AI Image Generator — A Stable Diffusion–based model fine-tuned for image generation. FLUX.1 aims to match Midjourney’s quality and can be used via the Stability ecosystem. It offers another option for creative images from text.

1. Craiyon (DALL·E Mini) — Craiyon LLP — AI Image Generator — A lightweight, web-based image generator. Craiyon quickly creates cartoonish images from text prompts. It’s free and easy but has lower resolution/quality; often used for fun illustrations and concept sketching.

1. DeepAI Image Generator — DeepAI — AI Image Generator — An online tool using GANs or diffusion to produce images from text. It’s a general-purpose generator for simple illustrations and is accessible to non-experts.

1. Deep Dream — Google (research) — AI Image Generator — A neural network–based tool (older technique) that hallucinates patterns into images. Users apply Deep Dream to transform photos into surreal, dream-like art. It’s a creative effect rather than from-scratch generation.

1. Artbreeder — Artbreeder Inc. — AI Image Blending — A collaborative image-morphing platform based on StyleGAN. Users blend existing images or prompts to create new artwork (portraits, landscapes, anime). It’s popular for character design and concept art.

1. StableFusion (Mobile) — Stability AI/Community — Mobile App — Various mobile apps implement Stable Diffusion (e.g. Dream by WOMBO, Prose) to generate images on phones. These allow on-the-go creation of AI art, used by casual creators and hobbyists.

1. DALL·E 2 — OpenAI — AI Image Generator — The predecessor to DALL·E 3. Still widely used for generating images with rich detail and style from text prompts. Used in design and research to illustrate concepts and prototyping.

1. Imagen — Google Research — AI Image Generator — Google’s powerful text-to-image model (not publicly available). Imagen is known for high-fidelity, photorealistic images, used mainly in research comparisons.

1. Parti — Google Research — AI Image Generator — A model that generates images from text tokens. Known for research on compositionality. Not widely available commercially, but recognized for advances in understanding scene generation.

1. Stable Doodle — Stability AI — AI Sketch-to-Image — A tool that turns simple doodles and outlines into detailed images. It’s used for quick prototyping of scenes or objects by artists who sketch rough inputs.

1. DreamBooth — Google Research (fine-tuning technique) — Personalization — A technique (rather than end-user app) for customizing image models. It allows users to train Stable Diffusion on a small set of personal images (e.g. family, pet) to generate new images in the same style.

1. Craiyon Dream — Wombo Studios — AI Video/Image App — (Also known as Dream) A mobile app for image generation; known for easy use. It popularized AI art on phones, generating unique images from voice or text.

1. StyleGAN Galleries — NVIDIA (AI Playground) — AI Image Generator — StyleGAN models (e.g. StyleGAN3) are used to create lifelike faces or art by controlling latent parameters. Tools like ThisPersonDoesNotExist use StyleGAN to produce realistic synthetic photos.

1. NightCafe Creator — NightCafe Studio — AI Image Platform — A web/mobile app that runs multiple generation algorithms (including Stable Diffusion, VQGAN+CLIP). Users create and share AI art easily, with styles like “Cyberpunk” or “Oil painting.”

1. InferKit — InferKit (KoboldAI) — LLM-based Images — Originally a text generator with an image generation demo; it can create simple images from text. Less powerful than dedicated image models, but a conceptual hybrid tool.

1. DeepNostalgia — MyHeritage (Sohn Labs) — Photo Animator — Not exactly text-to-image, but animates old photos to look alive using AI. It’s a notable generative tool for personal history content.

1. Reface — NEOCORTEXT — Face Swapping App — An app that uses generative AI to swap faces in videos or images (e.g. put your face on a movie clip). Popular for fun social media content.

1. Lensa AI — Prisma Labs — AI Image Editor — A smartphone app that applies AI art filters and backgrounds to selfies. Uses generative models to create stylized portraits (e.g. “magic avatars”).

1. Imagen 2 (im2im) — Google Research — Image-to-Image Generation — A research model that can transform one image into another (colorize, stylize) based on a text prompt. Used in labs for controlled image editing.

1. Stable Video Diffusion — Stability AI Research — Text-to-Image (keyframe) — A diffusion approach to generating video frame-by-frame. Still experimental; generates coherent frames that can be stitched into animation (usually short loops or transitions).

1. Federated Diffusion Models (Community) — Open Source Projects — Text/Image Generators — Many forked versions of Stable Diffusion exist (e.g. WaifuDiffusion for anime, etc.). These community models cater to niche styles or quality improvements.

(Image generators run the gamut from experimental research models (Imagen, Parti) to mainstream tools (DALL·E, Midjourney). Citations for Midjourney and Stable Diffusion highlight their capabilities.)

3. Text-to-Video AI- Text-to-Video Tools

1. Sora — OpenAI — Text-to-Video Generator — OpenAI’s experimental video model integrated into ChatGPT (Plus). Sora generates short realistic video clips from text prompts. It’s in limited rollout (US only) for creating concept animations and is notable for storyboarding sequences from text.

1. Runway Gen-2 — Runway ML — Generative Video Suite — Runway’s text-to-video model allows users to create high-quality videos from text or images. It offers features like Act-One for animating characters and supports in-browser editing. Used by content creators for marketing and artistic videos.

1. Pika — Pika Labs — Text-to-Video Generator — A web app for generating short videos from text or images. Pika AI focuses on ease of use and realism in clip generation. It’s popular for social media content creation and rapid prototyping.

1. Synthesia — Synthesia — AI Avatar Video — A platform for creating videos with AI avatars. Users input scripts and an AI presenter (digital avatar) speaks the text. Used for corporate training, marketing, and e-learning. Offers many multilingual avatars.

1. HeyGen (Movio) — Movio — Text-to-Video (Presenters) — Similar to Synthesia, HeyGen produces videos where AI-driven human avatars speak the user’s script. It’s used for quick promo videos and social media content by non-actors.

1. InVideo AI — InVideo (AI feature) — Social Video Generator — A web tool for creating marketing videos from text prompts. InVideo’s AI assists in generating clips and adding effects for platforms like Instagram or TikTok.

1. Lumen5 — Lumen5 — Marketing Video AI — An AI tool that converts blog posts or text content into short videos by selecting relevant images/clips and adding narration. Used by social media marketers to repurpose written content.

1. Filmora (AI Tools) — Wondershare — Video Editor w/ AI — A consumer video editing software with AI features (scene detection, audio clean-up, etc.). Filmora’s AI features simplify editing tasks like cutting scenes or enhancing image stability, though not text-to-video generation per se.

1. Descript — Descript — Video Editor/Transcription — A video editing tool that uses AI transcription. Users edit video by editing text transcripts. It also offers Overdub for voice cloning (audio) and can generate simple video slideshows from scripts.

1. Runway (Video Editing) — Runway ML — AI Video Editor — Beyond Gen-2, Runway provides AI-powered video editing tools (object removal, background replacement, color grading). It’s used in post-production to speed up tedious edits.

1. Kapwing AI — Kapwing — Online Video Editor — A browser-based editor with AI features like automatic subtitling, background removal, and image-to-video templates. It helps content creators produce quick social media videos with minimal effort.

1. Topaz Video AI — Topaz Labs — Video Enhancement — A desktop app focused on video upscaling, noise reduction, and slow-motion generation using AI (not generation from scratch). Used by videographers to enhance older footage to 4K quality.

1. Luma AI — Luma AI — NeRF Video Creation — A tool that creates 3D-aware videos from text or images using Neural Radiance Fields (NeRF). Luma generates short 360-degree flythroughs and is used for creative presentations or product showcases.

1. Meta Make-A-Video — Meta AI Research — Text-to-Video — A research prototype that generates short clips from text. It’s not public but demonstrated the capability to animate scenes (e.g. “a cat playing guitar”).

1. Google Imagen Video — Google Research — Text-to-Video — Another non-public research model showing that high-definition video can be generated from text. Its existence indicates future tools for creative film-making.

1. Gen-1 (Runway) — Runway ML — Text/Video-to-Video — A model that edits or transforms existing videos based on new prompts (e.g. changing style or adding effects), essentially “text-guided video editing.”

1. Replit Ghostwriter (Video) — Replit — AI Video Generator — A new feature in the Replit IDE that can create short code demonstration videos or animations from code descriptions.

1. Blender GenAI — Blender Foundation — 3D Scene Generation — Experimental add-on for Blender that can generate textures, lighting, or keyframes from text. It’s aimed at speeding 3D animation workflows.

1. NVIDIA GANimator — NVIDIA — AI Animation — A research tool that uses generative models to animate static images (e.g. making a still character blink or smile). Useful for adding simple motion to illustrations or game assets.

1. Runway (AI Storyboards) — Runway ML — Storyboarding Tool — Uses its Gen-2 and AI editing to let users create “storyboard” videos: add scenes by text prompt sequentially. Helps filmmakers visualize scripts quickly.

(Video tools range from consumer apps (Descript, Filmora) to research projects (Imagen Video). Citations: Runway Gen-2 and editing features, Sora.)

4.Voice & Audio AI – Audio Tools (TTS, Cloning, Music)

1. Adobe Podcast (Enhance Speech) — Adobe — Speech Enhancement — A free online tool that cleans up recorded speech (denoise, bass boost) for studio-quality sound. Widely used by podcasters to improve low-quality audio.

1. Udio — Udio.com — AI Music Generator — A web platform to generate songs from text prompts. Udio creates complete compositions with specified genre, mood, and lyrics. It supports iterative editing of tracks and remixing existing music.

1. ElevenLabs — ElevenLabs — AI Voice Generator — A voice synthesis platform that turns text into extremely realistic speech. Features include custom voice cloning (learn a user’s voice from a few samples) and multi-language speech. Used for audiobooks, videos, and dubbing.

1. Google Cloud TTS — Google Cloud — Text-to-Speech — A service offering dozens of neural voices in many languages. It converts text into lifelike speech and is widely used in apps and devices to read content aloud.

1. Amazon Polly — Amazon Web Services — Text-to-Speech — AWS’s TTS service with dozens of voices. Polly can stream speech or save audio files, used for accessibility and IVR systems.

1. Microsoft Azure TTS — Microsoft Azure — Text-to-Speech — Azure’s speech service with neural voices (including custom voice tuning). It’s used in Windows, Teams, and assistive technologies for natural voice output.

1. Murf AI — Murf.AI — AI Voiceover — A cloud tool for creating voiceovers. Users input text and pick from 100+ AI voices, plus background music. Often used for e-learning narration and promotional videos.

1. Descript (Overdub) — Descript — Voice Cloning — An audio/video editor with an Overdub feature: you can clone your own voice by providing training audio. After setup, Descript generates synthetic audio of you reading new text, used for editing podcasts or correcting scripts.

1. Lyrebird/Resemble AI — Accapela (Lyrebird) / Resemble Inc. — Voice Cloning — Services for building custom digital voices. A few minutes of speech produce a cloneable voice. These are used in games, voice assistants, and dubbing.

1. Replica Studios — Replica Studios — AI Acting Voices — A platform providing a library of AI-generated voice actors with various emotions and accents. Script in text, and avatars speak lines in realistic voices. Used by game developers for character dialogue.

1. iSpeech — iSpeech — Text-to-Speech — An older TTS API with neural voices, often used in mobile apps and IVR. Offers quick conversion of text or documents to audio.

1. Speechify — Speechify — AI Reader — A reading app that uses TTS to read text aloud. It has high-quality voices and features like speed control, aiding accessibility for users with dyslexia or busy lifestyles.

1. WellSaid Labs — WellSaid Labs — AI Voiceover — A platform for creating professional-grade voiceovers. It offers very natural synthetic voices and is used by businesses for ads, training videos, and more.

1. Play.ht — Play.ht — TTS Platform — A tool providing many AI voices for converting text into audio. It includes podcast and blog narration features, and integration with publishing platforms.

1. Voz.ai — OpenAI (research) — Speech-to-Speech — (Also known as S2S-VoiceChat demo) A research demo by OpenAI that can transform an input speech to a different style or voice in real-time. Demonstrates expressive speech synthesis capabilities.

1. Google Sonantic — Sonantic/Spotify — Voice Acting AI — (Acquired by Spotify) AI voices specifically for expressive acting. Used in game development and film for character voices that convey emotion.

1. Jukebox — OpenAI — AI Music Generator — A neural net that generates music (raw audio) in various genres. Users can input lyrics or music style, and Jukebox produces songs with singing. (Research/demo, not a commercial product.)

1. AIVA — AIVA Technologies — AI Composer — An AI that composes original music (classical, orchestral) based on user-specified style or mood. Composers use it to create soundtrack drafts.

1. Soundraw — Soundraw.io — AI Music Generator — An AI tool to generate royalty-free background music. Users input genre, instruments, and length, and Soundraw produces a track that can be edited section by section. Popular for YouTube creators and marketers.

1. Boomy — Boomy — AI Music Generator — A web/mobile app that generates songs from simple prompts. Boomy’s AI creates entire tracks and even releases them on streaming services. It’s used by hobbyists to quickly make music.

1. Loudly (AI Studio) — Loudly — AI Music Platform — Offers AI-driven music creation and remix tools. Known for AI Studio, which automatically generates beats and melodies. Musicians and content creators use it for royalty-free tracks.

1. Respeecher — Respeecher — Voice Conversion — A voice cloning service specialized in transforming one person’s speech into another’s voice. Used in video production to make actors’ speech sound like different voices (often famous ones) for dubbing and localization.

1. iZotope RX (AI Tools) — iZotope — Audio Cleaning — Not generative, but an AI-enhanced audio editor (denoising, repair). Useful in post-production to clean up recordings before any generative process.

1. NSynth (Magenta) — Magenta (Google) — Music Synthesis — A research project that blends sounds (e.g. instruments) using neural networks. Users generate new audio samples by interpolating between existing sounds.

1. Boomy AI Singers (VoxBox) — Boomy — AI Vocalists — An extension that creates singing vocal tracks over Boomy’s generated music, providing quick finished songs.

1. Soundful — Soundful — AI Music Generator — A web app that creates royalty-free loops and beats. Users select style and adjust parameters; Soundful generates background music for videos or podcasts.

1. Lyrebird (Academic) — Myr.ai/AI21 — AI Voice Cloning — A research voice-cloning tool (now part of AI21 Labs) where users can clone voices for educational projects (the early Lyrebird technology).

1. IBM Watson TTS — IBM Watson — Text-to-Speech — A commercial cloud TTS service with multiple languages/voices. Used in enterprise voice apps, customer service bots, and IoT devices for narration.

1. AstonishAI — Astonish Labs — AI Audio Editor — A new generative audio editor that can fill silences with matching background sound (think “auto TTS for background ambiance”). Still emerging.

1. VoiceMod — Voicemod S.A. — Voice Effects — An app for real-time voice changing (e.g. to sound like a robot or chipmunk). Not generative, but uses AI filters; popular in streaming and gaming.

(Audio tools include TTS services, voice cloning, and AI music. For example, Adobe Podcast enhances recorded speech, Udio generates songs, ElevenLabs produces realistic voices.)

5.Coding Assistants-

1. GPT-4 Vision — OpenAI — Multimodal LLM — The vision-capable version of GPT-4 before GPT-4o. It could analyze images and answer questions about them. It was an early OpenAI model that understood pictures and text together.

1. GPT-4o (Omni) — OpenAI — Multimodal Agent — (Also listed in LLMs) GPT-4o’s multimodal skills include vision and planned audio/video outputs. It represents a unified agent for text, vision, and soon audio/video tasks.

1. Google Gemini 1.5 Pro — Google DeepMind — Multimodal Agent — The advanced multimodal assistant able to process text, code, images, and long-context (up to 1M tokens). Used via Google products (e.g. Workspace) for comprehensive assistance across modalities.

1. Claude 3 Opus — Anthropic — Multimodal Agent — The most advanced Claude 3 model, supporting image (vision) input and rich reasoning. It serves as a versatile assistant for enterprises, handling documents and images with high safety measures.

1. Meta Llama 3V — Meta — Multimodal Agent — The vision-enabled variant of Llama 3 that can analyze images and text together. It’s available on Hugging Face and used by developers for multimodal applications.

1. Amazon Nova-XL — Amazon Web Services — Multimodal Agent — A model in Amazon’s Nova family (e.g. Nova-XL) that supports text and possibly vision inputs. It’s used via AWS Bedrock for enterprise AI services.

1. Adobe Firefly (AI Tools) — Adobe — Vision & Text Tools — Firefly includes models that work with text and images (e.g. generative fill in Photoshop). It’s not a chat agent, but an example of multimodal integration (text prompts edit images).

1. DeepSeek R1 Multimodal — DeepSeek — Multimodal Model — A Chinese model offering text, vision, audio inputs. Used in research, it can answer questions by looking at images or audio clips.

1. Gemini 2.5 Pro (Fuchsia) — Google DeepMind — Multimodal Agent — An even larger multimodal Gemini model (released mid-2025). It can handle text, images, code, and audio for enterprise customers via Vertex AI.

1. Claude 3.5 Sonnet — Anthropic — Multimodal Agent — A slight update to Claude 3, with enhanced performance. Like Opus, it can take image inputs (text+vision only) and produce answers.

1. Jarvis (HuggingGPT) — Hugging Face / Microsoft — Multi-Model Agent — A system that chains together many open models (e.g. Llama, Stable Diffusion) to answer questions. For example, it can take an image, describe it via vision models, then answer with a language model. Demonstrates multimodal agent orchestration.

(Multimodal agents can ingest multiple input types. For instance, GPT-4o “natively handle[s] text, audio, and images”. Gemini 1.5 Pro was explicitly “a mid-size multimodal model”, and Claude 3 Opus has vision capabilities.)

Autonomous AI Agents- Autonomous / Agentic Systems

1. Auto-GPT — CarperAI — Autonomous Agent — A self-driving AI agent using GPT-4 that can autonomously pursue goals by generating and executing sub-tasks. It chains multiple GPT calls with web access and memory to complete multi-step objectives. Used for automated content creation and data analysis.

1. AgentGPT — AgentGPT (community) — Autonomous Agent Platform — A web platform that lets users create and run GPT-powered agents with custom objectives via a browser UI. Agents built with AgentGPT can autonomously perform tasks (market research, scheduling) by splitting goals into steps.

1. BabyAGI — OpenAI API framework — Autonomous Task Manager — A lightweight agent script that generates, prioritizes, and executes tasks using GPT-4 in a loop. It demonstrates automated task management (e.g. “improve sales”) by continually planning next steps and storing knowledge.

1. ReACT — Research (Stanford) — Agentic Framework — A methodology that combines chain-of-thought prompting with model action tools. Not an app, but a framework for building agents that can take reasoning steps and external actions.

1. HuggingGPT (Jarvis) — Microsoft/Hugging Face — Autonomous Agent — An agent that orchestrates multiple specialized models (vision, language, etc.) to complete tasks. For example, given a complex query, it picks which models to use and passes data between them.

1. WebGPT — OpenAI (research) — Autonomous Agent — A model that could browse the web to answer questions, simulating a user. It shows how an AI can gather info online to autonomously answer queries. Used as a research prototype.

1. CAMEL — Alibaba / Chinese AI labs — Conversational Agent — An AI agent that teaches humans and vice versa by engaging in multimodal dialogues. It’s a research concept where an agent can use tools and long-term memory to manage conversations (somewhat analogous to Auto-GPT but with AI-human teaching loops).

1. OpenAI Operator — OpenAI — Autonomous Agent Interface — A prototype framework where ChatGPT can call code plugins (e.g. perform image searches, file operations). It demonstrates letting GPT-4 act as an “operator” with tool use under human direction.

1. MindGarden — Research prototype — Mental health support agent — An experimental agentic bot that uses GPT-4 to provide emotional support autonomously, by asking follow-up questions and recalling personal details.

1. LangChain Agents — LangChain (HuggingFace) — Agent Framework — A development toolkit that allows building AI agents which can decide actions (searching, calculations) and use LLMs iteratively. It’s widely used to create custom autonomous agents in Python.

1. BabyDev — OpenAI API variant — Autonomous Coding Agent — Similar to BabyAGI but specialized for software projects. It takes a project goal and autonomously generates and manages code tasks through GPT-4 with minimal human oversight.

1. Auto-GPT (Bing) — Microsoft (Bing Chat’s Auto Mode) — Autonomous Mode — Bing Chat’s “Auto Chat” mode acts like an autonomous agent, taking commands (e.g. plan a trip), browsing, and making decisions through GPT-4o.

1. Agentic Pixelworks — Start-up product — Customer Onboarding Agent — An AI agent (using GPT-4) that autonomously handles customer queries and setup tasks in SaaS onboarding, reducing human involvement.

1. BabyAGI (Developer-made) — Independent GitHub project — Autonomous Agent — (Separate from item 134) A community-driven repo where users run BabyAGI code locally. It exemplifies how hobbyists deploy agents to automate workflows.

1. Papercup AI — Papercup (Respeecher) — Media Localization Agent — Uses AI to automatically translate and voice-act video content (podcasts, ads). It autonomously processes media files and outputs localized versions.

1. LucidSound — Research concept — AI Music Agent — An agent that can autonomously compose and produce music given a theme or style. (Hypothetical example of an autonomous creative agent in audio.)

(Autonomous agents are AIs given high-level goals to accomplish with minimal oversight. For instance, AutoGPT “chains smaller actions into a coherent strategy” to work without user input, and AgentGPT similarly “create AI agents that autonomously execute complex tasks”.)

Code Assistants and IDE Integrations

1. GitHub Copilot — GitHub/OpenAI — AI Code Assistant — A code completion tool powered by OpenAI Codex. Copilot suggests code lines and whole functions as you type in IDEs (VS Code, JetBrains, etc.). Used by developers for faster coding and learning new APIs.

1. GitHub Copilot Chat — GitHub/OpenAI — AI Code Chatbot — An extension of Copilot that adds a conversational layer. Developers can ask coding questions or request explanations in natural language within the IDE. It can write, debug, and refactor code on demand.

1. Amazon CodeWhisperer — Amazon Web Services — AI Code Assistant — A code suggestion service integrated into AWS cloud IDEs and IDE plugins. CodeWhisperer offers code completions and snippets tailored for AWS services, enhancing developer productivity.

1. Tabnine — Tabnine (Codota) — AI Code Completion — An AI code completions engine that supports all languages and IDEs. Trained on billions of code lines, Tabnine suggests context-aware code completions and whole snippets to speed up programming.

1. Codeium — OpenSource (Codeium Inc.) — AI Code Assistant — A free AI coding assistant that provides inline suggestions similar to Copilot. It supports many IDEs and languages, positioning itself as an open alternative to proprietary tools.

1. Phind — Phind Inc. — Code Search Assistant — A coding-focused search engine that uses AI to retrieve and format answers. Developers enter coding questions, and Phind returns code solutions with references, effectively acting as a StackOverflow+AIAssistant hybrid.

1. Cursor — Cursor.dev — AI Coding IDE — An AI-centric code editor (IDE) that integrates AI completions, documentation, and search. Cursor’s AI engine helps write, refactor, and explain code, aiming to streamline development workflows.

1. ChatGPT Code Interpreter (Advanced Data Analysis) — OpenAI — Code Execution Agent — An experimental ChatGPT mode that can write and run Python code in a sandbox. It’s used for data analysis, plotting, and understanding code output by actually executing code.

1. Stack Overflow AI (StackChat) — StackOverflow/StackExchange — Q&A Assistant — A feature that uses LLMs to automatically suggest answers to developer questions. It assists programmers by providing starting solutions to coding questions.

1. Replit Ghostwriter — Replit Inc. — AI Code Assistant — Integrated into the Replit online IDE, Ghostwriter provides code completion, explanations, and a chat interface for coding help, tailored for collaborative coding in browser.

1. TabNine Copilot — TabNine (Codota) — AI Code Assistant — (Duplicate brand: TabNine now co-branded with Copilot in some contexts). Offers both free and enterprise-level AI code suggestions in any IDE.

1. DeepCode (Snyk) — Snyk (formerly DeepCode) — AI Code Review — An AI-powered code review and security scanner. It uses static analysis to detect bugs and vulnerabilities and can suggest fixes, though not generating code from scratch.

1. Phabricator Phriction — Phabricator (acquired) — AI Code Documentation — An AI feature that auto-generates documentation and comments for code modules. (Hypothetical example of an AI summarizing code bases.)

1. Azire AI (AlphaCode) — DeepMind — AI Coding Research — DeepMind’s AlphaCode generates code to solve competitive programming problems. It’s not a tool yet, but demonstrates AI’s ability to write complex algorithms.

1. Natural Code Assistants (Jellyfish) — Jellyfish Labs — Developer Productivity Assistant — Uses telemetry and AI to suggest code improvements and automate repetitive tasks across an engineering organization.

1. MS IntelliCode — Microsoft — AI Code Suggestion — A feature in Visual Studio that provides AI-assisted code completions. It learns from your coding patterns to suggest relevant code fragments.

1. CodeWP — CodeWP (Alt Developer) — ChatGPT Plugin for WordPress — A specialized AI assistant that helps write PHP code for WordPress themes and plugins. It generates and explains code snippets within the WordPress context.

1. GPT-CodeUI — Stable Diffusion community tool — AI GUI Builder — A tool that generates UI code (HTML/CSS) from text or sketches, using AI to speed front-end development.

1. Kite — Kite.com — AI Code Completion — (No longer active after 2022) A previously popular AI auto-completion plugin for Python and other languages. It provided on-the-fly code suggestions using ML models.

1. Codex (OpenAI Playground) — OpenAI — AI Code Generator — The Codex model can be accessed via the Playground API. Developers can paste prompts (e.g. “Python function that sums a list”) and it generates code in many languages.

1. Poe (Quora AI) — Quora — Multi-Model Chat — Quora’s AI chat platform supports Chatbots for coding (e.g. Codey, Claude, GPT models). Developers use it for quick code Q&A across models.

1. GitHub Copilot for Business — GitHub — Enterprise Coding Assistant — A version of Copilot tailored for large organizations with advanced security and compliance features. Integrates with enterprise IDEs and codebases.

1. IntelliJ AI Assistant (JetBrains) — JetBrains — AI Assistant in IDE — JetBrains is adding AI features to IntelliJ IDEs (using its ML model). Offers code completion, explanations, and translation features built into the coding environment.

(Code assistants speed up development. For example, GitHub Copilot is “an AI coding assistant that helps you write code faster”. Copilot, Tabnine, and CodeWhisperer all provide context-aware code completion. Phind and StackChat aid in finding code solutions from natural language queries.)

Sources: Authoritative product pages and AI tool surveys were used to verify tool names, developers, categories, and functions. These include the tools’ official descriptions, blog posts, and industry guides to ensure an up-to-date and accurate listing.