Chat with AI Models
On Your Android Device

Connect to your Ollama server or run models fully on-device — Gemma 4, Qwen 3, DeepSeek R1 Distill and Phi-4 Mini all powered by Google LiteRT-LM. Real-time streaming, no data leaves your phone when you go local.

Powerful Features

Everything you need for seamless AI conversations

📲

On-device AI with LiteRT-LM

Download and run Gemma 4, Gemma 3, Qwen 3, DeepSeek R1 Distill, and Phi-4 Mini directly on your phone via Google's LiteRT-LM runtime. No server, no network, no data leaving the device.

💬

Intuitive Chat Interface

Experience smooth, real-time conversations with AI models through our beautifully designed Material Design 3 interface.

🖼️

Image Support

Share images with vision-capable AI models! Smart compression automatically optimizes images for fast, efficient conversations.

📚

Persistent Chat History

Never lose a conversation! All your chats are automatically saved locally using our secure Room database.

⚙️

Flexible Configuration

Connect to multiple Ollama servers, switch between AI models, and customize model parameters to your needs.

🎨

Modern & Beautiful UI

Built with Jetpack Compose and Material Design 3, featuring smooth animations and full dark theme support.

🔒

Privacy-First

Your conversations stay on your device. We use local database storage, so your chat history never leaves your phone.

See It In Action

Beautiful screenshots from the app

Chat Threads

Chat Threads

Text Chat

Text Chat

Image Chat

Image Chat

Model Selection

Model Selection

Server Management

Server Management

Thread Settings

Thread Settings

Frequently Asked Questions

Everything you need to know

Ollama AI Chat is an Android application that connects to your Ollama server, enabling you to chat with various AI models like Llama, Mistral, and more directly from your Android device. It features a modern Material Design 3 interface with real-time streaming responses.

No — as of the latest release you can also run models fully on-device using Google's LiteRT-LM runtime, no Ollama server required. Just add a LiteRT (on-device) backend from the Servers screen, download one of the built-in bundles, and chat. If you prefer remote inference, Ollama is still fully supported — visit ollama.ai to learn more.

The Models screen ships with a curated catalog of .litertlm bundles pulled from the public litert-community Hugging Face organization: Gemma 3 270M (~304 MB), Gemma 3 1B (~584 MB), Qwen 3 0.6B (~614 MB), Qwen 2.5 1.5B Instruct (~1.6 GB), DeepSeek R1 Distill Qwen 1.5B (~1.83 GB), Gemma 4 E2B (~2.58 GB), Gemma 4 E4B (~3.65 GB), and Phi-4 Mini Instruct (~3.91 GB). Downloads resume after network drops and a free-space check runs before each pull.

The app requires Android 13 (API Level 33) or higher. Make sure your device is running a compatible Android version before installing.

Absolutely! Your conversations stay on your device. We use local database storage (Room database), so your chat history never leaves your phone unless you choose to share it. The app supports both HTTP and HTTPS connections for secure networking.

Yes! You can connect to multiple Ollama servers and switch between different AI models instantly. The app allows you to manage models, pull new ones, and delete models you no longer need.

Yes! You can attach and send images in conversations with vision-capable AI models. The app automatically compresses images to optimize performance while maintaining quality, ensuring fast and efficient conversations.

Yes! The app is open source and licensed under the MIT License. You can view the source code, contribute, and report issues on GitHub. We welcome contributions!

Simply enter your Ollama server URL (e.g., http://192.168.1.100:11434) in the app settings. The app supports both HTTP and HTTPS connections. You can add multiple servers and switch between them as needed.

Ready to Get Started?

Download Ollama AI Chat and start chatting with AI models on your Android device today!

Requires Android 13 (API 33) or higher