run llama on smartphone

Run Llama 3.2 3B on Your Smartphone: PocketPal AI Brings Advanced AI Models to iOS and Android

The PocketPal AI app has made a significant stride by integrating the Llama 3.2 3B model (Q4_K_M GGUF variant) for both iOS and Android devices. This update allows users to run advanced AI models directly on their smartphones, offering a new level of accessibility and functionality.

Key Features and Updates

  • Model Integration: The app now includes the Llama 3.2 3B model in the Q4 variant. The Q8 version is not included by default due to potential throttling issues, but users with sufficient device memory can import the GGUF file as a local model.
  • User Interface Enhancements: Based on user feedback, the app’s UI has been improved. Tabs have been renamed to “Downloaded” and “Available Models” to make the interface more intuitive.
  • Performance Metrics: Users have reported impressive performance metrics. One user achieved 11 tokens/sec, while another shared CPU usage details on an iPhone 14 running iOS 18.0. A user successfully ran a Mistral Nemo 12B model in Q4K on their 12GB RAM smartphone.

Technical Details

Run llama on phone
  • Inference Engine: The app uses llama.cpp for inference and llama.rn for React Native bindings.
  • Platform Compatibility: Currently, the app uses CPU on Android. The developer has hinted at the possibility of open-sourcing the app in the future.

User Feedback and Future Developments

The developer has acknowledged user feedback positively and is committed to making further improvements. The potential for open-sourcing the app in the future opens up exciting possibilities for community contributions and enhancements.

How to Get Started

  1. Download PocketPal AI: Available on both the App Store and Google Play Store. Playstore link – https://play.google.com/store/apps/details?id=com.pocketpalai and app store link – https://apps.apple.com/us/app/pocketpal-ai/id6502579498
  2. Select the Model: Choose the Llama 3.2 3B model from the available options.
  3. Import GGUF File (if needed): Users with sufficient device memory can import the GGUF file as a local model, ensuring to select the “llama32” chat template.

Conclusion

The PocketPal AI app is revolutionizing how users interact with advanced AI models on their smartphones. With continuous improvements and a commitment to user feedback, the app is poised to become a go-to solution for running AI models on mobile devices.


By integrating the Llama 3.2 3B model, PocketPal AI is setting new standards for mobile AI applications. Stay tuned for more updates and enhancements as the app continues to evolve.

As a software engineer passionate about AI and emerging technologies, I specialize in breaking down complex concepts and industry developments into practical insights. My blog delivers the latest AI tech news, hands-on tutorials, and implementation guides to over ~300 monthly readers, helping developers navigate the rapidly evolving world of artificial intelligence.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *