← Back to Projects

Burmese GPT

The first foundational open-source Large Language Model for the Burmese/Myanmar language.

Overview

Pre-trained on comprehensive Burmese text corpora, Burmese GPT enables downstream fine-tuning for chatbots, NLP research, and creative writing. This model represents a breakthrough in making AI accessible for low-resource languages, specifically addressing the unique linguistic challenges of the Myanmar language.

Key Capabilities

  • Foundational Pre-training: Trained exclusively on high-quality Burmese text corpuses.
  • Optimized Tokenizer: Features a custom tokenizer designed to efficiently handle Burmese script and syllables.
  • Downstream Ready: Can be easily fine-tuned using LoRA/QLoRA for specific tasks such as instruction following, Q&A, and sentiment analysis.
  • Open Source: Fully available to the community to spur local AI innovation in Myanmar.

Project Links & Resources


Related Internal Research