Democratize Production-Grade AI 

Lead AI into smaller hardware

Models

Our Missions

We make open-source AI more accessible and help developers overcome infrastructure limitations. In particular, we optimize open-source large language models to be 75% smaller while preserving more than 99% of their quality. And thus, we enable leading LLMs to be deployed on smaller hardware, run faster, and serve more users.

Accessible Production-Grade LLMs

Alibaba Qwen3 Next Instruct FP16 model size is 162.7 GB while cyankiwi Qwen3 Next Instruct INT4 model size is only 49.2 GB.

cyankiwi reduces model size and KV cache usage by 75% while incurring less than 1% performance degradation.