Microsoft Releases Phi-4-Reasoning-Vision-15B Model
Microsoft has unveiled the Phi-4-reasoning-vision-15B, a compact open-weight multimodal AI model that excels in processing images and text. This 15-billion-parameter model outperforms larger systems in tasks like complex math, science problems, chart interpretation, and GUI navigation, trained on just 200 billion tokens of multimodal data. Available under a permissive license via Microsoft Foundry, Hugging Face, and GitHub, it emphasizes efficiency and practical applications.