
Hi, I'm Yichen Zhang
AI Researcher @ NYU Computer Engineering
Building intelligent systems through multimodal learning
Research Focus: Vision-Language Models, Reinforcement Learning, Robotics
🔬 Current Work
**🚀 FastVLM Research**
Fine-tuning large vision-language models on H100 GPUs for enhanced multimodal reasoning and real-world applications.
**🧬 VariantAI Platform**
Developing a genome-scale model integration platform that combines AI with biological data analysis for precision medicine.
**🤖 Vision-Language-Action Systems**
Exploring VLA architectures for robotic manipulation tasks, enabling robots to understand and act based on visual and linguistic inputs.
🎯 Research Interests
- Vision-Language Models (VLM/VLA): Multimodal understanding and reasoning
- Reinforcement Learning: Sample-efficient algorithms and policy optimization
- Embodied AI: Bridging perception, language, and action in robotics
🧩 Explore More
📫 Get In Touch
Interested in collaboration or want to discuss AI research? Feel free to reach out!
Email: jarviszhang.ai@gmail.com
GitHub: @JarvisZhang24
Posts
Getting Started with Vision-Language Models
A comprehensive introduction to Vision-Language Models (VLMs), covering key concepts, popular architectures, and practical implementation examples with code.Hello Jekyll - Welcome to My Research Blog
Welcome to my research blog! 🎉
subscribe via RSS