Swin Transformer Explained

📚 About The Project

This repository provides a comprehensive exploration of the Swin Transformer architecture, developed as part of a Deep Learning course project. Our goal is to help others understand this powerful architecture through detailed explanations, visualizations, and practical implementations.

👥 Team

Team: Optical Flow

Jinghan Gao - GitHub Profile
Calvin Lo - GitHub Profile
Xinyi Wang - GitHub Profile

What is Swin Transformer?

Swin Transformer is a hierarchical vision transformer that uses shifted windows for efficient modeling of visual data. It addresses key limitations of previous vision transformers by:

Computing self-attention within local windows
Supporting hierarchical feature maps
Achieving linear computational complexity relative to image size

🎯 Project Structure

swin-transformer-explained/
├── docs/                   # Documentation and tutorials
├── model/                    # swim transformer model code 
├── demos/                  # Interactive demonstrations
├── presentation/          # Presentation materials
└── notebooks/            # Jupyter notebooks for examples

📊 Documentation

For detailed documentation, please visit our docs directory:

Architecture Overview
Understanding Swin Transformer
Related Resource: Additional documentation or configurations related to the project.
advanced_applications: advanced applications of Swin Transformer across various fields.
Demo Tutorial : demo tutorial for swin transformer implementation
Presentation

💡 Key Features

Theoretical Understanding
- Detailed explanation of architecture
- Visual guides and diagrams
Implementation
- Step-by-step code walkthrough
- Practical examples in Demo
Demonstrations
- Interactive visualizations
- Real-world applications
- Performance benchmarks

🚀 Swin Transformer Performance on COCO Dataset

The following chart illustrates the state-of-the-art performance of Swin Transformer and its variants (such as SwinV2-G) on the COCO test-dev dataset. The chart shows how Swin Transformer models consistently achieve high box mAP scores, outperforming other object detection models across different time periods.

Swin-L (HTC++, multi scale): Achieves a box mAP of around 60.
Soft Teacher + Swin-L (HTC++, multi scale): Further improves the box mAP to approximately 62.
SwinV2-G (HTC++): Sets a new benchmark with a box mAP of around 64.

This showcases the effectiveness of the Swin Transformer and SwinV2 models in object detection tasks, demonstrating their superior performance on high-resolution images and dense prediction tasks.

Source: COCO Object Detection Leaderboard

📝 Course Information

This project was developed as part of the Computer Vision course at Northeastern University.

🔎 References

📖 Citation

If you find this project helpful, please consider citing:

@techreport{optical_flow_team2024,
    title={Understanding Swin Transformer: A Comprehensive Study},
    author={{Optical Flow Team (Gao, Jinghan and Lo, Calvin and Wang, Xinyi)}},
    institution={University of British Columbia},
    type={Course Project},
    year={2024},
    note={Deep Learning Course Project}
}

This project is a comprehensive explanation and implementation of the Swin Transformer architecture, created by Team Optical Flow:

Jinghan Gao
Calvin Lo
Xinyi Wang

For academic use, please ensure to cite both the original work and our educational materials appropriately. }



## 🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## 📧 Contact

For questions and feedback, please open an issue in this repository or contact team members directly.

## 🙏 Acknowledgments

- [Microsoft Research Asia](https://www.microsoft.com/en-us/research/lab/microsoft-research-asia/) for the original Swin Transformer
- [Our course instructor] for guidance and support
- The PyTorch team for their excellent framework

---

Made with ❤️ by Team Optical Flow
</div>

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
assets/images		assets/images
demo		demo
documentation		documentation
models		models
README.md		README.md
chart.png		chart.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Swin Transformer Explained

📚 About The Project

👥 Team

What is Swin Transformer?

🎯 Project Structure

📊 Documentation

💡 Key Features

🚀 Swin Transformer Performance on COCO Dataset

📝 Course Information

🔎 References

📖 Citation

About

Releases

Packages

Languages

thisissophiawang/swin-transformer-explained

Folders and files

Latest commit

History

Repository files navigation

Swin Transformer Explained

📚 About The Project

👥 Team

What is Swin Transformer?

🎯 Project Structure

📊 Documentation

💡 Key Features

🚀 Swin Transformer Performance on COCO Dataset

📝 Course Information

🔎 References

📖 Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages