CV
Classification
- AlexNet:ImageNet Classification with Deep Convolutional Neural Networks
- VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
- Going Deeper with Convolutions
- Densely Connected Convolutional Networks
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Rethinking the Inception Architecture for Computer Vision
- Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
- Deep Residual Learning for Image Recognition
- Label-embedding for attribute-based classification
- Image Super-Resolution Using Deep Convolutional Networks
- bilinear cnn models for fine-grained visual recognition
- xception deep learning with depthwise separable convolutions
- A Review on Multi-Label Learning Algorithms
- Wide Residual Networks
- Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
- Scene Classification Via pLSA
- XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
- Semi-Supervised Classification with Graph Convolutional Networks
Segmentation
- Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
- Instance-aware Semantic Segmentation via Multi-task Network Cascades
- ParseNet: Looking Wider to See Better
- Pyramid Scene Parsing Network
- Rethinking Atrous Convolution for Semantic Image Segmentation
- Learning Deconvolution Network for Semantic Segmentation
- Fully Convolutional Networks for Semantic Segmentation
- RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
- SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
- U-Net: Convolutional Networks for Biomedical Image Segmentation
- Semantic Image Segmentation via Deep Parsing Network
- Learning Deconvolution Network for Semantic Segmentation
- Learning to Refine Object Segments
- Simultaneous Detection and Segmentation
- Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation
- Yet Another Survey on Image Segmentation: Region and Boundary Information Integration
- Class-specific, top-down segmentation
- Learning to segment object candidates
- ParseNet: Looking Wider to See Better
Object Detection
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- DSSD : Deconvolutional Single Shot Detector
- Feature Pyramid Networks for Object Detection
- Focal Loss for Dense Object Detection
- Mask R-CNN
- SSD: Single Shot MultiBox Detector
- You Only Look Once:Unified, Real-Time Object Detection
- YOLO9000: Better, Faster, Stronger
- Training Region-based Object Detectors with Online Hard Example Mining
- R-FCN: Object Detection via Region-based Fully Convolutional Networks
- Deep Feature Pyramid Reconfiguration for Object Detection
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
- Rich feature hierarchies for accurate object detection and semantic segmentation
- Recurrent convolutional neural network for object recognition
- Render for CNN: Viewpoint Estimation in Images Using CNNs Trained With Rendered 3D Model Views
- Learning to See by Moving
- FaceNet: A unified embedding for face recognition and clustering
- Edge Boxes: Locating Object Proposals from Edges
- Mining actionlet ensemble for action recognition with depth cameras
- Action Recognition with Improved Trajectories
- Joint Deep Learning for Pedestrian Detection
- Unsupervised learning of models for object recognition
- An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector
- Learning a Sparse Representation for Object Detection
- 3D object proposals for accurate object class detection
- Fast R-CNN
- DSSD: Deconvolutional Single Shot Detector
action recogination
- Towards Understanding Action Recognition
- P-CNN: Pose-Based CNN Features for Action Recognition
- Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
- Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
Text Dectection
- Text Detection and Recognition in Imagery: A Survey
- Reading Text in the Wild with Convolutional Neural Networks
- Character-level convolutional networks for text classification
image interpretation
- Long-term Recurrent Convolutional Networks for Visual Recognition and Description
- Ask Your Neurons: A Neural-Based Approach to Answering Questions About Images
- Deep visual-semantic alignments for generating image descriptions
ZSL/ZSD
- Zero-Shot Learning by Convex Combination of Semantic Embeddings
- Synthesized Classifiers for Zero-Shot Learning
- Latent Embeddings for Zero-shot Classification
- Zero-Shot Learning via Semantic Similarity Embedding
- Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions
- Unsupervised Domain Adaptation for Zero-Shot Learning
- Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation
- Zero-shot recognition with unreliable attributes
Re-id
- Efficient PSD Constrained Asymmetric Metric Learning for Person Re-Identification
- Person Re-Identification Using Kernel-Based Metric Learning Methods
Pose Estimation
- Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Video classification
- Large-Scale Video Classification with Convolutional Neural Networks
- Beyond short snippets: Deep networks for video classification
其他
- Attention Is All You Need
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling
- Bidirectional recurrent neural networks
- auto-encoding variational bayes
- Visualizing and Understanding Convolutional Networks
- Dynamic Routing Between Capsules
- training region-based object detectors with online hard example mining
- Deep Neural Decision Forests
- Convolutional Channel Features
- Dropout: a simple way to prevent neural networks from overfitting
- Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks
- Conditional Random Fields as Recurrent Neural Networks
- How transferable are features in deep neural networks?
- Recurrent Models of Visual Attention
- On the Properties of Neural Machine Translation: Encoder-Decoder Approaches
- Continuous control with deep reinforcement learning
- Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
- Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
- Sequence Level Training with Recurrent Neural Networks
- Stochastic Backpropagation and Approximate Inference in Deep Generative Models
- Learning Rich Features from RGB-D Images for Object Detection and Segmentation
- Gradient-based Hyperparameter Optimization through Reversible Learning