Multi-Label Contrastive Loss (MLCL)
Jaccard Similarity
Efficient computation of Jaccard similarity between labels $\mathbf{y}_i$ and $\mathbf{y}_j$ :
$$\text{Jaccard}(\mathbf{y}_i,\mathbf{y}_j)=\frac{\mathbf{y}_i^\top\mathbf{y}_j}{|\mathbf{y}_i|_1+|\mathbf{y}_j|_1-\mathbf{y}_i^\top\mathbf{y}_j}$$
Weighting
The weighting function applies the Jaccard similarity raised to a power $\beta$ :
$$w_{i,j}=(\text{Jaccard}(\mathbf{y}_i,\mathbf{y}_j))^\beta$$
$$m_{i,j}^+=\begin{cases}
1,&\text{if } Jaccard(y_i,y_j)>c_{threshold}\\
0,& otherwise
\end{cases}$$
$$m_{i,j}^-=1-m_{i,j}^+$$
$$sim(i,j)=\frac{\mathbf{f}_i\cdot\mathbf{f}_j}{\tau}$$
Log probabilities are computed as:
$$\log p_{i,j}=sim(i,j)-\log\sum_k e^{sim(i,k)}$$
The positive contribution to the loss is:
$$\text{positive log probability}=\frac{\sum_{j}m_{i,j}^+\cdot w_{i,j}\cdot\log p_{i,j}}{\sum_{j}m_{ij}^++\epsilon}$$
The negative contribution to the loss is:
$$\text{negative log probability}=\frac{\sum_{j}m_{i,j}^-\cdot\log p_{i,j}}{\sum_{j}m_{i,j}^-+\epsilon}$$
The total loss for the batch:
$$\mathcal{L}=-\frac{1}{N}\sum_{i=1}^N\left[\alpha\sum_{j\neq i}m_{ij}^+w_{ij}\log p_{ij}+(1-\alpha)\sum_{j\neq i}m_{ij}^-\log p_{ij}\right]+\lambda\mathcal{R}$$
Where:
$\alpha$ : Weight for positive pairs.
$\mathcal{R}$ : Regularization term, defined as:
$$\mathcal{R}=\frac{1}{N}\sum_{i}|\mathbf{f}_i|_2$$
$$m_{i,j}=\begin{cases}
1,&\text{if } Jaccard(y_i,y_j)>c_{threshold}\\
0,& otherwise
\end{cases}$$
$$w_{i,j} = \frac{Jaccard Similarity_{i,j}}{c_{\text{threshold}} + \epsilon}$$
The similarity between contrastive features is computed as:
$${sim}_{i,j}=\frac{\text{f}_i\cdot\text{f}_j}{\text{temperature}}$$
$$p_{i,j}=\log\left(\frac{e^{sim(i,j)}}{\sum_{k=1}^{N}e^{sim(i,k)}}\right)$$
The MultiSupConLoss is defined as:
$$\text{loss}=-\frac{1}{N}\sum_{i=1}^{N}\sum_{j=1}^{N}m_{i,j}\cdot w_{i,j}\cdot{p}_{i,j}$$