Official repository for "Orthogonal Projection Loss" (ICCV'21)

Overview

Orthogonal Projection Loss (ICCV'21)

Kanchana Ranasinghe, Muzammal Naseer, Munawar Hayat, Salman Khan, & Fahad Shahbaz Khan

Paper Link | Project Page | Video

Abstract: Deep neural networks have achieved remarkable performance on a range of classification tasks, with softmax cross-entropy (CE) loss emerging as the de-facto objective function. The CE loss encourages features of a class to have a higher projection score on the true class-vector compared to the negative classes. However, this is a relative constraint and does not explicitly force different class features to be well-separated. Motivated by the observation that ground-truth class representations in CE loss are orthogonal (one-hot encoded vectors), we develop a novel loss function termed “Orthogonal Projection Loss” (OPL) which imposes orthogonality in the feature space. OPL augments the properties of CE loss and directly enforces inter-class separation alongside intra-class clustering in the feature space through orthogonality constraints on the mini-batch level. As compared to other alternatives of CE, OPL offers unique advantages e.g., no additional learnable parameters, does not require careful negative mining and is not sensitive to the batch size. Given the plug-and-play nature of OPL, we evaluate it on a diverse range of tasks including image recognition (CIFAR-100), large-scale classification (ImageNet), domain generalization (PACS) and few-shot learning (miniImageNet, CIFAR-FS, tiered-ImageNet and Meta-dataset) and demonstrate its effectiveness across the board. Furthermore, OPL offers better robustness against practical nuisances such as adversarial attacks and label noise.

Citation

@InProceedings{Ranasinghe_2021_ICCV,
    author    = {Ranasinghe, Kanchana and Naseer, Muzammal and Hayat, Munawar and Khan, Salman and Khan, Fahad Shahbaz},
    title     = {Orthogonal Projection Loss},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {12333-12343}
}

Table of Contents

  1. Contributions
  2. Usage
  3. Pretrained Models
  4. Training
  5. Evaluation
  6. What Can You Do?
  7. Quantitative Results
  8. Qualitative Results

Contributions

  1. We propose a novel loss, OPL, that directly enforces inter-class separation and intra-class clustering via orthogonality constraints on the feature space with no additional learnable parameters.
  2. Our orthogonality constraints are efficiently formulated in comparison to existing methods, allowing mini-batch processing without the need for explicit calculation of singular values. This leads to a simple vectorized implementation of OPL directly integrating with CE.
  3. We conduct extensive evaluations on a diverse range of image classification tasks highlighting the discriminative ability of OPL. Further, our results on few-shot learning (FSL) and domain generalization (DG) datasets establish the transferability and generalizability of features learned with OPL. Finally, we establish the improved robustness of learned features to adversarial attacks and label noise.

Usage

Refer to requirements.txt for dependencies. Orthogonal Projection Loss (OPL) can be simply plugged-in with any standard loss function similar to Softmax Cross-Entropy Loss (CE) as below. You may need to edit the forward function of your model to output features (we use the penultimate feature maps) alongside the final logits. You can set the gamma and lambda values to default as 0.5 and 1 respectively.

import torch.nn.functional as F

from loss import OrthogonalProjectionLoss

ce_loss = F.cross_entropy
op_loss = OrthogonalProjectionLoss(gamma=0.5)
op_lambda = 1

for inputs, targets in dataloader:
    features, logits = model(inputs)

    loss_op = op_loss(features, targets)
    loss_ce = ce_loss(logits, targets)

    loss = loss_ce + op_lambda * loss_op
    loss.backward()

Pretrained Models

If you find our OPL pretrained models useful, please consider citing our work.

Training

Refer to the sub-folders for CIFAR-100 (cifar), ImageNet (imagenet), few-shot learning (rfs) training, and label noise training (truncated_loss). The README.MD within each directory contains the training instructions for that task.

Evaluation

Refer to the relevant sub-folders (same as in Training above). You can find the pretrained models for these tasks on our releases page.

What Can You Do?

For future work, we hope to explore the following:

  • Test how OPL can be adapted for un-supervised representation learning
  • Test the performance of OPL on more architectures (e.g. vision transformers)
  • Test how OPL performs on class-imbalanced datasets

Quantitative Results

We present qualitative results for training with OPL (against a CE only backbone) for various classification tasks.

Classification: ImageNet

Model Backbone ResNet-18 ResNet-18 ResNet-50 ResNet-50
top-1 top-5 top-1 top-5
CE (Baseline) 69.91% 89.08% 76.15% 92.87%
CE + OPL (ours) 70.27% 89.60% 76.98% 93.30%

Classification: Few Shot

Method New Loss Cifar:1shot Cifar:5shot Mini:1shot Mini:5shot Tier:1shot Tier:5shot
MAML 58.90±1.9 71.50±1.0 48.70±1.84 63.11±0.92 51.67±1.81 70.30±1.75
Prototypical Networks 55.50±0.7 72.00±0.6 49.42±0.78 68.20±0.66 53.31±0.89 72.69±0.74
Relation Networks 55.00±1.0 69.30±0.8 50.44±0.82 65.32±0.70 54.48±0.93 71.32±0.78
Shot-Free 69.20±N/A 84.70±N/A 59.04±N/A 77.64±N/A 63.52±N/A 82.59±N/A
MetaOptNet 72.60±0.7 84.30±0.5 62.64±0.61 78.63±0.46 65.99±0.72 81.56±0.53
RFS 71.45±0.8 85.95±0.5 62.02±0.60 79.64±0.44 69.74±0.72 84.41±0.55
RFS + OPL (Ours) 73.02±0.4 86.12±0.2 63.10±0.36 79.87±0.26 70.20±0.41 85.01±0.27
NAML - - 65.42±0.25 75.48±0.34 - -
Neg-Cosine - - 63.85±0.81 81.57±0.56 - -
SKD 74.50±0.9 88.00±0.6 65.93±0.81 83.15±0.54 71.69±0.91 86.66±0.60
SKD + OPL (Ours) 74.94±0.4 88.06±0.3 66.90±0.37 83.23±0.25 72.10±0.41 86.70±0.27

Classification: Label Noise

Dataset Method Uniform Class Dependent
CIFAR10 TL 87.62% 82.28%
TL+OPL 88.45% 87.02%
CIFAR100 TL 62.64% 47.66%
TL+OPL 65.62% 53.94%

Qualitative Results

We present some examples for qualitative improvements over imagenet below.

Comments
  • the opl loss doesn't decrease

    the opl loss doesn't decrease

    thank yor for your great work!

    I print the opl loss, but it I find it doesn't decrease. How does the opl loss take effect at such take effect?

    Prints are like below: ############### print loss ######################################################### base_loss:4.280840873718262;op_loss:0.5367265939712524 Training Epoch: 2 [128/50000] Loss: 4.8176 LR: 0.100000 base_loss:4.107833385467529;op_loss:0.5339833498001099 Training Epoch: 2 [256/50000] Loss: 4.6418 LR: 0.100000 base_loss:4.187429904937744;op_loss:0.5304139852523804 Training Epoch: 2 [384/50000] Loss: 4.7178 LR: 0.100000 base_loss:4.127537727355957;op_loss:0.5252766013145447 Training Epoch: 2 [512/50000] Loss: 4.6528 LR: 0.100000 base_loss:4.084356307983398;op_loss:0.5540026426315308 Training Epoch: 2 [640/50000] Loss: 4.6384 LR: 0.100000 base_loss:4.1102681159973145;op_loss:0.5689339637756348 Training Epoch: 2 [768/50000] Loss: 4.6792 LR: 0.100000 base_loss:4.056018352508545;op_loss:0.5612157583236694 Training Epoch: 2 [896/50000] Loss: 4.6172 LR: 0.100000 base_loss:4.160975456237793;op_loss:0.5274627208709717 Training Epoch: 2 [1024/50000] Loss: 4.6884 LR: 0.100000 base_loss:4.068561553955078;op_loss:0.5268392562866211 Training Epoch: 2 [1152/50000] Loss: 4.5954 LR: 0.100000 base_loss:4.090016841888428;op_loss:0.5170165300369263 Training Epoch: 2 [1280/50000] Loss: 4.6070 LR: 0.100000 base_loss:4.1942949295043945;op_loss:0.5494662523269653 Training Epoch: 2 [1408/50000] Loss: 4.7438 LR: 0.100000 base_loss:4.133640289306641;op_loss:0.5417580604553223 Training Epoch: 2 [1536/50000] Loss: 4.6754 LR: 0.100000 base_loss:4.124314785003662;op_loss:0.5448318719863892 Training Epoch: 2 [1664/50000] Loss: 4.6691 LR: 0.100000 base_loss:4.153204917907715;op_loss:0.5473470687866211 Training Epoch: 2 [1792/50000] Loss: 4.7006 LR: 0.100000 base_loss:4.167095184326172;op_loss:0.569574236869812 Training Epoch: 2 [1920/50000] Loss: 4.7367 LR: 0.100000 base_loss:4.202641010284424;op_loss:0.5590764880180359 Training Epoch: 2 [2048/50000] Loss: 4.7617 LR: 0.100000 base_loss:4.2228193283081055;op_loss:0.5472540855407715 Training Epoch: 2 [2176/50000] Loss: 4.7701 LR: 0.100000 base_loss:4.1299238204956055;op_loss:0.5477103590965271 Training Epoch: 2 [2304/50000] Loss: 4.6776 LR: 0.100000 base_loss:4.061820983886719;op_loss:0.5515830516815186 Training Epoch: 2 [2432/50000] Loss: 4.6134 LR: 0.100000 base_loss:4.058617115020752;op_loss:0.5518112182617188 Training Epoch: 2 [2560/50000] Loss: 4.6104 LR: 0.100000 base_loss:4.192203521728516;op_loss:0.5860923528671265 Training Epoch: 2 [2688/50000] Loss: 4.7783 LR: 0.100000 base_loss:3.96213436126709;op_loss:0.5402579307556152 Training Epoch: 2 [2816/50000] Loss: 4.5024 LR: 0.100000 base_loss:4.114309310913086;op_loss:0.5352994799613953 Training Epoch: 2 [2944/50000] Loss: 4.6496 LR: 0.100000 base_loss:4.110159873962402;op_loss:0.5759779214859009 Training Epoch: 2 [3072/50000] Loss: 4.6861 LR: 0.100000 base_loss:4.094690322875977;op_loss:0.5454019904136658 Training Epoch: 2 [3200/50000] Loss: 4.6401 LR: 0.100000 base_loss:4.0557379722595215;op_loss:0.5510457754135132 Training Epoch: 2 [3328/50000] Loss: 4.6068 LR: 0.100000 base_loss:4.225389003753662;op_loss:0.609066367149353 Training Epoch: 2 [3456/50000] Loss: 4.8345 LR: 0.100000 base_loss:4.351457595825195;op_loss:0.5496102571487427 Training Epoch: 2 [3584/50000] Loss: 4.9011 LR: 0.100000 base_loss:4.223308086395264;op_loss:0.5360826849937439 Training Epoch: 2 [3712/50000] Loss: 4.7594 LR: 0.100000 base_loss:4.35025691986084;op_loss:0.5664458274841309 Training Epoch: 2 [3840/50000] Loss: 4.9167 LR: 0.100000 base_loss:4.255465030670166;op_loss:0.5395272374153137 Training Epoch: 2 [3968/50000] Loss: 4.7950 LR: 0.100000 base_loss:4.104780673980713;op_loss:0.5505534410476685 Training Epoch: 2 [4096/50000] Loss: 4.6553 LR: 0.100000 base_loss:4.003930568695068;op_loss:0.5199926495552063 Training Epoch: 2 [4224/50000] Loss: 4.5239 LR: 0.100000 base_loss:4.060220718383789;op_loss:0.5441393852233887 Training Epoch: 2 [4352/50000] Loss: 4.6044 LR: 0.100000 base_loss:4.145919322967529;op_loss:0.5306097269058228 Training Epoch: 2 [4480/50000] Loss: 4.6765 LR: 0.100000 base_loss:4.119339942932129;op_loss:0.5385489463806152 Training Epoch: 2 [4608/50000] Loss: 4.6579 LR: 0.100000 base_loss:4.11818265914917;op_loss:0.560119092464447 Training Epoch: 2 [4736/50000] Loss: 4.6783 LR: 0.100000 base_loss:4.3832106590271;op_loss:0.5470128059387207 Training Epoch: 2 [4864/50000] Loss: 4.9302 LR: 0.100000 base_loss:4.000164985656738;op_loss:0.528315544128418 Training Epoch: 2 [4992/50000] Loss: 4.5285 LR: 0.100000 base_loss:4.128720760345459;op_loss:0.5646262168884277 Training Epoch: 2 [5120/50000] Loss: 4.6933 LR: 0.100000 base_loss:4.194307804107666;op_loss:0.5486522912979126 Training Epoch: 2 [5248/50000] Loss: 4.7430 LR: 0.100000 base_loss:4.138046741485596;op_loss:0.5774385333061218 Training Epoch: 2 [5376/50000] Loss: 4.7155 LR: 0.100000 base_loss:4.057068824768066;op_loss:0.552772045135498 Training Epoch: 2 [5504/50000] Loss: 4.6098 LR: 0.100000 base_loss:4.105223655700684;op_loss:0.5938466787338257 Training Epoch: 2 [5632/50000] Loss: 4.6991 LR: 0.100000 base_loss:4.064273357391357;op_loss:0.5769248008728027 Training Epoch: 2 [5760/50000] Loss: 4.6412 LR: 0.100000 base_loss:4.177098274230957;op_loss:0.541038990020752 Training Epoch: 2 [5888/50000] Loss: 4.7181 LR: 0.100000 base_loss:4.053581714630127;op_loss:0.5154528021812439 Training Epoch: 2 [6016/50000] Loss: 4.5690 LR: 0.100000 base_loss:3.9969096183776855;op_loss:0.5439358949661255 Training Epoch: 2 [6144/50000] Loss: 4.5408 LR: 0.100000 base_loss:4.2833333015441895;op_loss:0.596412181854248 Training Epoch: 2 [6272/50000] Loss: 4.8797 LR: 0.100000 base_loss:4.224669933319092;op_loss:0.5579344034194946 Training Epoch: 2 [6400/50000] Loss: 4.7826 LR: 0.100000 base_loss:4.171744346618652;op_loss:0.539709210395813 Training Epoch: 2 [6528/50000] Loss: 4.7115 LR: 0.100000 base_loss:3.9602854251861572;op_loss:0.5358787775039673 Training Epoch: 2 [6656/50000] Loss: 4.4962 LR: 0.100000 base_loss:3.9493770599365234;op_loss:0.5226551294326782 Training Epoch: 2 [6784/50000] Loss: 4.4720 LR: 0.100000 base_loss:4.003721714019775;op_loss:0.5647306442260742 Training Epoch: 2 [6912/50000] Loss: 4.5685 LR: 0.100000 base_loss:3.979095458984375;op_loss:0.5380873084068298 Training Epoch: 2 [7040/50000] Loss: 4.5172 LR: 0.100000 base_loss:4.058328151702881;op_loss:0.5032943487167358 Training Epoch: 2 [7168/50000] Loss: 4.5616 LR: 0.100000 base_loss:4.172536849975586;op_loss:0.5657532215118408 Training Epoch: 2 [7296/50000] Loss: 4.7383 LR: 0.100000 base_loss:3.997509717941284;op_loss:0.540406346321106 Training Epoch: 2 [7424/50000] Loss: 4.5379 LR: 0.100000 base_loss:4.1355462074279785;op_loss:0.5518960356712341 Training Epoch: 2 [7552/50000] Loss: 4.6874 LR: 0.100000 base_loss:3.9606270790100098;op_loss:0.5324451923370361 Training Epoch: 2 [7680/50000] Loss: 4.4931 LR: 0.100000 base_loss:4.090785026550293;op_loss:0.5301138758659363 Training Epoch: 2 [7808/50000] Loss: 4.6209 LR: 0.100000 base_loss:4.014397144317627;op_loss:0.5153027176856995 Training Epoch: 2 [7936/50000] Loss: 4.5297 LR: 0.100000 base_loss:3.9166464805603027;op_loss:0.5357871055603027 Training Epoch: 2 [8064/50000] Loss: 4.4524 LR: 0.100000 base_loss:4.113657474517822;op_loss:0.5453531742095947 Training Epoch: 2 [8192/50000] Loss: 4.6590 LR: 0.100000 base_loss:4.08429479598999;op_loss:0.5551325082778931 Training Epoch: 2 [8320/50000] Loss: 4.6394 LR: 0.100000 base_loss:4.230902671813965;op_loss:0.5798734426498413 Training Epoch: 2 [8448/50000] Loss: 4.8108 LR: 0.100000 base_loss:3.9448187351226807;op_loss:0.5061675310134888 Training Epoch: 2 [8576/50000] Loss: 4.4510 LR: 0.100000 base_loss:4.013118743896484;op_loss:0.5448779463768005 Training Epoch: 2 [8704/50000] Loss: 4.5580 LR: 0.100000 base_loss:4.005880355834961;op_loss:0.5329939126968384 Training Epoch: 2 [8832/50000] Loss: 4.5389 LR: 0.100000 base_loss:4.086043834686279;op_loss:0.5604828596115112 Training Epoch: 2 [8960/50000] Loss: 4.6465 LR: 0.100000 base_loss:4.011306285858154;op_loss:0.5554200410842896 Training Epoch: 2 [9088/50000] Loss: 4.5667 LR: 0.100000 base_loss:4.126717567443848;op_loss:0.6163482666015625 Training Epoch: 2 [9216/50000] Loss: 4.7431 LR: 0.100000 base_loss:4.091407775878906;op_loss:0.5237964391708374 Training Epoch: 2 [9344/50000] Loss: 4.6152 LR: 0.100000 base_loss:4.059153079986572;op_loss:0.5792750120162964 Training Epoch: 2 [9472/50000] Loss: 4.6384 LR: 0.100000 base_loss:3.981687307357788;op_loss:0.5341448783874512 Training Epoch: 2 [9600/50000] Loss: 4.5158 LR: 0.100000 base_loss:4.207813262939453;op_loss:0.5748806595802307 Training Epoch: 2 [9728/50000] Loss: 4.7827 LR: 0.100000 base_loss:4.008450984954834;op_loss:0.5871504545211792 Training Epoch: 2 [9856/50000] Loss: 4.5956 LR: 0.100000 base_loss:4.080292701721191;op_loss:0.5415703058242798 Training Epoch: 2 [9984/50000] Loss: 4.6219 LR: 0.100000 base_loss:3.9237353801727295;op_loss:0.524863064289093 Training Epoch: 2 [10112/50000] Loss: 4.4486 LR: 0.100000 base_loss:4.118155002593994;op_loss:0.5795061588287354 Training Epoch: 2 [10240/50000] Loss: 4.6977 LR: 0.100000 base_loss:3.868434190750122;op_loss:0.5216013193130493 Training Epoch: 2 [10368/50000] Loss: 4.3900 LR: 0.100000 base_loss:4.077742576599121;op_loss:0.5491902828216553 Training Epoch: 2 [10496/50000] Loss: 4.6269 LR: 0.100000 base_loss:3.8872153759002686;op_loss:0.5644634962081909 Training Epoch: 2 [10624/50000] Loss: 4.4517 LR: 0.100000 base_loss:4.216022968292236;op_loss:0.5591652989387512 Training Epoch: 2 [10752/50000] Loss: 4.7752 LR: 0.100000 base_loss:3.9916350841522217;op_loss:0.5574780106544495 Training Epoch: 2 [10880/50000] Loss: 4.5491 LR: 0.100000 base_loss:3.9976184368133545;op_loss:0.5582561492919922 Training Epoch: 2 [11008/50000] Loss: 4.5559 LR: 0.100000 base_loss:3.9008119106292725;op_loss:0.5464879274368286 Training Epoch: 2 [11136/50000] Loss: 4.4473 LR: 0.100000 base_loss:3.9299840927124023;op_loss:0.5282109379768372 Training Epoch: 2 [11264/50000] Loss: 4.4582 LR: 0.100000 base_loss:4.037593364715576;op_loss:0.546193540096283 Training Epoch: 2 [11392/50000] Loss: 4.5838 LR: 0.100000 base_loss:3.9935431480407715;op_loss:0.5289690494537354 Training Epoch: 2 [11520/50000] Loss: 4.5225 LR: 0.100000 base_loss:4.07742977142334;op_loss:0.5728453397750854 Training Epoch: 2 [11648/50000] Loss: 4.6503 LR: 0.100000 base_loss:4.042142391204834;op_loss:0.547771155834198 Training Epoch: 2 [11776/50000] Loss: 4.5899 LR: 0.100000 base_loss:4.176750659942627;op_loss:0.557599663734436 Training Epoch: 2 [11904/50000] Loss: 4.7344 LR: 0.100000 base_loss:4.099972724914551;op_loss:0.5662820339202881 Training Epoch: 2 [12032/50000] Loss: 4.6663 LR: 0.100000 base_loss:4.098907947540283;op_loss:0.5655167102813721 Training Epoch: 2 [12160/50000] Loss: 4.6644 LR: 0.100000 base_loss:4.045915603637695;op_loss:0.5533789396286011 Training Epoch: 2 [12288/50000] Loss: 4.5993 LR: 0.100000 base_loss:4.0994486808776855;op_loss:0.5776498317718506 Training Epoch: 2 [12416/50000] Loss: 4.6771 LR: 0.100000 base_loss:3.9360766410827637;op_loss:0.5429057478904724 Training Epoch: 2 [12544/50000] Loss: 4.4790 LR: 0.100000 base_loss:3.9477016925811768;op_loss:0.538610577583313 Training Epoch: 2 [12672/50000] Loss: 4.4863 LR: 0.100000 base_loss:4.210412502288818;op_loss:0.5811970233917236 Training Epoch: 2 [12800/50000] Loss: 4.7916 LR: 0.100000 base_loss:3.9044268131256104;op_loss:0.5374801754951477 Training Epoch: 2 [12928/50000] Loss: 4.4419 LR: 0.100000 base_loss:4.1393961906433105;op_loss:0.6037479639053345 Training Epoch: 2 [13056/50000] Loss: 4.7431 LR: 0.100000 base_loss:3.940371513366699;op_loss:0.532346785068512 Training Epoch: 2 [13184/50000] Loss: 4.4727 LR: 0.100000 base_loss:3.9259440898895264;op_loss:0.550885021686554 Training Epoch: 2 [13312/50000] Loss: 4.4768 LR: 0.100000 base_loss:3.8317489624023438;op_loss:0.5320054292678833 Training Epoch: 2 [13440/50000] Loss: 4.3638 LR: 0.100000 base_loss:3.880911350250244;op_loss:0.5213427543640137 Training Epoch: 2 [13568/50000] Loss: 4.4023 LR: 0.100000 base_loss:4.073591232299805;op_loss:0.5471822023391724 Training Epoch: 2 [13696/50000] Loss: 4.6208 LR: 0.100000 base_loss:4.001219749450684;op_loss:0.5670425891876221 Training Epoch: 2 [13824/50000] Loss: 4.5683 LR: 0.100000 base_loss:3.9181933403015137;op_loss:0.5290877223014832 Training Epoch: 2 [13952/50000] Loss: 4.4473 LR: 0.100000 base_loss:3.7816948890686035;op_loss:0.5207613706588745 Training Epoch: 2 [14080/50000] Loss: 4.3025 LR: 0.100000 base_loss:3.9933664798736572;op_loss:0.560212254524231 Training Epoch: 2 [14208/50000] Loss: 4.5536 LR: 0.100000 base_loss:3.93005633354187;op_loss:0.5484799146652222 Training Epoch: 2 [14336/50000] Loss: 4.4785 LR: 0.100000 base_loss:3.836000919342041;op_loss:0.55451500415802 Training Epoch: 2 [14464/50000] Loss: 4.3905 LR: 0.100000 base_loss:4.100119590759277;op_loss:0.5685286521911621 Training Epoch: 2 [14592/50000] Loss: 4.6686 LR: 0.100000 base_loss:4.030941009521484;op_loss:0.5982041358947754 Training Epoch: 2 [14720/50000] Loss: 4.6291 LR: 0.100000 base_loss:3.987234592437744;op_loss:0.566533088684082 Training Epoch: 2 [14848/50000] Loss: 4.5538 LR: 0.100000 base_loss:4.078621864318848;op_loss:0.6090583205223083 Training Epoch: 2 [14976/50000] Loss: 4.6877 LR: 0.100000 base_loss:4.001223087310791;op_loss:0.5583722591400146 Training Epoch: 2 [15104/50000] Loss: 4.5596 LR: 0.100000 base_loss:3.9525036811828613;op_loss:0.5767878293991089 Training Epoch: 2 [15232/50000] Loss: 4.5293 LR: 0.100000 base_loss:4.11889123916626;op_loss:0.5561187863349915 Training Epoch: 2 [15360/50000] Loss: 4.6750 LR: 0.100000 base_loss:3.926288366317749;op_loss:0.5604448318481445 Training Epoch: 2 [15488/50000] Loss: 4.4867 LR: 0.100000 base_loss:4.002326011657715;op_loss:0.5613310933113098 Training Epoch: 2 [15616/50000] Loss: 4.5637 LR: 0.100000 base_loss:4.182285308837891;op_loss:0.6124283075332642 Training Epoch: 2 [15744/50000] Loss: 4.7947 LR: 0.100000 base_loss:3.996242046356201;op_loss:0.5417048335075378 Training Epoch: 2 [15872/50000] Loss: 4.5379 LR: 0.100000 base_loss:4.020390510559082;op_loss:0.5404340028762817 Training Epoch: 2 [16000/50000] Loss: 4.5608 LR: 0.100000 base_loss:3.944962739944458;op_loss:0.554619312286377 Training Epoch: 2 [16128/50000] Loss: 4.4996 LR: 0.100000 base_loss:3.8157522678375244;op_loss:0.5304190516471863 Training Epoch: 2 [16256/50000] Loss: 4.3462 LR: 0.100000 base_loss:3.8531455993652344;op_loss:0.5176503658294678 Training Epoch: 2 [16384/50000] Loss: 4.3708 LR: 0.100000 base_loss:3.969573974609375;op_loss:0.533740222454071 Training Epoch: 2 [16512/50000] Loss: 4.5033 LR: 0.100000 base_loss:4.1333112716674805;op_loss:0.5763177871704102 Training Epoch: 2 [16640/50000] Loss: 4.7096 LR: 0.100000 base_loss:4.007053375244141;op_loss:0.5581700801849365 Training Epoch: 2 [16768/50000] Loss: 4.5652 LR: 0.100000 base_loss:4.020654678344727;op_loss:0.5500863790512085 Training Epoch: 2 [16896/50000] Loss: 4.5707 LR: 0.100000 base_loss:3.8734445571899414;op_loss:0.5279120206832886 Training Epoch: 2 [17024/50000] Loss: 4.4014 LR: 0.100000 base_loss:3.935228109359741;op_loss:0.5523131489753723 Training Epoch: 2 [17152/50000] Loss: 4.4875 LR: 0.100000 base_loss:4.027062892913818;op_loss:0.5717694163322449 Training Epoch: 2 [17280/50000] Loss: 4.5988 LR: 0.100000 base_loss:4.000626087188721;op_loss:0.5528971552848816 Training Epoch: 2 [17408/50000] Loss: 4.5535 LR: 0.100000 base_loss:3.934094190597534;op_loss:0.5325213074684143 Training Epoch: 2 [17536/50000] Loss: 4.4666 LR: 0.100000 base_loss:3.9861996173858643;op_loss:0.5766400098800659 Training Epoch: 2 [17664/50000] Loss: 4.5628 LR: 0.100000 base_loss:4.126750946044922;op_loss:0.5524164438247681 Training Epoch: 2 [17792/50000] Loss: 4.6792 LR: 0.100000 base_loss:4.030887126922607;op_loss:0.5731723308563232 Training Epoch: 2 [17920/50000] Loss: 4.6041 LR: 0.100000 base_loss:3.911486864089966;op_loss:0.547804594039917 Training Epoch: 2 [18048/50000] Loss: 4.4593 LR: 0.100000 base_loss:3.832413673400879;op_loss:0.5421112775802612 Training Epoch: 2 [18176/50000] Loss: 4.3745 LR: 0.100000 base_loss:3.91367506980896;op_loss:0.5375984907150269 Training Epoch: 2 [18304/50000] Loss: 4.4513 LR: 0.100000 base_loss:3.917893409729004;op_loss:0.5319538116455078 Training Epoch: 2 [18432/50000] Loss: 4.4498 LR: 0.100000 base_loss:3.9348998069763184;op_loss:0.5671139359474182 Training Epoch: 2 [18560/50000] Loss: 4.5020 LR: 0.100000 base_loss:3.8741447925567627;op_loss:0.5473669767379761 Training Epoch: 2 [18688/50000] Loss: 4.4215 LR: 0.100000 base_loss:3.9020802974700928;op_loss:0.5262272357940674 Training Epoch: 2 [18816/50000] Loss: 4.4283 LR: 0.100000 base_loss:3.887955665588379;op_loss:0.5487526655197144 Training Epoch: 2 [18944/50000] Loss: 4.4367 LR: 0.100000 base_loss:3.846449136734009;op_loss:0.5653930902481079 Training Epoch: 2 [19072/50000] Loss: 4.4118 LR: 0.100000 base_loss:4.031688690185547;op_loss:0.5698837041854858 Training Epoch: 2 [19200/50000] Loss: 4.6016 LR: 0.100000 base_loss:3.9116621017456055;op_loss:0.5691394209861755 Training Epoch: 2 [19328/50000] Loss: 4.4808 LR: 0.100000 base_loss:3.9180822372436523;op_loss:0.5745803117752075 Training Epoch: 2 [19456/50000] Loss: 4.4927 LR: 0.100000 base_loss:3.887133836746216;op_loss:0.5276685953140259 Training Epoch: 2 [19584/50000] Loss: 4.4148 LR: 0.100000 base_loss:3.848511219024658;op_loss:0.5183668732643127 Training Epoch: 2 [19712/50000] Loss: 4.3669 LR: 0.100000 base_loss:3.9894495010375977;op_loss:0.5411995649337769 Training Epoch: 2 [19840/50000] Loss: 4.5306 LR: 0.100000 base_loss:3.8970415592193604;op_loss:0.5570001006126404 Training Epoch: 2 [19968/50000] Loss: 4.4540 LR: 0.100000 base_loss:4.106607913970947;op_loss:0.5823178291320801 Training Epoch: 2 [20096/50000] Loss: 4.6889 LR: 0.100000 base_loss:3.996297836303711;op_loss:0.5623360872268677 Training Epoch: 2 [20224/50000] Loss: 4.5586 LR: 0.100000 base_loss:3.9712746143341064;op_loss:0.5528634786605835 Training Epoch: 2 [20352/50000] Loss: 4.5241 LR: 0.100000 base_loss:4.122766971588135;op_loss:0.5336371660232544 Training Epoch: 2 [20480/50000] Loss: 4.6564 LR: 0.100000 base_loss:3.965708017349243;op_loss:0.5651992559432983 Training Epoch: 2 [20608/50000] Loss: 4.5309 LR: 0.100000 base_loss:3.8252618312835693;op_loss:0.5347216129302979 Training Epoch: 2 [20736/50000] Loss: 4.3600 LR: 0.100000 base_loss:4.087770938873291;op_loss:0.5342988967895508 Training Epoch: 2 [20864/50000] Loss: 4.6221 LR: 0.100000 base_loss:3.8877315521240234;op_loss:0.539771556854248 Training Epoch: 2 [20992/50000] Loss: 4.4275 LR: 0.100000 base_loss:3.938214063644409;op_loss:0.5663728713989258 Training Epoch: 2 [21120/50000] Loss: 4.5046 LR: 0.100000 base_loss:3.9643566608428955;op_loss:0.5357565879821777 Training Epoch: 2 [21248/50000] Loss: 4.5001 LR: 0.100000 base_loss:4.123490333557129;op_loss:0.6017665863037109 Training Epoch: 2 [21376/50000] Loss: 4.7253 LR: 0.100000 base_loss:3.916621685028076;op_loss:0.5306718349456787 Training Epoch: 2 [21504/50000] Loss: 4.4473 LR: 0.100000 base_loss:3.9351696968078613;op_loss:0.5486545562744141 Training Epoch: 2 [21632/50000] Loss: 4.4838 LR: 0.100000 base_loss:3.9070043563842773;op_loss:0.5428156852722168 Training Epoch: 2 [21760/50000] Loss: 4.4498 LR: 0.100000 base_loss:3.899783134460449;op_loss:0.5319163203239441 Training Epoch: 2 [21888/50000] Loss: 4.4317 LR: 0.100000 base_loss:3.9853343963623047;op_loss:0.5571230053901672 Training Epoch: 2 [22016/50000] Loss: 4.5425 LR: 0.100000 base_loss:3.9992835521698;op_loss:0.5658630728721619 Training Epoch: 2 [22144/50000] Loss: 4.5651 LR: 0.100000 base_loss:3.716024875640869;op_loss:0.5126596093177795 Training Epoch: 2 [22272/50000] Loss: 4.2287 LR: 0.100000 base_loss:4.106863021850586;op_loss:0.5616646409034729 Training Epoch: 2 [22400/50000] Loss: 4.6685 LR: 0.100000 base_loss:3.8402814865112305;op_loss:0.5228695869445801 Training Epoch: 2 [22528/50000] Loss: 4.3632 LR: 0.100000 base_loss:3.886693239212036;op_loss:0.5065106153488159 Training Epoch: 2 [22656/50000] Loss: 4.3932 LR: 0.100000 base_loss:4.062132358551025;op_loss:0.6078095436096191 Training Epoch: 2 [22784/50000] Loss: 4.6699 LR: 0.100000 base_loss:3.7729077339172363;op_loss:0.5214307308197021 Training Epoch: 2 [22912/50000] Loss: 4.2943 LR: 0.100000 base_loss:3.8444833755493164;op_loss:0.5350896120071411 Training Epoch: 2 [23040/50000] Loss: 4.3796 LR: 0.100000 base_loss:3.901613712310791;op_loss:0.5505872368812561 Training Epoch: 2 [23168/50000] Loss: 4.4522 LR: 0.100000 base_loss:4.030137538909912;op_loss:0.5822103023529053 Training Epoch: 2 [23296/50000] Loss: 4.6123 LR: 0.100000 base_loss:4.140347957611084;op_loss:0.5442243814468384 Training Epoch: 2 [23424/50000] Loss: 4.6846 LR: 0.100000 base_loss:4.010788917541504;op_loss:0.5474587678909302 Training Epoch: 2 [23552/50000] Loss: 4.5582 LR: 0.100000 base_loss:4.013339996337891;op_loss:0.568802535533905 Training Epoch: 2 [23680/50000] Loss: 4.5821 LR: 0.100000 base_loss:4.00309944152832;op_loss:0.5742208957672119 Training Epoch: 2 [23808/50000] Loss: 4.5773 LR: 0.100000 base_loss:3.830199956893921;op_loss:0.5263798236846924 Training Epoch: 2 [23936/50000] Loss: 4.3566 LR: 0.100000 base_loss:4.025938510894775;op_loss:0.5630514621734619 Training Epoch: 2 [24064/50000] Loss: 4.5890 LR: 0.100000 base_loss:3.835080862045288;op_loss:0.5731643438339233 Training Epoch: 2 [24192/50000] Loss: 4.4082 LR: 0.100000 base_loss:3.789602518081665;op_loss:0.5308221578598022 Training Epoch: 2 [24320/50000] Loss: 4.3204 LR: 0.100000 base_loss:3.897852659225464;op_loss:0.5399445295333862 Training Epoch: 2 [24448/50000] Loss: 4.4378 LR: 0.100000 base_loss:3.82706880569458;op_loss:0.5551329851150513 Training Epoch: 2 [24576/50000] Loss: 4.3822 LR: 0.100000 base_loss:3.956073045730591;op_loss:0.5589912533760071 Training Epoch: 2 [24704/50000] Loss: 4.5151 LR: 0.100000 base_loss:3.8474905490875244;op_loss:0.5719531178474426 Training Epoch: 2 [24832/50000] Loss: 4.4194 LR: 0.100000 base_loss:3.898164987564087;op_loss:0.562261700630188 Training Epoch: 2 [24960/50000] Loss: 4.4604 LR: 0.100000 base_loss:4.061901092529297;op_loss:0.5722566843032837 Training Epoch: 2 [25088/50000] Loss: 4.6342 LR: 0.100000 base_loss:3.7705395221710205;op_loss:0.5475386381149292 Training Epoch: 2 [25216/50000] Loss: 4.3181 LR: 0.100000 base_loss:3.700868606567383;op_loss:0.5162238478660583 Training Epoch: 2 [25344/50000] Loss: 4.2171 LR: 0.100000 base_loss:4.059940338134766;op_loss:0.6250487565994263 Training Epoch: 2 [25472/50000] Loss: 4.6850 LR: 0.100000 base_loss:3.941880702972412;op_loss:0.5762021541595459 Training Epoch: 2 [25600/50000] Loss: 4.5181 LR: 0.100000 base_loss:3.712475299835205;op_loss:0.5045211911201477 Training Epoch: 2 [25728/50000] Loss: 4.2170 LR: 0.100000 base_loss:3.9507226943969727;op_loss:0.5714401006698608 Training Epoch: 2 [25856/50000] Loss: 4.5222 LR: 0.100000 base_loss:3.8116304874420166;op_loss:0.5507208108901978 Training Epoch: 2 [25984/50000] Loss: 4.3624 LR: 0.100000 base_loss:3.8392250537872314;op_loss:0.5401781797409058 Training Epoch: 2 [26112/50000] Loss: 4.3794 LR: 0.100000 base_loss:4.095599174499512;op_loss:0.5746181011199951 Training Epoch: 2 [26240/50000] Loss: 4.6702 LR: 0.100000 base_loss:3.993638038635254;op_loss:0.5610768795013428 Training Epoch: 2 [26368/50000] Loss: 4.5547 LR: 0.100000 base_loss:3.8439080715179443;op_loss:0.5643384456634521 Training Epoch: 2 [26496/50000] Loss: 4.4082 LR: 0.100000 base_loss:3.7558159828186035;op_loss:0.5430795550346375 Training Epoch: 2 [26624/50000] Loss: 4.2989 LR: 0.100000 base_loss:3.8994176387786865;op_loss:0.5508720874786377 Training Epoch: 2 [26752/50000] Loss: 4.4503 LR: 0.100000 base_loss:3.832751750946045;op_loss:0.5519804954528809 Training Epoch: 2 [26880/50000] Loss: 4.3847 LR: 0.100000 base_loss:3.7795217037200928;op_loss:0.5214532613754272 Training Epoch: 2 [27008/50000] Loss: 4.3010 LR: 0.100000 base_loss:3.8258230686187744;op_loss:0.5615601539611816 Training Epoch: 2 [27136/50000] Loss: 4.3874 LR: 0.100000 base_loss:3.8854758739471436;op_loss:0.5499289631843567 Training Epoch: 2 [27264/50000] Loss: 4.4354 LR: 0.100000 base_loss:3.6913657188415527;op_loss:0.5313825011253357 Training Epoch: 2 [27392/50000] Loss: 4.2227 LR: 0.100000 base_loss:3.885385513305664;op_loss:0.5844192504882812 Training Epoch: 2 [27520/50000] Loss: 4.4698 LR: 0.100000 base_loss:3.8427481651306152;op_loss:0.5542815923690796 Training Epoch: 2 [27648/50000] Loss: 4.3970 LR: 0.100000 base_loss:3.936248779296875;op_loss:0.5676923990249634 Training Epoch: 2 [27776/50000] Loss: 4.5039 LR: 0.100000 base_loss:3.9283361434936523;op_loss:0.5396283864974976 Training Epoch: 2 [27904/50000] Loss: 4.4680 LR: 0.100000 base_loss:3.8383162021636963;op_loss:0.5481919050216675 Training Epoch: 2 [28032/50000] Loss: 4.3865 LR: 0.100000 base_loss:3.753767967224121;op_loss:0.5237563848495483 Training Epoch: 2 [28160/50000] Loss: 4.2775 LR: 0.100000 base_loss:3.7770819664001465;op_loss:0.5357043743133545 Training Epoch: 2 [28288/50000] Loss: 4.3128 LR: 0.100000 base_loss:3.775968074798584;op_loss:0.5484045743942261 Training Epoch: 2 [28416/50000] Loss: 4.3244 LR: 0.100000 base_loss:3.8435142040252686;op_loss:0.5724435448646545 Training Epoch: 2 [28544/50000] Loss: 4.4160 LR: 0.100000 base_loss:3.767258882522583;op_loss:0.511907696723938 Training Epoch: 2 [28672/50000] Loss: 4.2792 LR: 0.100000 base_loss:3.7392678260803223;op_loss:0.564441442489624 Training Epoch: 2 [28800/50000] Loss: 4.3037 LR: 0.100000 base_loss:3.997739553451538;op_loss:0.5390644669532776 Training Epoch: 2 [28928/50000] Loss: 4.5368 LR: 0.100000 base_loss:3.846623420715332;op_loss:0.5556312799453735 Training Epoch: 2 [29056/50000] Loss: 4.4023 LR: 0.100000 base_loss:3.877669334411621;op_loss:0.5602124929428101 Training Epoch: 2 [29184/50000] Loss: 4.4379 LR: 0.100000 base_loss:3.759528398513794;op_loss:0.5330902338027954 Training Epoch: 2 [29312/50000] Loss: 4.2926 LR: 0.100000 base_loss:3.962939500808716;op_loss:0.5954258441925049 Training Epoch: 2 [29440/50000] Loss: 4.5584 LR: 0.100000 base_loss:3.9653871059417725;op_loss:0.5620343685150146 Training Epoch: 2 [29568/50000] Loss: 4.5274 LR: 0.100000 base_loss:3.8975536823272705;op_loss:0.553621768951416 Training Epoch: 2 [29696/50000] Loss: 4.4512 LR: 0.100000 base_loss:3.9335806369781494;op_loss:0.5649428367614746 Training Epoch: 2 [29824/50000] Loss: 4.4985 LR: 0.100000 base_loss:3.818086862564087;op_loss:0.5452444553375244 Training Epoch: 2 [29952/50000] Loss: 4.3633 LR: 0.100000 base_loss:3.938337802886963;op_loss:0.571191132068634 Training Epoch: 2 [30080/50000] Loss: 4.5095 LR: 0.100000 base_loss:3.6936147212982178;op_loss:0.5652775764465332 Training Epoch: 2 [30208/50000] Loss: 4.2589 LR: 0.100000 base_loss:3.6759190559387207;op_loss:0.5365843772888184 Training Epoch: 2 [30336/50000] Loss: 4.2125 LR: 0.100000 base_loss:3.935365676879883;op_loss:0.5915765166282654 Training Epoch: 2 [30464/50000] Loss: 4.5269 LR: 0.100000 base_loss:3.7919299602508545;op_loss:0.5455601215362549 Training Epoch: 2 [30592/50000] Loss: 4.3375 LR: 0.100000 base_loss:3.8465006351470947;op_loss:0.5511833429336548 Training Epoch: 2 [30720/50000] Loss: 4.3977 LR: 0.100000 base_loss:3.6641602516174316;op_loss:0.5380809307098389 Training Epoch: 2 [30848/50000] Loss: 4.2022 LR: 0.100000 base_loss:3.7215418815612793;op_loss:0.5309674739837646 Training Epoch: 2 [30976/50000] Loss: 4.2525 LR: 0.100000 base_loss:3.7070746421813965;op_loss:0.5367683172225952 Training Epoch: 2 [31104/50000] Loss: 4.2438 LR: 0.100000 base_loss:3.83191180229187;op_loss:0.5683099031448364 Training Epoch: 2 [31232/50000] Loss: 4.4002 LR: 0.100000 base_loss:3.687145471572876;op_loss:0.5676800012588501 Training Epoch: 2 [31360/50000] Loss: 4.2548 LR: 0.100000 base_loss:3.8021647930145264;op_loss:0.5442438125610352 Training Epoch: 2 [31488/50000] Loss: 4.3464 LR: 0.100000 base_loss:3.9961986541748047;op_loss:0.5436975955963135 Training Epoch: 2 [31616/50000] Loss: 4.5399 LR: 0.100000 base_loss:3.748223066329956;op_loss:0.5534614324569702 Training Epoch: 2 [31744/50000] Loss: 4.3017 LR: 0.100000 base_loss:3.763866662979126;op_loss:0.5668948888778687 Training Epoch: 2 [31872/50000] Loss: 4.3308 LR: 0.100000 base_loss:3.8091788291931152;op_loss:0.5503913164138794 Training Epoch: 2 [32000/50000] Loss: 4.3596 LR: 0.100000 base_loss:3.8184268474578857;op_loss:0.554498553276062 Training Epoch: 2 [32128/50000] Loss: 4.3729 LR: 0.100000 base_loss:3.8836381435394287;op_loss:0.5511849522590637 Training Epoch: 2 [32256/50000] Loss: 4.4348 LR: 0.100000 base_loss:3.7631940841674805;op_loss:0.5355446338653564 Training Epoch: 2 [32384/50000] Loss: 4.2987 LR: 0.100000 base_loss:3.9454233646392822;op_loss:0.5683684945106506 Training Epoch: 2 [32512/50000] Loss: 4.5138 LR: 0.100000 base_loss:3.9010353088378906;op_loss:0.5177701711654663 Training Epoch: 2 [32640/50000] Loss: 4.4188 LR: 0.100000 base_loss:3.8644816875457764;op_loss:0.5810962915420532 Training Epoch: 2 [32768/50000] Loss: 4.4456 LR: 0.100000 base_loss:3.881481170654297;op_loss:0.5591676235198975 Training Epoch: 2 [32896/50000] Loss: 4.4406 LR: 0.100000 base_loss:3.805453062057495;op_loss:0.5270752906799316 Training Epoch: 2 [33024/50000] Loss: 4.3325 LR: 0.100000 base_loss:3.8777475357055664;op_loss:0.5881315469741821 Training Epoch: 2 [33152/50000] Loss: 4.4659 LR: 0.100000 base_loss:3.8695993423461914;op_loss:0.5827788710594177 Training Epoch: 2 [33280/50000] Loss: 4.4524 LR: 0.100000 base_loss:3.704249858856201;op_loss:0.5382960438728333 Training Epoch: 2 [33408/50000] Loss: 4.2425 LR: 0.100000 base_loss:3.7612247467041016;op_loss:0.5391104817390442 Training Epoch: 2 [33536/50000] Loss: 4.3003 LR: 0.100000 base_loss:3.8835256099700928;op_loss:0.5955924987792969 Training Epoch: 2 [33664/50000] Loss: 4.4791 LR: 0.100000 base_loss:3.7209033966064453;op_loss:0.5071792006492615 Training Epoch: 2 [33792/50000] Loss: 4.2281 LR: 0.100000 base_loss:3.689117431640625;op_loss:0.5536002516746521 Training Epoch: 2 [33920/50000] Loss: 4.2427 LR: 0.100000 base_loss:3.8811049461364746;op_loss:0.5463123917579651 Training Epoch: 2 [34048/50000] Loss: 4.4274 LR: 0.100000 base_loss:3.8837778568267822;op_loss:0.5643473863601685 Training Epoch: 2 [34176/50000] Loss: 4.4481 LR: 0.100000 base_loss:3.891619920730591;op_loss:0.5946428775787354 Training Epoch: 2 [34304/50000] Loss: 4.4863 LR: 0.100000 base_loss:3.7175278663635254;op_loss:0.5497081279754639 Training Epoch: 2 [34432/50000] Loss: 4.2672 LR: 0.100000 base_loss:3.782231092453003;op_loss:0.5552924871444702 Training Epoch: 2 [34560/50000] Loss: 4.3375 LR: 0.100000 base_loss:3.6383509635925293;op_loss:0.5464444160461426 Training Epoch: 2 [34688/50000] Loss: 4.1848 LR: 0.100000 base_loss:3.7457454204559326;op_loss:0.5486184358596802 Training Epoch: 2 [34816/50000] Loss: 4.2944 LR: 0.100000 base_loss:3.6042778491973877;op_loss:0.5284616947174072 Training Epoch: 2 [34944/50000] Loss: 4.1327 LR: 0.100000 base_loss:3.6705310344696045;op_loss:0.5567630529403687 Training Epoch: 2 [35072/50000] Loss: 4.2273 LR: 0.100000 base_loss:3.7394771575927734;op_loss:0.5675286054611206 Training Epoch: 2 [35200/50000] Loss: 4.3070 LR: 0.100000 base_loss:3.779526948928833;op_loss:0.5739312171936035 Training Epoch: 2 [35328/50000] Loss: 4.3535 LR: 0.100000 base_loss:3.9953362941741943;op_loss:0.5840990543365479 Training Epoch: 2 [35456/50000] Loss: 4.5794 LR: 0.100000 base_loss:3.669971227645874;op_loss:0.5484297275543213 Training Epoch: 2 [35584/50000] Loss: 4.2184 LR: 0.100000 base_loss:3.697075843811035;op_loss:0.5392510294914246 Training Epoch: 2 [35712/50000] Loss: 4.2363 LR: 0.100000 base_loss:3.703193187713623;op_loss:0.5987867116928101 Training Epoch: 2 [35840/50000] Loss: 4.3020 LR: 0.100000 base_loss:3.948091506958008;op_loss:0.5643701553344727 Training Epoch: 2 [35968/50000] Loss: 4.5125 LR: 0.100000 base_loss:3.746151924133301;op_loss:0.5392628908157349 Training Epoch: 2 [36096/50000] Loss: 4.2854 LR: 0.100000 base_loss:3.6312782764434814;op_loss:0.5010913610458374 Training Epoch: 2 [36224/50000] Loss: 4.1324 LR: 0.100000 base_loss:3.8958747386932373;op_loss:0.5502843856811523 Training Epoch: 2 [36352/50000] Loss: 4.4462 LR: 0.100000 base_loss:3.9277234077453613;op_loss:0.5992369651794434 Training Epoch: 2 [36480/50000] Loss: 4.5270 LR: 0.100000 base_loss:3.767594814300537;op_loss:0.5715197324752808 Training Epoch: 2 [36608/50000] Loss: 4.3391 LR: 0.100000 base_loss:3.9161529541015625;op_loss:0.5501008033752441 Training Epoch: 2 [36736/50000] Loss: 4.4663 LR: 0.100000 base_loss:3.5899252891540527;op_loss:0.5299320220947266 Training Epoch: 2 [36864/50000] Loss: 4.1199 LR: 0.100000 base_loss:3.7980997562408447;op_loss:0.5714657306671143 Training Epoch: 2 [36992/50000] Loss: 4.3696 LR: 0.100000 base_loss:3.7315499782562256;op_loss:0.540511965751648 Training Epoch: 2 [37120/50000] Loss: 4.2721 LR: 0.100000 base_loss:3.830615282058716;op_loss:0.5773620009422302 Training Epoch: 2 [37248/50000] Loss: 4.4080 LR: 0.100000 base_loss:3.806159257888794;op_loss:0.5807817578315735 Training Epoch: 2 [37376/50000] Loss: 4.3869 LR: 0.100000 base_loss:3.804353952407837;op_loss:0.5602822303771973 Training Epoch: 2 [37504/50000] Loss: 4.3646 LR: 0.100000 base_loss:3.6165828704833984;op_loss:0.5466784238815308 Training Epoch: 2 [37632/50000] Loss: 4.1633 LR: 0.100000 base_loss:3.851067543029785;op_loss:0.5554298758506775 Training Epoch: 2 [37760/50000] Loss: 4.4065 LR: 0.100000 base_loss:3.4815421104431152;op_loss:0.5254203081130981 Training Epoch: 2 [37888/50000] Loss: 4.0070 LR: 0.100000 base_loss:3.733215808868408;op_loss:0.5669469833374023 Training Epoch: 2 [38016/50000] Loss: 4.3002 LR: 0.100000 base_loss:3.7020106315612793;op_loss:0.5734156966209412 Training Epoch: 2 [38144/50000] Loss: 4.2754 LR: 0.100000 base_loss:3.6346089839935303;op_loss:0.5675078630447388 Training Epoch: 2 [38272/50000] Loss: 4.2021 LR: 0.100000 base_loss:3.5478475093841553;op_loss:0.5357934236526489 Training Epoch: 2 [38400/50000] Loss: 4.0836 LR: 0.100000 base_loss:3.640148639678955;op_loss:0.5261424779891968 Training Epoch: 2 [38528/50000] Loss: 4.1663 LR: 0.100000 base_loss:3.7112765312194824;op_loss:0.5307673215866089 Training Epoch: 2 [38656/50000] Loss: 4.2420 LR: 0.100000 base_loss:3.5381064414978027;op_loss:0.5429272055625916 Training Epoch: 2 [38784/50000] Loss: 4.0810 LR: 0.100000 base_loss:3.752046585083008;op_loss:0.5622568130493164 Training Epoch: 2 [38912/50000] Loss: 4.3143 LR: 0.100000 base_loss:3.8045544624328613;op_loss:0.5921481847763062 Training Epoch: 2 [39040/50000] Loss: 4.3967 LR: 0.100000 base_loss:3.7374045848846436;op_loss:0.5568718910217285 Training Epoch: 2 [39168/50000] Loss: 4.2943 LR: 0.100000 base_loss:3.764887809753418;op_loss:0.5605908632278442 Training Epoch: 2 [39296/50000] Loss: 4.3255 LR: 0.100000 base_loss:3.6292269229888916;op_loss:0.5411778092384338 Training Epoch: 2 [39424/50000] Loss: 4.1704 LR: 0.100000 base_loss:3.6681225299835205;op_loss:0.5112259984016418 Training Epoch: 2 [39552/50000] Loss: 4.1793 LR: 0.100000 base_loss:3.6919984817504883;op_loss:0.552815318107605 Training Epoch: 2 [39680/50000] Loss: 4.2448 LR: 0.100000 base_loss:3.6374194622039795;op_loss:0.5640242099761963 Training Epoch: 2 [39808/50000] Loss: 4.2014 LR: 0.100000 base_loss:3.694801092147827;op_loss:0.5668648481369019 Training Epoch: 2 [39936/50000] Loss: 4.2617 LR: 0.100000 base_loss:3.7429771423339844;op_loss:0.5550873875617981 Training Epoch: 2 [40064/50000] Loss: 4.2981 LR: 0.100000 base_loss:3.6577112674713135;op_loss:0.5416204929351807 Training Epoch: 2 [40192/50000] Loss: 4.1993 LR: 0.100000 base_loss:3.5659420490264893;op_loss:0.5471091270446777 Training Epoch: 2 [40320/50000] Loss: 4.1131 LR: 0.100000 base_loss:3.695629596710205;op_loss:0.5519151091575623 Training Epoch: 2 [40448/50000] Loss: 4.2475 LR: 0.100000 base_loss:3.5980539321899414;op_loss:0.5849438905715942 Training Epoch: 2 [40576/50000] Loss: 4.1830 LR: 0.100000 base_loss:3.7832765579223633;op_loss:0.5506361722946167 Training Epoch: 2 [40704/50000] Loss: 4.3339 LR: 0.100000 base_loss:3.767059326171875;op_loss:0.5836620330810547 Training Epoch: 2 [40832/50000] Loss: 4.3507 LR: 0.100000 base_loss:3.5711848735809326;op_loss:0.4968909025192261 Training Epoch: 2 [40960/50000] Loss: 4.0681 LR: 0.100000 base_loss:3.7396597862243652;op_loss:0.5382078886032104 Training Epoch: 2 [41088/50000] Loss: 4.2779 LR: 0.100000 base_loss:3.6316683292388916;op_loss:0.5416370630264282 Training Epoch: 2 [41216/50000] Loss: 4.1733 LR: 0.100000 base_loss:3.6659433841705322;op_loss:0.5240254402160645 Training Epoch: 2 [41344/50000] Loss: 4.1900 LR: 0.100000 base_loss:3.622417449951172;op_loss:0.5476852059364319 Training Epoch: 2 [41472/50000] Loss: 4.1701 LR: 0.100000 base_loss:3.5213141441345215;op_loss:0.5527032613754272 Training Epoch: 2 [41600/50000] Loss: 4.0740 LR: 0.100000 base_loss:3.851975917816162;op_loss:0.5474400520324707 Training Epoch: 2 [41728/50000] Loss: 4.3994 LR: 0.100000 base_loss:3.6676106452941895;op_loss:0.5726674795150757 Training Epoch: 2 [41856/50000] Loss: 4.2403 LR: 0.100000 base_loss:3.666172504425049;op_loss:0.5618478059768677 Training Epoch: 2 [41984/50000] Loss: 4.2280 LR: 0.100000 base_loss:3.665099859237671;op_loss:0.546825647354126 Training Epoch: 2 [42112/50000] Loss: 4.2119 LR: 0.100000 base_loss:3.8053016662597656;op_loss:0.5450616478919983 Training Epoch: 2 [42240/50000] Loss: 4.3504 LR: 0.100000 base_loss:3.467334270477295;op_loss:0.5174261331558228 Training Epoch: 2 [42368/50000] Loss: 3.9848 LR: 0.100000 base_loss:3.6633100509643555;op_loss:0.566246747970581 Training Epoch: 2 [42496/50000] Loss: 4.2296 LR: 0.100000 base_loss:3.712745428085327;op_loss:0.5483819246292114 Training Epoch: 2 [42624/50000] Loss: 4.2611 LR: 0.100000 base_loss:3.606707811355591;op_loss:0.5496277809143066 Training Epoch: 2 [42752/50000] Loss: 4.1563 LR: 0.100000 base_loss:3.5225234031677246;op_loss:0.5447441339492798 Training Epoch: 2 [42880/50000] Loss: 4.0673 LR: 0.100000 base_loss:3.703758716583252;op_loss:0.5799238681793213 Training Epoch: 2 [43008/50000] Loss: 4.2837 LR: 0.100000 base_loss:3.8243813514709473;op_loss:0.5476846098899841 Training Epoch: 2 [43136/50000] Loss: 4.3721 LR: 0.100000 base_loss:3.8335187435150146;op_loss:0.5841560363769531 Training Epoch: 2 [43264/50000] Loss: 4.4177 LR: 0.100000 base_loss:3.772874355316162;op_loss:0.5466620922088623 Training Epoch: 2 [43392/50000] Loss: 4.3195 LR: 0.100000 base_loss:3.614379405975342;op_loss:0.5419776439666748 Training Epoch: 2 [43520/50000] Loss: 4.1564 LR: 0.100000 base_loss:3.5509157180786133;op_loss:0.5490822196006775 Training Epoch: 2 [43648/50000] Loss: 4.1000 LR: 0.100000 base_loss:3.785449743270874;op_loss:0.5515962839126587 Training Epoch: 2 [43776/50000] Loss: 4.3370 LR: 0.100000 base_loss:3.7325148582458496;op_loss:0.5258728265762329 Training Epoch: 2 [43904/50000] Loss: 4.2584 LR: 0.100000 base_loss:3.6273269653320312;op_loss:0.5088866949081421 Training Epoch: 2 [44032/50000] Loss: 4.1362 LR: 0.100000 base_loss:3.7741997241973877;op_loss:0.5788636803627014 Training Epoch: 2 [44160/50000] Loss: 4.3531 LR: 0.100000 base_loss:3.7089836597442627;op_loss:0.5462599992752075 Training Epoch: 2 [44288/50000] Loss: 4.2552 LR: 0.100000 base_loss:3.6435673236846924;op_loss:0.5522520542144775 Training Epoch: 2 [44416/50000] Loss: 4.1958 LR: 0.100000 base_loss:3.6178042888641357;op_loss:0.5518785715103149 Training Epoch: 2 [44544/50000] Loss: 4.1697 LR: 0.100000 base_loss:3.5295042991638184;op_loss:0.5132536888122559 Training Epoch: 2 [44672/50000] Loss: 4.0428 LR: 0.100000 base_loss:3.848250150680542;op_loss:0.5623078346252441 Training Epoch: 2 [44800/50000] Loss: 4.4106 LR: 0.100000 base_loss:3.5371782779693604;op_loss:0.5307868719100952 Training Epoch: 2 [44928/50000] Loss: 4.0680 LR: 0.100000 base_loss:3.611419200897217;op_loss:0.5348109006881714 ############################################################

    opened by ChenJunzhi-buaa 2
  • Large batch size

    Large batch size

    Thanks for your generous sharing of your code!

    But I have a small question. If you have a really large batch size, e.g.,1024, the inner product between features would produce a huge matrix. Then it will cost a lot of memory in your device (maybe out of extent...)

    How can you deal with this situation?

    opened by shanice-l 2
  • Eq (2) and (3)

    Eq (2) and (3)

    Hi, thanks to your great work first! I curious about that do you need to average the s and d in eq2 and eq3 respectively? Because you take “mean operator” in your pseudo code but “sum operation” in equation.

    opened by Lilyo 1
  • About the  model

    About the model

    Such a great work! However, I found the model you use is vanilla resnet. The final block of resnet use ReLU as activation function, which results in all output features being non-negative. So the d is non-negative, which means that none of the features can be orthogonal. Can you explain why this model is used?Thank you so much!

    opened by caixincx 0
  • Take the absolute value of

    Take the absolute value of "d"

    Thanks for sharing such a great work. By comparing the papers, I found that the implementation of the absolute operation of "d" is missing here. So why remove the absolute operation? Thank you.

    opened by YahuanCong 0
  • About the visualization

    About the visualization

    Wow, such a great work! I really wonder how to generate the visualization in the figure 2. I haven't seen this before,. Is there any code about this visualization figure? Thank you so much!

    opened by wwtt666 2
Releases(v1.0.0)
Owner
Kanchana Ranasinghe
Kanchana Ranasinghe
salabim - discrete event simulation in Python

Object oriented discrete event simulation and animation in Python. Includes process control features, resources, queues, monitors. statistical distrib

181 Dec 21, 2022
A Python framework for conversational search

Chatty Goose Multi-stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting Installation Ma

Castorini 36 Oct 23, 2022
Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

ReLU-GP Residual (RGPR) This repository contains code for reproducing the following NeurIPS 2021 paper: @inproceedings{kristiadi2021infinite, title=

Agustinus Kristiadi 4 Dec 26, 2021
Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction".

TGIN Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction". Files in the folder dataset/ electr

Alibaba 21 Dec 21, 2022
DRIFT is a tool for Diachronic Analysis of Scientific Literature.

About DRIFT is a tool for Diachronic Analysis of Scientific Literature. The application offers user-friendly and customizable utilities for two modes:

Rajaswa Patil 108 Dec 12, 2022
Mini-hmc-jax - A simple implementation of Hamiltonian Monte Carlo in JAX

mini-hmc-jax This is a simple implementation of Hamiltonian Monte Carlo in JAX t

Martin Marek 6 Mar 03, 2022
Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.

Video Representation Learning by Recognizing Temporal Transformations [Project Page] Simon Jenni, Givi Meishvili, and Paolo Favaro. In ECCV, 2020. Thi

Simon Jenni 46 Nov 14, 2022
A Factor Model for Persistence in Investment Manager Performance

Factor-Model-Manager-Performance A Factor Model for Persistence in Investment Manager Performance I apply methods and processes similar to those used

Omid Arhami 1 Dec 01, 2021
Powerful unsupervised domain adaptation method for dense retrieval.

Powerful unsupervised domain adaptation method for dense retrieval

Ubiquitous Knowledge Processing Lab 191 Dec 28, 2022
PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric

PyTorch implementation of MSBG hearing loss model and MBSTOI intelligibility metric This repository contains the implementation of MSBG hearing loss m

BUT <a href=[email protected]"> 9 Nov 08, 2022
Meshed-Memory Transformer for Image Captioning. CVPR 2020

M²: Meshed-Memory Transformer This repository contains the reference code for the paper Meshed-Memory Transformer for Image Captioning (CVPR 2020). Pl

AImageLab 422 Dec 28, 2022
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

254 Dec 27, 2022
Code for database and frontend of webpage for Neural Fields in Visual Computing and Beyond.

Neural Fields in Visual Computing—Complementary Webpage This is based on the amazing MiniConf project from Hendrik Strobelt and Sasha Rush—thank you!

Brown University Visual Computing Group 29 Nov 30, 2022
DimReductionClustering - Dimensionality Reduction + Clustering + Unsupervised Score Metrics

Dimensionality Reduction + Clustering + Unsupervised Score Metrics Introduction

11 Nov 15, 2022
Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

1.3k Dec 25, 2022
Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification This repository is the official implementation of [Dealing With Misspeci

0 Oct 25, 2021
NLU Dataset Diagnostics

NLU Dataset Diagnostics This repository contains data and scripts to reproduce the results from our paper: Aarne Talman, Marianna Apidianaki, Stergios

Language Technology at the University of Helsinki 1 Jul 20, 2022
This package contains a PyTorch Implementation of IB-GAN of the submitted paper in AAAI 2021

The PyTorch implementation of IB-GAN model of AAAI 2021 This package contains a PyTorch implementation of IB-GAN presented in the submitted paper (IB-

Insu Jeon 9 Mar 30, 2022
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

SimMIM By Zhenda Xie*, Zheng Zhang*, Yue Cao*, Yutong Lin, Jianmin Bao, Zhuliang Yao, Qi Dai and Han Hu*. This repo is the official implementation of

Microsoft 674 Dec 26, 2022
Council-GAN - Implementation for our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020)

Council-GAN Implementation of our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020) Paper Ori Nizan , Ayellet Tal, Breaking the Cycle

ori nizan 260 Nov 16, 2022