๐Ÿค– A Python library for learning and evaluating knowledge graph embeddings

Overview

PyKEEN

GitHub Actions License DOI Optuna integrated

PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-modal information).

Installation โ€ข Quickstart โ€ข Datasets โ€ข Models โ€ข Support โ€ข Citation

Installation PyPI - Python Version PyPI

The latest stable version of PyKEEN can be downloaded and installed from PyPI with:

$ pip install pykeen

The latest version of PyKEEN can be installed directly from the source on GitHub with:

$ pip install git+https://github.com/pykeen/pykeen.git

More information about installation (e.g., development mode, Windows installation, Colab, Kaggle, extras) can be found in the installation documentation.

Quickstart Documentation Status

This example shows how to train a model on a dataset and test on another dataset.

The fastest way to get up and running is to use the pipeline function. It provides a high-level entry into the extensible functionality of this package. The following example shows how to train and evaluate the TransE model on the Nations dataset. By default, the training loop uses the stochastic local closed world assumption (sLCWA) training approach and evaluates with rank-based evaluation.

from pykeen.pipeline import pipeline

result = pipeline(
    model='TransE',
    dataset='nations',
)

The results are returned in an instance of the PipelineResult dataclass that has attributes for the trained model, the training loop, the evaluation, and more. See the tutorials on using your own dataset, understanding the evaluation, and making novel link predictions.

PyKEEN is extensible such that:

  • Each model has the same API, so anything from pykeen.models can be dropped in
  • Each training loop has the same API, so pykeen.training.LCWATrainingLoop can be dropped in
  • Triples factories can be generated by the user with from pykeen.triples.TriplesFactory

The full documentation can be found at https://pykeen.readthedocs.io.

Implementation

Below are the models, datasets, training modes, evaluators, and metrics implemented in pykeen.

Datasets (27)

The following datasets are built in to PyKEEN. The citation for each dataset corresponds to either the paper describing the dataset, the first paper published using the dataset with knowledge graph embedding models, or the URL for the dataset if neither of the first two are available. If you want to use a custom dataset, see the Bring Your Own Dataset tutorial. If you have a suggestion for another dataset to include in PyKEEN, please let us know here.

Name Documentation Citation Entities Relations Triples
Clinical Knowledge Graph pykeen.datasets.CKG Santos et al., 2020 7617419 11 26691525
CN3l Family pykeen.datasets.CN3l Chen et al., 2017 3206 42 21777
CoDEx (large) pykeen.datasets.CoDExLarge Safavi et al., 2020 77951 69 612437
CoDEx (medium) pykeen.datasets.CoDExMedium Safavi et al., 2020 17050 51 206205
CoDEx (small) pykeen.datasets.CoDExSmall Safavi et al., 2020 2034 42 36543
ConceptNet pykeen.datasets.ConceptNet Speer et al., 2017 28370083 50 34074917
Countries pykeen.datasets.Countries Bouchard et al., 2015 271 2 1158
Commonsense Knowledge Graph pykeen.datasets.CSKG Ilievski et al., 2020 2087833 58 4598728
DB100K pykeen.datasets.DB100K Ding et al., 2018 99604 470 697479
DBpedia50 pykeen.datasets.DBpedia50 Shi et al., 2017 24624 351 34421
Drug Repositioning Knowledge Graph pykeen.datasets.DRKG gnn4dr/DRKG 97238 107 5874257
FB15k pykeen.datasets.FB15k Bordes et al., 2013 14951 1345 592213
FB15k-237 pykeen.datasets.FB15k237 Toutanova et al., 2015 14505 237 310079
Hetionet pykeen.datasets.Hetionet Himmelstein et al., 2017 45158 24 2250197
Kinships pykeen.datasets.Kinships Kemp et al., 2006 104 25 10686
Nations pykeen.datasets.Nations ZhenfengLei/KGDatasets 14 55 1992
OGB BioKG pykeen.datasets.OGBBioKG Hu et al., 2020 45085 51 5088433
OGB WikiKG pykeen.datasets.OGBWikiKG Hu et al., 2020 2500604 535 17137181
OpenBioLink pykeen.datasets.OpenBioLink Breit et al., 2020 180992 28 4563407
OpenBioLink pykeen.datasets.OpenBioLinkLQ Breit et al., 2020 480876 32 27320889
Unified Medical Language System pykeen.datasets.UMLS ZhenfengLei/KGDatasets 135 46 6529
WD50K (triples) pykeen.datasets.WD50KT Galkin et al., 2020 40107 473 232344
WK3l-120k Family pykeen.datasets.WK3l120k Chen et al., 2017 119748 3109 1375406
WK3l-15k Family pykeen.datasets.WK3l15k Chen et al., 2017 15126 1841 209041
WordNet-18 pykeen.datasets.WN18 Bordes et al., 2014 40943 18 151442
WordNet-18 (RR) pykeen.datasets.WN18RR Toutanova et al., 2015 40559 11 92583
YAGO3-10 pykeen.datasets.YAGO310 Mahdisoltani et al., 2015 123143 37 1089000

Models (30)

Name Reference Citation
CompGCN pykeen.models.CompGCN Vashishth et al., 2020
ComplEx pykeen.models.ComplEx Trouillon et al., 2016
ComplEx Literal pykeen.models.ComplExLiteral Kristiadi et al., 2018
ConvE pykeen.models.ConvE Dettmers et al., 2018
ConvKB pykeen.models.ConvKB Nguyen et al., 2018
CrossE pykeen.models.CrossE Zhang et al., 2019
DistMA pykeen.models.DistMA Shi et al., 2019
DistMult pykeen.models.DistMult Yang et al., 2014
DistMult Literal pykeen.models.DistMultLiteral Kristiadi et al., 2018
ER-MLP pykeen.models.ERMLP Dong et al., 2014
ER-MLP (E) pykeen.models.ERMLPE Sharifzadeh et al., 2019
HolE pykeen.models.HolE Nickel et al., 2016
KG2E pykeen.models.KG2E He et al., 2015
MuRE pykeen.models.MuRE Balaลพeviฤ‡ et al., 2019
NTN pykeen.models.NTN Socher et al., 2013
PairRE pykeen.models.PairRE Chao et al., 2020
ProjE pykeen.models.ProjE Shi et al., 2017
QuatE pykeen.models.QuatE Zhang et al., 2019
RESCAL pykeen.models.RESCAL Nickel et al., 2011
R-GCN pykeen.models.RGCN Schlichtkrull et al., 2018
RotatE pykeen.models.RotatE Sun et al., 2019
SimplE pykeen.models.SimplE Kazemi et al., 2018
Structured Embedding pykeen.models.StructuredEmbedding Bordes et al., 2011
TorusE pykeen.models.TorusE Ebisu et al., 2018
TransD pykeen.models.TransD Ji et al., 2015
TransE pykeen.models.TransE Bordes et al., 2013
TransH pykeen.models.TransH Wang et al., 2014
TransR pykeen.models.TransR Lin et al., 2015
TuckER pykeen.models.TuckER Balaลพeviฤ‡ et al., 2019
Unstructured Model pykeen.models.UnstructuredModel Bordes et al., 2014

Losses (7)

Name Reference Description
Binary cross entropy (after sigmoid) pykeen.losses.BCEAfterSigmoidLoss A module for the numerically unstable version of explicit Sigmoid + BCE loss.
Binary cross entropy (with logits) pykeen.losses.BCEWithLogitsLoss A module for the binary cross entropy loss.
Cross entropy pykeen.losses.CrossEntropyLoss A module for the cross entropy loss that evaluates the cross entropy after softmax output.
Margin ranking pykeen.losses.MarginRankingLoss A module for the margin ranking loss.
Mean square error pykeen.losses.MSELoss A module for the mean square error loss.
Self-adversarial negative sampling pykeen.losses.NSSALoss An implementation of the self-adversarial negative sampling loss function proposed by [sun2019]_.
Softplus pykeen.losses.SoftplusLoss A module for the softplus loss.

Regularizers (5)

Name Reference Description
combined pykeen.regularizers.CombinedRegularizer A convex combination of regularizers.
lp pykeen.regularizers.LpRegularizer A simple L_p norm based regularizer.
no pykeen.regularizers.NoRegularizer A regularizer which does not perform any regularization.
powersum pykeen.regularizers.PowerSumRegularizer A simple x^p based regularizer.
transh pykeen.regularizers.TransHRegularizer A regularizer for the soft constraints in TransH.

Optimizers (6)

Name Reference Description
adadelta torch.optim.Adadelta Implements Adadelta algorithm.
adagrad torch.optim.Adagrad Implements Adagrad algorithm.
adam torch.optim.Adam Implements Adam algorithm.
adamax torch.optim.Adamax Implements Adamax algorithm (a variant of Adam based on infinity norm).
adamw torch.optim.AdamW Implements AdamW algorithm.
sgd torch.optim.SGD Implements stochastic gradient descent (optionally with momentum).

Training Loops (2)

Name Reference Description
lcwa pykeen.training.LCWATrainingLoop A training loop that uses the local closed world assumption training approach.
slcwa pykeen.training.SLCWATrainingLoop A training loop that uses the stochastic local closed world assumption training approach.

Negative Samplers (3)

Name Reference Description
basic pykeen.sampling.BasicNegativeSampler A basic negative sampler.
bernoulli pykeen.sampling.BernoulliNegativeSampler An implementation of the Bernoulli negative sampling approach proposed by [wang2014]_.
pseudotyped pykeen.sampling.PseudoTypedNegativeSampler A sampler that accounts for which entities co-occur with a relation.

Stoppers (2)

Name Reference Description
early pykeen.stoppers.EarlyStopper A harness for early stopping.
nop pykeen.stoppers.NopStopper A stopper that does nothing.

Evaluators (2)

Name Reference Description
rankbased pykeen.evaluation.RankBasedEvaluator A rank-based evaluator for KGE models.
sklearn pykeen.evaluation.SklearnEvaluator An evaluator that uses a Scikit-learn metric.

Metrics (16)

Name Description
AUC-ROC The area under the ROC curve, on [0, 1]. Higher is better.
Adjusted Arithmetic Mean Rank (AAMR) The mean over all chance-adjusted ranks, on (0, 2). Lower is better.
Adjusted Arithmetic Mean Rank Index (AAMRI) The re-indexed adjusted mean rank (AAMR), on [-1, 1]. Higher is better.
Average Precision The area under the precision-recall curve, on [0, 1]. Higher is better.
Geometric Mean Rank (GMR) The geometric mean over all ranks, on [1, inf). Lower is better.
Harmonic Mean Rank (HMR) The harmonic mean over all ranks, on [1, inf). Lower is better.
Hits @ K The relative frequency of ranks not larger than a given k, on [0, 1]. Higher is better
Inverse Arithmetic Mean Rank (IAMR) The inverse of the arithmetic mean over all ranks, on (0, 1]. Higher is better.
Inverse Geometric Mean Rank (IGMR) The inverse of the geometric mean over all ranks, on (0, 1]. Higher is better.
Inverse Median Rank The inverse of the median over all ranks, on (0, 1]. Higher is better.
Mean Rank (MR) The arithmetic mean over all ranks on, [1, inf). Lower is better.
Mean Reciprocal Rank (MRR) The inverse of the harmonic mean over all ranks, on (0, 1]. Higher is better.
Median Rank The median over all ranks, on [1, inf). Lower is better.

Trackers (7)

Name Reference Description
console pykeen.trackers.ConsoleResultTracker A class that directly prints to console.
csv pykeen.trackers.CSVResultTracker Tracking results to a CSV file.
json pykeen.trackers.JSONResultTracker Tracking results to a JSON lines file.
mlflow pykeen.trackers.MLFlowResultTracker A tracker for MLflow.
neptune pykeen.trackers.NeptuneResultTracker A tracker for Neptune.ai.
tensorboard pykeen.trackers.TensorBoardResultTracker A tracker for TensorBoard.
wandb pykeen.trackers.WANDBResultTracker A tracker for Weights and Biases.

Hyper-parameter Optimization

Samplers (3)

Name Reference Description
grid optuna.samplers.GridSampler Sampler using grid search.
random optuna.samplers.RandomSampler Sampler using random sampling.
tpe optuna.samplers.TPESampler Sampler using TPE (Tree-structured Parzen Estimator) algorithm.

Any sampler class extending the optuna.samplers.BaseSampler, such as their sampler implementing the CMA-ES algorithm, can also be used.

Experimentation

Reproduction

PyKEEN includes a set of curated experimental settings for reproducing past landmark experiments. They can be accessed and run like:

$ pykeen experiments reproduce tucker balazevic2019 fb15k

Where the three arguments are the model name, the reference, and the dataset. The output directory can be optionally set with -d.

Ablation

PyKEEN includes the ability to specify ablation studies using the hyper-parameter optimization module. They can be run like:

$ pykeen experiments ablation ~/path/to/config.json

Large-scale Reproducibility and Benchmarking Study

We used PyKEEN to perform a large-scale reproducibility and benchmarking study which are described in our article:

@article{ali2020benchmarking,
  title={Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework},
  author={Ali, Mehdi and Berrendorf, Max and Hoyt, Charles Tapley and Vermue, Laurent and Galkin, Mikhail and Sharifzadeh, Sahand and Fischer, Asja and Tresp, Volker and Lehmann, Jens},
  journal={arXiv preprint arXiv:2006.13365},
  year={2020}
}

We have made all code, experimental configurations, results, and analyses that lead to our interpretations available at https://github.com/pykeen/benchmarking.

Contributing

Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.

Acknowledgements

Supporters

This project has been supported by several organizations (in alphabetical order):

Funding

The development of PyKEEN has been funded by the following grants:

Funding Body Program Grant
DARPA Automating Scientific Knowledge Extraction (ASKE) HR00111990009
German Federal Ministry of Education and Research (BMBF) Maschinelles Lernen mit Wissensgraphen (MLWin) 01IS18050D
German Federal Ministry of Education and Research (BMBF) Munich Center for Machine Learning (MCML) 01IS18036A
Innovation Fund Denmark (Innovationsfonden) Danish Center for Big Data Analytics driven Innovation (DABAI) Grand Solutions

Logo

The PyKEEN logo was designed by Carina Steinborn

Citation

If you have found PyKEEN useful in your work, please consider citing our article:

@article{ali2021pykeen,
    author = {Ali, Mehdi and Berrendorf, Max and Hoyt, Charles Tapley and Vermue, Laurent and Sharifzadeh, Sahand and Tresp, Volker and Lehmann, Jens},
    journal = {Journal of Machine Learning Research},
    number = {82},
    pages = {1--6},
    title = {{PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings}},
    url = {http://jmlr.org/papers/v22/20-825.html},
    volume = {22},
    year = {2021}
}
Comments
  • ๐Ÿ’ƒ ๐Ÿ“ฆ Add BoxE

    ๐Ÿ’ƒ ๐Ÿ“ฆ Add BoxE

    Closes #613

    Following on the issue (https://github.com/pykeen/pykeen/issues/613), I have now written a preliminary version of binary BoxE (which I called BoxE-KG), and am sharing it as is as you suggested. It is based on an interaction (BoxEKGInteraction), and the file with my additions is under src/pykeen/models/unimodal: I instantiated boxes analogously to my official code, and set initializations consistently as well. I have also implemented the BoxE distance function and other helpers.

    ~I have tried to train the model, and this runs. However, I have had some erratic behavior with NSSALoss: When I run this in the forward function (to double-check, this is currently commented-out), I get sensible values. But when I pass this loss to the pipeline, the loss values are minuscule, starting at around 0.001! So I would appreciate some help understanding this. Otherwise, we can proceed from there to verify the model.~ <- This is all solved!!

    Dependencies:

    • [x] #623
    • [x] https://github.com/pykeen/pykeen/pull/624
    • [x] https://github.com/pykeen/pykeen/pull/626

    Test Code

    Test code
    import numpy as np
    import torch
    
    from pykeen.losses import NSSALoss
    from pykeen.datasets import WN18RR
    from pykeen.models.unimodal.boxe_kg import BoxE
    from pykeen.pipeline import pipeline
    
    from torch.nn import functional
    
    # TODO: Align optimizer settings: Constant LR
    
    class NSSALossLogging(NSSALoss):
        def forward(
            self,
            pos_scores: torch.FloatTensor,
            neg_scores: torch.FloatTensor,
            neg_weights: torch.FloatTensor,
        ) -> torch.FloatTensor:
            # copy of NSSALoss.forward
            neg_loss = functional.logsigmoid(-neg_scores - self.margin)
            neg_loss = neg_weights * neg_loss
            neg_loss = self._reduction_method(neg_loss)
            print("-", -neg_loss.item())
    
            pos_loss = functional.logsigmoid(self.margin + pos_scores)
            pos_loss = self._reduction_method(pos_loss)
            print("+", -pos_loss.item())
    
            loss = -pos_loss - neg_loss
    
            if self._reduction_method is torch.mean:
                loss = loss / 2.0
    
            return loss
    
    
    
    def main():
        embedding_dim = 500
        unif_init_bound = 2 * np.sqrt(embedding_dim)
        init_kw = dict(a=-1 / unif_init_bound, b=1 / unif_init_bound)
        size_init_kw = dict(a=-1, b=1)
        dataset = WN18RR()
        triples_factory = dataset.training
        model = BoxE(
            triples_factory=triples_factory,
            embedding_dim=500,
            norm_order=2,
            tanh_map=True,
            entity_initializer=torch.nn.init.uniform_,
            entity_initializer_kwargs=init_kw,
            relation_initializer=torch.nn.init.uniform_,
            relation_initializer_kwargs=init_kw,
            relation_size_initializer=torch.nn.init.uniform_,
            relation_size_initializer_kwargs=size_init_kw,
        )
    
        results = pipeline(
            random_seed=1000000,
            dataset=dataset,
            model=model,
            training_kwargs=dict(num_epochs=300, batch_size=512, checkpoint_name="tria.pt", checkpoint_frequency=100),
            loss=NSSALossLogging(margin=5, adversarial_temperature=0.0, reduction="sum"),
            lr_scheduler=torch.optim.lr_scheduler.ConstantLR,
            lr_scheduler_kwargs=dict(total_iters=0),
            training_loop="sLCWA",
            negative_sampler="basic",
            negative_sampler_kwargs=dict(num_negs_per_pos=150),
            result_tracker="json",
            result_tracker_kwargs=dict(name="test.json"),
            evaluation_kwargs=dict(batch_size=16),
            optimizer=torch.optim.Adam,
            optimizer_kwargs=dict(lr=0.001)   # Cancel out the thing
        )
    
    if __name__ == '__main__':
        main()
    
    ๐Ÿ’ƒ Model ๐Ÿ’Ž New Component 
    opened by ralphabb 60
  • ๐Ÿฆ† ๐Ÿ Inductive LP framework

    ๐Ÿฆ† ๐Ÿ Inductive LP framework

    Closes #720.

    This PR brings the support of inductive link prediction in PyKEEN with new datasets and training loops.

    Quite a lot of things going on here and I had to take some design decisions, so I'll list important features and TODOs, and please feel free to edit the design decisions.

    1. Inductive setup means that at inference time (validation and test) we run predictions on a new graph comprised of new entities. In the seminal work of Teru et al there exist Training graph (on which we train a model) and Inductive Inference graph - the inductive inference graph is then split into 3 parts: main graph, missing validation triples, and missing test triples. However, as we classified in the ISWC'21 paper, there exist other inductive scenarios where nodes might get added to the training graph as well. The main assumption: all relations must be seen in the transductive training graph, such that we can learn at least relation embeddings.

    2. A new InductiveDataset class now includes at least 4 factories:

    • transductive_training - on which we train and get the known relations
    • inductive_inference - the inference graph on which we are supposed to run a GNN model (or anything else)
    • inductive_validation - missing triples from inductive_inference to predict as validation triples
    • inductive_test - missing triples from inductive_inference to predict as validation triples
    • I thought of further specializing this dataset into DisjointInductiveDataset where InductiveInference is totally disjoint from the TransductiveTraining and MixedInductiveDataset where InductiveInference might be an updated, bigger version of the TransductiveTraining graph. So far I kept the base class only
    1. Because of 4 factories, I had to create some loading spaghetti LazyInductiveDataset, DisjointInductivePathDataset, UnpackedRemoteDisjointInductiveDataset in order to create the loaders for 12 standard ILP datasets from Teru et al. The data splits exist right on github, so it's the inductive version of the UnpackedRemoteDataset. Each of 3 datasets (FB15k-237, WN18RR, NELL-995) consists of 4 versions that differ by the sizes of training and inductive inference graphs. Each loader has the default v1 version. The downloading procedures create relevant subfolders in the PYKEEN_DATASETS home. I tried loading all of them and it works

    2. Crucial part of the loading: all inductive splits (inference, validation, test) share the same RELATIONS index from the transductive training part, i.e., relation2id in the TripleFactory creation process must belong to the original training.

    Next steps are:

    1. Adding the Inductive Training Loop - where training instances are obtained from the transductive training factory
    2. Adding the ILP Evaluator - where evaluation instances are obtained from the inductive factories

    TODOs:

    • [x] InductiveDataset class
    • [x] Loading Helpers
    • [x] 12 ILP datasets from Teru et al
    • [x] Inductive Training Loop + InductiveSLCWA + Inductive LCWA
    • [x] Inductive Evaluator
    • [x] Inductive NodePiece
    • [x] Inductive LP MVP
    • [x] A restricted inductive evaluator to replicate the Teru et al setting to evaluate the model only on 50 randomly selected entities from the inference graph
    • [x] More generic model interfaces
    • [x] Add easy integration of NodePiece (featurizer) + GNN (graph encoder) + any interaction function (as link prediction decoder). Experimentally, it works even better than plain NodePiece + interaction

    Spin-off PRs

    • https://github.com/pykeen/pykeen/pull/729
    • https://github.com/pykeen/pykeen/pull/733
    • https://github.com/pykeen/pykeen/pull/734
    • https://github.com/pykeen/pykeen/pull/736
    • https://github.com/pykeen/pykeen/pull/743
    • https://github.com/pykeen/pykeen/pull/769
    opened by migalkin 35
  • ๐Ÿ’ƒ ๐Ÿฅค Extract interaction function from models

    ๐Ÿ’ƒ ๐Ÿฅค Extract interaction function from models

    maxresdefault

    This is a replacement for #88 , where the merge target is master.

    @mali-git As discussed in today's call, I tried to draft an API for the interaction function. It is built around the most generic form of interaction function, which has one batch dimension, and then allows broadcasting over multiple entities/relations to meet the use cases for e.g. scoring all tail entities at once, but also supports, e.g. full CWA scores.

    This can likely also help for the fast LCWA @lvermue once envisioned ๐Ÿ˜‰

    I did not define the methods to be static to allow for parametric interaction functions such as e.g. ER-MLP having some weights.

    Overview

    • One shared implementation for score_hrt / score_h / score_r / score_t in the base class, done in InteractionFunction.
    • One shared implementation of _score for all models sharing the same set of embeddings (e.g. TransE/DistMult/ERMLP -> one vector for each entity/relation, TransH -> additional vector for each entity, etc.)
    • A state-less functional form of the interaction function where all necessary states are passed from the outside. This is done in pykeen.nn.modules.
    • A state-full implementation of interaction function encapsulating all shared parameters (e.g. weight matrices for ERMLP / ConvE, etc.), but delegating the actual interaction to the state-less version. This is done in pykeen.nn.functional.

    Tasks:

    • [x] What to do with interaction models where we have more than one vector for an entity/relation, such as e.g. TransH?
    • [x] Re-introduce slicing (In best case, on the generic level)
    • [x] Re-introduce regularization on generic level
    • [x] Fix R-GCN (or keep it broken for #110 ?)
    • [x] Add regularizer/constrainer directly to modules, recursively search for modules having regularize and accumulate value.
    • [x] Update to reshape in the generic Interaction
    • ~[ ] Update pipeline model composition~ bumped to https://github.com/pykeen/pykeen/pull/163

    Dependencies:

    • [x] #137
    enhancement 
    opened by mberr 23
  • How to upgrade PyKEEN<1.8.0 code that uses `EmbeddingSpecification`?

    How to upgrade PyKEEN<1.8.0 code that uses `EmbeddingSpecification`?

    Describe the bug

    It seems that EmbeddingSpecification is no longer under pykeen.nn.representation.

    How to reproduce

    from pykeen.nn.representation import EmbeddingSpecification

    Environment

    PyKEEN | 1.8.0

    Additional information

    No response

    question 
    opened by thtang 19
  • ๐Ÿšง๐Ÿ”ฆ Update R-GCN configuration

    ๐Ÿšง๐Ÿ”ฆ Update R-GCN configuration

    This PR updates the RGCN implementation and experiment configuration.

    In particular it

    • [x] fixes some errors in the old experiment configurations
    • [x] converts the JSON configurations to YAML (see also #612 ), and adds extensive comments to the fb15k version
    • [x] add gradient clipping (see also #607) , cf. here
    • [x] learns separate decompositions for forward and backward edges (aka "normal" and inverse relations), and does not include the self-loop in any of them, but rather learns one additional independent weight for it
    • [x] removes batch normalization (which is not part of the original model)
    • [x] adds FB15k237 configuration
    • [x] ~makes sure that the graph sampler is used for batch sampling rather than sampling individual triples~ solved in #614

    While the changes improve the results obtained in the reproduction setting (at least for fb15k), they cannot achieve the reported performance.

    Dependencies

    • [x] #607
    • [x] #612
    • [x] #614

    Related:

    • #603
    • https://github.com/MichSchli/RelationPrediction/issues/6
    • https://github.com/MichSchli/RelationPrediction/issues/10
    โ˜ ๏ธ R-GCN โ˜ ๏ธ 
    opened by mberr 18
  • Support for text on Literal models

    Support for text on Literal models

    I am currently working on a project on knowledge graph embeddings and I wanted to test the efficiency of models which use literals to enrich the embeddings (i.e. LiteralE models). I have seen that your library already provides an implementation of the models DistMult and ComplEx which use information from numerical literals and I was wondering if you thought about providing support also for textual literals such as in here: https://github.com/SmartDataAnalytics/LiteralE. The issue is that the repository mentioned above is no more maintained and lacks of utilities such as hyper parameter optimization that you already offer in your suite.

    enhancement 
    opened by sntcristian 18
  • Improve vectorization of novelty computation

    Improve vectorization of novelty computation

    This PR improves the vectorization of novelty computation for predict_heads / predict_tails.

    Fixes #49

    Still to do:

    • [x] Provide fast implementation for scoring/sorting all possible triples (@mberr)
    • [x] Provide in documentation tutorial about making predictions (@cthoyt)
    opened by mberr 18
  • ValueError: need at least one array to concatenate in rank_based_evaluator.py

    ValueError: need at least one array to concatenate in rank_based_evaluator.py

    Describe the bug

    When running a pipeline to train a model it calls to the rank_based_evaluator.py script. There, the line c_ranks = np.concatenate([ranks_flat[side, rank_type] for side in sides]) outputs the following error:

     File "<__array_function__ internals>", line 5, in concatenate
    ValueError: need at least one array to concatenate
    

    How to reproduce

    I haven't developed a script to make the error reproducible, but it happens when calling the pipeline function:

    results = pipeline(
        training=training,
        testing=testing,
        validation=validation,
        model=model,
        model_kwargs=dict(
            embedding_dim=embedding
        ),
        loss=loss,
        training_loop='sLCWA',
        negative_sampler='basic',
        result_tracker='tensorboard',
        result_tracker_kwargs=dict(
            experiment_path=logdir,
        ),
        training_kwargs=dict(
            num_epochs=epochs,
            batch_size=batch_size,
            sampler=sampler),
        stopper='early',
        stopper_kwargs=dict(
            frequency=5,
            patience=10,
            relative_delta=0.05,
            metric='adjusted_mean_rank_index',
        ),
        random_seed=42,
        device=device
    )
    

    Environment

    Unable to handle parameter in AutoSF: coefficients | Key | Value | |-----------------|-----------------------------| | OS | posix | | Platform | Linux | | Release | 3.10.0-1160.36.2.el7.x86_64 | | Time | Mon Mar 7 17:27:26 2022 | | Python | 3.8.12 | | PyKEEN | 1.7.1-dev | | PyKEEN Hash | UNHASHED | | PyKEEN Branch | | | PyTorch | 1.10.2 | | CUDA Available? | false | | CUDA Version | 10.2 | | cuDNN Version | 7605 |

    Additional information

    I suspect that it's related with some recent update, because I have other environment in which the code runs perfectly. The numpy version is the same in both environments. My current environment is the following:

     Name                    Version                   Build  Channel
    _libgcc_mutex             0.1                        main    conda-forge
    _openmp_mutex             4.5                       1_gnu  
    absl-py                   0.15.0             pyhd3eb1b0_0  
    aiohttp                   3.8.1            py38h7f8727e_0  
    aiosignal                 1.2.0              pyhd3eb1b0_0  
    alembic                   1.7.6                    pypi_0    pypi
    async-timeout             4.0.1              pyhd3eb1b0_0  
    attrs                     21.4.0             pyhd3eb1b0_0  
    autopage                  0.5.0                    pypi_0    pypi
    blas                      1.0                         mkl    conda-forge
    blinker                   1.4              py38h06a4308_0  
    bottleneck                1.3.2            py38heb32a55_1  
    brotlipy                  0.7.0           py38h497a2fe_1001    conda-forge
    bzip2                     1.0.8                h7b6447c_0  
    c-ares                    1.18.1               h7f8727e_0  
    ca-certificates           2021.10.8            ha878542_0    conda-forge
    cachetools                4.2.2              pyhd3eb1b0_0  
    certifi                   2021.10.8        py38h578d9bd_1    conda-forge
    cffi                      1.15.0           py38hd667e15_1  
    charset-normalizer        2.0.12             pyhd8ed1ab_0    conda-forge
    class-resolver            0.3.4                    pypi_0    pypi
    click                     8.0.4            py38h06a4308_0  
    click-default-group       1.2.2                    pypi_0    pypi
    cliff                     3.10.1                   pypi_0    pypi
    cmaes                     0.8.2                    pypi_0    pypi
    cmd2                      2.4.0                    pypi_0    pypi
    colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
    colorlog                  6.6.0                    pypi_0    pypi
    cryptography              35.0.0           py38ha5dfef3_0    conda-forge
    cudatoolkit               10.2.89              hfd86e86_1  
    dataclasses               0.8                pyh6d0b6a4_7  
    dataclasses-json          0.5.6                    pypi_0    pypi
    decorator                 4.4.2                      py_0    conda-forge
    docdata                   0.0.3                    pypi_0    pypi
    docrep                    0.3.2                    pypi_0    pypi
    ffmpeg                    4.3                  hf484d3e_0    pytorch
    freetype                  2.11.0               h70c0345_0  
    frozenlist                1.2.0            py38h7f8727e_0  
    giflib                    5.2.1                h7b6447c_0  
    gmp                       6.2.1                h2531618_2  
    gnutls                    3.6.15               he1e5248_0  
    google-auth               1.33.0             pyhd3eb1b0_0  
    google-auth-oauthlib      0.4.1                      py_2  
    googledrivedownloader     0.4                pyhd3deb0d_1    conda-forge
    greenlet                  1.1.2                    pypi_0    pypi
    grpcio                    1.42.0           py38hce63b2e_0  
    html5lib                  1.1                pyh9f0ad1d_0    conda-forge
    idna                      3.3                pyhd8ed1ab_0    conda-forge
    importlib-metadata        4.11.2                   pypi_0    pypi
    importlib-resources       5.4.0                    pypi_0    pypi
    inflect                   5.4.0                    pypi_0    pypi
    intel-openmp              2021.4.0          h06a4308_3561  
    isodate                   0.6.1              pyhd8ed1ab_0    conda-forge
    jinja2                    3.0.3              pyhd8ed1ab_0    conda-forge
    joblib                    1.1.0              pyhd8ed1ab_0    conda-forge
    jpeg                      9d                   h7f8727e_0  
    lame                      3.100                h7b6447c_0  
    lcms2                     2.12                 h3be6417_0  
    ld_impl_linux-64          2.35.1               h7274673_9  
    libblas                   3.9.0            12_linux64_mkl    conda-forge
    libcblas                  3.9.0            12_linux64_mkl    conda-forge
    libffi                    3.3                  he6710b0_2  
    libgcc-ng                 9.3.0               h5101ec6_17  
    libgfortran-ng            7.5.0               h14aa051_20    conda-forge
    libgfortran4              7.5.0               h14aa051_20    conda-forge
    libgomp                   9.3.0               h5101ec6_17  
    libiconv                  1.15                 h63c8f33_5  
    libidn2                   2.3.2                h7f8727e_0  
    liblapack                 3.9.0            12_linux64_mkl    conda-forge
    libpng                    1.6.37               hbc83047_0  
    libprotobuf               3.19.1               h4ff587b_0  
    libstdcxx-ng              9.3.0               hd4cf53a_17  
    libtasn1                  4.16.0               h27cfd23_0  
    libtiff                   4.2.0                h85742a9_0  
    libunistring              0.9.10               h27cfd23_0  
    libuv                     1.40.0               h7b6447c_0  
    libwebp                   1.2.2                h55f646e_0  
    libwebp-base              1.2.2                h7f8727e_0  
    lz4-c                     1.9.3                h295c915_1  
    mako                      1.1.6                    pypi_0    pypi
    markdown                  3.3.4            py38h06a4308_0  
    markupsafe                2.1.0                    pypi_0    pypi
    marshmallow               3.14.1                   pypi_0    pypi
    marshmallow-enum          1.5.1                    pypi_0    pypi
    mkl                       2021.4.0           h06a4308_640  
    mkl-service               2.4.0            py38h7f8727e_0  
    mkl_fft                   1.3.1            py38hd3c417c_0  
    mkl_random                1.2.2            py38h51133e4_0  
    more-click                0.0.6                    pypi_0    pypi
    more-itertools            8.12.0                   pypi_0    pypi
    multidict                 5.2.0            py38h7f8727e_2  
    mypy-extensions           0.4.3                    pypi_0    pypi
    ncurses                   6.3                  h7f8727e_2  
    nettle                    3.7.3                hbbd107a_1  
    networkx                  2.5.1              pyhd8ed1ab_0    conda-forge
    numexpr                   2.8.1            py38h6abb31d_0  
    numpy                     1.20.3                   pypi_0    pypi
    oauthlib                  3.1.0                      py_0  
    openh264                  2.1.1                h4ff587b_0  
    openssl                   1.1.1m               h7f8727e_0  
    optuna                    2.10.0                   pypi_0    pypi
    packaging                 21.3               pyhd3eb1b0_0  
    pandas                    1.3.5                    pypi_0    pypi
    pbr                       5.8.1                    pypi_0    pypi
    pillow                    9.0.1            py38h22f2fdc_0  
    pip                       21.2.4           py38h06a4308_0  
    prettytable               3.2.0                    pypi_0    pypi
    protobuf                  3.19.1           py38h295c915_0  
    pyasn1                    0.4.8              pyhd3eb1b0_0  
    pyasn1-modules            0.2.8                      py_0  
    pycparser                 2.21               pyhd8ed1ab_0    conda-forge
    pyg                       2.0.3           py38_torch_1.10.0_cu102    pyg
    pyjwt                     1.7.1                    py38_0  
    pykeen                    1.7.1.dev0               pypi_0    pypi
    pyopenssl                 22.0.0             pyhd8ed1ab_0    conda-forge
    pyparsing                 3.0.7              pyhd8ed1ab_0    conda-forge
    pyperclip                 1.8.2                    pypi_0    pypi
    pysocks                   1.7.1            py38h578d9bd_4    conda-forge
    pystow                    0.4.0                    pypi_0    pypi
    python                    3.8.12               h12debd9_0  
    python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
    python-louvain            0.15               pyhd8ed1ab_1    conda-forge
    python_abi                3.8                      2_cp38    conda-forge
    pytorch                   1.10.2          py3.8_cuda10.2_cudnn7.6.5_0    pytorch
    pytorch-cluster           1.5.9           py38_torch_1.10.0_cu102    pyg
    pytorch-mutex             1.0                        cuda    pytorch
    pytorch-scatter           2.0.9           py38_torch_1.10.0_cu102    pyg
    pytorch-sparse            0.6.12          py38_torch_1.10.0_cu102    pyg
    pytorch-spline-conv       1.2.1           py38_torch_1.10.0_cu102    pyg
    pytz                      2021.3             pyhd8ed1ab_0    conda-forge
    pyyaml                    6.0                      pypi_0    pypi
    rdflib                    6.1.1              pyhd8ed1ab_0    conda-forge
    readline                  8.1.2                h7f8727e_1  
    requests                  2.27.1             pyhd8ed1ab_0    conda-forge
    requests-oauthlib         1.3.0                      py_0  
    rexmex                    0.1.0                    pypi_0    pypi
    rsa                       4.7.2              pyhd3eb1b0_1  
    scikit-learn              1.0.2            py38h51133e4_1  
    scipy                     1.8.0                    pypi_0    pypi
    setuptools                58.0.4           py38h06a4308_0  
    six                       1.16.0             pyhd3eb1b0_1  
    sklearn                   0.0                      pypi_0    pypi
    sqlalchemy                1.4.32                   pypi_0    pypi
    sqlite                    3.37.2               hc218d9a_0  
    stevedore                 3.5.0                    pypi_0    pypi
    tabulate                  0.8.9                    pypi_0    pypi
    tensorboard               2.6.0                      py_1  
    tensorboard-data-server   0.6.0            py38hca6d32c_0  
    tensorboard-plugin-wit    1.6.0                      py_0  
    threadpoolctl             3.1.0              pyh8a188c0_0    conda-forge
    tk                        8.6.11               h1ccaba5_0  
    torchaudio                0.10.2               py38_cu102    pytorch
    torchvision               0.11.3               py38_cu102    pytorch
    tqdm                      4.63.0             pyhd8ed1ab_0    conda-forge
    typing-extensions         3.10.0.2             hd3eb1b0_0  
    typing-inspect            0.7.1                    pypi_0    pypi
    typing_extensions         3.10.0.2           pyh06a4308_0  
    urllib3                   1.26.8             pyhd8ed1ab_1    conda-forge
    wcwidth                   0.2.5                    pypi_0    pypi
    webencodings              0.5.1                      py_1    conda-forge
    werkzeug                  2.0.3              pyhd3eb1b0_0  
    wheel                     0.37.1             pyhd3eb1b0_0  
    wrapt                     1.13.3                   pypi_0    pypi
    xz                        5.2.5                h7b6447c_0  
    yacs                      0.1.8              pyhd8ed1ab_0    conda-forge
    yaml                      0.2.5                h516909a_0    conda-forge
    yarl                      1.6.3            py38h27cfd23_0  
    zipp                      3.7.0              pyhd3eb1b0_0  
    zlib                      1.2.11               h7f8727e_4  
    zstd                      1.4.9                haebb681_0 
    
    bug 
    opened by AlejandroTL 17
  • ๐Ÿ“ก ๐Ÿ“‰ Adding Tensorboard Tracker

    ๐Ÿ“ก ๐Ÿ“‰ Adding Tensorboard Tracker

    This PR closes #383

    Adding tensorboard as a results tracker.

    What you need to test:

    1. pip install tensorboard
    2. start the tensorboard process with the default logging path:tensorboard --logdir=.data/pykeen/logs/tensorboard/

    Issues:

    • I am not happy with the current naming scheme for the experiment log dir.
    • The params are current saved as text within the tb log. There is a add_hparams function, but I don't think it is quite what is needed here.
    • It would be great to distinguish between train and eval parameters when logging. It might also be worth considering if all the metrics actually need to be logged to tensorboard.
    • Tensorboard has great support for visualizing embeddings using the add_embedding function - would be great if the final embeddings (or a subset of them more realistically) could be added.

    Tasks:

    • [x] add documentation and examples
    • [x] update setup.cfg with tensorboard deps
    ๐Ÿบ Tracker ๐Ÿ’Ž New Component 
    opened by sbonner0 17
  • Results of HPO

    Results of HPO

    What is your question

    I am using HPO to optimize (Epoch, batch size, embedding dimension, neg_per-positive, learning rate) I got the best model with the best hyperparameters with highest Mrr. but when I am trying the same results of hyperparameter and build the mode , I got MRR value higher than I got in the results of HPO itself. Is this normal ? why some trials have much higher Mrr in HPO but failed?

    How can I Know the default value for margin in marginrankingloss function in default case for models and is it the same in all models?

    Environment

    Key | Value -- | -- OS | posix Platform | Linux Release | 5.10.107+ Time | Mon Jun 13 09:04:32 2022 Python | 3.7.12 PyKEEN | 1.8.1 PyKEEN Hash | UNHASHED PyKEEN Branch | ย  PyTorch | 1.11.0 CUDA Available? | true CUDA Version | 11.0 cuDNN Version | 8005

    Issue Template Checks

    • [X] This is not a bug report (use a different issue template if it is)
    • [X] This is not a feature request (use a different issue template if it is)
    • [X] I've read the text explaining why including environment information is important and understand if I omit this information that my issue will be dismissed
    question 
    opened by Ahmed-fub 15
  • Kaggle notebooks are having trouble loading entrypoints

    Kaggle notebooks are having trouble loading entrypoints

    I use the following code on colab or kaggle

    !pip install pykeen -q
    from pykeen.datasets import OpenBioLink
    

    Exception occurs on the import line. this error message pops out:

    /usr/local/lib/python3.7/dist-packages/pykeen/datasets/__init__.py in <module>()
         75 }
         76 if not _DATASETS:
    ---> 77     raise RuntimeError('Datasets have been loaded with entrypoints since PyKEEN v1.0.5. Please reinstall.')
         78 
         79 #: A mapping of datasets' names to their classes
    
    RuntimeError: Datasets have been loaded with entrypoints since PyKEEN v1.0.5. Please reinstall.
    

    It would be fine if I restart the kernel such that the python environment is reloaded at least once after the package installation. However it would make kaggle commit (Save version => Save & Run all (commit)) impossible, cause I have to do the restart manually and interactively, but the commit session is not interactive.

    Because of this, I can not produce a saved notebook with output file from pykeen. The only option left is to do everything on interactive session, then download the output file manually, then upload it as a kaggle dataset.

    Solution

    From https://github.com/pykeen/pykeen/issues/373#issuecomment-821060699:

    I just found the workaound. It should work on both colab and kaggle.

     from pkg_resources import require
     require('pykeen')
    

    https://colab.research.google.com/drive/1yFWDQ3OybultFHaNdi_gJCHpBJiB4Iem?usp=sharing

    question 
    opened by jerryIsHere 15
  • โšก๐Ÿงช Bring back lightning tests

    โšก๐Ÿงช Bring back lightning tests

    As https://github.com/Lightning-AI/lightning/pull/14117 has been fixed, and https://github.com/Lightning-AI/ecosystem-ci/pull/50 is regaining momentum, this PR brings back automatic PyTorch Lightning tests.

    opened by mberr 1
  • Saving Checkpoints to S3 bucket

    Saving Checkpoints to S3 bucket

    Problem Statement

    While training on AWS sagemaker, it is expensive to keep the checkpoints on the notebook. Being able to upload checkpoints directly to s3 bucket will save time.

    Describe the solution you'd like

    in check points, allow s3 bucket urls

    from pykeen.pipeline import pipeline
    
    result = pipeline(
        model='transe',
        dataset='nations',
        training_kwargs=dict(num_epochs=10,
                             checkpoint_name='test_checkpoint.pt',
                             checkpoint_directory='s3://bucket/checkpoints/',
                             )
    
    )
    

    Describe alternatives you've considered

    Uploading checkpoints at the end of training for backup.

    Additional information

    No response

    Issue Template Checks

    • [X] This is not a bug report (use a different issue template if it is)
    • [X] This is not a question (use the discussions forum instead)
    enhancement 
    opened by mhmgad 0
  • Make tutorial for enabling different learning rates

    Make tutorial for enabling different learning rates

    Problem Statement

    First of all, thanks for the great work with the library! It would be very useful to be able to specify different learning rates. Right now, when running a pipeline, an instance of the optimizer is created by passing all parameters in the model:

    https://github.com/pykeen/pykeen/blob/313055e8c846a52a35901f2746a43e5efdae1e3e/src/pykeen/pipeline/api.py#L1035-L1039

    However, in some cases we might also want to apply per-parameter options, for example

    optim.SGD([
        {'params': model.base.parameters()},
        {'params': model.classifier.parameters(), 'lr': 1e-3}
    ], lr=1e-2, momentum=0.9)
    

    Describe the solution you'd like

    A possible solution could be an optional dictionary passed when creating the pipeline, e.g. optimizer_params. If it's not provided, then the pipeline would default to the above, otherwise the user could choose different learning rates for modules in a custom model:

    optimizer_instance = optimizer_resolver.make(
            optimizer,
            optimizer_kwargs,
            params=optimizer_params if optimizer_params else model_instance.get_grad_params(),
        )
    

    Describe alternatives you've considered

    I tried getting access to the optimizer via a TrainingCallback, and I considered modifying the learning rate for different modules in the pre_step method:

    class MultiLearningRateCallback(TrainingCallback):
        ....
        pre_step(self, **kwargs):
            # Here we have access to the optimizer via self.optimizer
    

    The problem is that at this point the optimizer has already been initialized and has been assigned Parameters, which are difficult to map to the original modules.

    Additional information

    No response

    Issue Template Checks

    • [X] This is not a bug report (use a different issue template if it is)
    • [X] This is not a question (use the discussions forum instead)
    documentation 
    opened by dfdazac 3
  • TypeError: missing required positional argument when using `CoreTriplesFactory.from_path_binary()`

    TypeError: missing required positional argument when using `CoreTriplesFactory.from_path_binary()`

    Describe the bug

    According to the document about saving pipeline result to directory, it said:

    training_triples contains the training triples factory, including label-to-id mappings, if used. It has been saved via pykeen.triples.CoreTriplesFactory.to_path_binary(), and can re-loaded via pykeen.triples.CoreTriplesFactory.from_path_binary().

    But error occurred when I want to load the triplets_factory from the pipeline result saved before.

    triplets_factory = CoreTriplesFactory.from_path_binary('RESCAL_FB15k237/training_triples')
    
    Traceback (most recent call last):
      File "/home/tzuwei/developing/kge-loss/main.py", line 32, in <module>
        triplets_factory = CoreTriplesFactory.from_path_binary(f'{save_path}/training_triples')
      File "/home/tzuwei/.pyenv/versions/loss/lib/python3.9/site-packages/pykeen/triples/triples_factory.py", line 816, in from_path_binary
        return cls(**cls._from_path_binary(path=path))
    TypeError: __init__() missing 2 required positional arguments: 'num_entities' and 'num_relations'
    

    Seems like the data return by cls._from_path_binary does not contain num_entities and num_relations at line 816.

    https://github.com/pykeen/pykeen/blob/313055e8c846a52a35901f2746a43e5efdae1e3e/src/pykeen/triples/triples_factory.py#L793-L824

    How to reproduce

    The following code can reproduce the problem on my machine

    from pykeen.pipeline import pipeline
    from pykeen.models import RESCAL
    from pykeen.datasets import FB15k237
    from pykeen.triples import CoreTriplesFactory
    
    result = pipeline(
        model=RESCAL,
        dataset=FB15k237,
    )
    result.save_to_directory('RESCAL_FB15k237')
    
    triplets_factory = CoreTriplesFactory.from_path_binary('RESCAL_FB15k237/training_triples')
    

    Environment

    Unable to handle parameter in CooccurrenceFilteredModel: base | Key | Value | |-----------------|--------------------------| | OS | posix | | Platform | Linux | | Release | 5.15.0-52-generic | | Time | Wed Dec 7 18:02:29 2022 | | Python | 3.9.14 | | PyKEEN | 1.9.0 | | PyKEEN Hash | 1f526edb | | PyKEEN Branch | master | | PyTorch | 1.13.0+cu117 | | CUDA Available? | true | | CUDA Version | 11.7 | | cuDNN Version | 8500 |

    Additional information

    Thank you for providing this open source project. Your work is very helpful to me. :smile:

    Issue Template Checks

    • [X] This is not a feature request (use a different issue template if it is)
    • [X] This is not a question (use the discussions forum instead)
    • [X] I've read the text explaining why including environment information is important and understand if I omit this information that my issue will be dismissed
    bug 
    opened by uier 3
  • Validation Loss

    Validation Loss

    This extracts parts of the training loop related to calculating the epoch loss into a function to re-use it for calculating validation losses.

    Example:

    from pykeen.datasets import get_dataset
    from pykeen.pipeline import pipeline
    
    dataset = get_dataset(dataset="nations")
    pipeline(
        dataset=dataset,
        model="mure",
        training_kwargs=dict(
            callbacks="validation-loss",
            callback_kwargs=dict(triples_factory=dataset.validation),
        ),
        result_tracker="console",
    )
    
    opened by mberr 0
Releases(v1.9.0)
  • v1.9.0(Aug 4, 2022)

    The theme of this release of PyKEEN is centered on new and exciting representations to bring more kinds of data (text, image, scalar data) into training in an elegant way. Several of these contribute to new functionality for NodePiece.

    Training and Evaluation

    • ๐Ÿ”ฌ๐Ÿ” Evaluation loop by @mberr in https://github.com/pykeen/pykeen/pull/768
    • ๐Ÿฆ๐Ÿ›‘ Early stopping: Reload weights from best epoch by @mberr in https://github.com/pykeen/pykeen/pull/961
    • ๐Ÿ”ฌ๐Ÿšช Update evaluator's evaluate to pass through kwargs by @mberr in https://github.com/pykeen/pykeen/pull/938
    • ๐ŸŒช๐Ÿ˜ฟ Fix epoch loss by @mberr in https://github.com/pykeen/pykeen/pull/1021

    Datasets

    • ๐Ÿฅจ๐Ÿ•ธ๏ธ Add Global Biotic Interactions (GloBI) dataset by @cthoyt in https://github.com/pykeen/pykeen/pull/947
    • Fix dataset caching with inverse triples by @mberr in https://github.com/pykeen/pykeen/pull/1034

    Models

    New

    • ๐Ÿ‡จ๐Ÿ‡ด๐Ÿ•ธ๏ธ Add co-occurence filtered meta model by @mberr in https://github.com/pykeen/pykeen/pull/943

    Updates

    • ๐Ÿ“๐Ÿค Add LineaRE interaction by @mberr in https://github.com/pykeen/pykeen/pull/971
    • โœจ๐Ÿค– Update ERMLP to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/869
    • โœจ๐Ÿค– Update ERMLP-E to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/872
    • โœจ๐Ÿค– Update HolE to ER-Model by @mberr in https://github.com/pykeen/pykeen/pull/953
    • โœจ๐Ÿค– Update TransE to ER-Model by @mberr in https://github.com/pykeen/pykeen/pull/955
    • โœจ๐Ÿค– Update TransH to ER-Model by @mberr in https://github.com/pykeen/pykeen/pull/954
    • โœจ๐Ÿค– Update RESCAL to ER-Model by @mberr in https://github.com/pykeen/pykeen/pull/952
    • โœจ๐Ÿ’€ Phase out old-style model by @mberr in https://github.com/pykeen/pykeen/pull/865
    • ๐Ÿซถ ๐Ÿงจ Update ConvKB & SE to use einsum by @mberr in https://github.com/pykeen/pykeen/pull/978
    • ๐ŸŽ๏ธ ๐Ÿดโ€โ˜ ๏ธ Add efficient RGCN implementation by @mberr in https://github.com/pykeen/pykeen/pull/634
    • ๐Ÿ”งโžก๏ธ Move Nguyen's TransE configurations into correct directory by @PhaelIshall in https://github.com/pykeen/pykeen/pull/957

    Representations

    • ๐Ÿ‘ฅ๐Ÿ“ Wikidata Textual Representations by @mberr in https://github.com/pykeen/pykeen/pull/966
    • ๐Ÿ”—๐Ÿ—ฟ Combined Representation by @mberr in https://github.com/pykeen/pykeen/pull/964
    • ๐Ÿ”ณ๐Ÿ”ฒ Add PartitionRepresentation by @mberr in https://github.com/pykeen/pykeen/pull/980
    • ๐Ÿ“‹โ‡๏ธ Generalize Text Encoders & add a simple one by @mberr in https://github.com/pykeen/pykeen/pull/969
    • ๐Ÿšš๐Ÿ—ฝ Add transformed representation by @mberr in https://github.com/pykeen/pykeen/pull/984
    • ๐Ÿ‘€๐Ÿ“‡ Simple visual representations by @mberr in https://github.com/pykeen/pykeen/pull/965
    • ๐Ÿ‹๏ธ๐Ÿš‚ Tensor Train Representation by @mberr in https://github.com/pykeen/pykeen/pull/989

    NodePiece

    • โš“๐Ÿ” NodePiece: GPU-enabled BFS searcher by @migalkin in https://github.com/pykeen/pykeen/pull/990
    • ๐Ÿดโ€โ˜ ๏ธ๐ŸŒŠ NodePiece x METIS by @mberr in https://github.com/pykeen/pykeen/pull/988
    • โš“ ๐Ÿ“– NodePiece documentation on MetisAnchorTokenizer by @migalkin in https://github.com/pykeen/pykeen/pull/1026

    Documentation

    • ๐Ÿ˜บ๐Ÿ‡ช๐Ÿ‡ฌ Explicitly set Sphinx language by @mberr in https://github.com/pykeen/pykeen/pull/951
    • ๐Ÿ’ฅ ๐Ÿ“’ Add troubleshooting for loading old models by @jas-ho in https://github.com/pykeen/pykeen/pull/963
    • ๐Ÿ“˜ ๐Ÿš€ Update README by @cthoyt in https://github.com/pykeen/pykeen/pull/1039
    • ๐Ÿ“’ ๐Ÿคก Fix documentation build by @cthoyt in https://github.com/pykeen/pykeen/pull/946
    • ๐Ÿ“• ๐Ÿดโ€โ˜ ๏ธ Update docs and deprecations by @mberr in https://github.com/pykeen/pykeen/pull/979
    • ๐Ÿ“— ๐Ÿ–Š๏ธ Update docs about normalizers and constrainers by @mberr in https://github.com/pykeen/pykeen/pull/1047

    Loss

    • โš”๏ธโš–๏ธ Add adversarially weighted BCE loss by @mberr in https://github.com/pykeen/pykeen/pull/958
    • โš”๏ธ๐Ÿค” New procedure for computing AdversarialBCEWithLogits by @migalkin in https://github.com/pykeen/pykeen/pull/997

    Predictions

    • ๐Ÿ‰๐Ÿ‰ Score multiple tails at once by @mberr in https://github.com/pykeen/pykeen/pull/949
    • ๐Ÿ”ฎใ€ฐ๏ธ Update Prediction Filtering by @mberr in https://github.com/pykeen/pykeen/pull/1048
    • ๐Ÿ”ฎ ๐ŸŽ‰ Add inference_mode annotation to get_prediction_df() by @tatiana-iazykova in https://github.com/pykeen/pykeen/pull/1024
    • ๐Ÿ”จ๐Ÿงช Fix the device in _safe_evaluate() by @migalkin in https://github.com/pykeen/pykeen/pull/1041

    Meta

    • ๐Ÿค–๐Ÿ–๏ธ Update GHA by @mberr in https://github.com/pykeen/pykeen/pull/959
    • ๐Ÿฆˆ๐Ÿงต Darglint forever by @cthoyt in https://github.com/pykeen/pykeen/pull/985

    Misc

    • ๐Ÿšจ๐Ÿ“Š Cast kwargs as strings in plot_er by @vsocrates in https://github.com/pykeen/pykeen/pull/945
    • โ›๐Ÿ“ฒ Add utility to analyze degree distributions by @mberr in https://github.com/pykeen/pykeen/pull/857
    • ๐Ÿฅฏโœ”๏ธ Add max_id/shape verification by @mberr in https://github.com/pykeen/pykeen/pull/983
    • Use torch_ppr by @mberr in https://github.com/pykeen/pykeen/pull/995
    • โž•๐Ÿน Add ExtraReprMixin by @mberr in https://github.com/pykeen/pykeen/pull/994
    • ๐Ÿ›ค๏ธ๐Ÿ›ข๏ธ Add prefix when tracking pipeline metrics by @mberr in https://github.com/pykeen/pykeen/pull/998
    • โญ•๐Ÿ”บ Update PyG version for CI by @mberr in https://github.com/pykeen/pykeen/pull/1025
    • #๏ธโƒฃ๐Ÿ Allow passing numpy.ndarray to CoreTriplesFactory by @mberr in https://github.com/pykeen/pykeen/pull/1029

    New Contributors

    • @PhaelIshall made their first contribution in https://github.com/pykeen/pykeen/pull/957
    • @jas-ho made their first contribution in https://github.com/pykeen/pykeen/pull/963
    • @tatiana-iazykova made their first contribution in https://github.com/pykeen/pykeen/pull/1024

    Full Changelog: https://github.com/pykeen/pykeen/compare/v1.8.2...v1.9.0

    Source code(tar.gz)
    Source code(zip)
  • v1.8.2(May 24, 2022)

    Datasets

    • Add the PrimeKG dataset by @sbonner0 in https://github.com/pykeen/pykeen/pull/915
    • ๐ŸŒ€๐Ÿ”— Extend EA datasets to allow loading a unified graph by @mberr in https://github.com/pykeen/pykeen/pull/871
    • ๐ŸŽบ๐ŸŽท Fix wk3l loading by @mberr in https://github.com/pykeen/pykeen/pull/907

    Lightning

    • ๐Ÿ”ฅโšก PyTorch Lightning by @mberr in https://github.com/pykeen/pykeen/pull/905
    • ๐Ÿ”ฅโšก PyTorch Lightning - Part 2 by @mberr in https://github.com/pykeen/pykeen/pull/917
    • ๐Ÿš…โšก Test Training with PyTorch Lightning by @mberr in https://github.com/pykeen/pykeen/pull/930

    Losses

    • ๐Ÿ“‰๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘ Fix default loss of PairRE by @mberr in https://github.com/pykeen/pykeen/pull/925
    • โ„น๏ธ๐Ÿฆญ Add InfoNCE loss by @mberr in https://github.com/pykeen/pykeen/pull/926
    • โ„น๏ธ๐Ÿš€ Update InfoNCE LCWA implementation by @mberr in https://github.com/pykeen/pykeen/pull/928

    Representations

    • ๐ŸŽฒ๐Ÿšถ Random Walk Positional Encoding by @mberr in https://github.com/pykeen/pykeen/pull/918
    • ๐Ÿ›๏ธ๐Ÿ‘จ Weisfeiler-Lehman Features by @mberr in https://github.com/pykeen/pykeen/pull/920

    Other great stuff that isn't the previous commit (it's after 5PM)

    • ๐Ÿงซ๐Ÿ Update scipy minimum version by @mberr in https://github.com/pykeen/pykeen/pull/891
    • โ™ป๏ธโ˜Ž๏ธ Re-use optimized batch-size in evaluation callback by @mberr in https://github.com/pykeen/pykeen/pull/886
    • ๐Ÿ–ฅ๏ธ๐ŸฆŽ Fix complex initialization by @mberr in https://github.com/pykeen/pykeen/pull/888
    • ๐Ÿ“ฆ๐Ÿ“š Update BoxE reproducibility configurations by @mberr in https://github.com/pykeen/pykeen/pull/631
    • ๐Ÿซ“๐Ÿช Improve loading of triples with nan strings by @SenJia in https://github.com/pykeen/pykeen/pull/883
    • ๐Ÿชต โœจ Update flake8 ignores by @cthoyt in https://github.com/pykeen/pykeen/pull/897
    • ๐Ÿ‘ฏโ€โ™‚๏ธ๐Ÿ‘ฏโ€โ™€๏ธ Unique hashes in the NodePiece representation by @migalkin in https://github.com/pykeen/pykeen/pull/896
    • ๐Ÿ“๐Ÿ“จ PyTorch Geometric Message Passing Representations by @mberr in https://github.com/pykeen/pykeen/pull/894
    • ๐Ÿช›๐Ÿ“ Fix directory path normalization by @mberr in https://github.com/pykeen/pykeen/pull/890
    • ๐Ÿง›๐Ÿ‡ช๐Ÿ‡บ Implement more graph pair unification approaches by @mberr in https://github.com/pykeen/pykeen/pull/893
    • ๐Ÿ”™๐ŸŒ™ Backwards Compatibility for init phases by @mberr in https://github.com/pykeen/pykeen/pull/899
    • ๐Ÿ“”โœ… Update Docstring Coverage check by @mberr in https://github.com/pykeen/pykeen/pull/892
    • ๐Ÿช„๐Ÿ–Š๏ธ Class resolver type annotations by @mberr in https://github.com/pykeen/pykeen/pull/904
    • ๐Ÿ“‹โžก๏ธ Move listing experiments from epilog to own command by @mberr in https://github.com/pykeen/pykeen/pull/903
    • ๐Ÿ”ง๐Ÿ“œ Update hpo tutorial about grid search by @mberr in https://github.com/pykeen/pykeen/pull/902
    • ๐Ÿ“– ๐Ÿ› ๏ธ Fix typo in prediction docs by @mberr in https://github.com/pykeen/pykeen/pull/912
    • โœ‚๏ธ๐ŸŒฐ Extract triple-independent information from CoreTriplesFactory by @mberr in https://github.com/pykeen/pykeen/pull/908
    • ๐Ÿ๐Ÿ‘ Increase Minimum Python Version to 3.8 by @mberr in https://github.com/pykeen/pykeen/pull/921
    • ๐Ÿงš๐Ÿ’พ Extend save to directory doc by @mberr in https://github.com/pykeen/pykeen/pull/916
    • ๐Ÿง ๐Ÿท๏ธ Maximize memory utilization for label based initialization by @mberr in https://github.com/pykeen/pykeen/pull/898
    • โœ๏ธ๐Ÿ‡ฎ๐Ÿ‡ณ Rename inductive representation methods by @mberr in https://github.com/pykeen/pykeen/pull/929
    • ๐Ÿ‘พ โšฝ Add missing device by @vsocrates in https://github.com/pykeen/pykeen/pull/936

    New Contributors

    • @SenJia made their first contribution in https://github.com/pykeen/pykeen/pull/883
    • @vsocrates made their first contribution in https://github.com/pykeen/pykeen/pull/936

    Full Changelog: https://github.com/pykeen/pykeen/compare/v1.8.1...v1.8.2

    Source code(tar.gz)
    Source code(zip)
  • v1.8.1(Apr 20, 2022)

    PyKEEN 1.8.1 contains a few critical bug fixes along with some other cool updates.

    Evaluation

    • โš–๏ธ๐ŸŒก๏ธ Weighted Rank-Based Metrics by @mberr in https://github.com/pykeen/pykeen/pull/837
    • ๐ŸŒŒ๐Ÿง Macro evaluation by @mberr in https://github.com/pykeen/pykeen/pull/850

    Inductive Models

    • โš“๐Ÿง NodePiece Anchor Searching via PPR by @mberr in https://github.com/pykeen/pykeen/pull/870

    Transductive Models

    • โœจ๐Ÿค– Update DistMult to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/874
    • โœจ๐Ÿค– Update ProjE to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/876
    • โœจ๐Ÿค– Update RotatE to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/877
    • โœจ๐Ÿค– Update ConvE to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/875
    • ๐Ÿš›ยฎ๏ธ Update TuckER to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/866
    • โœจ๐Ÿฆœ Upgrade TransR to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/868

    New Datasets

    • ๐ŸŒช๏ธ ๐Ÿ“– Add ILPC datasets and inductive dataset resolver by @cthoyt in https://github.com/pykeen/pykeen/pull/848
    • ๐Ÿ‘‘๐Ÿค‘ Add aristo-v4 dataset by @mberr in https://github.com/pykeen/pykeen/pull/855

    Documentation

    • ๐Ÿ“—โœจ Update documentation to better reflect new-style models by @mberr in https://github.com/pykeen/pykeen/pull/879
    • ๐Ÿ‘ฃ ๐Ÿ“š Correct typos in "First Steps" tutorial by @andreasala98 in https://github.com/pykeen/pykeen/pull/846

    Bug Fixes

    • ๐Ÿ”ง#๏ธโƒฃ Fix arange dtype and clip variances by @mberr in https://github.com/pykeen/pykeen/pull/881
    • ๐Ÿช„โš–๏ธ Fix pop_regularization_term by @mberr in https://github.com/pykeen/pykeen/pull/849
    • ๐Ÿง‘โ€๐Ÿญ๐Ÿ”ข Fix numeric triples factory by @mberr in https://github.com/pykeen/pykeen/pull/862
    • ๐Ÿ” ๐Ÿช“ Ensure reproducible splits for all datasets by @mberr in https://github.com/pykeen/pykeen/pull/856
    • ๐Ÿšซ๐Ÿ‹ Raise explicit error if no training batch was available by @mberr in https://github.com/pykeen/pykeen/pull/860
    • ๐Ÿšš๐Ÿ’ป Fix TransformerEncoder tokens' device by @mberr in https://github.com/pykeen/pykeen/pull/861

    Misc

    • ๐Ÿ”Ž๐Ÿ‹๏ธ Resolve optimizer, LR-scheduler & tracker in training loop by @mberr in https://github.com/pykeen/pykeen/pull/852
    • ๐ŸŽฏ๐Ÿชœ Update default batch size HPO range by @mberr in https://github.com/pykeen/pykeen/pull/864
    • โ™ป๏ธ๐Ÿ”ฅ Use torch builtin broadcast by @mberr in https://github.com/pykeen/pykeen/pull/873

    New Contributors

    • @andreasala98 made their first contribution in https://github.com/pykeen/pykeen/pull/846

    Full Changelog: https://github.com/pykeen/pykeen/compare/v1.8.0...v1.8.1

    Source code(tar.gz)
    Source code(zip)
  • v1.8.0(Mar 22, 2022)

    Among a ton of updates since the beginning of the year, PyKEEN v1.8.0 has three major themes:

    1. The introduction of the inductive link prediction pipeline and the NodePiece model. We highly suggest checking out An Open Challenge for Inductive Link Prediction on Knowledge Graphs to go along with this new pipeline and models.
    2. The introduction of new rank-based evaluation metrics to go along with A Unified Framework for Rank-based Evaluation Metrics for Link Prediction in Knowledge Graphs
    3. Major internal refactoring of negative sampling to better use PyTorch's data loaders and support multi-CPU generation (special thanks to @Koenkalle for help testing this)

    NodePiece and Inductive Link Prediction

    • ๐Ÿฆ†๐Ÿ Inductive LP framework by @migalkin in https://github.com/pykeen/pykeen/pull/722
    • ๐ŸŒ™๐Ÿบ Add mode parameter by @cthoyt in https://github.com/pykeen/pykeen/pull/769
    • ๐ŸธโœŒ๏ธ Mixed tokenization for NodePiece by @mberr in https://github.com/pykeen/pykeen/pull/770
    • โ˜ฎ๏ธโš“ NodePiece with anchors by @mberr in https://github.com/pykeen/pykeen/pull/755
    • ๐Ÿฅ’๐Ÿดโ€โ˜ ๏ธ Precomputed Tokenization for NodePiece by @mberr in https://github.com/pykeen/pykeen/pull/822
    • ๐Ÿฆœ๐Ÿ“– Refactor NodePiece and improve documentation by @cthoyt in https://github.com/pykeen/pykeen/pull/833
    • โš“ ๐Ÿ”ง NodePiece MixtureAnchorSelection unique anchor IDs fix + PageRank fix by @migalkin in https://github.com/pykeen/pykeen/pull/776
    • ๐Ÿงฉ๐Ÿงช NodePiece experimental configs by @migalkin in https://github.com/pykeen/pykeen/pull/771
    • ๐Ÿ‘€ ๐Ÿ‹๏ธ Attention edge weighting by @migalkin in https://github.com/pykeen/pykeen/pull/734

    Models

    New

    • โ›Ÿโ†”๏ธ Add (multi-)linear Tucker interaction by @mberr in https://github.com/pykeen/pykeen/pull/751
    • ๐Ÿฆ ๐ŸŽธ Soft inverse triples baseline by @mberr in https://github.com/pykeen/pykeen/pull/543

    Updated

    • ๐Ÿค– โ›‘๏ธ Fix device for FixedModel by @mberr in https://github.com/pykeen/pykeen/pull/725
    • ๐Ÿง€ ๐Ÿน Unify usage of slice_size by @cthoyt in https://github.com/pykeen/pykeen/pull/729
    • ๐Ÿ“Ÿ โœ‚๏ธ Remove device from model by @cthoyt in https://github.com/pykeen/pykeen/pull/730
    • ๐Ÿ’ƒ ๐Ÿชฅ Cleanup model argument passing by @cthoyt in https://github.com/pykeen/pykeen/pull/762

    Training and Evaluation

    • โœ‚๏ธ โฐ Split early stopping logic from evaluation by @mberr in https://github.com/pykeen/pykeen/pull/355
    • ๐ŸŽฒ๐ŸŽš๏ธ Sampled Rank-Based Evaluator by @mberr in https://github.com/pykeen/pykeen/pull/733
    • โ›ยฉ๏ธ Fix Checkpointing by @mberr in https://github.com/pykeen/pykeen/pull/740
    • ๐ŸŒ‹ ๐Ÿ—บ๏ธ Switch evaluator from dataclass to dict by @cthoyt in https://github.com/pykeen/pykeen/pull/780
    • ๐ŸŒ€ โš–๏ธ Simplify evaluate by @mberr in https://github.com/pykeen/pykeen/pull/767
    • ๐Ÿ›ค๏ธ ๐Ÿ” Store result tracker inside loop by @mberr in https://github.com/pykeen/pykeen/pull/793

    Callbacks

    • ๐Ÿ“ž๐Ÿ”™ Evaluation callback by @mberr in https://github.com/pykeen/pykeen/pull/765
    • ๐ŸฅŠ โ˜Ž๏ธ Early stopping via training callback by @mberr in https://github.com/pykeen/pykeen/pull/354

    Data and Datasets

    New

    • ๐Ÿชข๐Ÿค” Add OpenEA datasets by @dobraczka in https://github.com/pykeen/pykeen/pull/784
    • ๐Ÿ’‰8๏ธโƒฃ Add the PharmKG8k dataset by @sbonner0 in https://github.com/pykeen/pykeen/pull/797
    • ๐Ÿงช๐Ÿ’‰ Add PharmKG full dataset by @sbonner0 in https://github.com/pykeen/pykeen/pull/806
    • โ™ป๏ธ 2๏ธโƒฃ Replace OGB's WikiKG by WikiKG2 by @mberr in https://github.com/pykeen/pykeen/pull/809

    Updates

    • ๐Ÿ“Œ๐Ÿง  Use pinned memory for training data loader by @mberr in https://github.com/pykeen/pykeen/pull/747
    • ๐Ÿงฎ๐Ÿ’ƒ Add property for number of parameters by @mberr in https://github.com/pykeen/pykeen/pull/804
    • ๐Ÿ’พ โ™ป๏ธ Refactor dataset utility code by @cthoyt in https://github.com/pykeen/pykeen/pull/830
    • ๐Ÿ’พ ๐Ÿ• Update dataset registration by @cthoyt in https://github.com/pykeen/pykeen/pull/832
    • ๐Ÿ’พ ๐Ÿš€ Update dataset statistics by @cthoyt in https://github.com/pykeen/pykeen/pull/834
    • ๐Ÿ˜ด๐Ÿฒ Ignore create_inverse_triples for caching hash digest by @mberr in https://github.com/pykeen/pykeen/pull/813
    • ๐Ÿ–‡๏ธ ๐Ÿ“Š Use Figshare link for OpenEA dataset by @dobraczka in https://github.com/pykeen/pykeen/pull/838
    • ๐Ÿ“ฆ ๐Ÿ’พ Batch data loader by @mberr in https://github.com/pykeen/pykeen/pull/817
    • ๐Ÿ“ฅ ๐Ÿญ Save Training Triples Factory by @mali-git in https://github.com/pykeen/pykeen/pull/655
    • ๐Ÿฆž ๐Ÿ’ฟ Negative sampling in data loader by @mberr in https://github.com/pykeen/pykeen/pull/417
    • ๐Ÿ’พ๐Ÿ’ฝ Change serialization format by @mberr in https://github.com/pykeen/pykeen/pull/785
    • ๐Ÿงฐ ๐Ÿ“ฅ Cache dataset loading by @mberr in https://github.com/pykeen/pykeen/pull/569

    Metrics

    • ๐Ÿ“๐Ÿช• Compute Candidate Set Sizes by @mberr in https://github.com/pykeen/pykeen/pull/732
    • ๐Ÿ† ๐Ÿฑ Update rank data structure by @cthoyt in https://github.com/pykeen/pykeen/pull/758
    • ๐Ÿ“ ๐Ÿฑ Update metric key data structure by @cthoyt in https://github.com/pykeen/pykeen/pull/759
    • ๐Ÿณ ๐Ÿ˜ƒ Reorganize metrics and expectation functions by @cthoyt in https://github.com/pykeen/pykeen/pull/763
    • ๐Ÿ›๏ธ ๐Ÿ‘ฝ Add improved indicator constructor by @cthoyt in https://github.com/pykeen/pykeen/pull/781
    • ๐Ÿ›๏ธ ๐Ÿฅพ Improve metrics data structures by @cthoyt in https://github.com/pykeen/pykeen/pull/782
    • ๐ŸŽฉ ๐ŸŽธ Class-Based Rank-Based Metrics by @mberr in https://github.com/pykeen/pykeen/pull/786
    • โš™๏ธ๐ŸŒก๏ธ Add more adjusted metrics by @cthoyt in https://github.com/pykeen/pykeen/pull/814
    • ๐Ÿชก๐Ÿ—œ๏ธ Refactor derived metrics by @mberr in https://github.com/pykeen/pykeen/pull/835
    • ๐Ÿ”ข ๐ŸŽ Update value range & docstring of adjusted metrics by @mberr in https://github.com/pykeen/pykeen/pull/823
    • โž•๐ŸŒ Add option to add all default rank-based metrics by @mberr in https://github.com/pykeen/pykeen/pull/827
    • ๐Ÿช›๐Ÿ’ก Fix RankBasedMetricResults.iter_rows by @mberr in https://github.com/pykeen/pykeen/pull/792

    Prediction

    • ๐Ÿ™ƒ๐Ÿ‘“ Predict workflow with inverse relations by @mberr in https://github.com/pykeen/pykeen/pull/726

    Representations

    • ๐Ÿช›๐Ÿ”— Change interactions' shape by @mberr in https://github.com/pykeen/pykeen/pull/736
    • ๐Ÿ ๐Ÿ›ฐ๏ธ Update constrainer, initializer, and normalizer resolution by @mberr in https://github.com/pykeen/pykeen/pull/742
    • ๐Ÿฆ„ ๐Ÿ”ข Only get representations for unique indices by @mberr in https://github.com/pykeen/pykeen/pull/743
    • โ›”๐Ÿ’  Remove get in canonical shape by @mberr in https://github.com/pykeen/pykeen/pull/745
    • โฉ๐Ÿ›Œ Fix dtype forwarding in Embedding by @mberr in https://github.com/pykeen/pykeen/pull/746
    • ๐Ÿ“๐Ÿ•ณ๏ธ Move normalization to base representation by @mberr in https://github.com/pykeen/pykeen/pull/818
    • โœ๏ธ๐Ÿ“ Unify representation module nomenclature by @mberr in https://github.com/pykeen/pykeen/pull/811
    • โœจ๐Ÿ’ค Resolve Representations by @mberr in https://github.com/pykeen/pykeen/pull/803

    Trackers

    • Add loss kwargs to ResultTracker by @Rodrigo-A-Pereira in https://github.com/pykeen/pykeen/pull/741
    • ๐Ÿชก๐Ÿ’ป Fix typo in ConsoleTracker.log_metrics by @mberr in https://github.com/pykeen/pykeen/pull/787

    Fixes

    • ๐ŸŒ๐Ÿ Fix ValueError during size probing on GPU machines by @mberr in https://github.com/pykeen/pykeen/pull/821
    • ๐Ÿช„โžฐ Fix device error in training loop by @mberr in https://github.com/pykeen/pykeen/pull/774
    • โ˜•๐Ÿ“ฑ Fix filterer's device by @mberr in https://github.com/pykeen/pykeen/pull/801
    • โ›ต๐Ÿ’ป Make sure indices are moved to device by @mberr in https://github.com/pykeen/pykeen/pull/800

    Documentation, Typing, and Packaging

    • ๐ŸŒŠ ๐Ÿ‘‹ Goodbye to setup.py and Makefile for building the docs by @cthoyt in https://github.com/pykeen/pykeen/pull/761
    • ๐ŸŒŒ ๐Ÿฅ› Update Constants and Types by @mberr in https://github.com/pykeen/pykeen/pull/754
    • ๐Ÿ”ซ ๐Ÿˆโ€โฌ› Update black by @cthoyt in https://github.com/pykeen/pykeen/pull/764
    • ๐Ÿ ๐Ÿ’ช Add Python 3.10 support by @cthoyt in https://github.com/pykeen/pykeen/pull/831
    • ๐Ÿฅฐ ๐Ÿ“™ Update argument passing and documentation by @cthoyt in https://github.com/pykeen/pykeen/pull/842
    • ๐ŸŠ โŒจ๏ธ Typing Updates by @cthoyt in https://github.com/pykeen/pykeen/pull/760
    • โš™๏ธ๐Ÿ“š Fix HPO doc by @mberr in https://github.com/pykeen/pykeen/pull/820
    • ๐Ÿ“–๐Ÿ”ช Extend documentation on subbatching and slicing by @mberr in https://github.com/pykeen/pykeen/pull/810

    Misc

    • โœ‰๏ธ โ™ป๏ธ Add list of available configurations to usage message of reproduction by @mberr in https://github.com/pykeen/pykeen/pull/753
    • ๐ŸฆŽโšก Update class-resolver by @cthoyt in https://github.com/pykeen/pykeen/pull/775

    Full Changelog: https://github.com/pykeen/pykeen/compare/v1.7.0...v1.8.0

    Source code(tar.gz)
    Source code(zip)
  • v1.7.0(Jan 11, 2022)

    New Models

    • Add BoxE by @ralphabb in https://github.com/pykeen/pykeen/pull/618
    • Add TripleRE by @mberr in https://github.com/pykeen/pykeen/pull/712
    • Add AutoSF by @mberr in https://github.com/pykeen/pykeen/pull/713
    • Add Transformer by @mberr in https://github.com/pykeen/pykeen/pull/714
    • Add Canonical Tensor Decomposition by @mberr in https://github.com/pykeen/pykeen/pull/663
    • Add (novel) Fixed Model by @cthoyt in https://github.com/pykeen/pykeen/pull/691
    • Add NodePiece model by @mberr in https://github.com/pykeen/pykeen/pull/621

    Updated Models

    • Update R-GCN configuration by @mberr in https://github.com/pykeen/pykeen/pull/610
    • Update ConvKB to ERModel by @cthoyt in https://github.com/pykeen/pykeen/pull/425
    • Update ComplEx to ERModel by @mberr in https://github.com/pykeen/pykeen/pull/639
    • Rename TranslationalInteraction to NormBasedInteraction by @mberr in https://github.com/pykeen/pykeen/pull/651
    • Fix generic slicing dimension by @mberr in https://github.com/pykeen/pykeen/pull/683
    • Rename UnstructuredModel to UM and StructuredEmbedding to SE by @cthoyt in https://github.com/pykeen/pykeen/pull/721
    • Allow to pass unresolved loss to ERModel's __init__ by @mberr in https://github.com/pykeen/pykeen/pull/717

    Representations and Initialization

    • Add low-rank embeddings by @mberr in https://github.com/pykeen/pykeen/pull/680
    • Add NodePiece representation by @mberr in https://github.com/pykeen/pykeen/pull/621
    • Add label-based initialization using a transformer (e.g., BERT) by @mberr in https://github.com/pykeen/pykeen/pull/638 and https://github.com/pykeen/pykeen/pull/652
    • Add label-based representation (e.g., to update language model using KGEM) by @mberr in https://github.com/pykeen/pykeen/pull/652
    • Remove literal representations (use label-based initialization instead) by @mberr in https://github.com/pykeen/pykeen/pull/679

    Training

    • Fix displaying previous epoch's loss by @mberr in https://github.com/pykeen/pykeen/pull/627
    • Fix kwargs transmission on MultiTrainingCallback by @Rodrigo-A-Pereira in https://github.com/pykeen/pykeen/pull/645
    • Extend Callbacks by @mberr in https://github.com/pykeen/pykeen/pull/609
    • Add gradient clipping by @mberr in https://github.com/pykeen/pykeen/pull/607
    • Fix negative score shape for sLCWA by @mberr in https://github.com/pykeen/pykeen/pull/624
    • Fix epoch loss for loss reduction != "mean" by @mberr in https://github.com/pykeen/pykeen/pull/623
    • Add sLCWA support for Cross Entropy Loss by @mberr in https://github.com/pykeen/pykeen/pull/704

    Inference

    • Add uncertainty estimate functions via MC dropout by @mberr in https://github.com/pykeen/pykeen/pull/688
    • Fix predict top k by @mberr in https://github.com/pykeen/pykeen/pull/690
    • Fix indexing in predict_* methods when using inverse relations by @mberr in https://github.com/pykeen/pykeen/pull/699
    • Move tensors to device for predict_* methods by @mberr in https://github.com/pykeen/pykeen/pull/658

    Trackers

    • Fix wandb logging by @mberr in https://github.com/pykeen/pykeen/pull/647
    • Add multi-result tracker by @mberr in https://github.com/pykeen/pykeen/pull/682
    • Add Python result tracker by @mberr in https://github.com/pykeen/pykeen/pull/681
    • Update file trackers by @cthoyt in https://github.com/pykeen/pykeen/pull/629

    Evaluation

    • Store rank count by @mberr in https://github.com/pykeen/pykeen/pull/672
    • Extend evaluate() for easier relation filtering by @mberr in https://github.com/pykeen/pykeen/pull/391
    • Rename sklearn evaluator and refactor evaluator code by @cthoyt in https://github.com/pykeen/pykeen/pull/708
    • Add additional classification metrics via rexmex by @cthoyt in https://github.com/pykeen/pykeen/pull/668

    Triples and Datasets

    • Add helper dataset with internal batching for Schlichtkrull sampling by @mberr in https://github.com/pykeen/pykeen/pull/616
    • Refactor splitting code and improve documentation by @mberr in https://github.com/pykeen/pykeen/pull/709
    • Switch np.loadtxt to pandas.read_csv by @mberr in https://github.com/pykeen/pykeen/pull/695
    • Add binary I/O to triples factories @cthoyt in https://github.com/pykeen/pykeen/pull/665

    Torch Usage

    • Use torch.finfo to determine suitable epsilon values by @mberr in https://github.com/pykeen/pykeen/pull/626
    • Use torch.isin instead of own implementation by @mberr in https://github.com/pykeen/pykeen/pull/635
    • Switch to using torch.inference_mode instead of torch.no_grad by @sbonner0 in https://github.com/pykeen/pykeen/pull/604

    Miscellaneous

    • Add YAML experiment format by @mberr in https://github.com/pykeen/pykeen/pull/612
    • Add comparison with reproduction results during replication, if available by @mberr in https://github.com/pykeen/pykeen/pull/642
    • Adapt hello_world notebook to API changes by @dobraczka in https://github.com/pykeen/pykeen/pull/649
    • Add testing configuration for Jupyter notebooks by @mberr in https://github.com/pykeen/pykeen/pull/650
    • Add empty default loss_kwargs by @mali-git in https://github.com/pykeen/pykeen/pull/656
    • Optional extra config for reproduce by @mberr in https://github.com/pykeen/pykeen/pull/692
    • Store pipeline configuration in pipeline result by @mberr in https://github.com/pykeen/pykeen/pull/685
    • Fix upgrade to sequence by @mberr in https://github.com/pykeen/pykeen/pull/697
    • Fix pruner use in hpo_pipeline by @mberr in https://github.com/pykeen/pykeen/pull/724

    Housekeeping

    • Automatically lint with black by @cthoyt in https://github.com/pykeen/pykeen/pull/605
    • Documentation and style guide cleanup by @cthoyt in https://github.com/pykeen/pykeen/pull/606
    Source code(tar.gz)
    Source code(zip)
  • v1.6.0(Oct 18, 2021)

    This release is only compatible with PyTorch 1.9+. Because of some changes, it's now pretty non-trivial to support both, so moving forwards PyKEEN will continue to support the latest version of PyTorch and try its best to keep backwards compatibility.

    New Models

    • DistMA (https://github.com/pykeen/pykeen/pull/507)
    • TorusE (https://github.com/pykeen/pykeen/pull/510)
    • Frequency Baselines (https://github.com/pykeen/pykeen/pull/514)
    • Gated Distmult Literal (https://github.com/pykeen/pykeen/pull/591, thanks @Rodrigo-A-Pereira)

    New Datasets

    • WD50K (https://github.com/pykeen/pykeen/pull/511)
    • Wikidata5M (https://github.com/pykeen/pykeen/pull/528)
    • BioKG (https://github.com/pykeen/pykeen/pull/585, thanks @sbonner0)

    New Losses

    • Double Margin Loss (https://github.com/pykeen/pykeen/pull/539)
    • Focal Loss (https://github.com/pykeen/pykeen/pull/542)
    • Pointwise Hinge Loss (https://github.com/pykeen/pykeen/pull/540)
    • Soft Pointwise Hinge Loss (https://github.com/pykeen/pykeen/pull/540)
    • Pairwise Logistic Loss (https://github.com/pykeen/pykeen/pull/540)

    Added

    • Tutorial in using checkpoints when bringing your own data (https://github.com/pykeen/pykeen/pull/498)
    • Learning rate scheduling (https://github.com/pykeen/pykeen/pull/492)
    • Checkpoints include entity/relation maps (https://github.com/pykeen/pykeen/pull/498)
    • QuatE reproducibility configurations (https://github.com/pykeen/pykeen/pull/486)

    Changed

    • Reimplment SE (https://github.com/pykeen/pykeen/pull/521) and NTN (https://github.com/pykeen/pykeen/pull/522) with new-style models
    • Generalize pairwise loss and pointwise loss hierarchies (https://github.com/pykeen/pykeen/pull/540)
    • Update to use PyTorch 1.9 functionality (https://github.com/pykeen/pykeen/pull/489)
    • Generalize generator strategies in LCWA (https://github.com/pykeen/pykeen/pull/602)

    Fixed

    • FileNotFoundError on Windows/Anaconda (https://github.com/pykeen/pykeen/pull/503, thanks @Hao-666)
    • Fixed docstring for ComplEx interaction (https://github.com/pykeen/pykeen/pull/504)
    • Make DistMult the default interaction function for R-GCN (https://github.com/pykeen/pykeen/pull/548)
    • Fix gradient error in CompGCN buffering (https://github.com/pykeen/pykeen/pull/573)
    • Fix splitting of numeric triples factories (https://github.com/pykeen/pykeen/pull/594, thanks @Rodrigo-A-Pereira)
    • Fix determinism in spitting of triples factory (https://github.com/pykeen/pykeen/pull/500)
    • Fix documentation and improve HPO suggestion (https://github.com/pykeen/pykeen/pull/524, thanks @kdutia)
    Source code(tar.gz)
    Source code(zip)
  • v1.5.0(Jun 13, 2021)

    New Metrics

    • Adjusted Arithmetic Mean Rank Index (https://github.com/pykeen/pykeen/pull/378)
    • Add harmonic, geometric, and median rankings (https://github.com/pykeen/pykeen/pull/381)

    New Trackers

    • Console Tracker (https://github.com/pykeen/pykeen/pull/440)
    • Tensorboard Tracker (https://github.com/pykeen/pykeen/pull/416; thanks @sbonner0)

    New Models

    • QuatE (https://github.com/pykeen/pykeen/pull/367)
    • CompGCN (https://github.com/pykeen/pykeen/pull/382)
    • CrossE (https://github.com/pykeen/pykeen/pull/467)
    • Reimplementation of LiteralE with arbitrary combination (g) function (https://github.com/pykeen/pykeen/pull/245)

    New Negative Samplers

    • Pseudo-typed Negative Sampler (https://github.com/pykeen/pykeen/pull/412)

    Datasets

    • Removed invalid datasets (OpenBioLink filtered sets; https://github.com/pykeen/pykeen/pull/https://github.com/pykeen/pykeen/pull/439)
    • Added WK3k-15K (https://github.com/pykeen/pykeen/pull/403)
    • Added WK3l-120K (https://github.com/pykeen/pykeen/pull/403)
    • Added CN3l (https://github.com/pykeen/pykeen/pull/403)

    Added

    • Documentation on using PyKEEN in Google Colab and Kaggle (https://github.com/pykeen/pykeen/pull/379, thanks @jerryIsHere)
    • Pass custom training loops to pipeline (https://github.com/pykeen/pykeen/pull/334)
    • Compatibility later for the fft module (https://github.com/pykeen/pykeen/pull/288)
    • Official Python 3.9 support, now that PyTorch has it (https://github.com/pykeen/pykeen/pull/223)
    • Utilities for dataset analysis (https://github.com/pykeen/pykeen/pull/16, https://github.com/pykeen/pykeen/pull/392)
    • Filtering of negative sampling now uses a bloom filter by default (https://github.com/pykeen/pykeen/pull/401)
    • Optional embedding dropout (https://github.com/pykeen/pykeen/pull/422)
    • Added more HPO suggestion methods and docs (https://github.com/pykeen/pykeen/pull/446)
    • Training callbacks (https://github.com/pykeen/pykeen/pull/429)
    • Class resolver for datasets (https://github.com/pykeen/pykeen/pull/473)

    Updated

    • R-GCN implementation now uses new-style models and is super idiomatic (https://github.com/pykeen/pykeen/pull/110)
    • Enable passing of interaction function by string in base model class (https://github.com/pykeen/pykeen/pull/384, https://github.com/pykeen/pykeen/pull/387)
    • Bump scipy requirement to 1.5.0+
    • Updated interfaces of models and negative samplers to enforce kwargs (https://github.com/pykeen/pykeen/pull/445)
    • Reorganize filtering, negative sampling, and remove triples factory from most objects (https://github.com/pykeen/pykeen/pull/400, https://github.com/pykeen/pykeen/pull/405, https://github.com/pykeen/pykeen/pull/406, https://github.com/pykeen/pykeen/pull/409, https://github.com/pykeen/pykeen/pull/420)
    • Update automatic memory optimization (https://github.com/pykeen/pykeen/pull/404)
    • Flexibly define positive triples for filtering (https://github.com/pykeen/pykeen/pull/398)
    • Completely reimplemented negative sampling interface in training loops (https://github.com/pykeen/pykeen/pull/427)
    • Completely reimplemented loss function in training loops (https://github.com/pykeen/pykeen/pull/448)
    • Forward-compatibility of embeddings in old-style models and updated docs on how to use embeddings (https://github.com/pykeen/pykeen/pull/474)

    Fixed

    • Regularizer passing in the pipeline and HPO (https://github.com/pykeen/pykeen/pull/345)
    • Saving results when using multimodal models (https://github.com/pykeen/pykeen/pull/349)
    • Add missing diagonal constraint on MuRE Model (https://github.com/pykeen/pykeen/pull/353)
    • Fix early stopper handling (https://github.com/pykeen/pykeen/pull/419)
    • Fixed saving results from pipeline (https://github.com/pykeen/pykeen/pull/428, thanks @kantholtz)
    • Fix OOM issues with early stopper and AMO (https://github.com/pykeen/pykeen/pull/433)
    • Fix ER-MLP functional form (https://github.com/pykeen/pykeen/pull/444)
    Source code(tar.gz)
    Source code(zip)
  • v1.4.0(Mar 4, 2021)

    New Datasets

    • Countries (https://github.com/pykeen/pykeen/pull/314)
    • DB100K (https://github.com/pykeen/pykeen/issues/316)

    New Models

    • MuRE (https://github.com/pykeen/pykeen/pull/311)
    • PairRE (https://github.com/pykeen/pykeen/pull/309)
    • Monotonic affine transformer (https://github.com/pykeen/pykeen/pull/324)

    New Algorithms

    If you're interested in any of these, please get in touch with us regarding an upcoming publication.

    • Dataset Similarity (https://github.com/pykeen/pykeen/pull/294)
    • Dataset Deterioration (https://github.com/pykeen/pykeen/pull/295)
    • Dataset Remix (https://github.com/pykeen/pykeen/pull/296)

    Added

    • New-style models (https://github.com/pykeen/pykeen/pull/260) for direct usage of interaction modules
    • Ability to train pipeline() using an Interaction module rather than a Model (https://github.com/pykeen/pykeen/pull/326, https://github.com/pykeen/pykeen/pull/330).

    Changes

    • Lookup of assets is now mediated by the class_resolver package (https://github.com/pykeen/pykeen/pull/321, https://github.com/pykeen/pykeen/pull/327)
    • The docdata package is now used to parse structured information out of the model and dataset documentation in order to make a more informative README with links to citations (https://github.com/pykeen/pykeen/pull/303).

    Fixed

    • Fixed ComplEx's implementation (https://github.com/pykeen/pykeen/pull/313)
    • Fixed OGB's reuse entity identifiers (https://github.com/pykeen/pykeen/pull/318, thanks @tgebhart)
    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Feb 15, 2021)

    We skipped version 1.2.0 because we made an accidental release before this version was ready. We're only human, and are looking into improving our release workflow to live in CI/CD so something like this doesn't happen again. However, as an end user, this won't have an effect on you.

    New Datasets

    • CSKG (https://github.com/pykeen/pykeen/pull/249)
    • DBpedia50 (https://github.com/pykeen/pykeen/issues/278)

    New Trackers

    • General file-based Tracker (https://github.com/pykeen/pykeen/pull/254)
    • CSV Tracker (https://github.com/pykeen/pykeen/pull/254)
    • JSON Tracker (https://github.com/pykeen/pykeen/pull/254)

    Added

    • pykeen version command for more easily reporting your environment in issues (https://github.com/pykeen/pykeen/issues/251)
    • Functional forms of all interaction models (e.g., TransE, RotatE) (https://github.com/pykeen/pykeen/issues/238, pykeen.nn.functional documentation). These can be generally reused, even outside of the typical PyKEEN workflows.
    • Modular forms of all interaction models (https://github.com/pykeen/pykeen/issues/242, pykeen.nn.modules documentation). These wrap the functional forms of interaction models and store hyper-parameters such as the p value for the L_p norm in TransE.
    • The initializer, normalizer, and constrainer for the entity and relation embeddings are now exposed through the __init__() function of each KGEM class and can be configured. A future update will enable HPO on these as well (https://github.com/pykeen/pykeen/issues/282).

    Refactoring and Future Preparation

    This release contains a few big refactors. Most won't affect end-users, but if you're writing your own PyKEEN models, these are important. Many of them are motivated to make it possible to introduce a new interface that makes it much easier for researchers (who shouldn't have to understand the inner workings of PyKEEN) to make new models.

    • The regularizer has been refactored (https://github.com/pykeen/pykeen/issues/266, https://github.com/pykeen/pykeen/issues/274). It no longer accepts a torch.device when instantiated.
    • The pykeen.nn.Embedding class has been improved in several ways:
      • Embedding Specification class makes it easier to write new classes (https://github.com/pykeen/pykeen/issues/277)
      • Refactor to make shape of embedding explicit (https://github.com/pykeen/pykeen/issues/287)
      • Specification of complex datatype (https://github.com/pykeen/pykeen/issues/292)
    • Refactoring of the loss model class to provide a meaningful class hierarchy (https://github.com/pykeen/pykeen/issues/256, https://github.com/pykeen/pykeen/issues/262)
    • Refactoring of the base model class to provide a consistent interface (https://github.com/pykeen/pykeen/issues/246, https://github.com/pykeen/pykeen/issues/248, https://github.com/pykeen/pykeen/issues/253, https://github.com/pykeen/pykeen/issues/257). This allowed for simplification of the loss computation based on the new hierarchy and also new implementation of regularizer class.
    • More automated testing of typing with MyPy (https://github.com/pykeen/pykeen/issues/255) and automated checking of documentation with doctests (https://github.com/pykeen/pykeen/issues/291)

    Triples Loading

    We've made some improvements to the pykeen.triples.TriplesFactory to facilitate loading even larger datasets (https://github.com/pykeen/pykeen/issues/216). However, this required an interface change. This will affect any code that loads custom triples. If you're loading triples from a path, you should now use:

    path = ...
    # Old (doesn't work anymore)
    tf = TriplesFactory(path=path)
    
    # New
    tf = TriplesFactory.from_path(path)
    

    Predictions

    While refactoring the base model class, we excised the prediction functionality to a new module pykeen.models.predict (docs: https://pykeen.readthedocs.io/en/latest/reference/predict.html#functions). We also renamed some of the prediction functions inside the base model to make them more consistent, but we now recommend you use the functions from pykeen.models.predict instead.

    • Model.predict_heads() -> Model.get_head_prediction_df()
    • Model.predict_relations() -> Model.get_head_prediction_df()
    • Model.predict_tails() -> Model.get_head_prediction_df()
    • Model.score_all_triples() -> Model.get_all_prediction_df()

    Fixed

    • Do not create inverse triples for validation and testing factory (https://github.com/pykeen/pykeen/issues/270)
    • Treat nonzero applied to large tensor error as OOM for batch size search (https://github.com/pykeen/pykeen/issues/279)
    • Fix bug in loading ConceptNet (https://github.com/pykeen/pykeen/issues/290). If your experiments relied on this dataset, you should rerun them.
    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Jan 20, 2021)

    New Datasets

    • CoDEx (https://github.com/pykeen/pykeen/pull/154)
    • DRKG (https://github.com/pykeen/pykeen/pull/156)
    • OGB (https://github.com/pykeen/pykeen/pull/159)
    • ConceptNet (https://github.com/pykeen/pykeen/pull/160)
    • Clinical Knowledge Graph (https://github.com/pykeen/pykeen/pull/209)

    New Trackers

    • Neptune.ai (https://github.com/pykeen/pykeen/pull/183)

    Added

    • Add MLFlow set tags function (https://github.com/pykeen/pykeen/pull/139; thanks @sunny1401)
    • Add score_t/h function for ComplEx (https://github.com/pykeen/pykeen/pull/150)
    • Add proper testing for literal datasets and literal models (https://github.com/pykeen/pykeen/pull/199)
    • Checkpoint functionality (https://github.com/pykeen/pykeen/pull/123)
    • Random triple generation (https://github.com/pykeen/pykeen/pull/201)
    • Make negative sampler corruption scheme configurable (https://github.com/pykeen/pykeen/pull/209)
    • Add predict with inverse triples pipeline (https://github.com/pykeen/pykeen/pull/208)
    • Add generalize p-norm to regularizer (https://github.com/pykeen/pykeen/pull/225)

    Changed

    • New harness for resetting parameters (https://github.com/pykeen/pykeen/pull/131)
    • Modularize embeddings (https://github.com/pykeen/pykeen/pull/132)
    • Update first steps documentation (https://github.com/pykeen/pykeen/pull/152; thanks @TobiasUhmann )
    • Switched testing to GitHub Actions (https://github.com/pykeen/pykeen/pull/165 and https://github.com/pykeen/pykeen/pull/194)
    • No longer support Python 3.6
    • Move automatic memory optimization (AMO) option out of model and into training loop (https://github.com/pykeen/pykeen/pull/176)
    • Improve hyper-parameter defaults and HPO defaults (https://github.com/pykeen/pykeen/pull/181 and https://github.com/pykeen/pykeen/pull/179)
    • Switch internal usage to ID-based triples (https://github.com/pykeen/pykeen/pull/193 and https://github.com/pykeen/pykeen/pull/220)
    • Optimize triples splitting algorithm (https://github.com/pykeen/pykeen/pull/187)
    • Generalize metadata storage in triples factory (https://github.com/pykeen/pykeen/pull/211)
    • Add drop_last option to data loader in training loop (https://github.com/pykeen/pykeen/pull/217)

    Fixed

    • Whitelist support in HPO pipeline (https://github.com/pykeen/pykeen/pull/124)
    • Improve evaluator instantiation (https://github.com/pykeen/pykeen/pull/125; thanks @kantholtz)
    • CPU fallback on AMO (https://github.com/pykeen/pykeen/pull/232)
    • Fix HPO save issues (https://github.com/pykeen/pykeen/pull/235)
    • Fix GPU issue in plotting (https://github.com/pykeen/pykeen/pull/207)
    Source code(tar.gz)
    Source code(zip)
  • v1.0.5(Oct 21, 2020)

    Added

    • Added testing on Windows with AppVeyor and documentation for installation on Windows (https://github.com/pykeen/pykeen/pull/95)
    • Add ability to specify custom datasets in HPO and ablation studies (https://github.com/pykeen/pykeen/pull/54)
    • Add functions for plotting entities and relations (as well as an accompanying tutorial) (https://github.com/pykeen/pykeen/pull/99)

    Changed

    • Replaced BCE loss with BCEWithLogits loss (https://github.com/pykeen/pykeen/pull/109)
    • Store default HPO ranges in loss classes (https://github.com/pykeen/pykeen/pull/111)
    • Use entrypoints for datasets (https://github.com/pykeen/pykeen/pull/115) to allow registering of custom datasets
    • Improved WANDB results tracker (https://github.com/pykeen/pykeen/pull/117, thanks @kantholtz)
    • Reorganized ablation study generation and execution (https://github.com/pykeen/pykeen/pull/54)

    Fixed

    • Fixed bug in the initialization of ConvE (https://github.com/pykeen/pykeen/pull/100)
    • Fixed cross-platform issue with random integer generation (https://github.com/pykeen/pykeen/pull/98)
    • Fixed documentation build on ReadTheDocs (https://github.com/pykeen/pykeen/pull/104)
    Source code(tar.gz)
    Source code(zip)
    pykeen-1.0.5-py3-none-any.whl(311.71 KB)
    pykeen-1.0.5.tar.gz(1.22 MB)
  • v1.0.4(Aug 25, 2020)

  • v1.0.3(Aug 13, 2020)

    Added

    • Side-specific evaluation (https://github.com/pykeen/pykeen/pull/44)
    • Grid Sampler (https://github.com/pykeen/pykeen/pull/52)
    • Weights & Biases Tracker (https://github.com/pykeen/pykeen/pull/68), thanks @migalkin!

    Changed

    • Update to Optuna 2.0 (https://github.com/pykeen/pykeen/pull/52)
    • Generalize specification of tracker (https://github.com/pykeen/pykeen/pull/39)

    Fixed

    • Fix bug in triples factory splitter (https://github.com/pykeen/pykeen/pull/59)
    • Device mismatch bug (https://github.com/pykeen/pykeen/pull/50)
    Source code(tar.gz)
    Source code(zip)
    pykeen-1.0.3-py3-none-any.whl(303.43 KB)
    pykeen-1.0.3.tar.gz(690.37 KB)
  • v1.0.2(Jul 10, 2020)

    Added

    • Add default values for margin and adversarial temperature in NSSA loss (https://github.com/pykeen/pykeen/pull/29)
    • Added FTP uploader (https://github.com/pykeen/pykeen/pull/35)
    • Add AWS S3 uploader (https://github.com/pykeen/pykeen/pull/39)

    Changed

    • Improved MLflow support (https://github.com/pykeen/pykeen/pull/40)
    • Lots of improvements to documentation!

    Fixed

    • Fix triples factory splitting bug (https://github.com/pykeen/pykeen/pull/21)
    • Fix problem with tensors' device during prediction (https://github.com/pykeen/pykeen/pull/41)
    • Fix RotatE relation embeddings re-initialization (https://github.com/pykeen/pykeen/pull/26)
    Source code(tar.gz)
    Source code(zip)
    pykeen-1.0.2-py3-none-any.whl(298.78 KB)
    pykeen-1.0.2.tar.gz(676.05 KB)
  • v1.0.1(Jul 2, 2020)

  • v0.0.26(Jun 22, 2020)

    This is the last release before the PyKEEN 1.0 release, be prepared for major changes.

    Note! If you've come this far looking for old releases of PyKEEN, we were unfortunately not able to retain them when we moved the code to this new organization. Please see PyPI for a more complete release history (https://pypi.org/project/pykeen/#history) or the Zenodo record associated with SmartDataAnalytics/PyKEEN

    Source code(tar.gz)
    Source code(zip)
Owner
PyKEEN
Predictions for the People
PyKEEN
Simultaneous Detection and Segmentation

Simultaneous Detection and Segmentation This is code for the ECCV Paper: Simultaneous Detection and Segmentation Bharath Hariharan, Pablo Arbelaez,

Bharath Hariharan 96 Jul 20, 2022
An auto discord account and token generator. Automatically verifies the phone number. Works without proxy. Bypasses captcha.

JOIN DISCORD SERVER https://discord.gg/uAc3agBY FREE HCAPTCHA SOLVING API Discord-Token-Gen An auto discord token generator. Auto verifies phone numbe

3kp 271 Jan 01, 2023
Multimodal commodity image retrieval ๅคšๆจกๆ€ๅ•†ๅ“ๅ›พๅƒๆฃ€็ดข

Multimodal commodity image retrieval ๅคšๆจกๆ€ๅ•†ๅ“ๅ›พๅƒๆฃ€็ดข Not finished yet... introduce explain:The specific description of the project and the product image dat

hongjie 8 Nov 25, 2022
Find the Heart simple Python Game

This is a simple Python game for finding a heart emoji. There is a 3 x 3 matrix in which a heart emoji resides. The location of the heart is randomized and is not revealed. The player must guess the

p.katekomol 1 Jan 24, 2022
IMBENS: class-imbalanced ensemble learning in Python.

IMBENS: class-imbalanced ensemble learning in Python. Links: [Documentation] [Gallery] [PyPI] [Changelog] [Source] [Download] [็ŸฅไนŽ/Zhihu] [ไธญๆ–‡README] [a

Zhining Liu 176 Jan 04, 2023
Sequence Modeling with Structured State Spaces

Structured State Spaces for Sequence Modeling This repository provides implementations and experiments for the following papers. S4 Efficiently Modeli

HazyResearch 896 Jan 01, 2023
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

21 Nov 09, 2022
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022
for a paper about leveraging discourse markers for training new models

TSLM-DISCOURSE-MARKERS Scope This repository contains: (1) Code to extract discourse markers from wikipedia (TSA). (1) Code to extract significant dis

International Business Machines 6 Nov 02, 2022
A light weight data augmentation tool for training CNNs and Viola Jones detectors

hey-daug A light weight data augmentation tool for training CNNs and Viola Jones detectors (Haar Cascades). This tool inflates your data by up to six

Jaiyam Sharma 2 Nov 23, 2019
Flexible time series feature extraction & processing

tsflex is a toolkit for flexible time series processing & feature extraction, that is efficient and makes few assumptions about sequence data. Useful

PreDiCT.IDLab 206 Dec 28, 2022
Author's PyTorch implementation of TD3+BC, a simple variant of TD3 for offline RL

A Minimalist Approach to Offline Reinforcement Learning TD3+BC is a simple approach to offline RL where only two changes are made to TD3: (1) a weight

Scott Fujimoto 193 Dec 23, 2022
Time Series Forecasting with Temporal Fusion Transformer in Pytorch

Forecasting with the Temporal Fusion Transformer Multi-horizon forecasting often contains a complex mix of inputs โ€“ including static (i.e. time-invari

Nicolรกs Fornasari 6 Jan 24, 2022
Fastshap: A fast, approximate shap kernel

fastshap: A fast, approximate shap kernel fastshap was designed to be: Fast Calculating shap values can take an extremely long time. fastshap utilizes

Samuel Wilson 22 Sep 24, 2022
Flaxformer: transformer architectures in JAX/Flax

Flaxformer is a transformer library for primarily NLP and multimodal research at Google.

Google 116 Jan 05, 2023
code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation (CVPR 2021) Introduction PBR is a conceptually simple yet effective

H.Chen 143 Jan 05, 2023
Happywhale - Whale and Dolphin Identification Silver๐Ÿฅˆ Solution (26/1588)

Kaggle-Happywhale Happywhale - Whale and Dolphin Identification Silver ๐Ÿฅˆ Solution (26/1588) ็ซž่ต›ๆ–นๆกˆๆ€่ทฏ ๅ›พๅƒๆ•ฐๆฎ้ข„ๅค„็†-ๆ ‡ๅฟ—ๆ€ง็‰นๅพๅ›พ็‰‡่ฃๅ‰ช๏ผš้ฆ–ๅ…ˆๆ นๆฎๅผ€ๆบ็š„ๆ ‡ๆณจๆ•ฐๆฎ่ฎญ็ปƒYOLOv5x6็›ฎๆ ‡ๆฃ€ๆต‹ๆจกๅž‹๏ผŒๅฐ†่ฎญ็ปƒ้›†

Franxx 20 Nov 14, 2022
ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

ROCKET + MINIROCKET ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge D

298 Dec 26, 2022
GeoTransformer - Geometric Transformer for Fast and Robust Point Cloud Registration

Geometric Transformer for Fast and Robust Point Cloud Registration PyTorch imple

Zheng Qin 220 Jan 05, 2023
Erpnext app for make employee salary on payroll entry based on one or more project with percentage for all project equal 100 %

Project Payroll this app for make payroll for employee based on projects like project on 30 % and project 2 70 % as account dimension it makes genral

Ibrahim Morghim 8 Jan 02, 2023