GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

Related tags

Deep Learninggrape
Overview

images/GRAPE.jpg

GraPE

Pypi project Pypi total project downloads

GraPE (Graph Processing and Embedding) is a fast graph processing and embedding library, designed to scale with big graphs and to run on both off-the-shelf laptop and desktop computers and High Performance Computing clusters of workstations.

The library is written in Rust and Python programming languages, and has been developed by AnacletoLAB (Dept.of Computer Science of the University of Milan), in collaboration with the RobinsonLab (Jackson Laboratory for Genomic Medicine) and the BPOP (Lawrence Berkeley National Laboratory).

GraPE is composed of two main modules: Ensmallen (ENabler of SMALL runtimE and memory Needs) and Embiggen (EMBeddInG GENerator), that run synergistically using parallel computation and efficient data structures.

Ensmallen efficiently executes graph processing operations including large-scale first and second-order random walks, while Embiggen leverages the large amount of sampled random walks generated by Ensmallen by computing effective node and edge embeddings. Beside being helpful for unsupervised exploratory analysis of graphs, the computed embeddings can be used for trainining any of the flexible neural models for edge and node label prediction, provided by Embiggen itself.

The following figure shows the main relationships between Ensmallen and Embiggen modules:

images/link_prediction_model.png

Installation of GraPE

For most computers you can just download it using pip:

pip install grape

Since Ensmallen is written in Rust, on PyPi we distribute pre-compiled packages for Windows, Linux, MacOs for the Python version 3.6, 3.7, 3.8, 3.9 for x86_64 cpus.

For the Linux binaires we follow the Python's ManyLinux2010 (PEP 571) standard which requires libc version >= 2.12, this version was releasted in 03/08/2010 so any Linux System in the last ten years should be compatible. To check your current libc version you can run ldd --version.

We also assume that the cpu has the following features: sse, sse2, ssse3, sse4_1, sse4_2, avx, avx2, bmi1, bmi2, popcnt. If these features are not present, you cannot use the PyPi pre-compiled binaries and you have to manually compile Ensmallen (Guide) . On Linux you can check if your CPU supports these features by running cat /proc/cpuinfo and ensuring that all these features are presents under the flags section. While these features are not strictly required, they significanly speed-up the executions and should be supported by any x86_64 CPU newer than Intel's Haswell architecture (2013).

If the your CPU doesn't support them you will get, on import, a ValueError exception with the following message:

This library was compiled assuming that SIMD instruction commonly available in CPU hardware since 2013 are present
on the machine where this library is intended to run.
On the current machine, the flags <MISSING_FLAGS> are not available.
You could still compile Ensmallen on this machine and have a version of the library that can execute here, but the
library has been extensively designed to use SIMD instructions, so you would have a version slower than the one
provided on Pypi.

These requirements were chosen to provide a good tradeoff between compatability and performance. If your system is not compatible, you can manually compile Ensmallen for any Os, libc version, and CPU architecture (such as Arm, AArch64, RiscV, Mips) which are supported by Rust and LLVM. Manually compiling Ensmallen might require more than half an hour and around 10Gb of RAM, if you encounter any error during the installation and/or compilation feel free to open an Issue here on Github and we will help troubleshoot it.

Main functionalities of the library

  • Robust graph loading and automatic graph retrieval:

    • More than 13000 graphs directly available from the library for benchmarking
    • Support for multiple graph formats
    • Automatic human readable reports of format errors
    • Automatic human readable reports of the main graph characteristics
  • Random walks:

    • Exact and approximated first and second order random walks
    • Massive generation of sampled random walks for graph embedding
    • Automatic dispatching of 8 optimized random walk algorithms depending on the parameters of the random walk and the type (weighted/unweighted) of the graph
  • Node embedding models:

    • SkipGram
    • CBOW
    • GloVe
  • Edge and node prediction models:

    • Perceptron
    • Multi-Layer Perceptron
    • Deep Neural Networks
  • Preprocessing for node embedding and edge prediction:

    • Lazy generation of skip-grams from random walks
    • Lazy generation of balanced batches for edge prediction
    • GloVe co-occurence matrix computation
  • Graph processing operations:

    • Optimized filtering by node, edge and components characteristics
    • Optimized algebraic set operations on graphs
    • Automatic generation of reports summarizing graph features in natural language
  • Graph algorithms:

    • Breadth and Depth-first search
    • Dijkstra, Tarjan's strongly connected component
    • Efficient Diameter computation, spanning arborescence and connected components
    • Approximated vertex cover, triads counting, transitivity, clustering coefficient and triangles counting
    • Betweenness and stress centrality, Closeness and harmonic centrality
  • Graph visualization tools: visualization of node and edge properties

Tutorials

You can find tutorials covering various aspects of the GraPE library here. All tutorials are as self-contained as possible and can be immediately executed on COLAB.

If you want to get quickly started, after having installed GraPE from Pypi as described above, you can try running the following example using the SkipGram embedding model on the Cora-graph:

from ensmallen.datasets.linqs import Cora
from ensmallen.datasets.linqs.parse_linqs import get_words_data
from embiggen.pipelines import compute_node_embedding
from embiggen.visualizations import GraphVisualization
import matplotlib.pyplot as plt

# Dowload, load up the graph and its node features
graph, node_features = get_words_data(Cora())

# Compute a SkipGram node embedding, using a second-order random walk sampling
node_embedding, training_history = compute_node_embedding(
    graph,
    node_embedding_method_name="SkipGram",
    # Let's increase the probability of explore the local neighbourhood
    return_weight=2.0,
    explore_weight=0.1
)

# Visualize the obtained node embeddings
visualizer = GraphVisualization(graph, node_embedding_method_name="SkipGram")
visualizer.fit_transform_nodes(node_embedding)

visualizer.plot_node_types()
plt.show()

You can see a tutorial detailing the above script here, and you can run it on COLAB from here.

Documentation

On line documentation

The on line documentation of the library is available here. Since Ensmallen is written in Rust, and PyO3 (the crate we use for the Python bindings), doesn't support typing, the documentation is obtained generating an empty skeleton package. This allows to have a proper documentation but you won't be able to see the source-code in it.

Using the automatic method suggestions utility

To aid working with the library, Grape provides an integrated recommender system meant to help you either to find a method or, if a method has been renamed for any reason, find its new name.

As an example, after having loaded the STRING Homo Sapiens graph, the function for computing the connected components can be retrieved by simply typing components as follows:

from ensmallen.datasets.string import HomoSapiens

graph = HomoSapiens()
graph.components

The code above will raise the following error, and will suggest methods with a similar or related name:

AttributeError                            Traceback (most recent call last)
<ipython-input-3-52fac30ac7f6> in <module>()
----> 2 graph.components

AttributeError: The method 'components' does not exists, did you mean one of the following?
* 'remove_components'
* 'connected_components'
* 'strongly_connected_components'
* 'get_connected_components_number'
* 'get_total_edge_weights'
* 'get_mininum_edge_weight'
* 'get_maximum_edge_weight'
* 'get_unchecked_maximum_node_degree'
* 'get_unchecked_minimum_node_degree'
* 'get_weighted_maximum_node_degree'

In our example the method we need for computing the graph components would be connected_components.

Now the easiest way to get the method documentation is to use Python's help as follows:

help(graph.connected_components)

And the above will return you:

connected_components(verbose) method of builtins.Graph instance
Compute the connected components building in parallel a spanning tree using [bader's algorithm](https://www.sciencedirect.com/science/article/abs/pii/S0743731505000882).

**This works only for undirected graphs.**

The returned quadruple contains:
- Vector of the connected component for each node.
- Number of connected components.
- Minimum connected component size.
- Maximum connected component size.

Parameters
----------
verbose: Optional[bool]
    Whether to show a loading bar or not.


Raises
-------
ValueError
    If the given graph is directed.
ValueError
    If the system configuration does not allow for the creation of the thread pool.

You can try to run the code described above on COLAB.

Cite GraPE

Please cite the following paper if it was useful for your research:

@misc{cappelletti2021grape,
  title={GraPE: fast and scalable Graph Processing and Embedding},
  author={Luca Cappelletti and Tommaso Fontana and Elena Casiraghi and Vida Ravanmehr and Tiffany J. Callahan and Marcin P. Joachimiak and Christopher J. Mungall and Peter N. Robinson and Justin Reese and Giorgio Valentini},
  year={2021},
  eprint={2110.06196},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

If you believe that any example may be of help, do feel free to open a GitHub issue describing what we are missing in this tutorial.

Comments
  • TransE error:

    TransE error: "ValueError: One of the provided node embedding computed with the TransE method contains NaN values."

    When generating embeddings for KG-Microbe (KGX edge file from KG-Hub) using TransE, the following error was observed:

    ValueError Traceback (most recent call last) in ----> 1 embedding = model.fit_transform(kg)

    ~/Library/Python/3.7/lib/python/site-packages/cache_decorator/cache.py in wrapped(*args, **kwargs) 595 if not cache_enabled: 596 self.logger.info("The cache is disabled") --> 597 result = function(*args, **kwargs) 598 self._check_return_type_compatability(result, self.cache_path) 599 return result

    ~/Library/Python/3.7/lib/python/site-packages/embiggen/utils/abstract_models/abstract_embedding_model.py in fit_transform(self, graph, return_dataframe, verbose) 164 graph=graph, 165 return_dataframe=return_dataframe, --> 166 verbose=verbose 167 ) 168

    ~/Library/Python/3.7/lib/python/site-packages/embiggen/embedders/ensmallen_embedders/transe.py in _fit_transform(self, graph, return_dataframe, verbose) 112 embedding_method_name=self.model_name(), 113 node_embeddings= node_embedding, --> 114 edge_type_embeddings= edge_type_embedding, 115 ) 116

    ~/Library/Python/3.7/lib/python/site-packages/embiggen/utils/abstract_models/embedding_result.py in init(self, embedding_method_name, node_embeddings, edge_embeddings, node_type_embeddings, edge_type_embeddings) 76 if np.isnan(numpy_embedding).any(): 77 raise ValueError( ---> 78 f"One of the provided {embedding_list_name} " 79 f"computed with the {embedding_method_name} method " 80 "contains NaN values."

    ValueError: One of the provided node embedding computed with the TransE method contains NaN values.

    I am attaching a jupyter notebook to reproduce the problem. load_graph_and.ipynb.zip

    The input edge file is here: https://kg-hub.berkeleybop.io/kg-microbe/current/kg-microbe.tar.gz

    opened by realmarcin 7
  • Need documentation on how to use a knowledge graph in grape

    Need documentation on how to use a knowledge graph in grape

    Hello, I have another question on how to import my data in grape. I think it is more a clarification on my method to import my KG.

    kg = Graph.from_csv(directed=True,
                           edge_path="sample_mabkg.tsv",
                           sources_column_number= 0,
                           edge_list_edge_types_column_number=1,edge_list_separator="|",
                           destinations_column_number=2, name="mAbKG", verbose=True, edge_list_header=True)
    

    but i saw that it exists node_path and other properties like in edge_path, so i don't know if i did in the good way my read from_csv. Can you please give me some explanation knowning i have a KG (with edge and node typed). Below is an example of my data.

    Thank you for your answer

    Gaoussou

     node source|edge|node destination
    _:B4dff5e7d17225b25b13ad12737e49779|imgt:isDecidedBy|imgt:EC
    pubmed:2843774|dc:title|Selective killing of HIV-infected cells by recombinant human CD4-Pseudomonas exotoxin hybrid protein.
    imgt:Product_8e9250cf-276a-3282-954f-3791316ac5a6|rdf:type|obo:NCIT_C51980
    imgt:Segment_212_1|obo:BFO_0000050|imgt:Construct_212
    imgt:IgG4-kappa_1001|rdfs:label|IgG4-kappa_1001
    imgt:V-D-GENE|owl:sameAs|obo:SO_0000510
    imgt:Segment_536_1|rdf:type|imgt:Segment
    imgt:LRR13|rdf:type|imgt:RepeatLabel
    imgt:StudyProduct_c2bc9b3a-a15e-376f-bda5-f87089b3f54b|imgt:application_type|Therapeutic
    imgt:StudyProduct_54a14ca8-f916-338b-af18-d079beb598a4|imgt:development_technology|  Dyax human antibody phage display library 
    

    sample_mabkg.txt

    opened by gsanou 6
  • embiggen package error under Windoze

    embiggen package error under Windoze

    The joy on installation on Windoze...

    Collecting embiggen>=0.11.9
      Downloading embiggen-0.11.38.tar.gz (154 kB)
         ---------------------------------------- 154.2/154.2 kB ? eta 0:00:00
      Preparing metadata (setup.py) ... error
      error: subprocess-exited-with-error
    
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [10 lines of output]
          Traceback (most recent call last):
            File "<string>", line 2, in <module>
            File "<pip-setuptools-caller>", line 34, in <module>
            File "C:\cygwin64\tmp\pip-install-37lyy1_b\embiggen_3ec9ca91df6044b1b2470bb84cb6184d\setup.py", line 54, in <module>
              long_description=readme(),
            File "C:\cygwin64\tmp\pip-install-37lyy1_b\embiggen_3ec9ca91df6044b1b2470bb84cb6184d\setup.py", line 12, in readme
              return f.read()
            File "C:\Users\richa\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 23, in decode
              return codecs.charmap_decode(input,self.errors,decoding_table)[0]
          UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 2: character maps to <undefined>
          [end of output]
    
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for details.
    
    
    opened by RichardBruskiewich 6
  • Bipartite Graph predict proba with undirected graph

    Bipartite Graph predict proba with undirected graph

    Hi. I noticed the performance metrics are not identical when using predict_proba_bipartite_graph_from_edge_node_types, when I swap the source and destination nodes. The graph used as input is an undirected graph, which I would expect would yield similar predictions for the same edge type regardless of which is source and destination nodes. Is this behavior intentional?

    Below are the version of the software I am running currently: grape==0.1.17 embiggen==0.11.27 ensmallen==0.8.14

    opened by arpelletier 6
  • ImportError: libgfortran-ed201abd.so.3.0.0: cannot open shared object file: No such file or directory

    ImportError: libgfortran-ed201abd.so.3.0.0: cannot open shared object file: No such file or directory

    In a fresh notebook, attempting to import grape yields an ImportError about a missing libgfortran-ed201abd.so.3.0.0.

    >>> !pip install grape -U
    >>> import grape
    /usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.8) or chardet (3.0.4) doesn't match a supported version!
      warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
    Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?276a3afe-1b97-4f33-82e6-6df2db01934a)
    ---------------------------------------------------------------------------
    ImportError                               Traceback (most recent call last)
    /home/harry/kg-bioportal/data/merged/KG-Bioportal analysis.ipynb Cell 2' in <cell line: 1>()
    ----> [1](vscode-notebook-cell://wsl%2Bubuntu-20.04/home/harry/kg-bioportal/data/merged/KG-Bioportal%20analysis.ipynb#ch0000001vscode-remote?line=0) import grape
    
    File ~/.local/lib/python3.8/site-packages/grape/__init__.py:9, in <module>
          1 """GraPE main module.
          2 
          3 For now, this is a simple wrapper of GraPE main two sub-modules that for
       (...)
          6 These packages are mimed here by the two sub-directories, ensmallen and embiggen.
          7 """
    ----> 9 from embiggen import *
         10 from ensmallen import Graph
         13 def import_all(module_locals):
    
    File ~/.local/lib/python3.8/site-packages/embiggen/__init__.py:2, in <module>
          1 """Module with models for graph machine learning and visualization."""
    ----> 2 from embiggen.visualizations import GraphVisualizer
          3 from embiggen.utils import (
          4     EmbeddingResult,
          5     get_models_dataframe,
       (...)
          9     get_available_models_for_node_embedding,
         10 )
    ...
        691     'spherical_kn',
        692 ]
        694 from scipy._lib._testutils import PytestTester
    
    ImportError: libgfortran-ed201abd.so.3.0.0: cannot open shared object file: No such file or directory
    

    I've seen that this may be related to libraries packaged with numpy, as seen in the following: https://github.com/ContinuumIO/anaconda-issues/issues/445 https://github.com/numpy/numpy/issues/14348

    This may be environment-specific, of course.

    opened by caufieldjh 6
  • `Illegal instruction (core dumped)` on importing grape

    `Illegal instruction (core dumped)` on importing grape

    In another issue that may have something to do with our aging build server: When we import grape in this environment (see info below), we get only Illegal instruction (core dumped).

    cpuinfo output:

    processor       : 23
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 44
    model name      : Intel(R) Xeon(R) CPU           X5675  @ 3.07GHz
    stepping        : 2
    microcode       : 0x1f
    cpu MHz         : 1599.987
    cache size      : 12288 KB
    physical id     : 1
    siblings        : 12
    core id         : 10
    cpu cores       : 6
    apicid          : 53
    initial apicid  : 53
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 11
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d
    bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
    bogomips        : 6133.21
    clflush size    : 64
    cache_alignment : 64
    address sizes   : 40 bits physical, 48 bits virtual
    power management:
    
    opened by caufieldjh 4
  • Link to the two sub packages

    Link to the two sub packages

    Hi, first of all, thanks for making such an amazing graph embedding resource!

    I'm wondering whether you can add some descriptions in the README clarifying that this repo is a thin wrapper of the two core packages embiggen and ensmallen and add links accordingly. I was a bit confused for a few minutes trying to find the source code and only came to realize it wraps the two libraries after looking at __init__.py.

    opened by RemyLau 4
  • pip install grape failure on support_luca>=1.0.2

    pip install grape failure on support_luca>=1.0.2

    I am attempting to install grape using pip on Ubuntu 20.04.4 LTS with python 3.8.3.

    Most of the build/install appears to work just fine until I hit this error, providing a little additional context. I have also tried to install ensmallen directly with pip install ensmallen and I get the same error. Any advice you have would be appreciated.

    Requirement already satisfied: idna<3,>=2.5 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (2.10)
    Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (1.25.9)
    Requirement already satisfied: certifi>=2017.4.17 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (2020.6.20)
    Requirement already satisfied: chardet<4,>=3.0.2 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (3.0.4)
    Collecting typing-extensions>=3.7.4.3
      Using cached typing_extensions-4.3.0-py3-none-any.whl (25 kB)
    ERROR: Could not find a version that satisfies the requirement support_luca>=1.0.2 (from dict_hash>=1.1.25->cache_decorator>=2.1.11->ensmallen>=0.8.21->grape) (from versions: none)
    ERROR: No matching distribution found for support_luca>=1.0.2 (from dict_hash>=1.1.25->cache_decorator>=2.1.11->ensmallen>=0.8.21->grape)
    
    opened by amc-corey-cox 4
  • Graph visualization error

    Graph visualization error

    Hello. I am trying the Using CBOW to embed Cora python notebook (linked) and after replacing "CBOWEnsmallen" with "DeepWalkCBOWEnsmallen", the first order embedding runs successfully but fails at the graph visualization. I get the following error:

    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    /tmp/ipykernel_3453/3695275499.py in <module>
    ----> 1 GraphVisualizer(
          2     graph,
          3     node_embedding_method_name="CBOW - First order"
          4 ).fit_and_plot_all(first_embedding)
    
    ~/anaconda3/lib/python3.9/site-packages/embiggen/visualizations/graph_visualizer.py in fit_and_plot_all(self, node_embedding, number_of_columns, show_letters, include_distribution_plots, **node_embedding_kwargs)
       4236         distribution_plot_methods_to_call = []
       4237 
    -> 4238         if not self._graph.has_constant_non_zero_node_degrees():
       4239             node_scatter_plot_methods_to_call.append(
       4240                 self.plot_node_degrees,
    
    AttributeError: The method 'has_constant_non_zero_node_degrees' does not exists, did you mean one of the following?
    * 'has_constant_edge_weights'
    * 'get_non_zero_subgraph_node_degrees'
    * 'has_nodes'
    * 'has_edges'
    * 'has_selfloops'
    * 'has_node_ontologies'
    * 'has_node_oddities'
    * 'get_node_degrees'
    * 'has_node_name'
    * 'has_node_types'
    

    Looks like the issue has to do with embiggen dependencies in the graph visualization. Below are the package versions I am using: embiggen==0.11.13 ensmallen==0.8.7 grape==0.1.9

    As well, I was not able to successfully run the second-order embeddings

    model = DeepWalkCBOWEnsmallen(
        return_weight=2.0,
        explore_weight=0.1
    )
    second_embedding = model.fit_transform(graph).get_node_embedding_from_index(0)
    

    The above code gives the below error:

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    /tmp/ipykernel_3453/3112314827.py in <module>
    ----> 1 model = DeepWalkCBOWEnsmallen(
          2     return_weight=2.0,
          3     explore_weight=0.1
          4 )
          5 second_embedding = model.fit_transform(graph).get_node_embedding_from_index(0)
    
    TypeError: __init__() got an unexpected keyword argument 'return_weight'```
    opened by arpelletier 4
  • Embedding model names not recognized; alternate suggestions are unexpected

    Embedding model names not recognized; alternate suggestions are unexpected

    As of grape 0.1.9, node embedding model names have changed, such that a call to embiggen's AbstractModel.get_task_data(model_name, task_name) with one of the frequently used model names like CBOW or SkipGram throws a ValueError.

    I see from grape.get_available_models_for_node_embedding() that these now have more specific names like Node2Vec CBOW. No problem with being specific, but we'd still like to be able to specify CBOW, SkipGram, or GloVe in config definitions without having to verify the exact model names embiggen is expecting first. Could we use the short names as aliases to a default model, like CBOW will be understood as Node2Vec CBOW, etc?

    The name convention also appears to confuse the alternative suggests provided in the ValueError text, so we get suggestions like this:

    ValueError: The provided model name `CBOW` is not available. Did you mean BoxE?
    
    opened by caufieldjh 4
  • ValueError when trying to use external embedder like in pykeen and karateClub

    ValueError when trying to use external embedder like in pykeen and karateClub

    Hello, Thanks you for your amaeing work, i'm a phD student working on the embeddings of biomedical data particularly in immunogenetics, and currently i'm comparing tools to embed data. I found your works very interesting. I got some issues when i try to use external model from pykeen and karateclub. i got this message :
    ValueError: We have found an useless method in the class StubClass, implementing method HolE from library PyKEEN and task Node Embedding. It does not make sense to implement the `requires_positive_edge_weights` method when the `can_use_edge_weights` always returns False, as it is already handled in the root abstract model class.

    Also for the vizualisation, when i did ```  from grape import GraphVisualizer visualizer = GraphVisualizer(kg.remove_disconnected_nodes()) visualizer.fit_and_plot_all(embedding)

    I got this warning without no visualisation:  FutureWarning: The parameter `square_distances` has not effect and will be removed in version 1.3.
    Thank you in advance for your answer
    Gaoussou
    opened by gsanou 3
  • Use case regarding Customer Analytics or Community detection?

    Use case regarding Customer Analytics or Community detection?

    Thanks for that repo. It seems that you have integrated several tools / libraries / approaches under Grape's hood. Do you intend to create a tutorial for a customer analytics recommendation?

    Thanks in advance.

    opened by stkarlos 2
  • Parallelized Embedding

    Parallelized Embedding

    Hey, I'm trying to process a directed graph, the scales are about 5 million nodes and 100 million edges. I've managed to load the graph from a csv file, i get a very nice Graph object (within 5 minutes). I'm now trying to embedd the graph with grape.embedders.Node2VecSkipGramEnsmallen, but it doesn't seem to succeed, I've let it run for over 10 hours. In order to make it faster, i did enable the Graph's vector_source, vector_cumulative_node_degree and vector_reciprocal_sqrt_degrees. Reading your paper, it seems that the embedding process could be parallelized, but i can't find the way to do that. I'd appreciate if you could describe what part/s of the embedding process are parallelized? and how can i make it run in parallel? Thank you, Bruria.

    opened by bruriah1999 2
  • Getting figure to be inline

    Getting figure to be inline

    matplotlib plots figures inline by default or if we write

    %matplotlib inline
    

    Some of the figures produced by GRAPE get put into "subwindows" in the Jupyter notebook, and one needs to scroll up and down to see the entire figure. GRAPE does not seem to be responsive to the inline magic command above either.

    For instance, in order for a certain figure to really appear online, I need to make it much smaller

    visualizer = GraphVisualizer(sli_graph, automatically_display_on_notebooks=False)
    fig, ax, cap = visualizer.plot_node_degree_distribution()
    fig.set_figheight(3)
    fig.set_figwidth(3)
    

    even though the notebook could comfortably show (5,5) or even (8,8)

    opened by pnrobinson 2
  • Saving classifier models

    Saving classifier models

    Could support for saving classifier models please be added? This came up while meeting with @LucaCappelletti94 recently but it's become relevant again in the course of updating neat-ml to use grape classifiers.

    Training classifiers isn't a major time commitment, but on our neat runs we've separated the process of training+testing vs. applying classifiers, so being unable to save or at least pickle the classifier object means we need to redo training for each model.

    opened by caufieldjh 4
  • Methods for generating node embeddings from word embeddings

    Methods for generating node embeddings from word embeddings

    While updating NEAT to use the most recent grape release, @justaddcoffee and @hrshdhgd and I took a look at what we're using to generate node embeddings based on pretrained word embeddings like BERT etc. : https://github.com/Knowledge-Graph-Hub/NEAT/blob/main/neat/graph_embedding/graph_embedding.py

    We know we can run something like get_okapi_tfidf_weighted_textual_embedding() on a graph, but is there a more "on demand" way to run this in grape now for an arbitrary graph?

    opened by caufieldjh 10
Releases(0.0.6.dev1)
Owner
AnacletoLab
Computational Biology and Bioinformatics Lab - Dept. of Computer Science - UNIMI
AnacletoLab
Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation"

1 Introduction Official code for article "Expression is enough: Improving traffic signal control with advanced traffic state representation". The code s

Liang Zhang 10 Dec 10, 2022
Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

IrwGAN (ICCV2021) Unaligned Image-to-Image Translation by Learning to Reweight [Update] 12/15/2021 All dataset are released, trained models and genera

37 Nov 09, 2022
시각 장애인을 위한 스마트 지팡이에 활용될 딥러닝 모델 (DL Model Repo)

SmartCane-DL-Model Smart Cane using semantic segmentation 참고한 Github repositoy 🔗 https://github.com/JunHyeok96/Road-Segmentation.git 데이터셋 🔗 https://

반드시 졸업한다 (Team Just Graduate) 4 Dec 03, 2021
OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

Siva Prakash 5 Apr 05, 2022
[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Visual-Reasoning-eXplanation [CVPR 2021 A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts] Project Page | Vid

Andy_Ge 54 Dec 21, 2022
Multi-Modal Machine Learning toolkit based on PyTorch.

简体中文 | English TorchMM 简介 多模态学习工具包 TorchMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 TorchMM 初始版本 v1.0 特性 丰富的任务场景:工具

njustkmg 1 Jan 05, 2022
Pyramid Pooling Transformer for Scene Understanding

Pyramid Pooling Transformer for Scene Understanding Requirements: torch 1.6+ torchvision 0.7.0 timm==0.3.2 Validated on torch 1.6.0, torchvision 0.7.0

Yu-Huan Wu 119 Dec 29, 2022
PolyGlot, a fuzzing framework for language processors

PolyGlot, a fuzzing framework for language processors Build We tested PolyGlot on Ubuntu 18.04. Get the source code: git clone https://github.com/s3te

Software Systems Security Team at Penn State University 79 Dec 27, 2022
A Fast Monotone Rotating Shallow Water model

pyRSW A Fast Monotone Rotating Shallow Water model How fast? As fast as a sustained 2 Gflop/s per core on a 2.5 GHz cpu (or 2048 Gflop/s with 1024 cor

Guillaume Roullet 13 Sep 28, 2022
(AAAI2020)Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing This repository contains pytorch source code for AAAI2020 oral paper: Grapy-ML

54 Aug 04, 2022
Multiple-criteria decision-making (MCDM) with Electre, Promethee, Weighted Sum and Pareto

EasyMCDM - Quick Installation methods Install with PyPI Once you have created your Python environment (Python 3.6+) you can simply type: pip3 install

Labrak Yanis 6 Nov 22, 2022
Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

MEGVII Research 309 Dec 30, 2022
Docker containers of baseline agents for the Crafter environment

Crafter Baselines This repository contains Docker containers for running various baselines on the Crafter environment. Reward Agents DreamerV2 based o

Danijar Hafner 17 Sep 25, 2022
Minimal deep learning library written from scratch in Python, using NumPy/CuPy.

SmallPebble Project status: experimental, unstable. SmallPebble is a minimal/toy automatic differentiation/deep learning library written from scratch

Sidney Radcliffe 92 Dec 30, 2022
用强化学习DQN算法,训练AI模型来玩合成大西瓜游戏,提供Keras版本和PARL(paddle)版本

用强化学习玩合成大西瓜 代码地址:https://github.com/Sharpiless/play-daxigua-using-Reinforcement-Learning 用强化学习DQN算法,训练AI模型来玩合成大西瓜游戏,提供Keras版本、PARL(paddle)版本和pytorch版本

72 Dec 17, 2022
A Python package for performing pore network modeling of porous media

Overview of OpenPNM OpenPNM is a comprehensive framework for performing pore network simulations of porous materials. More Information For more detail

PMEAL 336 Dec 30, 2022
[CVPR'21] Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration

Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration This repository contains the implementation of our paper Locally Aware Pi

sfwang 70 Dec 19, 2022
Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

English | 简体中文 Latest News 2021.10.25 Paper "Docking-based Virtual Screening with Multi-Task Learning" is accepted by BIBM 2021. 2021.07.29 PaddleHeli

633 Jan 04, 2023
PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

Yoonki Jeong 129 Dec 22, 2022
The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection.

The Equalization Losses for Long-tailed Object Detection and Instance Segmentation This repo is official implementation CVPR 2021 paper: Equalization

Jingru Tan 129 Dec 16, 2022