GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

Last update: Dec 29, 2022

Related tags

Overview

GraPE

GraPE (Graph Processing and Embedding) is a fast graph processing and embedding library, designed to scale with big graphs and to run on both off-the-shelf laptop and desktop computers and High Performance Computing clusters of workstations.

The library is written in Rust and Python programming languages, and has been developed by AnacletoLAB (Dept.of Computer Science of the University of Milan), in collaboration with the RobinsonLab (Jackson Laboratory for Genomic Medicine) and the BPOP (Lawrence Berkeley National Laboratory).

GraPE is composed of two main modules: Ensmallen (ENabler of SMALL runtimE and memory Needs) and Embiggen (EMBeddInG GENerator), that run synergistically using parallel computation and efficient data structures.

Ensmallen efficiently executes graph processing operations including large-scale first and second-order random walks, while Embiggen leverages the large amount of sampled random walks generated by Ensmallen by computing effective node and edge embeddings. Beside being helpful for unsupervised exploratory analysis of graphs, the computed embeddings can be used for trainining any of the flexible neural models for edge and node label prediction, provided by Embiggen itself.

The following figure shows the main relationships between Ensmallen and Embiggen modules:

Installation of GraPE

For most computers you can just download it using pip:

pip install grape

Since Ensmallen is written in Rust, on PyPi we distribute pre-compiled packages for Windows, Linux, MacOs for the Python version 3.6, 3.7, 3.8, 3.9 for x86_64 cpus.

For the Linux binaires we follow the Python's ManyLinux2010 (PEP 571) standard which requires libc version >= 2.12, this version was releasted in 03/08/2010 so any Linux System in the last ten years should be compatible. To check your current libc version you can run ldd --version.

We also assume that the cpu has the following features: sse, sse2, ssse3, sse4_1, sse4_2, avx, avx2, bmi1, bmi2, popcnt. If these features are not present, you cannot use the PyPi pre-compiled binaries and you have to manually compile Ensmallen (Guide) . On Linux you can check if your CPU supports these features by running cat /proc/cpuinfo and ensuring that all these features are presents under the flags section. While these features are not strictly required, they significanly speed-up the executions and should be supported by any x86_64 CPU newer than Intel's Haswell architecture (2013).

If the your CPU doesn't support them you will get, on import, a ValueError exception with the following message:

This library was compiled assuming that SIMD instruction commonly available in CPU hardware since 2013 are present
on the machine where this library is intended to run.
On the current machine, the flags <MISSING_FLAGS> are not available.
You could still compile Ensmallen on this machine and have a version of the library that can execute here, but the
library has been extensively designed to use SIMD instructions, so you would have a version slower than the one
provided on Pypi.

These requirements were chosen to provide a good tradeoff between compatability and performance. If your system is not compatible, you can manually compile Ensmallen for any Os, libc version, and CPU architecture (such as Arm, AArch64, RiscV, Mips) which are supported by Rust and LLVM. Manually compiling Ensmallen might require more than half an hour and around 10Gb of RAM, if you encounter any error during the installation and/or compilation feel free to open an Issue here on Github and we will help troubleshoot it.

Main functionalities of the library

Robust graph loading and automatic graph retrieval:
- More than 13000 graphs directly available from the library for benchmarking
- Support for multiple graph formats
- Automatic human readable reports of format errors
- Automatic human readable reports of the main graph characteristics
Random walks:
- Exact and approximated first and second order random walks
- Massive generation of sampled random walks for graph embedding
- Automatic dispatching of 8 optimized random walk algorithms depending on the parameters of the random walk and the type (weighted/unweighted) of the graph
Node embedding models:
- SkipGram
- CBOW
- GloVe
Edge and node prediction models:
- Perceptron
- Multi-Layer Perceptron
- Deep Neural Networks
Preprocessing for node embedding and edge prediction:
- Lazy generation of skip-grams from random walks
- Lazy generation of balanced batches for edge prediction
- GloVe co-occurence matrix computation
Graph processing operations:
- Optimized filtering by node, edge and components characteristics
- Optimized algebraic set operations on graphs
- Automatic generation of reports summarizing graph features in natural language
Graph algorithms:
- Breadth and Depth-first search
- Dijkstra, Tarjan's strongly connected component
- Efficient Diameter computation, spanning arborescence and connected components
- Approximated vertex cover, triads counting, transitivity, clustering coefficient and triangles counting
- Betweenness and stress centrality, Closeness and harmonic centrality
Graph visualization tools: visualization of node and edge properties

Tutorials

You can find tutorials covering various aspects of the GraPE library here. All tutorials are as self-contained as possible and can be immediately executed on COLAB.

If you want to get quickly started, after having installed GraPE from Pypi as described above, you can try running the following example using the SkipGram embedding model on the Cora-graph:

from ensmallen.datasets.linqs import Cora
from ensmallen.datasets.linqs.parse_linqs import get_words_data
from embiggen.pipelines import compute_node_embedding
from embiggen.visualizations import GraphVisualization
import matplotlib.pyplot as plt

# Dowload, load up the graph and its node features
graph, node_features = get_words_data(Cora())

# Compute a SkipGram node embedding, using a second-order random walk sampling
node_embedding, training_history = compute_node_embedding(
    graph,
    node_embedding_method_name="SkipGram",
    # Let's increase the probability of explore the local neighbourhood
    return_weight=2.0,
    explore_weight=0.1
)

# Visualize the obtained node embeddings
visualizer = GraphVisualization(graph, node_embedding_method_name="SkipGram")
visualizer.fit_transform_nodes(node_embedding)

visualizer.plot_node_types()
plt.show()

You can see a tutorial detailing the above script here, and you can run it on COLAB from here.

Documentation

On line documentation

The on line documentation of the library is available here. Since Ensmallen is written in Rust, and PyO3 (the crate we use for the Python bindings), doesn't support typing, the documentation is obtained generating an empty skeleton package. This allows to have a proper documentation but you won't be able to see the source-code in it.

Using the automatic method suggestions utility

To aid working with the library, Grape provides an integrated recommender system meant to help you either to find a method or, if a method has been renamed for any reason, find its new name.

As an example, after having loaded the STRING Homo Sapiens graph, the function for computing the connected components can be retrieved by simply typing components as follows:

from ensmallen.datasets.string import HomoSapiens

graph = HomoSapiens()
graph.components

The code above will raise the following error, and will suggest methods with a similar or related name:

AttributeError                            Traceback (most recent call last)
<ipython-input-3-52fac30ac7f6> in <module>()
----> 2 graph.components

AttributeError: The method 'components' does not exists, did you mean one of the following?
* 'remove_components'
* 'connected_components'
* 'strongly_connected_components'
* 'get_connected_components_number'
* 'get_total_edge_weights'
* 'get_mininum_edge_weight'
* 'get_maximum_edge_weight'
* 'get_unchecked_maximum_node_degree'
* 'get_unchecked_minimum_node_degree'
* 'get_weighted_maximum_node_degree'

In our example the method we need for computing the graph components would be connected_components.

Now the easiest way to get the method documentation is to use Python's help as follows:

help(graph.connected_components)

And the above will return you:

connected_components(verbose) method of builtins.Graph instance
Compute the connected components building in parallel a spanning tree using [bader's algorithm](https://www.sciencedirect.com/science/article/abs/pii/S0743731505000882).

**This works only for undirected graphs.**

The returned quadruple contains:
- Vector of the connected component for each node.
- Number of connected components.
- Minimum connected component size.
- Maximum connected component size.

Parameters
----------
verbose: Optional[bool]
    Whether to show a loading bar or not.


Raises
-------
ValueError
    If the given graph is directed.
ValueError
    If the system configuration does not allow for the creation of the thread pool.

You can try to run the code described above on COLAB.

Cite GraPE

Please cite the following paper if it was useful for your research:

@misc{cappelletti2021grape,
  title={GraPE: fast and scalable Graph Processing and Embedding},
  author={Luca Cappelletti and Tommaso Fontana and Elena Casiraghi and Vida Ravanmehr and Tiffany J. Callahan and Marcin P. Joachimiak and Christopher J. Mungall and Peter N. Robinson and Justin Reese and Giorgio Valentini},
  year={2021},
  eprint={2110.06196},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

If you believe that any example may be of help, do feel free to open a GitHub issue describing what we are missing in this tutorial.

Comments

TransE error: "ValueError: One of the provided node embedding computed with the TransE method contains NaN values."

When generating embeddings for KG-Microbe (KGX edge file from KG-Hub) using TransE, the following error was observed:

ValueError Traceback (most recent call last) in ----> 1 embedding = model.fit_transform(kg)

~/Library/Python/3.7/lib/python/site-packages/cache_decorator/cache.py in wrapped(*args, **kwargs) 595 if not cache_enabled: 596 self.logger.info("The cache is disabled") --> 597 result = function(*args, **kwargs) 598 self._check_return_type_compatability(result, self.cache_path) 599 return result

~/Library/Python/3.7/lib/python/site-packages/embiggen/utils/abstract_models/abstract_embedding_model.py in fit_transform(self, graph, return_dataframe, verbose) 164 graph=graph, 165 return_dataframe=return_dataframe, --> 166 verbose=verbose 167 ) 168

~/Library/Python/3.7/lib/python/site-packages/embiggen/embedders/ensmallen_embedders/transe.py in _fit_transform(self, graph, return_dataframe, verbose) 112 embedding_method_name=self.model_name(), 113 node_embeddings= node_embedding, --> 114 edge_type_embeddings= edge_type_embedding, 115 ) 116

~/Library/Python/3.7/lib/python/site-packages/embiggen/utils/abstract_models/embedding_result.py in init(self, embedding_method_name, node_embeddings, edge_embeddings, node_type_embeddings, edge_type_embeddings) 76 if np.isnan(numpy_embedding).any(): 77 raise ValueError( ---> 78 f"One of the provided {embedding_list_name} " 79 f"computed with the {embedding_method_name} method " 80 "contains NaN values."

ValueError: One of the provided node embedding computed with the TransE method contains NaN values.

I am attaching a jupyter notebook to reproduce the problem. load_graph_and.ipynb.zip

The input edge file is here: https://kg-hub.berkeleybop.io/kg-microbe/current/kg-microbe.tar.gz

opened by realmarcin 7

Need documentation on how to use a knowledge graph in grape

Hello, I have another question on how to import my data in grape. I think it is more a clarification on my method to import my KG.

kg = Graph.from_csv(directed=True,
                       edge_path="sample_mabkg.tsv",
                       sources_column_number= 0,
                       edge_list_edge_types_column_number=1,edge_list_separator="|",
                       destinations_column_number=2, name="mAbKG", verbose=True, edge_list_header=True)

but i saw that it exists node_path and other properties like in edge_path, so i don't know if i did in the good way my read from_csv. Can you please give me some explanation knowning i have a KG (with edge and node typed). Below is an example of my data.

Thank you for your answer

Gaoussou

 node source|edge|node destination
_:B4dff5e7d17225b25b13ad12737e49779|imgt:isDecidedBy|imgt:EC
pubmed:2843774|dc:title|Selective killing of HIV-infected cells by recombinant human CD4-Pseudomonas exotoxin hybrid protein.
imgt:Product_8e9250cf-276a-3282-954f-3791316ac5a6|rdf:type|obo:NCIT_C51980
imgt:Segment_212_1|obo:BFO_0000050|imgt:Construct_212
imgt:IgG4-kappa_1001|rdfs:label|IgG4-kappa_1001
imgt:V-D-GENE|owl:sameAs|obo:SO_0000510
imgt:Segment_536_1|rdf:type|imgt:Segment
imgt:LRR13|rdf:type|imgt:RepeatLabel
imgt:StudyProduct_c2bc9b3a-a15e-376f-bda5-f87089b3f54b|imgt:application_type|Therapeutic
imgt:StudyProduct_54a14ca8-f916-338b-af18-d079beb598a4|imgt:development_technology|  Dyax human antibody phage display library

sample_mabkg.txt

opened by gsanou 6

embiggen package error under Windoze

The joy on installation on Windoze...

Collecting embiggen>=0.11.9
  Downloading embiggen-0.11.38.tar.gz (154 kB)
     ---------------------------------------- 154.2/154.2 kB ? eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [10 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "C:\cygwin64\tmp\pip-install-37lyy1_b\embiggen_3ec9ca91df6044b1b2470bb84cb6184d\setup.py", line 54, in <module>
          long_description=readme(),
        File "C:\cygwin64\tmp\pip-install-37lyy1_b\embiggen_3ec9ca91df6044b1b2470bb84cb6184d\setup.py", line 12, in readme
          return f.read()
        File "C:\Users\richa\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 23, in decode
          return codecs.charmap_decode(input,self.errors,decoding_table)[0]
      UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 2: character maps to <undefined>
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

opened by RichardBruskiewich 6

Bipartite Graph predict proba with undirected graph

Hi. I noticed the performance metrics are not identical when using predict_proba_bipartite_graph_from_edge_node_types, when I swap the source and destination nodes. The graph used as input is an undirected graph, which I would expect would yield similar predictions for the same edge type regardless of which is source and destination nodes. Is this behavior intentional?

Below are the version of the software I am running currently: grape==0.1.17 embiggen==0.11.27 ensmallen==0.8.14

opened by arpelletier 6

ImportError: libgfortran-ed201abd.so.3.0.0: cannot open shared object file: No such file or directory

In a fresh notebook, attempting to import grape yields an ImportError about a missing libgfortran-ed201abd.so.3.0.0.

>>> !pip install grape -U
>>> import grape
/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.8) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?276a3afe-1b97-4f33-82e6-6df2db01934a)
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
/home/harry/kg-bioportal/data/merged/KG-Bioportal analysis.ipynb Cell 2' in <cell line: 1>()
----> [1](vscode-notebook-cell://wsl%2Bubuntu-20.04/home/harry/kg-bioportal/data/merged/KG-Bioportal%20analysis.ipynb#ch0000001vscode-remote?line=0) import grape

File ~/.local/lib/python3.8/site-packages/grape/__init__.py:9, in <module>
      1 """GraPE main module.
      2 
      3 For now, this is a simple wrapper of GraPE main two sub-modules that for
   (...)
      6 These packages are mimed here by the two sub-directories, ensmallen and embiggen.
      7 """
----> 9 from embiggen import *
     10 from ensmallen import Graph
     13 def import_all(module_locals):

File ~/.local/lib/python3.8/site-packages/embiggen/__init__.py:2, in <module>
      1 """Module with models for graph machine learning and visualization."""
----> 2 from embiggen.visualizations import GraphVisualizer
      3 from embiggen.utils import (
      4     EmbeddingResult,
      5     get_models_dataframe,
   (...)
      9     get_available_models_for_node_embedding,
     10 )
...
    691     'spherical_kn',
    692 ]
    694 from scipy._lib._testutils import PytestTester

ImportError: libgfortran-ed201abd.so.3.0.0: cannot open shared object file: No such file or directory

I've seen that this may be related to libraries packaged with numpy, as seen in the following: https://github.com/ContinuumIO/anaconda-issues/issues/445 https://github.com/numpy/numpy/issues/14348

This may be environment-specific, of course.

opened by caufieldjh 6

`Illegal instruction (core dumped)` on importing grape

In another issue that may have something to do with our aging build server: When we import grape in this environment (see info below), we get only Illegal instruction (core dumped).

cpuinfo output:

processor       : 23
vendor_id       : GenuineIntel
cpu family      : 6
model           : 44
model name      : Intel(R) Xeon(R) CPU           X5675  @ 3.07GHz
stepping        : 2
microcode       : 0x1f
cpu MHz         : 1599.987
cache size      : 12288 KB
physical id     : 1
siblings        : 12
core id         : 10
cpu cores       : 6
apicid          : 53
initial apicid  : 53
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips        : 6133.21
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

opened by caufieldjh 4

Link to the two sub packages

Hi, first of all, thanks for making such an amazing graph embedding resource!

I'm wondering whether you can add some descriptions in the README clarifying that this repo is a thin wrapper of the two core packages embiggen and ensmallen and add links accordingly. I was a bit confused for a few minutes trying to find the source code and only came to realize it wraps the two libraries after looking at __init__.py.

opened by RemyLau 4

pip install grape failure on support_luca>=1.0.2

I am attempting to install grape using pip on Ubuntu 20.04.4 LTS with python 3.8.3.

Most of the build/install appears to work just fine until I hit this error, providing a little additional context. I have also tried to install ensmallen directly with pip install ensmallen and I get the same error. Any advice you have would be appreciated.

Requirement already satisfied: idna<3,>=2.5 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (1.25.9)
Requirement already satisfied: certifi>=2017.4.17 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (2020.6.20)
Requirement already satisfied: chardet<4,>=3.0.2 in /home/corey/anaconda3/lib/python3.8/site-packages (from requests->bioregistry>=0.5.65->ensmallen>=0.8.21->grape) (3.0.4)
Collecting typing-extensions>=3.7.4.3
  Using cached typing_extensions-4.3.0-py3-none-any.whl (25 kB)
ERROR: Could not find a version that satisfies the requirement support_luca>=1.0.2 (from dict_hash>=1.1.25->cache_decorator>=2.1.11->ensmallen>=0.8.21->grape) (from versions: none)
ERROR: No matching distribution found for support_luca>=1.0.2 (from dict_hash>=1.1.25->cache_decorator>=2.1.11->ensmallen>=0.8.21->grape)

opened by amc-corey-cox 4

Graph visualization error

Hello. I am trying the Using CBOW to embed Cora python notebook (linked) and after replacing "CBOWEnsmallen" with "DeepWalkCBOWEnsmallen", the first order embedding runs successfully but fails at the graph visualization. I get the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_3453/3695275499.py in <module>
----> 1 GraphVisualizer(
      2     graph,
      3     node_embedding_method_name="CBOW - First order"
      4 ).fit_and_plot_all(first_embedding)

~/anaconda3/lib/python3.9/site-packages/embiggen/visualizations/graph_visualizer.py in fit_and_plot_all(self, node_embedding, number_of_columns, show_letters, include_distribution_plots, **node_embedding_kwargs)
   4236         distribution_plot_methods_to_call = []
   4237 
-> 4238         if not self._graph.has_constant_non_zero_node_degrees():
   4239             node_scatter_plot_methods_to_call.append(
   4240                 self.plot_node_degrees,

AttributeError: The method 'has_constant_non_zero_node_degrees' does not exists, did you mean one of the following?
* 'has_constant_edge_weights'
* 'get_non_zero_subgraph_node_degrees'
* 'has_nodes'
* 'has_edges'
* 'has_selfloops'
* 'has_node_ontologies'
* 'has_node_oddities'
* 'get_node_degrees'
* 'has_node_name'
* 'has_node_types'

Looks like the issue has to do with embiggen dependencies in the graph visualization. Below are the package versions I am using: embiggen==0.11.13 ensmallen==0.8.7 grape==0.1.9

As well, I was not able to successfully run the second-order embeddings

model = DeepWalkCBOWEnsmallen(
    return_weight=2.0,
    explore_weight=0.1
)
second_embedding = model.fit_transform(graph).get_node_embedding_from_index(0)

The above code gives the below error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_3453/3112314827.py in <module>
----> 1 model = DeepWalkCBOWEnsmallen(
      2     return_weight=2.0,
      3     explore_weight=0.1
      4 )
      5 second_embedding = model.fit_transform(graph).get_node_embedding_from_index(0)

TypeError: __init__() got an unexpected keyword argument 'return_weight'```

opened by arpelletier 4

Embedding model names not recognized; alternate suggestions are unexpected
As of grape 0.1.9, node embedding model names have changed, such that a call to embiggen's AbstractModel.get_task_data(model_name, task_name) with one of the frequently used model names like CBOW or SkipGram throws a ValueError.

I see from grape.get_available_models_for_node_embedding() that these now have more specific names like Node2Vec CBOW. No problem with being specific, but we'd still like to be able to specify CBOW, SkipGram, or GloVe in config definitions without having to verify the exact model names embiggen is expecting first. Could we use the short names as aliases to a default model, like CBOW will be understood as Node2Vec CBOW, etc?

The name convention also appears to confuse the alternative suggests provided in the ValueError text, so we get suggestions like this:

ValueError: The provided model name `CBOW` is not available. Did you mean BoxE?
opened by caufieldjh 4
ValueError when trying to use external embedder like in pykeen and karateClub
Hello, Thanks you for your amaeing work, i'm a phD student working on the embeddings of biomedical data particularly in immunogenetics, and currently i'm comparing tools to embed data. I found your works very interesting. I got some issues when i try to use external model from pykeen and karateclub. i got this message :
ValueError: We have found an useless method in the class StubClass, implementing method HolE from library PyKEEN and task Node Embedding. It does not make sense to implement the `requires_positive_edge_weights` method when the `can_use_edge_weights` always returns False, as it is already handled in the root abstract model class.

Also for the vizualisation, when i did ``` from grape import GraphVisualizer visualizer = GraphVisualizer(kg.remove_disconnected_nodes()) visualizer.fit_and_plot_all(embedding)

I got this warning without no visualisation: FutureWarning: The parameter `square_distances` has not effect and will be removed in version 1.3. Thank you in advance for your answer Gaoussou
opened by gsanou 3
Use case regarding Customer Analytics or Community detection?

Thanks for that repo. It seems that you have integrated several tools / libraries / approaches under Grape's hood. Do you intend to create a tutorial for a customer analytics recommendation?

Thanks in advance.

opened by stkarlos 2
Parallelized Embedding

Hey, I'm trying to process a directed graph, the scales are about 5 million nodes and 100 million edges. I've managed to load the graph from a csv file, i get a very nice Graph object (within 5 minutes). I'm now trying to embedd the graph with grape.embedders.Node2VecSkipGramEnsmallen, but it doesn't seem to succeed, I've let it run for over 10 hours. In order to make it faster, i did enable the Graph's vector_source, vector_cumulative_node_degree and vector_reciprocal_sqrt_degrees. Reading your paper, it seems that the embedding process could be parallelized, but i can't find the way to do that. I'd appreciate if you could describe what part/s of the embedding process are parallelized? and how can i make it run in parallel? Thank you, Bruria.

opened by bruriah1999 2
Getting figure to be inline
matplotlib plots figures inline by default or if we write

%matplotlib inline

Some of the figures produced by GRAPE get put into "subwindows" in the Jupyter notebook, and one needs to scroll up and down to see the entire figure. GRAPE does not seem to be responsive to the inline magic command above either.

For instance, in order for a certain figure to really appear online, I need to make it much smaller

visualizer = GraphVisualizer(sli_graph, automatically_display_on_notebooks=False) fig, ax, cap = visualizer.plot_node_degree_distribution() fig.set_figheight(3) fig.set_figwidth(3)

even though the notebook could comfortably show (5,5) or even (8,8)
opened by pnrobinson 2
Saving classifier models

Could support for saving classifier models please be added? This came up while meeting with @LucaCappelletti94 recently but it's become relevant again in the course of updating neat-ml to use grape classifiers.

Training classifiers isn't a major time commitment, but on our neat runs we've separated the process of training+testing vs. applying classifiers, so being unable to save or at least pickle the classifier object means we need to redo training for each model.

opened by caufieldjh 4
Methods for generating node embeddings from word embeddings

While updating NEAT to use the most recent grape release, @justaddcoffee and @hrshdhgd and I took a look at what we're using to generate node embeddings based on pretrained word embeddings like BERT etc. : https://github.com/Knowledge-Graph-Hub/NEAT/blob/main/neat/graph_embedding/graph_embedding.py

We know we can run something like get_okapi_tfidf_weighted_textual_embedding() on a graph, but is there a more "on demand" way to run this in grape now for an arbitrary graph?

opened by caufieldjh 10

Releases(0.0.6.dev1)

0.0.6.dev1(Dec 15, 2021)

New version of GraPE wrapping nightly versions of Embiggen and Ensmallen.
Source code(tar.gz)
Source code(zip)

Owner

AnacletoLab

Computational Biology and Bioinformatics Lab - Dept. of Computer Science - UNIMI

GitHub Repository

GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

Related tags

Overview

GraPE

Installation of GraPE

Main functionalities of the library

Tutorials

Documentation

On line documentation

Using the automatic method suggestions utility

Cite GraPE

Comments

Releases(0.0.6.dev1)

0.0.6.dev1(Dec 15, 2021)

Owner

AnacletoLab

CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

Facestar dataset. High quality audio-visual recordings of human conversational speech.

Using LSTM write Tang poetry

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

A flexible framework of neural networks for deep learning

Vertex AI: Serverless framework for MLOPs (ESP / ENG)

[PAMI 2020] Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

MISSFormer: An Effective Medical Image Segmentation Transformer

GraphLily: A Graph Linear Algebra Overlay on HBM-Equipped FPGAs

The official project of SimSwap (ACM MM 2020)

Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

68 keypoint annotations for COFW test data

This package contains deep learning models and related scripts for RoseTTAFold

🎓Automatically Update CV Papers Daily using Github Actions (Update at 12:00 UTC Every Day)

🍷 Gracefully claim weekly free games and monthly content from Epic Store.

The 2nd place solution of 2021 google landmark retrieval on kaggle.

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

Denoising images with Fourier Ring Correlation loss

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

VoxHRNet - Whole Brain Segmentation with Full Volume Neural Network