A PoC Corporation Relationship Knowledge Graph System on top of Nebula Graph.

Overview

Corp-Rel is a PoC of Corpartion Relationship Knowledge Graph System. It's built on top of the Open Source Graph Database: Nebula Graph with a dataset from nebula-shareholding-example.

corp-rel-capture.mov

Quick Start

First, please setup a Nebula Graph Cluster with data loaded from nebula-shareholding-example.

Then, clone this project:

git clone https://github.com/wey-gu/nebula-corp-rel-search.git
cd nebula-corp-rel-search

Start the backend:

python3 -m pip install -r requirements.txt
cd corp-rel-backend
export NG_ENDPOINTS="192.168.123.456:9669" # This should be your Nebula Graph Cluster GraphD Endpoint
python3 app.py

Start the frontend in another terminal:

npm install -g @vue/cli
cd nebula-corp-rel-search/corp-rel-frontend
vue serve src/main.js

Start a reverse Proxy to enable Corp-Rel Backend being served with same origin of Frontend:

For example below is a Nginx config to make :8081/ go to http://localhost:8080 and :8081/api go to http://192.168.123.456:5000/api.

http {
    include       mime.types;
    default_type  application/octet-stream;

    keepalive_timeout  65;

    server {
        listen       8081;
        server_name  localhost;
        # frontend
        location / {
            proxy_pass http://localhost:8080;
        }
        # backend
        location /api {
            proxy_pass http://192.168.123.456:5000/api;
        }
    }
#...

After above reverse proxy being configured, let's verify it via cURL:

curl --header "Content-Type: application/json" \
     --request POST \
     --data '{"entity": "c_132"}' \
     http://localhost:8081/api | jq

If it's properly responded, hen we could go to http://localhost:8081 from the web browser :).

Design Log

data from Backend Side

Backend should query node's relationship path as follow:

MATCH p=(v)-[e:hold_share|:is_branch_of|:reletive_with|:role_as*1..3]-(v2) \
WHERE id(v) IN ["c_132"] RETURN p LIMIT 100

An example of the query will be like this:

([email protected]) [shareholding]> MATCH p=(v)-[e:hold_share|:is_branch_of|:reletive_with|:role_as*1..3]-(v2) \
                           -> WHERE id(v) IN ["c_132"] RETURN p LIMIT 100
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| p                                                                                                                                                                                                                                        |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| <("c_132" :corp{name: "Chambers LLC"})<-[:[email protected] {share: 0.0}]-("c_245" :corp{name: "Thompson-King"})>                                                                                                                             |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| <("c_132" :corp{name: "Chambers LLC"})<-[:[email protected] {share: 3.0}]-("p_1039" :person{name: "Christian Miller"})>                                                                                                                       |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| <("c_132" :corp{name: "Chambers LLC"})<-[:[email protected] {share: 3.0}]-("p_1399" :person{name: "Sharon Gonzalez"})>                                                                                                                        |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| <("c_132" :corp{name: "Chambers LLC"})<-[:[email protected] {share: 9.0}]-("p_1767" :person{name: "Dr. David Vance"})>                                                                                                                        |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| <("c_132" :corp{name: "Chambers LLC"})<-[:[email protected] {share: 11.0}]-("p_1997" :person{name: "Glenn Reed"})>                                                                                                                            |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| <("c_132" :corp{name: "Chambers LLC"})<-[:[email protected] {share: 14.0}]-("p_2341" :person{name: "Jessica Baker"})>                                                                                                                         |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
...

Leveraging nebula2-python, we could have result in below data structure:

$ python3 -m pip install nebula2-python==2.5.0
$ ipython
In [1]: from nebula2.gclient.net import ConnectionPool
In [2]: from nebula2.Config import Config
In [3]: config = Config()
   ...: config.max_connection_pool_size = 10
   ...: # init connection pool
   ...: connection_pool = ConnectionPool()
   ...: # if the given servers are ok, return true, else return false
   ...: ok = connection_pool.init([('192.168.8.137', 9669)], config)
   ...: session = connection_pool.get_session('root', 'nebula')
[2021-10-13 13:44:24,242]:Get connection to ('192.168.8.137', 9669)

In [4]: resp = session.execute("use shareholding")
In [5]: query = '''
   ...: MATCH p=(v)-[e:hold_share|:is_branch_of|:reletive_with|:role_as*1..3]-(v2) \
   ...: WHERE id(v) IN ["c_132"] RETURN p LIMIT 100
   ...: '''
In [6]: resp = session.execute(query) # Note: after nebula graph 2.6.0, we could use execute_json as well

In [7]: resp.col_size()
Out[7]: 1

In [9]: resp.row_size()
Out[10]: 100

As we know the result is actually a nebula-python path type, they could be extracted as follow with .nodes() and .relationships():

In [11]: p=resp.row_values(22)[0].as_path()

In [12]: p.nodes()
Out[12]:
[("c_132" :corp{name: "Chambers LLC"}),
 ("p_4000" :person{name: "Colton Bailey"})]

In [13]: p.relationships()
Out[13]: [("p_4000")-[:role_as@0{role: "Editorial assistant"}]->("c_132")]

For relationships/edges, we could call its .edge_name(), .properties(), .start_vertex_id(), .end_vertex_id():

In [14]: rel=p.relationships()[0]

In [15]: rel
Out[15]: ("p_4000")-[:role_as@0{role: "Editorial assistant"}]->("c_132")

In [16]: rel.edge_name()
Out[16]: 'role_as'

In [17]: rel.properties()
Out[17]: {'role': "Editorial assistant"}

In [18]: rel.start_vertex_id()
Out[18]: "p_4000"

In [19]: rel.end_vertex_id()
Out[19]: "c_132"

And for nodes/vertices, we could call its .tags(), properties, get_id():

In [20]: node=p.nodes()[0]

In [21]: node.tags()
Out[21]: ['corp']

In [22]: node.properties('corp')
Out[22]: {'name': "Chambers LLC"}

In [23]: node.get_id()
Out[23]: "c_132"

Data visualization

For the frontend, we could create a view by leveraging vue-network-d3:

npm install vue-network-d3 --save
touch src/App.vue
touch src/main.js

In src/App.vue, we create a Network instance and fill in the nodeList, and linkList fetched from backend, in below example, we put fake data as:

nodes: [
        {"id": "c_132", "name": "Chambers LLC", "tag": "corp"},
        {"id": "p_4000", "name": "Colton Bailey", "tag": "person"}],
relationships: [
        {"source": "p_4000", "target": "c_132", "properties": { "role": "Editorial assistant" }, "edge": "role_as"}]

And the full example of src/App.vue will be:

<template>
  <div id="app">
    <network
      :nodeList="nodes"
      :linkList="relationships"
      :nodeSize="nodeSize"
      :linkWidth="linkWidth"
      :linkDistance="linkDistance"
      :linkTextFrontSize="linkTextFrontSize"
      :nodeTypeKey="nodeTypeKey"
      :linkTypeKey="linkTypeKey"
      :nodeTextKey="nodeTextKey"
      :linkTextKey="linkTextKey"
      :showNodeText="showNodeText"
      :showLinkText="showLinkText"
      >
    </network>
  </div>
</template>

<script>
import Network from "vue-network-d3";

export default {
  name: "app",
  components: {
    Network
  },
  data() {
    return {
      nodes: [
        {"id": "c_132", "name": "Chambers LLC", "tag": "corp"},
        {"id": "p_4000", "name": "Colton Bailey", "tag": "person"}
      ],
      relationships: [
        {"source": "p_4000", "target": "c_132", "properties": { "role": "Editorial assistant" }, "edge": "role_as"}
      ],
      nodeSize: 18,
      linkDistance: 120,
      linkWidth: 6,
      linkTextFrontSize: 20,
      nodeTypeKey: "tag",
      linkTypeKey: "edge",
      nodeTextKey: "name",
      linkTextKey: "properties",
      showNodeText: true,
      showLinkText: true
    };
  },
};
</script>

<style>
body {
  margin: 0;
}
</style>

Together with src/main.js:

import Vue from 'vue'
import App from './App.vue'

Vue.config.productionTip = false

new Vue({
  render: h => h(App),
}).$mount('#app')

Then we could run: vue serve src/main.js to have this renderred:

vue-network-d3-demo

The data construction in Back End:

Thus we shoud know that if the backend provides list of nodes and relationships in JSON as the following, things are perfectly connected!

Nodes:

[{"id": "c_132", "name": "Chambers LLC", "tag": "corp"},
 {"id": "p_4000", "name": "Colton Bailey", "tag": "person"}]

Relationships:

[{"source": "p_4000", "target": "c_132", "properties": { "role": "Editorial assistant" }, "edge": "role_as"},
 {"source": "p_1039", "target": "c_132", "properties": { "share": "3.0" }, "edge": "hold_share"}]

We could construct it as:

def make_graph_response(resp) -> dict:
    nodes, relationships = list(), list()
    for row_index in range(resp.row_size()):
        path = resp.row_values(row_index)[0].as_path()
        _nodes = [
            {
                "id": node.get_id(), "tag": node.tags()[0],
                "name": node.properties(node.tags()[0]).get("name", "")
                }
                for node in path.nodes()
        ]
        nodes.extend(_nodes)
        _relationships = [
            {
                "source": rel.start_vertex_id(),
                "target": rel.end_vertex_id(),
                "properties": rel.properties(),
                "edge": rel.edge_name()
                }
                for rel in path.relationships()
        ]
        relationships.extend(_relationships)
    return {"nodes": nodes, "relationships": relationships}

The Flask App

Then Let's create a Flask App to consume the HTTP API request and return the data designed as above.

from flask import Flask, jsonify, request



app = Flask(__name__)


@app.route("/")
def root():
    return "Hey There?"


@app.route("/api", methods=["POST"])
def api():
    request_data = request.get_json()
    entity = request_data.get("entity", "")
    if entity:
        resp = query_shareholding(entity)
        data = make_graph_response(resp)
    else:
        data = dict() # tbd
    return jsonify(data)


def parse_nebula_graphd_endpoint():
    ng_endpoints_str = os.environ.get(
        'NG_ENDPOINTS', '127.0.0.1:9669,').split(",")
    ng_endpoints = []
    for endpoint in ng_endpoints_str:
        if endpoint:
            parts = endpoint.split(":")  # we dont consider IPv6 now
            ng_endpoints.append((parts[0], int(parts[1])))
    return ng_endpoints

def query_shareholding(entity):
    query_string = (
        f"USE shareholding; "
        f"MATCH p=(v)-[e:hold_share|:is_branch_of|:reletive_with|:role_as*1..3]-(v2) "
        f"WHERE id(v) IN ['{ entity }'] RETURN p LIMIT 100"
    )
    session = connection_pool.get_session('root', 'nebula')
    resp = session.execute(query_string)
    return resp

And by starting this Flask App instance:

export NG_ENDPOINTS="192.168.8.137:9669"
python3 app.py

 * Serving Flask app 'app' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
[2021-10-13 18:30:17,574]: * Running on all addresses.
   WARNING: This is a development server. Do not use it in a production deployment.
[2021-10-13 18:30:17,574]: * Running on http://192.168.10.14:5000/ (Press CTRL+C to quit)

we could then query the API with cURL like this:

curl --header "Content-Type: application/json" \
     --request POST \
     --data '{"entity": "c_132"}' \
     http://192.168.10.14:5000/api | jq

{
  "nodes": [
    {
      "id": "c_132",
      "name": "\"Chambers LLC\"",
      "tag": "corp"
    },
    {
      "id": "c_245",
      "name": "\"Thompson-King\"",
      "tag": "corp"
    },
    {
      "id": "c_132",
      "name": "\"Chambers LLC\"",
      "tag": "corp"
    },
...
    }
  ],
  "relationships": [
    {
      "edge": "hold_share",
      "properties": "{'share': 0.0}",
      "source": "c_245",
      "target": "c_132"
    {
      "edge": "hold_share",
      "properties": "{'share': 9.0}",
      "source": "p_1767",
      "target": "c_132"
    },
    {
      "edge": "hold_share",
      "properties": "{'share': 11.0}",
      "source": "p_1997",
      "target": "c_132"
    },
...
    },
    {
      "edge": "reletive_with",
      "properties": "{'degree': 51}",
      "source": "p_7283",
      "target": "p_4723"
    }
  ]
}

Upstreams Projects

Owner
Wey Gu
Developer Advocate @vesoft-inc
Wey Gu
Exploring Simple Siamese Representation Learning

G-SimSiam A PyTorch implementation which refers to repo for the paper Exploring Simple Siamese Representation Learning by Xinlei Chen & Kaiming He Add

zhuyun 1 Dec 19, 2021
Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral) [Project] [Paper] [Demo] [Related Work: A2RL (for Auto Image Cropping)] [C

Wu Huikai 402 Dec 27, 2022
Simple implementation of OpenAI CLIP model in PyTorch.

It was in January of 2021 that OpenAI announced two new models: DALL-E and CLIP, both multi-modality models connecting texts and images in some way. In this article we are going to implement CLIP mod

Moein Shariatnia 226 Jan 05, 2023
OOD Generalization and Detection (ACL 2020)

Pretrained Transformers Improve Out-of-Distribution Robustness How does pretraining affect out-of-distribution robustness? We create an OOD benchmark

littleRound 57 Jan 09, 2023
🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds (CVPR 2020) This is the official implementation of RandLA-Net (CVPR2020, Oral

Qingyong 1k Dec 30, 2022
A embed able annotation tool for end to end cross document co-reference

CoRefi CoRefi is an emebedable web component and stand alone suite for exaughstive Within Document and Cross Document Coreference Anntoation. For a de

PythicCoder 39 Dec 12, 2022
Event sourced bank - A wide-and-shallow example using the Python event sourcing library

Event Sourced Bank A "wide but shallow" example of using the Python event sourci

3 Mar 09, 2022
Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

gtn_applications An applications library using GTN. Current examples include: Offline handwriting recognition Automatic speech recognition Installing

Facebook Research 68 Dec 29, 2022
Code and datasets for TPAMI 2021

SkeletonNet This repository constains the codes and ShapeNetV1-Surface-Skeleton,ShapNetV1-SkeletalVolume and 2d image datasets ShapeNetRendering. Plea

34 Aug 15, 2022
[NeurIPS 2021] Low-Rank Subspaces in GANs

Low-Rank Subspaces in GANs Figure: Image editing results using LowRankGAN on StyleGAN2 (first three columns) and BigGAN (last column). Low-Rank Subspa

112 Dec 28, 2022
A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

70 Jul 12, 2022
Answering Open-Domain Questions of Varying Reasoning Steps from Text

This repository contains the authors' implementation of the Iterative Retriever, Reader, and Reranker (IRRR) model in the EMNLP 2021 paper "Answering Open-Domain Questions of Varying Reasoning Steps

26 Dec 22, 2022
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

weijiawu 47 Dec 26, 2022
Simple Tensorflow implementation of Toward Spatially Unbiased Generative Models (ICCV 2021)

Spatial unbiased GANs — Simple TensorFlow Implementation [Paper] : Toward Spatially Unbiased Generative Models (ICCV 2021) Abstract Recent image gener

Junho Kim 16 Apr 15, 2022
Artstation-Artistic-face-HQ Dataset (AAHQ)

Artstation-Artistic-face-HQ Dataset (AAHQ) Artstation-Artistic-face-HQ (AAHQ) is a high-quality image dataset of artistic-face images. It is proposed

onion 105 Dec 16, 2022
The MATH Dataset

Measuring Mathematical Problem Solving With the MATH Dataset This is the repository for Measuring Mathematical Problem Solving With the MATH Dataset b

Dan Hendrycks 267 Dec 26, 2022
Python script to download the celebA-HQ dataset from google drive

download-celebA-HQ Python script to download and create the celebA-HQ dataset. WARNING from the author. I believe this script is broken since a few mo

133 Dec 21, 2022
A repo with study material, exercises, examples, etc for Devnet SPAUTO

MPLS in the SDN Era -- DevNet SPAUTO Get right to the study material: Checkout the Wiki! A lab topology based on MPLS in the SDN era book used for 30

Hugo Tinoco 67 Nov 16, 2022
Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption

SG-GAN TensorFlow implementation of SG-GAN. Prerequisites TensorFlow (implemented in v1.3) numpy scipy pillow Getting Started Train Prepare dataset. W

lplcor 61 Jun 07, 2022
ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

ObjectDrawer-ToolBox is a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system, Object Drawer.

77 Jan 05, 2023