A high-level Python library for Quantum Natural Language Processing

Related tags

Text Data & NLPlambeq
Overview

lambeq

lambeq logo

Build status License PyPI version PyPI downloads arXiv

About

lambeq is a toolkit for quantum natural language processing (QNLP).

Documentation: https://cqcl.github.io/lambeq/

Getting started

Prerequisites

  • Python 3.7+

Installation

Direct pip install

The base lambeq can be installed with the command:

pip install lambeq

This does not include optional dependencies such as depccg and PyTorch, which have to be installed separately. In particular, depccg is required for lambeq.ccg2discocat.DepCCGParser.

To install lambeq with depccg, run instead:

pip install cython numpy
pip install lambeq[depccg]
depccg_en download

See below for further options.

Automatic installation (recommended)

This runs an interactive installer to help pick the installation destination and configuration.

  1. Run:
    bash <(curl 'https://cqcl.github.io/lambeq/install.sh')

Git installation

This required Git to be installed.

  1. Download this repository:

    git clone https://github.com/CQCL/lambeq
  2. Enter the repository:

    cd lambeq
  3. Make sure pip is up-to-date:

    pip install --upgrade pip wheel
  4. (Optional) If installing the optional dependency depccg, the following packages must be installed before installing depccg:

    pip install cython numpy

    Further information can be found on the depccg homepage.

  5. Install lambeq from the local repository using pip:

    pip install --use-feature=in-tree-build .

    To include depccg, run instead:

    pip install --use-feature=in-tree-build .[depccg]

    To include all optional dependencies, run instead:

    pip install --use-feature=in-tree-build .[all]
  6. If using a pretrained depccg parser, download a pretrained model:

    depccg_en download

Usage

The docs/examples directory contains notebooks demonstrating usage of the various tools in lambeq.

Example - parsing a sentence into a diagram (see docs/examples/ccg2discocat.ipynb):

from lambeq.ccg2discocat import DepCCGParser

depccg_parser = DepCCGParser()
diagram = depccg_parser.sentence2diagram('This is a test sentence')
diagram.draw()

Note: all pre-trained depccg models apart from the basic one are broken, and depccg has not yet been updated to fix this. Therefore, it is recommended to just use the basic parser, as shown here.

Testing

Run all tests with the command:

pytest

Note: if you have installed in a virtual environment, remember to install pytest in the same environment using pip.

Building Documentation

To build the documentation, first install the required dependencies:

pip install -r docs/requirements.txt

then run the commands:

cd docs
make clean
make html

the docs will be under docs/_build.

To rebuild the rst files themselves, run:

sphinx-apidoc --force -o docs lambeq

License

Distributed under the Apache 2.0 license. See LICENSE for more details.

Citation

If you wish to attribute our work, please cite the accompanying paper:

@article{kartsaklis2021lambeq,
   title={lambeq: {A}n {E}fficient {H}igh-{L}evel {P}ython {L}ibrary for {Q}uantum {NLP}},
   author={Dimitri Kartsaklis and Ian Fan and Richie Yeung and Anna Pearson and Robin Lorenz and Alexis Toumi and Giovanni de Felice and Konstantinos Meichanetzidis and Stephen Clark and Bob Coecke},
   year={2021},
   journal={arXiv preprint arXiv:2110.04236},
}
Comments
  • No module named BobcatParser

    No module named BobcatParser

    I'm trying to run the following code from your tutorials website, but I am unable to install BobcatParser. I am working in a Colab environment. Is there a dependency that I may be missing?

    from lambeq import BobcatParser
    
    parser = BobcatParser(root_cats=('NP', 'N'), verbose='text')
    
    raw_train_diagrams = parser.sentences2diagrams(train_data, suppress_exceptions=True)
    raw_val_diagrams = parser.sentences2diagrams(val_data, suppress_exceptions=True) 
    
    opened by alt-shreya 14
  • error during installation

    error during installation

    When I try installing using sh <(curl 'https://cqcl.github.io/lambeq/install.sh'), i get the follwing error:

    ERROR: Cannot install lambeq[depccg]==0.1.0, lambeq[depccg]==0.1.1 and lambeq[depccg]==0.1.2 because these package versions have conflicting dependencies.
    
    The conflict is caused by:
        lambeq[depccg] 0.1.2 depends on depccg==1.1.0; extra == "depccg"
        lambeq[depccg] 0.1.1 depends on depccg==1.1.0; extra == "depccg"
        lambeq[depccg] 0.1.0 depends on depccg==1.1.0; extra == "depccg"
    
    To fix this you could try to:
    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict
    
    ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
    
    

    I am on a 2020 Macbook air (with Apple M1 chip), and using conda with python=3.8.11 . Will any of that be causing the problem?

    opened by mithunpaul08 13
  • Problem with trainer.fit(), operands of different shape

    Problem with trainer.fit(), operands of different shape

    Hi, I am trying to run the quantum trainer algorithm. When running the following line:

    trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)

    i get the following error:

    ValueError                          Traceback (most recent call last)
    Input In [17], in <cell line: 1>()
    ----> 1 trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)
    
    File c:\python38\lib\site-packages\lambeq\training\trainer.py:365, in Trainer.fit(self, train_dataset, val_dataset, evaluation_step, logging_step)
        363 step += 1
        364 x, y_label = batch
    --> 365 y_hat, loss = self.training_step(batch)
        366 if (self.evaluate_on_train and
        367         self.evaluate_functions is not None):
        368     for metr, func in self.evaluate_functions.items():
    
    File c:\python38\lib\site-packages\lambeq\training\quantum_trainer.py:149, in QuantumTrainer.training_step(self, batch)
        133 def training_step(
        134         self,
        135         batch: tuple[list[Any], np.ndarray]) -> tuple[np.ndarray, float]:
        136     """Perform a training step.
        137 
        138     Parameters
       (...)
        147 
        148     """
    --> 149     y_hat, loss = self.optimizer.backward(batch)
        150     self.train_costs.append(loss)
        151     self.optimizer.step()
    
    File c:\python38\lib\site-packages\lambeq\training\spsa_optimizer.py:126, in SPSAOptimizer.backward(self, batch)
        124 self.model.weights = xplus
        125 y0 = self.model(diagrams)
    --> 126 loss0 = self.loss_fn(y0, targets)
        127 xminus = self.project(x - self.ck * delta)
        128 self.model.weights = xminus
    
    Input In [13], in <lambda>(y_hat, y)
    ----> 1 loss = lambda y_hat, y: -np.sum(y * np.log(y_hat)) / len(y)  # binary cross-entropy loss
          3 acc = lambda y_hat, y: np.sum(np.round(y_hat) == y) / len(y) / 2  # half due to double-counting
          4 eval_metrics = {"acc": acc}
    
    ValueError: operands could not be broadcast together with shapes (30,2) (30,)
    

    I have just fixed the .py file in the lib following #12. The algorithm raised an error even before. I can't recall exactly, but i don't think it was the same error.

    What can i do to solve this? Thank you for your time.

    opened by Stephenito 10
  • Error when running Parser

    Error when running Parser

    Below is the code I ran for testing the parser : from lambeq import BobcatParser

    parser = BobcatParser() diagram = parser.sentence2diagram('This is a test sentence') diagram.draw()

    and this is the error I received when running it : 2022-05-22 20:19:08.041411: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2022-05-22 20:19:08.042271: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Traceback (most recent call last): File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1342, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1255, in request self._send_request(method, url, body, headers, encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1301, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1250, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1010, in _send_output self.send(msg) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 950, in send self.connect() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1424, in connect self.sock = self._context.wrap_socket(self.sock, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create self.do_handshake() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP Depression 2.py", line 3, in parser = BobcatParser() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\bobcat_parser.py", line 258, in init download_model(model_name_or_path, model_dir, verbose) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\bobcat_parser.py", line 130, in download_model model_file, headers = urlretrieve(url, reporthook=progress_bar.update) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 239, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 517, in open response = self._open(req, data) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1385, in https_open return self.do_open(http.client.HTTPSConnection, req, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1345, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)>

    How can I resolve this?

    opened by ACE07-Sev 9
  • Question : What does the Quantum_trainer output?

    Question : What does the Quantum_trainer output?

    Hi, I wish to use the Quantum_trainer to do a Depression Detection using a chatbot (the sentences would be the input for the QNLP module), and wish to then classify as whether the person has depression or not.

    May I ask what is the input and what is the output in the sample trainer for the quantum module? Does it do binary classification or is it something I need to add as an additional layer?

    opened by ACE07-Sev 8
  • An Error while running Classical Pipeline Example given in docs/examples

    An Error while running Classical Pipeline Example given in docs/examples

    Hi @dimkart I hope you are doing well

    I am trying to run the code given here on my Google Colab account - https://github.com/CQCL/lambeq/blob/main/docs/examples/classical_pipeline.ipynb

    I am installing lambeq directly on Colab and it is picking up the latest version of DisCoPy

    But I am continuously getting an error like this. I have pasted the full stack trace here -

    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    <ipython-input-11-84634b74856a> in <module>()
         39 dev_cost_fn, dev_costs, dev_accs = make_cost_fn(dev_pred_fn, dev_labels)
         40 
    ---> 41 result = train(train_cost_fn, x0, niter=20, callback=dev_cost_fn, optimizer_fn=torch.optim.AdamW, lr=0.1)
    
    10 frames
    <ipython-input-11-84634b74856a> in train(func, x0, niter, callback, optimizer_fn, lr)
          3     optimizer = optimizer_fn(x, lr=lr)
          4     for _ in range(niter):
    ----> 5         loss = func(x)
          6 
          7         optimizer.zero_grad()
    
    <ipython-input-11-84634b74856a> in cost_fn(params, **kwargs)
         16 def make_cost_fn(pred_fn, labels):
         17     def cost_fn(params, **kwargs):
    ---> 18         predictions = pred_fn(params)
         19 
         20         logits = predictions[:, 1] - predictions[:, 0]
    
    <ipython-input-10-dbb8534e3157> in predict(params)
          1 def make_pred_fn(circuits):
          2     def predict(params):
    ----> 3         return torch.stack([c.lambdify(*parameters)(*params).eval(contractor=tn.contractors.auto).array for c in circuits])
          4     return predict
          5 
    
    <ipython-input-10-dbb8534e3157> in <listcomp>(.0)
          1 def make_pred_fn(circuits):
          2     def predict(params):
    ----> 3         return torch.stack([c.lambdify(*parameters)(*params).eval(contractor=tn.contractors.auto).array for c in circuits])
          4     return predict
          5 
    
    /usr/local/lib/python3.7/dist-packages/discopy/tensor.py in eval(self, contractor)
        448         if contractor is None:
        449             return Functor(ob=lambda x: x, ar=lambda f: f.array)(self)
    --> 450         array = contractor(*self.to_tn()).tensor
        451         return Tensor(self.dom, self.cod, array)
        452 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in auto(nodes, output_edge_order, memory_limit, ignore_edge_order)
        262         output_edge_order=output_edge_order,
        263         nbranch=1,
    --> 264         ignore_edge_order=ignore_edge_order)
        265   return greedy(nodes, output_edge_order, memory_limit, ignore_edge_order)
        266 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in branch(nodes, output_edge_order, memory_limit, nbranch, ignore_edge_order)
        160   alg = functools.partial(
        161       opt_einsum.paths.branch, memory_limit=memory_limit, nbranch=nbranch)
    --> 162   return base(nodes, alg, output_edge_order, ignore_edge_order)
        163 
        164 
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/contractors/opt_einsum_paths/path_contractors.py in base(nodes, algorithm, output_edge_order, ignore_edge_order)
         86   path, nodes = utils.get_path(nodes_set, algorithm)
         87   for a, b in path:
    ---> 88     new_node = contract_between(nodes[a], nodes[b], allow_outer_product=True)
         89     nodes.append(new_node)
         90     nodes = utils.multi_remove(nodes, [a, b])
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/network_components.py in contract_between(node1, node2, name, allow_outer_product, output_edge_order, axis_names)
       2083     axes1 = [axes1[i] for i in ind_sort]
       2084     axes2 = [axes2[i] for i in ind_sort]
    -> 2085     new_tensor = backend.tensordot(node1.tensor, node2.tensor, [axes1, axes2])
       2086     new_node = Node(tensor=new_tensor, name=name, backend=backend)
       2087     # node1 and node2 get new edges in _remove_edges
    
    /usr/local/lib/python3.7/dist-packages/tensornetwork/backends/pytorch/pytorch_backend.py in tensordot(self, a, b, axes)
         44   def tensordot(self, a: Tensor, b: Tensor,
         45                 axes: Union[int, Sequence[Sequence[int]]]) -> Tensor:
    ---> 46     return torchlib.tensordot(a, b, dims=axes)
         47 
         48   def reshape(self, tensor: Tensor, shape: Tensor) -> Tensor:
    
    /usr/local/lib/python3.7/dist-packages/torch/functional.py in tensordot(a, b, dims, out)
       1032 
       1033     if out is None:
    -> 1034         return _VF.tensordot(a, b, dims_a, dims_b)  # type: ignore[attr-defined]
       1035     else:
       1036         return _VF.tensordot(a, b, dims_a, dims_b, out=out)  # type: ignore[attr-defined]
    
    RuntimeError: expected scalar type Float but found Double
    

    I was able to successfully carry out experiments using the Quantum Pipeline code on Google Colab and did not faced any issues but for this one I am getting error. I have tried to fix the issue by converting variables or some function outputs to float() but I was unable to rectify this.

    Can you please help me fix this issue?

    Thank you so much!

    opened by srinjoyganguly 8
  • Add Japanese support to DepCCGParser

    Add Japanese support to DepCCGParser

    Updated DepCCGParser to support Japanese. The sample code is as follows.

    1. Prepare depccg.

    pip install cython numpy depccg
    depccg_en download
    depccg_ja download
    

    2. Install Japanese fonts on Ubuntu.

    apt install -y fonts-migmix
    rm ~/.cache/matplotlib/fontlist-v330.json
    

    3. Set the matplotlib Japanese font in the jupyter notebook python code.

    import matplotlib
    from matplotlib.font_manager import FontProperties
    
    font_path = "/usr/share/fonts/truetype/migmix/migmix-1p-regular.ttf"
    font_prop = FontProperties(fname=font_path)
    matplotlib.rcParams["font.family"] = font_prop.get_name()
    

    4. Use sentence2diagram in the jupyter notebook python code.

    from lambeq import DepCCGParser
    from discopy import grammar
    
    parser = DepCCGParser(lang='ja')
    diagram = parser.sentence2diagram('これはテストの文です。')
    grammar.draw(diagram, figsize=(14,3), fontsize=12)
    

    5. Use ansatz in the jupyter notebook python code.

    from lambeq import AtomicType, IQPAnsatz
    
    # Define atomic types
    N = AtomicType.NOUN
    S = AtomicType.SENTENCE
    
    # Convert string diagram to quantum circuit
    ansatz = IQPAnsatz({N: 1, S: 1}, n_layers=2)
    discopy_circuit = ansatz(diagram)
    discopy_circuit.draw(figsize=(15,10))
    

    6. Use pytket in the jupyter notebook python code.

    from pytket.circuit.display import render_circuit_jupyter
    
    tket_circuit = discopy_circuit.to_tk()
    render_circuit_jupyter(tket_circuit)
    
    opened by KentaroAOKI 6
  • <unk> token feature in the forward() function

    token feature in the forward() function

    Considering the necessity of the token for the never-seen before entities, how can I implement the token in the forward function to allow for the model to calculate probabilities for the instances which have unknown symbols. Based on my understanding and guide from one of the moderators I think it's supposed to be in the forward function.

    Could you kindly assist me in implementing this?

    opened by ACE07-Sev 6
  • inference

    inference

    I'm trying to run Quantum pipeline using JAX backend. In order to better show results (how each sentence was classified according to the two different categories) , it is available an example of code to realise the inference as in the classic NLP deep learning approaches (i.e. like in transformers-based approaches or similar)

    stale 
    opened by nlpirate 6
  • WebParser error

    WebParser error

    Hi I'm trying to run the Web Parser but very similar to the issue with the Bobcat Parser, it gives this error : Traceback (most recent call last): File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1342, in do_open h.request(req.get_method(), req.selector, req.data, headers, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1255, in request self._send_request(method, url, body, headers, encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1301, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1250, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1010, in _send_output self.send(msg) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 950, in send self.connect() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1424, in connect self.sock = self._context.wrap_socket(self.sock, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1040, in _create self.do_handshake() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP_test.py", line 6, in new_diagram = parser.sentence2diagram('he was overtaken by the depression.') File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\ccg_parser.py", line 227, in sentence2diagram return self.sentences2diagrams( File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\ccg_parser.py", line 157, in sentences2diagrams trees = self.sentences2trees(sentences, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\web_parser.py", line 159, in sentences2trees raise e File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\text2diagram\web_parser.py", line 148, in sentences2trees with urlopen(url) as f: File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen return opener.open(url, data, timeout) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 517, in open response = self._open(req, data) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 534, in _open result = self._call_chain(self.handle_open, protocol, protocol + File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain result = func(*args) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1385, in https_open return self.do_open(http.client.HTTPSConnection, req, File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 1345, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1123)>

    Process finished with exit code 1

    I really need to get this working because Bobcat fails in parsing the sentences into diagrams. Can someone please help me with this?

    opened by ACE07-Sev 4
  • Numpy int32 error

    Numpy int32 error

    Hi I ran the Quantum_trainer, and been getting this and can't seem to find where the issue is.

    Traceback (most recent call last): File "C:\Users\elmm\Desktop\CQM\QNLP Depression.py", line 103, in trainer.fit(train_dataset, val_dataset, logging_step=12) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\trainer.py", line 365, in fit y_hat, loss = self.training_step(batch) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\quantum_trainer.py", line 149, in training_step y_hat, loss = self.optimizer.backward(batch) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\spsa_optimizer.py", line 125, in backward y0 = self.model(diagrams) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\model.py", line 59, in call return self.forward(*args, **kwds) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 131, in forward return self.get_diagram_output(x) File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 103, in get_diagram_output seed=self._randint() File "C:\Users\elmm\AppData\Local\Programs\Python\Python39\lib\site-packages\lambeq\training\tket_model.py", line 71, in _randint return np.random.randint(low, high) File "mtrand.pyx", line 746, in numpy.random.mtrand.RandomState.randint File "_bounded_integers.pyx", line 1334, in numpy.random._bounded_integers._rand_int32 ValueError: low is out of bounds for int32

    opened by ACE07-Sev 4
  • Error with Bobcat Parser

    Error with Bobcat Parser

    Hi,

    I am having a relatively difficult to understand error. I am trying to look at the sentence "Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat", and can't figure out why it is giving me the error below. Does BobCatParser throw an error if it doesn't recognize a word? I am unclear what I would need to do to fix the sentence.

    from lambeq import BobcatParser

    parser = BobcatParser() diagram = parser.sentence2diagram('Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat') diagram.draw()

    raceback (most recent call last):

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/bobcat_parser.py", line 382, in sentences2trees trees.append(self._build_ccgtree(result[0]))

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/bobcat/parser.py", line 258, in getitem return self.root[index]

    IndexError: list index out of range

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec exec(code, globals, locals)

    File "/Users/dabeaulieu/Documents/Initiatives/quantum/machine learning/notebooks/qnlp/ankush/Quantum_NLP/testcode.py", line 12, in diagram = parser.sentence2diagram('Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat')

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/ccg_parser.py", line 231, in sentence2diagram return self.sentences2diagrams(

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/ccg_parser.py", line 161, in sentences2diagrams trees = self.sentences2trees(sentences,

    File "/Users/dabeaulieu/opt/anaconda3/envs/qcware/lib/python3.9/site-packages/lambeq/text2diagram/bobcat_parser.py", line 387, in sentences2trees raise BobcatParseError(' '.join(sent.words))

    BobcatParseError: Bobcat failed to parse 'Senate Advances Bill To Approve Keystone Pipeline Despite Obamas Veto Threat'.

    opened by dancbeaulieu 1
  • Error from_tk

    Error from_tk

    Discussed in https://github.com/CQCL/lambeq/discussions/49

    Originally posted by JVM1982 October 10, 2022 Hello.

    Why the following code does not work ? :

    sentence = 'person runs program .' diagram = remove_cups( parser.sentence2diagram( sentence ) ) circuit = ansatz( diagram ) print( model( [ circuit ] ) ) # OK # print( model( [ from_tk( circuit.to_tk() ) ] ) ) # ERROR #

    [[0.14473685 0.85526315]] Unexpected exception formatting exception. Falling back to standard exception

    Traceback (most recent call last): File "/home/javier.valera/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3378, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_16444/2916265434.py", line 9, in print( model( [ from_tk( circuit.to_tk() ) ] ) ) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/model.py", line 59, in call return self.forward(*args, **kwds) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 131, in forward return self.get_diagram_output(x) File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 101, in get_diagram_output *[diag_f(*self.weights) for diag_f in lambdified_diagrams], File "/home/javier.valera/.local/lib/python3.8/site-packages/lambeq/training/tket_model.py", line 101, in *[diag_f(*self.weights) for diag_f in lambdified_diagrams], File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/monoidal.py", line 509, in return lambda xs: self.id(self.dom).then(( File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/monoidal.py", line 510, in self.id(left) @ box.lambdify(*symbols, **kwargs)(*xs) File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/quantum/gates.py", line 321, in return lambda *xs: type(self)(c_fn(*xs), distance=self.distance) File "/home/javier.valera/.local/lib/python3.8/site-packages/discopy/quantum/gates.py", line 448, in return lambda *xs: type(self)(data(*xs)) File "", line 2, in _lambdifygenerated return [email protected]@n.l_0 NameError: name 'runs__n' is not defined

    Thanks.

    opened by Thommy257 0
  • cfg: other languages compatibility

    cfg: other languages compatibility

    I am interested in some verticalization of lambeq (and consequently discocat) to languages other than English, particularly Italian. As far as I read from the documentation from a linguistic point of view, at the base of the framework, there are cfg grammars. I know this theoretical formalism very well. How is it possible to visualize and extract the structures and formalisms of these grammars from the library so that they can be extended/modified?

    opened by nlpirate 0
Releases(0.2.7)
  • 0.2.7(Oct 11, 2022)

    Added:

    • Added support for Japanese to DepCCGParser. (credit: KentaroAOKI https://github.com/CQCL/lambeq/pull/24)
    • Overhauled the CircuitAnsatz interface, and added three new ansätze.
    • Added helper methods to CCGTree to get the children of a tree. Added a new .tree2diagram method to TreeReader, extracted from TreeReader.sentence2diagram.
    • Added a new TreeReaderMode named HEIGHT.
    • Added new methods to Checkpoint for creating, saving and loading checkpoints for training.
    • Documentation: added a section for how to select the right model and trainer for training.
    • Documentation: added links to glossary terms throughout the documentation.
    • Documentation: added UML class diagrams for the sub-packages in lambeq.

    Changed:

    • Dependencies: bumped the minimum versions of discopy and torch.
    • IQPAnsatz now post-selects in the Hadamard basis.
    • PytorchModel now initialises using xavier_uniform.
    • CCGTree.to_json can now be applied to None, returning None.
    • Several slow imports have been deferred, making lambeq much faster to import for the first time.
    • In CCGRule.infer_rule, direction checks have been made explicit.
    • UnarySwap is now specified to be a unaryBoxConstructor.
    • BobcatParser has been refactored for easier use with external evaluation tools.
    • Documentation: headings have been organised in the tutorials into subsections.

    Fixed:

    • Fixed how CCGRule.infer_rule assigns a punc + X instance: if the result is X\X the assigned rule is CONJUNCTION, otherwise the rule is REMOVE_PUNCTUATION_LEFT (similarly for punctuation on the right).

    Removed:

    • Removed unnecessary override of .from_diagrams in NumpyModel.
    • Removed unnecessary kwargs parameters from several constructors.
    • Removed unused special_cases parameter and _ob method from CircuitAnsatz.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.6(Aug 11, 2022)

    Added:

    • A strict pregroups mode to the CLI. With this mode enabled, all swaps are removed from the output string diagrams by changing the ordering of the atomic types, converting them into a valid pregroup form.

    Fixed:

    • Adjusted the behaviour of output normalisation in quantum models. Now, NumpyModel always returns probabilities instead of amplitudes.
    • Removed the prediction from the output of the SPSAOptimizer, which now returns just the loss.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.5(Jul 26, 2022)

    Added:

    • Added a "swapping" unary rule box to handle unary rules that change the direction of composition, improving the coverage of the BobcatParser.
    • Added a --version flag to the CLI.
    • Added a make_checkpoint method to all training models.

    Changed:

    • Changed the WebParser so that the online service to use is specified by name rather than by URL.
    • Changed the BobcatParser to only allow one tree per category in a cell, doubling parsing speed without affecting the structure of the parse trees (in most cases).
    • Made the linting of the codebase stricter, enforced by the GitHub action. The flake8 configuration can be viewed in the setup.cfg file.

    Fixed:

    • Fixed the parameter names in CCGRule, where dom and cod had inadvertently been swapped.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.4(Jul 4, 2022)

    Added:

    • Support for using jax as backend of tensornetwork when setting use_jit=True in the NumpyModel. The interface is not affected by this change, but performance of the model is significantly improved.

    Fixed:

    • Fix a bug that caused the BobcatParser and the WebParser to trigger an SSL certificate error using Windows.
    • Fix false positives in assigning conjunction rule using the CCGBankParser. The rule , + X[conj] -> X[conj] is a case of removing left punctuation, but was being assigned conjunction erroneously.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.3(Jun 8, 2022)

    Added:

    • CCGRule: Add symbol method that returns the ASCII symbol of a given CCG rule.
    • CCGTree: Extend deriv method with CCG output. It is now capable of returning standard CCG diagrams.
    • Command-line interface: CCG mode. When enabled, the output will be a string representation of the CCG diagram corresponding to the CCGTree object produced by the parser, instead of DisCoPy diagram or circuit.
    • Documentation: Add a troubleshooting page.

    Change:

    • Change the behaviour of spiders_reader such that the spiders decompose logarithmically. This change also affects other rewrite rules that use spiders, such as coordination and relative pronouns.
    • Rename AtomicType.PREPOSITION to AtomicType.PREPOSITIONAL_PHRASE.

    Fixed:

    • Fix a bug that raised a dtype error when using the TketModel on Windows.
    • Fix a bug that caused the normalisation of scalar outputs of circuits without open wires using a QuantumModel.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.2(Apr 24, 2022)

    Added:

    • Add support for Python 3.10.
    • Unify class hierarchies for parsers and readers: CCGParser is now a subclass of Reader and placed in the common package text2diagram. The old packages reader and ccg2discocat are no longer available. Compatibility problems with previous versions should be minimal, since from Release 0.2.0 and onwards all lambeq classes can be imported from the global namespace.
    • Add CurryRewriteRule, which uses map-state duality in order to remove adjoint types from the boxes of a diagram. When used in conjunction with discopy.rigid.Diagram.normal_form(), this removes cups from the diagram, eliminating post-selection.
    • The Bobcat parser now updates automatically when new versions are made available online.
    • Allow customising available root categories for the parser when using the command-line interface.

    Fixed:

    • Update grammar file of Bobcat parser to avoid problems with conflicting unary rules.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Apr 7, 2022)

    Added:

    • A new Checkpoint class that implements pickling and file operations from Trainer and Model.

    Changed:

    • Improvements to the training module, allowing multiple diagrams to be accepted as input to the SPSAOptimizer.
    • Updated documentation, including sub-package structures and class diagrams.
    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Mar 21, 2022)

    Added:

    • A new state-of-the-art CCG parser, fully integrated with lambeq, which replaces depccg as the default parser of the toolkit. The new Bobcat parser has better performance, simplifies installation, and provides compatibility with Windows (which was not supported due to a depccg conflict). depccg is still supported as an alternative external dependency.
    • A training package, providing a selection of trainers, models, and optimizers that greatly simplify supervised training for most of lambeq's use cases, classical and quantum. The new package adds several new features to lambeq, such as the ability to save to and restore models from checkpoints.
    • Furthermore, the training package uses DisCoPy's tensor network capability to contract tensor diagrams efficiently. In particular, DisCoPy 0.4.1's new unitary and density matrix simulators in result in substantially faster training speeds compared to the previous version.
    • A command-line interface, which provides most of lambeq's functionality from the command line. For example, lambeq can now be used as a standard command-line pregroup parser.
    • A web parser class that can send parsing queries to an online API, so that local installation of a parser is not strictly necessary anymore. The web parser is particularly helpful for testing purposes, interactive usage or when a local parser is unavailable, but should not be used for serious experiments.
    • A new lambeq.pregroups package that provides methods for easy creation of pregroup diagrams, removal of cups, and printing of diagrams in text form (i.e. in a terminal).
    • A new TreeReader class that exploits the biclosed structure of CCG grammatical derivations.
    • Three new rewrite rules for relative pronouns and coordination.
    • Tokenisation features have been added in all parsers and readers.
    • Additional generator methods and minor improvements for the CCGBankParser class.

    Changed:

    • Improved and more detailed package structure.
    • Most classes and functions can now be imported from lambeq directly, instead of having to import from the sub-packages.
    • The circuit and tensor modules have been combined into an lambeq.ansatz package. (However, as mentioned above, the classes and functions they define can now be imported directly from lambeq and should continue to do so in future releases.)
    • Improved documentation and additional tutorials.
    Source code(tar.gz)
    Source code(zip)
  • 0.1.2(Oct 12, 2021)

    Minor changes:

    • Add URLs to setup file

    Fixes:

    • Fix logo link in README
    • Fix missing version when building docs in GitHub action
    • Fix typo to description keyword in setup file
    Source code(tar.gz)
    Source code(zip)
  • 0.1.1(Oct 12, 2021)

    Minor changes:

    • Update install script to use PyPI package
    • Add badges and link to documentation to README file
    • Add lambeq logo and link to GitHub to documentation
    • Allow documentation to automatically get package version
    • Add keywords and classifiers to setup file

    Fixes:

    • Add lambeq.circuit submodule to top-level lambeq module
    • Fix references to license file
    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Oct 12, 2021)

    Initial release of lambeq. This contains a lot of core material:

    • converting sentences to string diagrams
    • CCG parsing, including reading from CCGBank
    • support for depccg parsing
    • rewriting diagrams
    • ansatze for circuits and tensors, including MPS ansatze
    • support for JAX and PyTorch integration
    • example notebooks and documentation
    Source code(tar.gz)
    Source code(zip)
Owner
Cambridge Quantum
Quantum Software and Technologies
Cambridge Quantum
A repo for materials relating to the tutorial of CS-332 NLP

CS-332-NLP A repo for materials relating to the tutorial of CS-332 NLP Contents Tutorial 1: Introduction Corpus Regular expression Tokenization Tutori

Alok singh 9 Feb 15, 2022
Built for cleaning purposes in military institutions

Ferramenta do AL Construído para fins de limpeza em instituições militares. Instalação Requer python = 3.2 pip install -r requirements.txt Usagem Exe

0 Aug 13, 2022
Tracking Progress in Natural Language Processing

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Sebastian Ruder 21.2k Dec 30, 2022
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

gpt-2-simple A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifical

Max Woolf 3.1k Jan 07, 2023
A paper list of pre-trained language models (PLMs).

Large-scale pre-trained language models (PLMs) such as BERT and GPT have achieved great success and become a milestone in NLP.

RUCAIBox 124 Jan 02, 2023
Python library to make development of portfolio analysis faster and easier

Trafalgar Python library to make development of portfolio analysis faster and easier Installation 🔥 For the moment, Trafalgar is still in beta develo

Santosh Passoubady 641 Jan 01, 2023
Source code for the paper "TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations"

TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations Created by Jiahao Pang, Duanshun Li, and Dong Tian from InterDigital In

InterDigital 21 Dec 29, 2022
GCRC: A Gaokao Chinese Reading Comprehension dataset for interpretable Evaluation

GCRC GCRC: A New Challenging MRC Dataset from Gaokao Chinese for Explainable Eva

Yunxiao Zhao 5 Nov 04, 2022
The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"

Zhiyu Chen 114 Dec 29, 2022
Natural Language Processing

NLP Natural Language Processing apps Multilingual_NLP.py start #This script is demonstartion of Mul

Ritesh Sharma 1 Oct 31, 2021
NLP, before and after spaCy

textacy: NLP, before and after spaCy textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the hig

Chartbeat Labs Projects 2k Jan 04, 2023
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x using fastT5.

Reduce T5 model size by 3X and increase the inference speed up to 5X. Install Usage Details Functionalities Benchmarks Onnx model Quantized onnx model

Kiran R 399 Jan 05, 2023
Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT)

CIRPLANT This repository contains the code and pre-trained models for Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) For d

Zheyuan (David) Liu 29 Nov 17, 2022
Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features Train python main.py --dataset brazil-flights C

wang zhang 0 Jun 28, 2022
PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis

YangHeng 567 Jan 07, 2023
基于pytorch_rnn的古诗词生成

pytorch_peot_rnn 基于pytorch_rnn的古诗词生成 说明 config.py里面含有训练、测试、预测的参数,更改后运行: python main.py 预测结果 if config.do_predict: result = trainer.generate('丽日照残春')

西西嘛呦 3 May 26, 2022
用Resnet101+GPT搭建一个玩王者荣耀的AI

基于pytorch框架用resnet101加GPT搭建AI玩王者荣耀 本源码模型主要用了SamLynnEvans Transformer 的源码的解码部分。以及pytorch自带的预训练模型"resnet101-5d3b4d8f.pth"

冯泉荔 2.2k Jan 03, 2023
Nested Named Entity Recognition

Nested Named Entity Recognition Training Dataset: CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark url: https://tianchi.aliyun.

8 Dec 25, 2022
DAGAN - Dual Attention GANs for Semantic Image Synthesis

Contents Semantic Image Synthesis with DAGAN Installation Dataset Preparation Generating Images Using Pretrained Model Train and Test New Models Evalu

Hao Tang 104 Oct 08, 2022
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

Ryuichi Yamamoto 1.8k Dec 30, 2022