profile
viewpoint
Huon Wilson huonw Sydney, Australia http://huonw.github.io/ @data61. Formerly: Swift frontend @apple, core team @rust-lang.

issue closedstellargraph/stellargraph

ImportError: cannot import name 'rng_integers' from 'scipy._lib._util'

Describe the bug

Hi! Thank you for this great library. I'm trying to install Stellargraph on Ubuntu 18.04. However, after I successfully installed Stellargraph via conda (follow this instruction), when I try to import Stellargraph, this problem happened:

ImportError: cannot import name 'rng_integers' from 'scipy._lib._util' (/home/zihan/anaconda3/envs/TF2/lib/python3.7/site-packages/scipy/_lib/_util.py)

To avoid the possible problem coming from scipy, I reinstalled it, and made sure it successfully installed in this environment:

conda list -n TF2

scipy                     1.4.1                    pypi_0    pypi
stellargraph              1.2.1                      py_0    stellargraph
tensorflow                2.2.0           mkl_py37h6e9ce2d_0  

Hence I'm quite confused with why this problem happens. Could you please help me with this issue? Thank you a lot in advance!

Environment

Operating system: Ubuntu 18.04

Python version: 3.7.7

Package versions: stellargraph==1.2.1, tensorflow==2.2.0


More detailed error description:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-11-855dad28d1e4> in <module>
----> 1 import stellargraph as sg
      2 #from stellargraph.data import EdgeSplitter
      3 #from stellargraph.mapper import FullBatchLinkGenerator
      4 #from stellargraph.layer import GCN, LinkEmbedding
      5 # import scipy

~/anaconda3/envs/TF2/lib/python3.7/site-packages/stellargraph/__init__.py in <module>
     37 
     38 # Import modules
---> 39 from stellargraph import (
     40     data,
     41     calibration,

~/anaconda3/envs/TF2/lib/python3.7/site-packages/stellargraph/data/__init__.py in <module>
     21 
     22 # Expose the stellargraph.data classes:
---> 23 from .explorer import *
     24 from .edge_splitter import *
     25 from .node_splitter import *

~/anaconda3/envs/TF2/lib/python3.7/site-packages/stellargraph/data/explorer.py in <module>
     29 import warnings
     30 from collections import defaultdict, deque
---> 31 from scipy import stats
     32 from scipy.special import softmax
     33 

~/anaconda3/envs/TF2/lib/python3.7/site-packages/scipy/stats/__init__.py in <module>
    386 
    387 """
--> 388 from .stats import *
    389 from .distributions import *
    390 from .morestats import *

~/anaconda3/envs/TF2/lib/python3.7/site-packages/scipy/stats/stats.py in <module>
    174 from scipy.spatial.distance import cdist
    175 from scipy.ndimage import measurements
--> 176 from scipy._lib._util import (_lazywhere, check_random_state, MapWrapper,
    177                               rng_integers)
    178 import scipy.special as special

ImportError: cannot import name 'rng_integers' from 'scipy._lib._util' (/home/zihan/anaconda3/envs/TF2/lib/python3.7/site-packages/scipy/_lib/_util.py)

closed time in a day

ZihanChen1995

issue commentstellargraph/stellargraph

ImportError: cannot import name 'rng_integers' from 'scipy._lib._util'

That's great that it now works. Thanks for filing an issue, and good luck! I'll close this issue now, but let us know if there's anything more.

ZihanChen1995

comment created time in a day

issue commentstellargraph/stellargraph

ImportError: cannot import name 'rng_integers' from 'scipy._lib._util'

It looks like the Ubuntu pip (on all Python versions) and Conda (Python 3.8) testing passed, using Scipy 1.4.1. For example, line 748 of https://github.com/stellargraph/stellargraph/pull/1785/checks?check_run_id=964620995 is scipy: 1.4.1-py37h0b6359f_0. (Windows failed, but I think that was due to a different problem.)

As an additional check, I've switched Conda testing to use Python 3.7 in that PR. That also passed.

Thus, my remaining thought is that somehow scipy is failing to install properly in your environment. Could you try just importing it in without going via StellarGraph? For example: python -c 'from scipy import stats', using the appropriate python command/binary for your Conda Python environment.

ZihanChen1995

comment created time in a day

push eventstellargraph/stellargraph

Huon Wilson

commit sha fa4d8f743eb856b2e6d383abb7c68a3bb7d798bb

Try python 3.7

view details

push time in a day

issue commentstellargraph/stellargraph

ImportError: cannot import name 'rng_integers' from 'scipy._lib._util'

That's very strange! It looks like a scipy submodule is failing to import another scipy internal submodule. This suggests to me that the problem isn't directly related to StellarGraph, but is instead a corrupted installation of scipy. StellarGraph runs successfully on my macOS machine with scipy 1.4.1 (installed with pip), and on Ubuntu CI with scipy 1.5.2 (pip) and scipy 1.5.0 (conda).

To help narrow down potential issues here, I've opened https://github.com/stellargraph/stellargraph/pull/1785, which pins the version of scipy to 1.4.1 exactly, and thus will run through the full test suite and all of the demo notebooks with that version. Once CI has finished on it, hopefully we'll have more insight into the issue.

ZihanChen1995

comment created time in a day

push eventstellargraph/stellargraph

Huon Wilson

commit sha 03370a021b997cf3988f51bbc719b06ae730044b

Pin conda dep

view details

push time in a day

PR opened stellargraph/stellargraph

Use scipy 1.4.1 exactly

This is a test for #1784, which sees some issues that may be connected to scipy==1.4.1.

+1 -1

0 comment

1 changed file

pr created time in a day

create barnchstellargraph/stellargraph

branch : test/issue-1784-scipy-1.4.1

created branch time in a day

issue commentstellargraph/stellargraph

Supervised graph classification with GCN produce different output for the same input

There's several sources of a randomness in a deep learning model, with train/test split being just one. For instance, the initial values of the trainable parameters. Most sources of randomness can be controlled via tensorflow.set_seed and stellargraph.random.set_seed. They should allow making the model outputs far more reproducible.

evan42mr

comment created time in 4 days

issue openedplotly/plotly.js

Support Choropleth's locationmode in Choroplethmapbox

A choropleth trace supports specifying locations as high-level string names/IDs, via the locations and locationmode keys. A choroplethmapbox trace seems to not support this, and thus requires specifying any geometries as a manual GeoJSON file. It would be nice if choroplethmapbox supported the same automatic "inference" of appropriate geometries too.

Related issues:

  • #3988 mentions the possibility of this, but it seems there's no progress:

    locationmode. The current behavior would correspond to locationmode: 'geojson-id'. Some users might expect to identify GeoJSON features using their properties.name (this would correspond to e.g. locationmode: 'geojson-prop-name'. Moreover, we could also support the locationmode values found in geo traces: 'ISO-3', 'USA-states', 'country names'

  • #4154 was, I think, about generalising the GeoJSON support in choroplethmapbox, but not adding the no-GeoJSON mode as here

  • #4267 was the opposite, I think: adding support for GeoJSON to choropleth

created time in 6 days

startedankitrohatgi/WebPlotDigitizer

started time in 7 days

pull request commentstellargraph/stellargraph

Test sparse GAT properly

Thanks for the review.

Status: this depends on #1779 so that the test of saving and loading the (sparse) model can work (with TF 2.3), and that seems to be waiting on TF 2.3 being available in conda (https://github.com/stellargraph/stellargraph/pull/1779#issuecomment-664747491).

huonw

comment created time in 14 days

pull request commentstellargraph/stellargraph

Un-xfail full-batch saving tests fixed by TensorFlow 2.3

This is blocked by TF 2.3.0 being available on Conda.

huonw

comment created time in 14 days

PR opened stellargraph/stellargraph

Test sparse GAT properly

This replaces the TestGATsparse function with a class, that has the correct subclass behaviour to pick up the tests from the Test_GAT subclass. This requires adjusting some tests, but otherwise the code seems to work.

Fixes: #1780

+12 -7

0 comment

1 changed file

pr created time in 14 days

create barnchstellargraph/stellargraph

branch : bugfix/1780-test-sparse-gat

created branch time in 14 days

issue openedstellargraph/stellargraph

Sparse GAT isn't tested properly

Describe the bug

GAT tries to uise subclassing to share tests between dense and sparse models, but accidentally defines a function instead of a subclass:

https://github.com/stellargraph/stellargraph/blob/ec132647e5cf43ff683e3f2e72e18ac6daa98202/tests/layer/test_graph_attention.py#L625-L627

To Reproduce

  1. Look at test_graph_attention.py

Observed behavior

The sparse tests define a new function that is ignored.

Expected behavior

The sparse tests should declare a subclass, like class TestGATsparse(Test_GAT):

Environment

Operating system: all

Python version: all

Package versions: StellarGraph ec132647e5cf43ff683e3f2e72e18ac6daa98202

Additional context

N/A

created time in 14 days

PR opened stellargraph/stellargraph

Un-xfail full-batch saving tests fixed by TensorFlow 2.3

TensorFlow 2.3.0 was released over the last day. This release includes the fix for https://github.com/tensorflow/tensorflow/issues/38465 (which we filed a duplicate of at https://github.com/tensorflow/tensorflow/issues/40373) which is the underlying issue behind #1251.

As such, we can remove the xfail markings from tests involving the saving of full-batch models like APPNP, GCN and RGCN.

+3 -8

0 comment

3 changed files

pr created time in 14 days

create barnchstellargraph/stellargraph

branch : bugfix/1251-test-full-batch-saving

created branch time in 14 days

Pull request review commenttensorflow/tensorflow

Try to fix bce loss check

 def compile(self,       self.optimizer = self._get_optimizer(optimizer)       self.compiled_loss = compile_utils.LossesContainer(           loss, loss_weights, output_names=self.output_names)-      self.compiled_metrics = compile_utils.MetricsContainer(-          metrics, weighted_metrics, output_names=self.output_names)+      mc = compile_utils.MetricsContainer(metrics,+                                          weighted_metrics,+                                          loss=loss,

I reran the https://colab.research.google.com/gist/huonw/4ac8796f598bc742b476fd8e36a1f866/gcn-link-prediction.ipynb notebook and it seemed to work (other than the change from model.compiled_metrics._loss to model.compiled_metrics._loss_container). The cell that tests various ways to specify binary crossentropy now has output:

'bce' binary_accuracy
'binary_crossentropy' binary_accuracy
<function binary_crossentropy at 0x7f94729b1950> binary_accuracy
<tensorflow.python.keras.losses.BinaryCrossentropy object at 0x7f94629c13c8> binary_accuracy
<function binary_crossentropy at 0x7f94729b1950> binary_accuracy

And the accuracy and validation accuracy metrics now look correct too 👍

bhack

comment created time in 14 days

Pull request review commenttensorflow/tensorflow

Try to fix bce loss check

 def test_accuracy(self):     self.assertEqual(metric_container.metrics[0]._fn,                      metrics_mod.binary_accuracy) +    loss = losses_mod.BinaryCrossentropy()+    metric_container = compile_utils.MetricsContainer('accuracy', loss=loss)

Ah, good trick. I applied it on Colab: https://colab.research.google.com/gist/huonw/4ac8796f598bc742b476fd8e36a1f866/gcn-link-prediction.ipynb . It didn't work.

Discussion (numbers [n] refer to the cell execution count on the rendered form):

  1. I applied the patch with curl ... | patch [2]
  2. I confirmed the patch was applied with grep [3]
  3. I ran the example from #41361 with many ways to specify the loss [4]. For example, "bce" and referring to the keras.losses.binary_crossentropy function directly (the example was extended to include printing the loss value used). Output here for convenience:
    'bce' binary_accuracy
    'binary_crossentropy' categorical_accuracy
    <function binary_crossentropy at 0x7ff1b8db0950> categorical_accuracy
    <tensorflow.python.keras.losses.BinaryCrossentropy object at 0x7ff1a8dc1208> binary_accuracy
    <function binary_crossentropy at 0x7ff1b8db0950> categorical_accuracy
    
    Note that only loss="bce" and loss=tf.keras.losses.BinaryCrossentropy() work, loss="binary_crossentropy" and the two forms of loss=tf.keras.losses.binary_crossentropy do not.
  4. I ran rest of the notebook from https://github.com/stellargraph/stellargraph/issues/1766 without changes to build and compile a Keras model [5]-[19] (using loss=tf.keras.losses.binary_crossentropy, as in the original)
  5. I inspected the Keras model to confirm that it's using the MetricsContainer class with the loss parameter to its constructor and the _loss attribute [20], [21]
  6. I evaluated and trained the model (also without changes from the original notebook), observing that the accuracy was the incorrect 0 value [22]-[25]
bhack

comment created time in 15 days

Pull request review commenttensorflow/tensorflow

Try to fix bce loss check

 def compile(self,       self.optimizer = self._get_optimizer(optimizer)       self.compiled_loss = compile_utils.LossesContainer(           loss, loss_weights, output_names=self.output_names)-      self.compiled_metrics = compile_utils.MetricsContainer(-          metrics, weighted_metrics, output_names=self.output_names)+      mc = compile_utils.MetricsContainer(metrics,+                                          weighted_metrics,+                                          loss=loss,

Given my other comment above, maybe this should be using the compiled/normalized self.compiled_loss value(s) rather than the generic loss one that can be in 'any' form? I don't know what a LossesContainer contains in particular, but I'm comparing to the TF 2.1 code, where there didn't need to be a special case for string forms like "bce":

https://github.com/tensorflow/tensorflow/blob/3ffdb91f122f556a74a6e1efd2469bfe1063cb5c/tensorflow/python/keras/engine/training_utils.py#L1114-L1121

bhack

comment created time in 15 days

Pull request review commenttensorflow/tensorflow

Try to fix bce loss check

 def test_accuracy(self):     self.assertEqual(metric_container.metrics[0]._fn,                      metrics_mod.binary_accuracy) +    loss = losses_mod.BinaryCrossentropy()+    metric_container = compile_utils.MetricsContainer('accuracy', loss=loss)

Great, thanks! I'd be happy to say that this PR resolves the issue from our side then.

Unfortunately, I am not in a position to build TensorFlow from source, so unless there's an easy way for me to install a package from CI, I can't confirm on our real code myself. However, this issue was flagged in public code: in particular, https://github.com/stellargraph/stellargraph/issues/1766 found Jupyter notebook where the model reported accuracy metrics of 0 with TF 2.2 (but not TF 2.1). This can be run locally using something like:

  1. Download the notebook from https://stellargraph.readthedocs.io/en/v1.2.1/demos/link-prediction/gcn-link-prediction.ipynb
  2. Install necessary libraries pip install stellargraph[demos]==1.2.1
  3. Run the notebook and see whether the fit call reports an accuracy of constant zero (bad) or decreasing non-zero (good)

This is a bit fiddly, so I'm not expecting you to do so. 😄

bhack

comment created time in 15 days

issue commentstellargraph/stellargraph

Node classification with Node2Vec "Data Splitting" section inconsistent

You're entirely correct that the text and code are inconsistent here.

It looks like this was correct and consistent in the original form of the notebook (added in 7d14eb28ab3254aae2c8c64e70a9fc2357bb83af) https://github.com/stellargraph/stellargraph/blob/7d14eb28ab3254aae2c8c64e70a9fc2357bb83af/demos/node-classification/stellargraph-node2vec-node-classification.ipynb but was changed in https://github.com/stellargraph/stellargraph/commit/7a6742213e0e8765bc6b712f9783a13a80199abb with apparently no explanation.

There's a few options:

  1. update the text to match the code (that is, "We use 10% of the data for training and the remaining 90% for testing as a hold out test set"). For example: https://stellargraph.readthedocs.io/en/stable/demos/node-classification/keras-node2vec-node-classification.html#Data-Splitting does 10-90 'properly'
  2. update the code to match the text (that is, what you state). For example: https://stellargraph.readthedocs.io/en/stable/demos/node-classification/node2vec-weighted-node-classification.html#Comparing-the-accuracy-of-node-classification-for-weighted-(weight-==1)-and-unweighted-random-walks.
  3. switch both to something completely different. For example: https://stellargraph.readthedocs.io/en/stable/demos/node-classification/attri2vec-node-classification.html#Data-Splitting does a 20-80 split

I'm inclined towards 2; potentially including updating those other examples that do splits other than 75-25 because I suspect at least the notebook from one steals the 10-90 split from the new weird notebook.

I'm happy to open a PR for this sometime later today or tomorrow, unless someone else gets to it first.

ankitatalwar

comment created time in 15 days

Pull request review commenttensorflow/tensorflow

Try to fix bce loss check

 def test_accuracy(self):     self.assertEqual(metric_container.metrics[0]._fn,                      metrics_mod.binary_accuracy) +    loss = losses_mod.BinaryCrossentropy()+    metric_container = compile_utils.MetricsContainer('accuracy', loss=loss)

Looks like it, but there is a reduced test case in #41361 that I cut down from our real code and so is the "true" measure of success. Does that test case match the behaviour of 2.1 with this PR?

bhack

comment created time in 15 days

fork huonw/terriajs

A library for building rich, web-based geospatial data explorers.

https://terria.io

fork in 18 days

pull request commentstellargraph/stellargraph

Run CI workflow at 00:00 UTC every day

This ran (and passed) at https://github.com/stellargraph/stellargraph/actions/runs/177796066

huonw

comment created time in 20 days

issue commentstellargraph/stellargraph

Use directed graph in GCN_LSTM

Thanks for the context. The current demo is indeed focused on predicting traffic flow at nodes, representing traffic sensors. It's definitely the case that another way to phrase this sort of problem is to predict the attributes of edges between points of interests like cities (https://github.com/stellargraph/stellargraph/issues/1630 is slightly related).

Is it possible to change it to a pair of nodes level?

It's not built in, and edge features (like having a time series of traffic flows along an edge) aren't fully supported yet (#1326).

So I will have N*N rows of flow value for each timestep instead of N rows of target value (in your case is speed of each node). Does that make sense to you?

Yeah, that makes sense. This is how I would approach it (you may have already thought of this and tried it 😄 ) would be to have a multi-variate time-series associated with each node as its features, representing the flow to each of the other nodes. This means a feature matrix of shape N × T × N, where each of the N "rows" (of shape T × N) represents the T observations of flows to the N nodes, for a single node. The model then is making predictions of size N × N.

This will work best when N is relatively small because of the O(N^2) size of the features involved.

In this set up, the adjacency matrix is just capturing some sort of "closeness" between nodes, e.g. if there's an edge A -> B are connected, then we're suggesting to the model that data from A will help with the predictions and B. Notably, it's not necessary that the matrix is directed in this case, because I imagine that data from B would also help with predictions at A, in many cases.

What do you think?

demiding0729

comment created time in 20 days

issue commentstellargraph/stellargraph

Allow different output dimensions in GCN-LSTM

they might be predicting a time series with a different number of variates to the inputs, e.g. using a multivariate series of observations to predict non-observed variates (both univariate or multivariate)

This in particular could be used for link prediction, where sequences on nodes (i.e. num_nodes input sequences) are used to make predictions for sequences on edges (i.e. num_edges outputs).

huonw

comment created time in 20 days

issue commentstellargraph/stellargraph

Representing a scene graph using StellarDiGraph for caption generation (GCN-LSTM)

Thanks for all the info.

It looks like a relatively complicated model, so I'd recommend that we simplify progressively to understand where the failure is. In particular, we can try to narrow down whether there's no signal to learn from, or (more likely) which part of the model is causing issues. Here's some ideas:

  1. Is it the directness? Use a non-directed StellarGraph, to see whether the directed edges are causing the problems (this will of course lose information, there would hopefully still be something in the dataset)
  2. Is it the connection between the graph classification model and the later layers? Look at the output of the graph classification model separately; one way to do this would be to train a graph classification model unsupervised and use the resulting embeddings in two ways:
    1. look at the embeddings themselves, to understand if there's any signal in them
    2. train a classifier/LSTM on top of those embeddings, to work out if it's the end-to-end training that's causing a problem (if this approach does work better, one could also try semi-supervised training, where the graph classification model is first trained without supervision and then the full LSTM model is fine-tuned end-to-end, similar to https://stellargraph.readthedocs.io/en/stable/demos/node-classification/gcn-deep-graph-infomax-fine-tuning-node-classification.html )
  3. Is it the graph classification layer? Use something other than the graph classification layer; for instance, ignore the edges and just pool the node features for each graph (with a manual collection of layers)
  4. Is it the other layers? Simplify the layers after the graph classification, to work out if the graph classification layer is capturing the signal appropriately and that's getting lost in the later layers; for instance:
    1. do binary classification (for instance, "does the caption contain word X?" for some particular X) to work out whether there's much signal in the data at all, e.g. instead of the layers from Embedding, just do predictions = Dense(units=1, activation="sigmoid")(predictions) on the output of the 32-unit dense layer
    2. generalising that, generate a single word caption, like Dense(units=t_vocabsize, activation="softmax")
    3. if all of these work, then I guess one would have to progressively re-introduce the Embedding, SpatialDropout1D, ... Activation layers to work out where the failure is

I'm not using the GCN-LSTM model given in the StellarGraph documentation

Ah, okay, most of my comment was thus just a misunderstanding about what you were referring to. We're on the same page now. 👍

medea-learner

comment created time in 20 days

push eventstellargraph/stellargraph

Huon Wilson

commit sha ec132647e5cf43ff683e3f2e72e18ac6daa98202

Run CI workflow at 00:00 UTC every day (#1774) We aren't running CI as often these days (since there's less development/contributions), which means if a dependency version change breaks something, we may not realise this for a while. This means that (a) the code may be broken for longer, and (b) may make it hard to work out what went wrong, since there will likely be some unrelated changes to dependency versions. Running CI regularly ensures that we at least only have to determine the dependency versions that change over the course of a single day (easier with #1712). Reference docs: https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#onschedule

view details

push time in 21 days

delete branch stellargraph/stellargraph

delete branch : feature/actions-cron

delete time in 21 days

PR merged stellargraph/stellargraph

Run CI workflow at 00:00 UTC every day

We aren't running CI as often these days (since there's less development/contributions), which means if a dependency version change breaks something, we may not realise this for a while. This means that (a) the code may be broken for longer, and (b) may make it hard to work out what went wrong, since there will likely be some unrelated changes to dependency versions. Running CI regularly ensures that we at least only have to determine the dependency versions that change over the course of a single day (easier with #1712).

Reference docs: https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#onschedule

+2 -0

1 comment

1 changed file

huonw

pr closed time in 21 days

startedkonstantinstadler/country_converter

started time in 21 days

issue commentstellargraph/stellargraph

Representing a scene graph using StellarDiGraph for caption generation (GCN-LSTM)

Hi, thanks for getting in touch.

As suggested on the linked thread, whole-graph summarisation is best served using a graph classification model to encode the graph to a vector, and then generating sequences from the vector. GCN-LSTM is designed for prediction on a graph with a time-series associated with each node. See https://stellargraph.readthedocs.io/en/stable/demos/graph-classification/index.html for info about graph classification.

Just to make sure we're on the same page: are you trying to use GCN-LSTM with a direct graph/StellarDiGraph, or are you using one of the graph classification models?

keep in mind that I tried one representation but my model didn't learn from (i.e., both train accuracy and val accuracy remain almost the same whatever the number of epochs and the layer used and ...

We'll be able to be much more helpful if you give us a little more info about this. It sounds like you tried something but it failed to learn? Could you say more about what it was? For instance, maybe there's a misconfiguration in the model.

medea-learner

comment created time in 22 days

push eventstellargraph/stellargraph

Huon Wilson

commit sha 32e2a7808a344d8776d1e6cddac96eaf88f582f6

Point to GitHub Discussions, not Discourse (#1771) See #1773 for more info. This also updates the email addresses to a more appropriate one for the current status of the library. Rendered README: https://github.com/stellargraph/stellargraph/blob/474865d7ea85f876279e6f1f9889668957ab0c69/README.md Fixes: #1765

view details

push time in a month

delete branch stellargraph/stellargraph

delete branch : feature/1765-discussions

delete time in a month

PR merged stellargraph/stellargraph

Point to GitHub Discussions, not Discourse

See #1773 for more info.

This also updates the email addresses to a more appropriate one for the current status of the library.

Rendered README: https://github.com/stellargraph/stellargraph/blob/474865d7ea85f876279e6f1f9889668957ab0c69/README.md

Fixes: #1765

+4 -4

1 comment

2 changed files

huonw

pr closed time in a month

issue closedstellargraph/stellargraph

Potentially migrating from Discourse to GitHub Discussions

Description

Unfortunately, we're considering migrating away from https://community.stellargraph.io/. It'd be good to preserve the discussion there in some form, because there's useful insights.

closed time in a month

huonw

issue commentstellargraph/stellargraph

Use directed graph in GCN_LSTM

Hi @demiding0729, thanks for getting in touch. Could you say how you're using the GCN-LSTM model? Are you passing in an adjacency matrix directly or are you using SlidingFeaturesNodeGenerator?

If you're passing in the adjacency matrix directly, I think it should be fine to pass in a directed (asymmetrical) adjacency matrix. This can be created using graph.to_adjacency_matrix().todense(), if graph is a StellarDiGraph (docs for the to_adjacency_matrix method).

demiding0729

comment created time in a month

PR opened stellargraph/stellargraph

Run CI workflow at 00:00 UTC every day

We aren't running CI due to development/contributions as often these days, which means if a dependency upgrade breaks something, we may not realise this for a while. This means that (a) the code may be broken for longer, and (b) may make it hard to work out what went wrong, since there will likely be some unrelated changes. Running CI regularly ensures that we at least only have to determine the dependency versions that change over the course of a single day (easier with #1712).

+2 -0

0 comment

1 changed file

pr created time in a month

create barnchstellargraph/stellargraph

branch : feature/actions-cron

created branch time in a month

push eventstellargraph/stellargraph

Huon Wilson

commit sha 474865d7ea85f876279e6f1f9889668957ab0c69

Update email

view details

push time in a month

create barnchstellargraph/stellargraph

branch : feature/1765-discussions

created branch time in a month

push eventstellargraph/stellargraph

Huon Wilson

commit sha 6def6aa72c077286b6fa743dc8f89a1f28ac49f9

Avoid TF 2.2 bug via 'binary_accuracy' metric in GCN link prediction (#1770) This works around a change in behaviour of `metrics=["acc"]` in TensorFlow 2.2 (https://github.com/tensorflow/tensorflow/issues/41361) by replacing that with `metrics=["binary_accuracy"]`. All other changes here are just from rerunning the notebook. Previously, in TF 2.0 and 2.1, the `acc`/`accuracy` metric was being translated into `binary_accuracy`, because the `compile` method was noticing that the loss function was `binary_crossentropy`. In TF 2.2, the inference for `acc`/`accuracy` no longer looks at the loss function, and instead just relies on the last dimension of the prediction tensor: if the dimension is equal to 1, it uses binary accuracy, otherwise categorical accuracy. (See #1766 and https://github.com/tensorflow/tensorflow/issues/41361 for more details.) This GCN link prediction demo uses the binary crossentropy loss function, but has a `Reshape((-1,))` layer, that removes the natural last dimension of size 1, leaving a predictions tensor with last dimension of size 2708 (number of nodes in the graph). This means that the inference of the true metric for the `acc` name was changing from the correct `binary_accuracy` in TF 2.1 to the incorrect `categorical_accuracy` in TF 2.2. (Another option would be removing the `Reshape((-1,))` layer, leaving the last dimension of size 1, but I decided that this could have large knock-on consequences.) (I looked for other instances of `Reshape((-1,))` and `Flatten()` layers that hit this same problem, but it seems like the others are fine. For instance, https://github.com/stellargraph/stellargraph/blob/f0590c0bed1a482f63ef1cd10e42fdba62ff8685/demos/link-prediction/homogeneous-comparison-link-prediction.ipynb uses `metrics=["binary_accuracy"]` directly.) Fixes: #1766

view details

push time in a month

delete branch stellargraph/stellargraph

delete branch : bugfix/1766-gcn-lp

delete time in a month

PR merged stellargraph/stellargraph

Avoid TF 2.2 bug via 'binary_accuracy' metric in GCN link prediction

This works around a change in behaviour of metrics=["acc"] in TensorFlow 2.2 (https://github.com/tensorflow/tensorflow/issues/41361) by replacing that with metrics=["binary_accuracy"]. All other changes here are just from rerunning the notebook.

Previously, in TF 2.0 and 2.1, the acc/accuracy metric was being translated into binary_accuracy, because the compile method was noticing that the loss function was binary_crossentropy. In TF 2.2, the inference for acc/accuracy no longer looks at the loss function, and instead just relies on the last dimension of the prediction tensor: if the dimension is equal to 1, it uses binary accuracy, otherwise categorical accuracy. (See #1766 and https://github.com/tensorflow/tensorflow/issues/41361 for more details.)

This GCN link prediction demo uses the binary crossentropy loss function, but has a Reshape((-1,)) layer, that removes the natural last dimension of size 1, leaving a predictions tensor with last dimension of size 2708 (number of nodes in the graph). This means that the inference of the true metric for the acc name was changing from the correct binary_accuracy in TF 2.1 to the incorrect categorical_accuracy in TF 2.2.

(Another option would be removing the Reshape((-1,)) layer, leaving the last dimension of size 1, but I decided that this could have large knock-on consequences.)

(I looked for other instances of Reshape((-1,)) and Flatten() layers that hit this same problem, but it seems like the others are fine. For instance, https://github.com/stellargraph/stellargraph/blob/f0590c0bed1a482f63ef1cd10e42fdba62ff8685/demos/link-prediction/homogeneous-comparison-link-prediction.ipynb uses metrics=["binary_accuracy"] directly.)

Fixes: #1766

+68 -65

2 comments

1 changed file

huonw

pr closed time in a month

issue closedstellargraph/stellargraph

Problem in GCN link prediction example

Describe the bug

It seems that the GCN link prediction example doesn't perform the same performance in the doc. And the GCN seems didn't learn something from the data, I am not sure why this happen.

To Reproduce

Steps to reproduce the behavior:

  1. Go to the colab example code https://colab.research.google.com/github/stellargraph/stellargraph/blob/master/demos/link-prediction/gcn-link-prediction.ipynb#scrollTo=cW8U-jvPKdob.
  2. Just run all the commands.
  3. Can't get the same performance in https://stellargraph.readthedocs.io/en/stable/demos/link-prediction/gcn-link-prediction.html, the acc is close to 0 in the reproduction of example.

I just run the code w/o any modification.

Environment

Operating system: 1. Ubuntu 2. colab

Python version: 1. python3.6.9 2. colab example default

Package versions: 1. stellargraph==1.2.1, tensorflow==2.2.0 2. colab example default

<IPython.core.display.HTML object>
StellarGraph: Undirected multigraph
 Nodes: 2708, Edges: 5429

 Node types:
  paper: [2708]
    Features: float32 vector, length 1440
    Edge types: paper-cites->paper

 Edge types:
    paper-cites->paper: [5429]
        Weights: all 1 (default)
        Features: none
** Sampled 542 positive and 542 negative edges. **
** Sampled 488 positive and 488 negative edges. **
Using GCN (local pooling) filters...
Using GCN (local pooling) filters...
2020-07-08 02:02:51.125100: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
1/1 [==============================] - 0s 261us/step - loss: 1.7441 - acc: 0.0000e+00
1/1 [==============================] - 0s 372us/step - loss: 1.7190 - acc: 0.0000e+00

Train Set Metrics of the initial (untrained) model:
        loss: 1.7441
        acc: 0.0000

Test Set Metrics of the initial (untrained) model:
        loss: 1.7190
        acc: 0.0000
Epoch 1/50
1/1 - 0s - loss: 1.7230 - acc: 0.0000e+00 - val_loss: 1.6174 - val_acc: 0.0000e+00
Epoch 2/50
1/1 - 0s - loss: 1.8744 - acc: 0.0000e+00 - val_loss: 0.6929 - val_acc: 0.0000e+00
Epoch 3/50
1/1 - 0s - loss: 0.7559 - acc: 0.0000e+00 - val_loss: 0.8678 - val_acc: 0.0000e+00
Epoch 4/50
1/1 - 0s - loss: 0.8746 - acc: 0.0000e+00 - val_loss: 0.9703 - val_acc: 0.0000e+00
Epoch 5/50
1/1 - 0s - loss: 0.9689 - acc: 0.0000e+00 - val_loss: 0.9082 - val_acc: 0.0000e+00
Epoch 6/50
1/1 - 0s - loss: 0.8899 - acc: 0.0000e+00 - val_loss: 0.7689 - val_acc: 0.0000e+00
Epoch 7/50
1/1 - 0s - loss: 0.7362 - acc: 0.0000e+00 - val_loss: 0.6734 - val_acc: 0.0000e+00
Epoch 8/50
1/1 - 0s - loss: 0.6950 - acc: 0.0000e+00 - val_loss: 0.7262 - val_acc: 0.0000e+00
Epoch 9/50
1/1 - 0s - loss: 0.7597 - acc: 0.0000e+00 - val_loss: 0.9009 - val_acc: 0.0000e+00
Epoch 10/50
1/1 - 0s - loss: 1.0559 - acc: 0.0000e+00 - val_loss: 0.8252 - val_acc: 0.0000e+00
Epoch 11/50
1/1 - 0s - loss: 0.9358 - acc: 0.0000e+00 - val_loss: 0.6823 - val_acc: 0.0000e+00
Epoch 12/50
1/1 - 0s - loss: 0.7335 - acc: 0.0000e+00 - val_loss: 0.6319 - val_acc: 0.0000e+00
Epoch 13/50
1/1 - 0s - loss: 0.5645 - acc: 0.0000e+00 - val_loss: 0.6511 - val_acc: 0.0000e+00
Epoch 14/50
1/1 - 0s - loss: 0.6124 - acc: 0.0000e+00 - val_loss: 0.6859 - val_acc: 0.0000e+00
Epoch 15/50
1/1 - 0s - loss: 0.6148 - acc: 0.0000e+00 - val_loss: 0.7000 - val_acc: 0.0000e+00
Epoch 16/50
1/1 - 0s - loss: 0.6342 - acc: 0.0000e+00 - val_loss: 0.6994 - val_acc: 0.0000e+00
Epoch 17/50
1/1 - 0s - loss: 0.6002 - acc: 0.0000e+00 - val_loss: 0.6820 - val_acc: 0.0000e+00
Epoch 18/50
1/1 - 0s - loss: 0.5741 - acc: 0.0000e+00 - val_loss: 0.6816 - val_acc: 0.0000e+00
Epoch 19/50
1/1 - 0s - loss: 0.5270 - acc: 0.0000e+00 - val_loss: 0.6888 - val_acc: 0.0000e+00
Epoch 20/50
1/1 - 0s - loss: 0.5358 - acc: 0.0000e+00 - val_loss: 0.7120 - val_acc: 0.0000e+00
Epoch 21/50
1/1 - 0s - loss: 0.5582 - acc: 0.0000e+00 - val_loss: 0.7397 - val_acc: 0.0000e+00
Epoch 22/50
1/1 - 0s - loss: 0.6449 - acc: 0.0000e+00 - val_loss: 0.7800 - val_acc: 0.0000e+00
Epoch 23/50
1/1 - 0s - loss: 0.5558 - acc: 0.0000e+00 - val_loss: 0.6514 - val_acc: 0.0000e+00
Epoch 24/50
1/1 - 0s - loss: 0.5180 - acc: 0.0000e+00 - val_loss: 0.6666 - val_acc: 0.0000e+00
Epoch 25/50
1/1 - 0s - loss: 0.5070 - acc: 0.0000e+00 - val_loss: 0.6780 - val_acc: 0.0000e+00
Epoch 26/50
1/1 - 0s - loss: 0.5492 - acc: 0.0000e+00 - val_loss: 0.7167 - val_acc: 0.0000e+00
Epoch 27/50
1/1 - 0s - loss: 0.5637 - acc: 0.0000e+00 - val_loss: 0.7343 - val_acc: 0.0000e+00
Epoch 28/50
1/1 - 0s - loss: 0.6122 - acc: 0.0000e+00 - val_loss: 0.7493 - val_acc: 0.0000e+00
Epoch 29/50
1/1 - 0s - loss: 0.6122 - acc: 0.0000e+00 - val_loss: 0.7420 - val_acc: 0.0000e+00
Epoch 30/50
1/1 - 0s - loss: 0.5376 - acc: 0.0000e+00 - val_loss: 0.7160 - val_acc: 0.0000e+00
Epoch 31/50
1/1 - 0s - loss: 0.5866 - acc: 0.0000e+00 - val_loss: 0.6874 - val_acc: 0.0000e+00
Epoch 32/50
1/1 - 0s - loss: 0.5429 - acc: 0.0000e+00 - val_loss: 0.6786 - val_acc: 0.0000e+00
Epoch 33/50
1/1 - 0s - loss: 0.6190 - acc: 0.0000e+00 - val_loss: 0.6995 - val_acc: 0.0000e+00
Epoch 34/50
1/1 - 0s - loss: 0.5032 - acc: 0.0000e+00 - val_loss: 0.7124 - val_acc: 0.0000e+00
Epoch 35/50
1/1 - 0s - loss: 0.5199 - acc: 0.0000e+00 - val_loss: 0.7479 - val_acc: 0.0000e+00
Epoch 36/50
1/1 - 0s - loss: 0.5283 - acc: 0.0000e+00 - val_loss: 0.7146 - val_acc: 0.0000e+00
Epoch 37/50
1/1 - 0s - loss: 0.5227 - acc: 0.0000e+00 - val_loss: 0.6946 - val_acc: 0.0000e+00
Epoch 38/50
1/1 - 0s - loss: 0.4818 - acc: 0.0000e+00 - val_loss: 0.6404 - val_acc: 0.0000e+00
Epoch 39/50
1/1 - 0s - loss: 0.4635 - acc: 0.0000e+00 - val_loss: 0.6060 - val_acc: 0.0000e+00
Epoch 40/50
1/1 - 0s - loss: 0.4405 - acc: 0.0000e+00 - val_loss: 0.6063 - val_acc: 0.0000e+00
Epoch 41/50
1/1 - 0s - loss: 0.4395 - acc: 0.0000e+00 - val_loss: 0.6093 - val_acc: 0.0000e+00
Epoch 42/50
1/1 - 0s - loss: 0.4369 - acc: 0.0000e+00 - val_loss: 0.6352 - val_acc: 0.0000e+00
Epoch 43/50
1/1 - 0s - loss: 0.4165 - acc: 0.0000e+00 - val_loss: 0.6561 - val_acc: 0.0000e+00
Epoch 44/50
1/1 - 0s - loss: 0.3748 - acc: 0.0000e+00 - val_loss: 0.6558 - val_acc: 0.0000e+00
Epoch 45/50
1/1 - 0s - loss: 0.3929 - acc: 0.0000e+00 - val_loss: 0.6668 - val_acc: 0.0000e+00
Epoch 46/50
1/1 - 0s - loss: 0.3927 - acc: 0.0000e+00 - val_loss: 0.6897 - val_acc: 0.0000e+00
Epoch 47/50
1/1 - 0s - loss: 0.3800 - acc: 0.0000e+00 - val_loss: 0.7172 - val_acc: 0.0000e+00
Epoch 48/50
1/1 - 0s - loss: 0.3637 - acc: 0.0000e+00 - val_loss: 0.7382 - val_acc: 0.0000e+00
Epoch 49/50
1/1 - 0s - loss: 0.3720 - acc: 0.0000e+00 - val_loss: 0.7558 - val_acc: 0.0000e+00
Epoch 50/50
1/1 - 0s - loss: 0.3790 - acc: 0.0000e+00 - val_loss: 0.8127 - val_acc: 0.0000e+00
1/1 [==============================] - 0s 487us/step - loss: 0.3245 - acc: 0.0000e+00
1/1 [==============================] - 0s 266us/step - loss: 0.8127 - acc: 0.0000e+00

Train Set Metrics of the trained model:
        loss: 0.3245
        acc: 0.0000

Test Set Metrics of the trained model:
        loss: 0.8127
        acc: 0.0000

closed time in a month

davidho27941

issue commentstellargraph/stellargraph

Problem in GCN link prediction example

You're pretty much correct, yeah.

I'll rephrase the options, starting from the best/easiest:

  1. replace acc with binary_accuracy: model.compile(..., metrics=["binary_accuracy"])
  2. remove the Reshapeing
  3. use TF 2.1 with stellargraph 1.2.1 (that is, version 2.1 seems to work for me, not just 2.0), where fit, evaluate and predict should work too (no need to use ..._generator)

The last is much better than using TF 2.0, and I only noticed in my previous comment when working out the true underlying problem.

davidho27941

comment created time in a month

PR opened stellargraph/stellargraph

Avoid TF 2.2 bug via 'binary_accuracy' metric in GCN link prediction

This works around a change in behaviour of metrics=["acc"] in TensorFlow 2.2 (https://github.com/tensorflow/tensorflow/issues/41361) by replacing that with metrics=["binary_accuracy"]. All other changes here are just from rerunning the notebook.

Previously, in TF 2.0 and 2.1, the acc/accuracy metric was being translated into binary_accuracy, because the compile method was noticing that the loss function was binary_crossentropy. In TF 2.2, the inference for acc/accuracy no longer looks at the loss function, and instead just relies on the last dimension of the prediction tensor: if the dimension is equal to 1, it uses binary accuracy, otherwise categorical accuracy.

This GCN link prediction demo uses the binary crossentropy loss function, but has a Reshape((-1,)) layer, that removes the natural last dimension of size 1, leaving a predictions tensor with last dimension of size 2708 (number of nodes in the graph). This means that the inference of the true metric for the acc name was changing from the correct binary_accuracy in TF 2.1 to the incorrect categorical_accuracy in TF 2.2

Fixes: #1766

+68 -65

0 comment

1 changed file

pr created time in a month

create barnchstellargraph/stellargraph

branch : bugfix/1766-gcn-lp

created branch time in a month

issue commentstellargraph/stellargraph

Problem in GCN link prediction example

Ah, I think this is actually caused by TF 2.2, not TF 2.1. The commit that had the failure (https://github.com/stellargraph/stellargraph/commit/36c2a493f457902af1373150ff8eb6564c6d8492) relaxes the requirement from tensorflow>=2.0.0, <2.1.0 to tensorflow>=2.0.1 , meaning that tensorflow==2.2.0 gets installed.

It looks like TF 2.2 dramatically changed how metrics worked, including how the acc name is computed. In particular, it changes the is_binary_crossentropy vs. is_binary decision:

  • TF 2.1: https://github.com/tensorflow/tensorflow/blob/v2.1.0/tensorflow/python/keras/engine/training_utils.py#L1094-L1133
    is_binary_crossentropy = (
        isinstance(loss_fn, losses.BinaryCrossentropy) or
        (isinstance(loss_fn, losses.LossFunctionWrapper) and
         loss_fn.fn == losses.binary_crossentropy))
    ...
      if output_shape[-1] == 1 or is_binary_crossentropy:
    
  • TF 2.2: https://github.com/tensorflow/tensorflow/blob/v2.2.0/tensorflow/python/keras/engine/compile_utils.py#L423-L477
    is_binary = y_p_last_dim == 1
    ...
      if is_binary
    

That is, TF 2.1 introspects the loss function and notices that we're using binary cross-entropy (despite the last dimension being > 1), while TF 2.2 does not, and just looks at the last dimension. This change in behaviour actually also explicitly violates the documentation for the metrics to the Model.compile method:

When you pass the strings 'accuracy' or 'acc', we convert this to one of tf.keras.metrics.BinaryAccuracy, tf.keras.metrics.CategoricalAccuracy, tf.keras.metrics.SparseCategoricalAccuracy based on the loss function used and the model output shape. We do a similar conversion for the strings 'crossentropy' and 'ce' as well.

I filed https://github.com/tensorflow/tensorflow/issues/41361 about this mismatch/regression.

davidho27941

comment created time in a month

issue openedtensorflow/tensorflow

Keras model.compile(..., metrics=["accuracy"]) no longer introspects loss function in TF 2.2

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS, Colab
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
  • TensorFlow installed from (source or binary): pip/binary
  • TensorFlow version (use command below): v2.2.0-rc4-8-g2b96f3662b, 2.2.0
  • Python version: 3.7.6
  • Bazel version (if compiling from source): N/A
  • GCC/Compiler version (if compiling from source): N/A
  • CUDA/cuDNN version: N/A
  • GPU model and memory: N/A

Describe the current behavior

The documentation of tf.keras.Model.compile includes the following for the metrics parameter:

When you pass the strings 'accuracy' or 'acc', we convert this to one of tf.keras.metrics.BinaryAccuracy, tf.keras.metrics.CategoricalAccuracy, tf.keras.metrics.SparseCategoricalAccuracy based on the loss function used and the model output shape. We do a similar conversion for the strings 'crossentropy' and 'ce' as well.

The code in question only looks at the model shape, and ignores the loss function:

https://github.com/tensorflow/tensorflow/blob/2b96f3662bd776e277f86997659e61046b56c315/tensorflow/python/keras/engine/compile_utils.py#L447-L453

This means that using acc/accuracy for a model doing binary classification with last dimension > 1 (e.g. binary predictions about multiple independent values for each batch element) will incorrectly use categorical accuracy, not binary accuracy. (That is, the compile invocation is equivalent to passing metrics=["categorical_accuracy"].)

Describe the expected behavior

The behaviour stated by the documentation is to look at the loss function in addition to the output shape. That is, if the loss function is binary cross-entropy, metrics=["accuracy"] should be equivalent to metrics=["binary_accuracy"].

This is behaviour of TF < 2.2, such as TF 2.1.1:

https://github.com/tensorflow/tensorflow/blob/3ffdb91f122f556a74a6e1efd2469bfe1063cb5c/tensorflow/python/keras/engine/training_utils.py#L1114-L1121

Standalone code to reproduce the issue

#%pip install tensorflow==2.2.0
#%pip install tensorflow==2.1.1
import tensorflow as tf

inp = tf.keras.Input(3)
model = tf.keras.Model(inp, inp)

model.compile(loss="bce", metrics=["acc"])
model.evaluate(tf.constant([[0.1, 0.6, 0.9]]), tf.constant([[0, 1, 1]]))

if tf.version.VERSION == "2.2.0":
    print(model.compiled_metrics.metrics[0]._fn.__name__)
else:
    print(model._per_output_metrics[0]["acc"]._fn.__name__)

Output

  • TF 2.2.0: categorical_accuracy
  • TF 2.1.1: binary_accuracy

Notebook: https://gist.github.com/huonw/4a95b73e3d8a1c48a8b5fc5297d30772 Colab: https://colab.research.google.com/gist/huonw/4a95b73e3d8a1c48a8b5fc5297d30772

Other info / logs N/A

created time in a month

issue commentstellargraph/stellargraph

Problem in GCN link prediction example

It looks like this may've been caused by the meaning of the acc metric changing between TF 2.0 and TF 2.1. Using acc as a metric name ends up having special cases (that is, it's not the same as tf.keras.metrics.get("acc")), where it tries to deduce whether binary, categorical or sparse categorical accuracy is the appropriate one, based on the shape of the input tensors (example of this logic, in TF 2.2: https://github.com/tensorflow/tensorflow/blob/v2.2.0/tensorflow/python/keras/engine/compile_utils.py#L423-L477 ).

The binary-or-categorical decision seems to be driven by whether the last dimension is 1 or not. For the GCN link prediction example, there's the extra batch dimension of size 1, meaning the binary prediction output is of shape (1, 2708, 1) (for cora with 2708 nodes), but this is flattened to (1, 2708) in the following line:

prediction = keras.layers.Reshape((-1,))(prediction)

This means that Keras thinks that the output is categorical (with 2708 categories) and thus categorical accuracy is used for the acc metric. (Other link prediction examples do not do this flattening, e.g. GraphSAGE has predictions of shape (batch size, 1), which is correct for Keras's inference.)

Thus, this can be fixed in two ways:

  • replace the acc metric with binary_accuracy
  • remove the Reshapeing, to have the prediction output have shape (1, num nodes, 1) instead of (1, num nodes)

This seems to just be a 'surface' issue with the calculation and display of the acc metric, but the model is still being trained correctly and performs predictions correctly.

I don't quite understand why the behaviour of acc seems to have changed between TF 2.0 and TF 2.1, I'll have a quick look but I'm not entirely sure it matters too much.

I'll open a PR to fix this in the demo later today, unless someone beats me to it.

(Unfortunately I'd done all investigation this yesterday, but I couldn't post the comment because GitHub was having problems. Thanks for waiting!)

davidho27941

comment created time in a month

startedoutline/outline

started time in a month

fork huonw/pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

https://pandas.pydata.org

fork in a month

issue commentstellargraph/stellargraph

Problem in GCN link prediction example

I haven't been able to make any progress on this today, so any further debugging by me will have to wait until Monday next week (AEST).

davidho27941

comment created time in a month

issue commentstellargraph/stellargraph

Problem in GCN link prediction example

Did you mean I can run stellargraph with TF 2.0?

Yes, this would be overriding stellargraph's dependency specification with TF 2.0 because we "know what we are doing". The main reason that TF >= 2.1 is specified is the problem with the non-..._generator methods on Keras Models that I mentioned, and we can manually work around that by using fit_generator instead of fit.

davidho27941

comment created time in a month

delete branch stellargraph/stellargraph

delete branch : feature/prereleases-20200710

delete time in a month

PR closed stellargraph/stellargraph

Test prereleases (including tensorflow 2.3.0-rc1)

https://github.com/tensorflow/tensorflow/releases/tag/v2.3.0-rc1 was released a day ago (at time of filing)

+17 -17

1 comment

1 changed file

huonw

pr closed time in a month

issue commentstellargraph/stellargraph

Notebooks fail with py2neo==5.0b1

The versioning syntax has changed, e.g. https://pypi.org/project/py2neo/2020.7b6/, but tests still fail (#1769).

huonw

comment created time in a month

pull request commentstellargraph/stellargraph

Test prereleases (including tensorflow 2.3.0-rc1)

Looks like the only failure is #1743.

huonw

comment created time in a month

issue commentstellargraph/stellargraph

Problem in GCN link prediction example

It looks like this was caused by the update from TensorFlow 2.0 to 2.1: 36c2a493f457902af1373150ff8eb6564c6d8492.

Unfortunately, current GCN link prediction demo doesn't work with TensorFlow 2.0 due to a bug in TensorFlow: the predict and fit methods don't properly support the Keras Sequence values used. However, the ..._generator methods like fit_generator do work. That is, one workaround is:

  • install tensorflow 2.0 (like pip install tensorflow==2.0.2)
  • use fit_generator, predict_generator, evaluate_generator instead of fit, predict, evaluate

This is suboptimal, since there are various bugs that affect StellarGraph and are fixed in TensorFlow 2.1 and 2.2, and because those ..._generator methods are deprecated.

I haven't narrowed down the exact cause yet, but am still looking at it.

<details><summary>Bisect info, for reference</summary>

$ git bisect log
# bad: [e13075bd6eccc0d01fedd489f02a895cb2acc6d6] Release 0.10.0
# good: [d1499f61a42db8bef4db6b0f0d7d74d673bc2e20] Release 0.9.0
git bisect start 'v0.10.0' 'v0.9.0'
# good: [a120fa0be92489b20bd32ef99d65110a167370e7] Speed up notebooks with Papermill to add them to CI (#820) (#860)
git bisect good a120fa0be92489b20bd32ef99d65110a167370e7
# bad: [bd55000a9ffc759aa0e20ceeb0d572fb52dd0d59] Test for warning about from_networkx 0-vector feature default (#897)
git bisect bad bd55000a9ffc759aa0e20ceeb0d572fb52dd0d59
# bad: [6688f5eb4ff80d7a2d2a27190ebe4d1d71689b06] Add Neo4j neighbourhood sampling for GraphSAGE (#799)
git bisect bad 6688f5eb4ff80d7a2d2a27190ebe4d1d71689b06
# good: [ceda25a21d8ea8a402b51d7cc6e7653cc1ef6f06] Add remaining notebooks to CI that were too slow to test (#820) (#874)
git bisect good ceda25a21d8ea8a402b51d7cc6e7653cc1ef6f06
# good: [50bb3e2b9aac5af831caa444d36f8304f4fa3be8] Install papermill as a test dependency (#878)
git bisect good 50bb3e2b9aac5af831caa444d36f8304f4fa3be8
# bad: [bbd6a924b8e3d482180ad04ef0116d0efbbcce6c] Check notebook formatting in its own step on CI (#877)
git bisect bad bbd6a924b8e3d482180ad04ef0116d0efbbcce6c
# bad: [36c2a493f457902af1373150ff8eb6564c6d8492] Remove pinning to tensorflow 2.0 to update to latest tensorflow (currently 2.1.0) (#875)
git bisect bad 36c2a493f457902af1373150ff8eb6564c6d8492
# first bad commit: [36c2a493f457902af1373150ff8eb6564c6d8492] Remove pinning to tensorflow 2.0 to update to latest tensorflow (currently 2.1.0) (#875)

Each step was assessed by:

  • loading: https://colab.research.google.com/github/stellargraph/stellargraph/blob/$COMMIT/demos/link-prediction/gcn/cora-gcn-links-example.ipynb
  • adding a first cell: %pip install git+https://github.com/stellargraph/stellargraph.git@$COMMIT
  • running the whole notebook and observing the training metrics

</details>

davidho27941

comment created time in a month

PR opened stellargraph/stellargraph

Test prereleases (including tensorflow 2.3.0-rc1)

https://github.com/tensorflow/tensorflow/releases/tag/v2.3.0-rc1 was released a day ago (at time of filing)

+17 -17

0 comment

1 changed file

pr created time in a month

create barnchstellargraph/stellargraph

branch : feature/prereleases-20200710

created branch time in a month

issue commentstellargraph/stellargraph

KeyError: 0 when training with Stellargraph generator.flow

Thanks for providing so much information; it's very helpful!

Firstly, I'm a little confused: it looks like the wget download lines are downloading HTML files, not the real CSV files. When I run those commands, the first few lines are:






<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
  <link rel="dns-prefetch" href="https://github.githubassets.com">

I think the 'raw' file needs to be downloaded instead, e.g. https://github.com/pranavn91/blockchain/raw/master/tx2009partvertices_new.csv (notice /raw/ instead of /blob/). I found that link using the "Raw" button on the GitHub view:

image

Based on the error, I suspect the subjects have a different index to the nodes. Could you show how you're parsing the files? Looking at the files, you might need to pass index_col=0 to pd.read_csv (if you're using Pandas) or use .set_index(...).

(I've edited the description to use Markdown code fences, like ``` to make it easier to understand. I hope that's okay!)

pranavn91

comment created time in a month

issue commentstellargraph/stellargraph

Problem in GCN link prediction example

Yeah, version 0.9 worked for me. However, I wouldn't recommend going so far back in time: there's been many usability, speed and memory use improvements between 0.9 and the latest 1.2 version. I'm working on narrowing down the problem now and will hopefully have more understanding and maybe a fix in an hour (if not, I'll let you know).

davidho27941

comment created time in a month

issue commentstellargraph/stellargraph

Problem in GCN link prediction example

Thanks for filing an issue. The fact it's failing catastrophically suggests there's some serious misconfiguration, not even just random classification. I can reproduce it, and, on Google Colab, it looks like this works in 0.9.0 (when it was first added), but fails in 0.10.0 (and beyond). I'm going to try to bisect to find what the cause was.

davidho27941

comment created time in a month

issue commentstellargraph/stellargraph

Inductive/feature-based knowledge graph algorithms

The "allow embeddings lookups" generalisation to RGCN could be equally well applied to GCN (as hinted by #621), and even GraphSAGE and any other algorithm using node features.

huonw

comment created time in a month

issue commentstellargraph/stellargraph

Potentially migrating from Discourse to Discussions

Potential options:

  • switch to a read-only/archive mode (that's hopefully free)
  • take a HTML snapshot:
    • https://meta.discourse.org/t/a-basic-discourse-archival-tool/62614
    • https://github.com/mcmcclur/ArchiveDiscourse
  • similar to an HTML snapshot, use archive.org:
    • Current archives (at the time of writing this comment, it seems only the top-level page is archived, and only in 2019): https://web.archive.org/web/*/https://community.stellargraph.io
    • Discussions of how to add to the archive:
      • https://help.archive.org/hc/en-us/articles/360001513491-Save-Pages-in-the-Wayback-Machine
      • https://www.archiveteam.org/
    • Donation link (if we're going to use them as an archive-of-reference, the least we can do is give them some of the money that we're no longer spending on Discourse): https://archive.org/donate/

No matter how we turn off Discourse, it'd be good to do it in a nice way, so someone either visiting an old link or looking for a previous discussion can find it. For example:

  1. put it in read-only mode with an announcement of the migration (for a while)
  2. once disabled, have every https://community.stellargraph.io/... page redirect to a GitHub Discussions post that talks about the migration, including linking to the snapshot and suggesting continuing the discussion as a Discussions post.
huonw

comment created time in a month

issue openedstellargraph/stellargraph

Potentially migrating from Discourse to Discussions

Description

Unfortunately, we're considering migrating away from https://community.stellargraph.io/. It'd be good to preserve the discussion there in some form, because there's useful insights.

created time in a month

push eventstellargraph/stellargraph

Huon Wilson

commit sha f0590c0bed1a482f63ef1cd10e42fdba62ff8685

Revert develop version to 1.3.0b

view details

push time in a month

push eventstellargraph/stellargraph

Huon Wilson

commit sha df610a9d697c170127af7bfd7426dcb300afe22b

Changelog and version bump for 1.2.1 release (#1760) The changelog includes the changes from 1.2.0 to the current develop: https://github.com/stellargraph/stellargraph/compare/v1.2.0...70d9952

view details

Huon Wilson

commit sha 9370caed1137c2527523a39212072df1760ca00f

Release 1.2.1

view details

Huon Wilson

commit sha 082c0838bcf11a608db028ccf7306483c19a3bb7

Merge branch 'master' into develop

view details

push time in a month

startedzotero/zotero

started time in a month

PR opened stellargraph/stellargraph

Implement UGraphEmb pooling

This implements the pooling layer from UGraphEmb. This doesn't quite fully implement the algorithm:

  • the output of each GCN/GIN layer needs to be pooled separately
  • the loss function seems to become nan when training with this in the existing unsupervised graph classification algorithm, so this clearly isn't fully useable yet

See: #1627

+154 -0

0 comment

4 changed files

pr created time in a month

create barnchstellargraph/stellargraph

branch : feature/1627-ugraphemb-pooling

created branch time in a month

issue commentstellargraph/stellargraph

Google colab: Cora dataset load reports HTTP 404

Thanks for letting us know, @kylebfred !

@huonw , Lise brought up hosting the data externally, and we are looking into it, thanks.

Cool 😄

sourabhXIII

comment created time in a month

issue commentstellargraph/stellargraph

Use multiple graphs in mini batch generator

Hey @davidxujiayang, you may've noticed my comment on #1355: now the APPNP and GAT (and GCN) support taking a ClusterNodeGenerator in addition to a FullBatchNodeGenerator and so can be trained with Clusters (#1531, #1585).

davidxujiayang

comment created time in a month

issue closedstellargraph/stellargraph

Support inductive/scalable training for non-GCN full batch models via clustering

Description

As noted by @davidxujiayang in https://github.com/stellargraph/stellargraph/issues/1275#issuecomment-617982144, training via clustering can apply to more than just GCN, e.g. we could have Cluster-GAT and even Cluster-APPNP.

This issue is about generalising models that can support inductive and scalable training in this manner.

closed time in a month

huonw

issue commentstellargraph/stellargraph

Support inductive/scalable training for non-GCN full batch models via clustering

This ended up being duplicated at #1531, and fixed by #1585.

huonw

comment created time in a month

pull request commentstellargraph/stellargraph

Update LINQS dataset URLs for directory change

Oh, haha! It looks like the new locations still work, though.

huonw

comment created time in a month

issue closedstellargraph/stellargraph

Graph classification

Description

Support at least one graph classification/matching algorithm, expanding the applications of StellarGraph.

User Story

**As a data scientist working with StellarGraph, I want access to graph classification algorithms

Done Checklist

  • [ ] Produced code for required functionality
  • [ ] Tests written and coverage checked
  • [ ] Code review performed
  • [ ] Documentation on Google Docs (if applicable)
  • [ ] Documentation in repo
  • [ ] Version number reflects new status
  • [ ] CHANGELOG.md updated
  • [ ] Team demo

closed time in a month

PantelisElinas

issue closedstellargraph/stellargraph

Cannot save an APPNP model: AttributeError: 'APPNPPropagationLayer' object has no attribute 'final_layer'

Describe the bug

An APPNP model cannot be saved, because it hits an error in APPNPPropagationLayer.get_config.

https://buildkite.com/stellar/stellargraph-public/builds/4753#e083dc95-02d2-40b4-95b1-81d0770d383a/322-469

To Reproduce

Steps to reproduce the behavior:

  1. Run the test_APPNP_save_load test added in https://github.com/stellargraph/stellargraph/pull/1676

Observed behavior

_________________________ test_APPNP_save_load[False] __________________________
 
tmpdir = local('/tmp/pytest-of-root/pytest-0/test_APPNP_save_load_False_0')
sparse = False
 
    @pytest.mark.parametrize("sparse", [False, True])
    def test_APPNP_save_load(tmpdir, sparse):
        G, _ = create_graph_features()
        generator = FullBatchNodeGenerator(G, sparse=sparse)
        appnp = APPNP([2, 3], generator, ["relu", "relu"])
>       test_utils.model_save_load(tmpdir, appnp)
 
tests/layer/test_appnp.py:299:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/test_utils/__init__.py:33: in model_save_load
    keras.models.save_model(model, str(save_model_dir))
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py:138: in save_model
    signatures, options)
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/save.py:78: in save
    save_lib.save(model, filepath, signatures, options)
/usr/local/lib/python3.6/site-packages/tensorflow/python/saved_model/save.py:951: in save
    obj, export_dir, signatures, options, meta_graph_def)
/usr/local/lib/python3.6/site-packages/tensorflow/python/saved_model/save.py:1037: in _build_meta_graph
    asset_info.asset_index)
/usr/local/lib/python3.6/site-packages/tensorflow/python/saved_model/save.py:697: in _serialize_object_graph
    saveable_view.function_name_map)
/usr/local/lib/python3.6/site-packages/tensorflow/python/saved_model/save.py:737: in _write_object_proto
    metadata=obj._tracking_metadata)
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:2742: in _tracking_metadata
    return self._trackable_saved_model_saver.tracking_metadata
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/base_serialization.py:54: in tracking_metadata
    return json_utils.Encoder().encode(self.python_properties)
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/layer_serialization.py:41: in python_properties
    return self._python_properties_internal()
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/model_serialization.py:35: in _python_properties_internal
    metadata = super(ModelSavedModelSaver, self)._python_properties_internal()
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/network_serialization.py:33: in _python_properties_internal
    metadata = super(NetworkSavedModelSaver, self)._python_properties_internal()
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/layer_serialization.py:57: in _python_properties_internal
    metadata.update(get_config(self.obj))
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/layer_serialization.py:115: in get_config
    config = generic_utils.serialize_keras_object(obj)['config']
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py:270: in serialize_keras_object
    config = instance.get_config()
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py:968: in get_config
    return copy.deepcopy(get_network_config(self))
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py:2119: in get_network_config
    layer_config = serialize_layer_fn(layer)
/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py:270: in serialize_keras_object
    config = instance.get_config()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
 
self = <stellargraph.layer.appnp.APPNPPropagationLayer object at 0x7f476525dd30>
 
    def get_config(self):
        """
        Gets class configuration for Keras serialization.
        Used by Keras model serialization.
    
        Returns:
            A dictionary that contains the config of the layer
        """
    
        config = {
            "units": self.units,
>           "final_layer": self.final_layer,
            "teleport_probability": self.teleport_probability,
        }
E       AttributeError: 'APPNPPropagationLayer' object has no attribute 'final_layer'

Expected behavior

Saving should work.

Environment

StellarGraph develop https://github.com/stellargraph/stellargraph/commit/8a3a74c81c33a85b6d8f5a81eb701dd896adf6a3

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

RGCN model fails to save with "'NoneType' object has no attribute 'replace'"

Describe the bug

Noticed in #1251, the RGCN model fails to save.

To Reproduce

import stellargraph as sg
import tensorflow as tf

def test(x):
    print(f"### {type(x).__name__}")
    print("```")
    model = tf.keras.Model(*x.in_out_tensors())
    try:
        tf.keras.models.save_model(model, "/tmp")
    except Exception as e:
        print(f"tf.keras.models.save_model: {e!r}")
        
    try:
        tf.saved_model.save(model, "/tmp")
    except Exception as e:
        print(f"tf.saved_model.save: {e!r}")

    try:
        model.save("/tmp")
    except Exception as e:
        print(f"model.save: {e!r}")
    print("```")


G, _ = sg.datasets.Cora().load()

rg_gen = sg.mapper.RelationalFullBatchNodeGenerator(G)
test(sg.layer.RGCN([4], rg_gen))

Observed behavior

RGCN fails to save with:

AttributeError: 'NoneType' object has no attribute 'replace'

<details><summary>Full stack trace</summary>

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-61-922e5f4550e0> in <module>
      1 rg_gen = sg.mapper.RelationalFullBatchNodeGenerator(G)
----> 2 test(sg.layer.RGCN([4], rg_gen))
      3 

<ipython-input-60-050090835af0> in test(x)
      7     model = tf.keras.Model(*x.in_out_tensors())
      8     try:
----> 9         tf.keras.models.save_model(model, "/tmp")
     10     except Exception as e:
     11         print(f"tf.keras.models.save_model: {e!r}")

~/.pyenv/versions/sg/lib/python3.6/site-packages/tensorflow_core/python/keras/saving/save.py in save_model(model, filepath, overwrite, include_optimizer, save_format, signatures, options)
    113   else:
    114     saved_model_save.save(model, filepath, overwrite, include_optimizer,
--> 115                           signatures, options)
    116 
    117 

~/.pyenv/versions/sg/lib/python3.6/site-packages/tensorflow_core/python/keras/saving/saved_model/save.py in save(model, filepath, overwrite, include_optimizer, signatures, options)
     76     # we use the default replica context here.
     77     with distribution_strategy_context._get_default_replica_context():  # pylint: disable=protected-access
---> 78       save_lib.save(model, filepath, signatures, options)
     79 
     80   if not include_optimizer:

~/.pyenv/versions/sg/lib/python3.6/site-packages/tensorflow_core/python/saved_model/save.py in save(obj, export_dir, signatures, options)
    897   # Note we run this twice since, while constructing the view the first time
    898   # there can be side effects of creating variables.
--> 899   _ = _SaveableView(checkpoint_graph_view)
    900   saveable_view = _SaveableView(checkpoint_graph_view)
    901 

~/.pyenv/versions/sg/lib/python3.6/site-packages/tensorflow_core/python/saved_model/save.py in __init__(self, checkpoint_view)
    163     self.checkpoint_view = checkpoint_view
    164     trackable_objects, node_ids, slot_variables = (
--> 165         self.checkpoint_view.objects_ids_and_slot_variables())
    166     self.nodes = trackable_objects
    167     self.node_ids = node_ids

~/.pyenv/versions/sg/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/graph_view.py in objects_ids_and_slot_variables(self)
    416     object_names = object_identity.ObjectIdentityDictionary()
    417     for obj, path in path_to_root.items():
--> 418       object_names[obj] = _object_prefix_from_path(path)
    419     node_ids = object_identity.ObjectIdentityDictionary()
    420     for node_id, node in enumerate(trackable_objects):

~/.pyenv/versions/sg/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/graph_view.py in _object_prefix_from_path(path_to_root)
     62   return "/".join(
     63       (_escape_local_name(trackable.name)
---> 64        for trackable in path_to_root))
     65 
     66 

~/.pyenv/versions/sg/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/graph_view.py in <genexpr>(.0)
     62   return "/".join(
     63       (_escape_local_name(trackable.name)
---> 64        for trackable in path_to_root))
     65 
     66 

~/.pyenv/versions/sg/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/graph_view.py in _escape_local_name(name)
     55   # edges traversed to reach the variable, so we escape forward slashes in
     56   # names.
---> 57   return (name.replace(_ESCAPE_CHAR, _ESCAPE_CHAR + _ESCAPE_CHAR)
     58           .replace(r"/", _ESCAPE_CHAR + "S"))
     59 

AttributeError: 'NoneType' object has no attribute 'replace'

</details>

<details><summary>Full output</summary>

RGCN

tf.keras.models.save_model: AttributeError("'NoneType' object has no attribute 'replace'",)
tf.saved_model.save: AttributeError("'NoneType' object has no attribute 'replace'",)
model.save: AttributeError("'NoneType' object has no attribute 'replace'",)

</details>

Expected behavior

Every model should support saving.

Environment

Operating system: Darwin-18.6.0-x86_64-i386-64bit Python version:

3.6.9 (default, Jul 10 2019, 12:25:55) 
[GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.4)]

Package versions: <details>

absl-py==0.8.0
ansiwrap==0.8.4
appdirs==1.4.3
appnope==0.1.0
astor==0.8.0
atomicwrites==1.3.0
attrs==19.3.0
backcall==0.1.0
black==19.10b0
bleach==3.1.0
boto==2.49.0
boto3==1.9.230
botocore==1.12.230
cachetools==4.0.0
certifi==2019.9.11
chardet==3.0.4
Click==7.0
coverage==4.5.4
coveralls==1.8.2
cycler==0.10.0
decorator==4.4.0
defusedxml==0.6.0
docopt==0.6.2
docutils==0.15.2
entrypoints==0.3
gast==0.2.2
gensim==3.8.0
google-auth==1.10.0
google-auth-oauthlib==0.4.1
google-pasta==0.1.7
gprof2dot==2019.11.30
grpcio==1.23.0
h5py==2.10.0
idna==2.8
importlib-metadata==0.23
ipykernel==5.1.3
ipython==7.9.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
isodate==0.6.0
jedi==0.15.1
Jinja2==2.10.3
jmespath==0.9.4
joblib==0.13.2
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==5.3.4
jupyter-console==6.0.0
jupyter-core==4.6.1
Keras==2.2.5
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
llvmlite==0.30.0
Mako==1.1.0
Markdown==3.1.1
MarkupSafe==1.1.1
matplotlib==3.1.1
mistune==0.8.4
more-itertools==7.2.0
mplleaflet==0.0.5
mypy==0.750
mypy-extensions==0.4.3
nbclient==0.1.0
nbconvert==5.6.1
nbformat==4.4.0
networkx==2.3
notebook==6.0.2
numba==0.46.0
numpy==1.17.2
oauthlib==3.1.0
opt-einsum==3.1.0
packaging==19.2
pandas==0.25.1
pandocfilters==1.4.2
papermill==1.2.1
parso==0.5.1
pathspec==0.6.0
pdoc3==0.7.2
pexpect==4.7.0
pickleshare==0.7.5
pluggy==0.13.1
prometheus-client==0.7.1
prompt-toolkit==2.0.10
protobuf==3.9.1
ptyprocess==0.6.0
py==1.8.0
py-cpuinfo==5.0.0
py4j==0.10.7
pyasn1==0.4.8
pyasn1-modules==0.2.7
pydot==1.4.1
Pygments==2.4.2
Pympler==0.8
pyparsing==2.4.2
pyrsistent==0.15.6
pyspark==2.4.4
pyspark-stubs==2.4.0.post6
pytest==5.3.1
pytest-benchmark==3.2.2
pytest-cov==2.8.1
pytest-profiling==1.7.0
pytest-repeat==0.8.0
python-dateutil==2.8.0
pytz==2019.2
PyYAML==5.1.2
pyzmq==18.1.1
qtconsole==4.6.0
rdflib==4.2.2
regex==2019.12.9
requests==2.22.0
requests-oauthlib==1.3.0
rsa==4.0
s3transfer==0.2.1
scikit-learn==0.21.3
scipy==1.4.1
seaborn==0.10.0
Send2Trash==1.5.0
six==1.12.0
smart-open==1.8.4
-e git+git@github.com:stellargraph/stellargraph.git@c337c18fa391169234ccc0f46b699493d96748a7#egg=stellargraph
tenacity==6.0.0
tensorboard==2.1.1
tensorflow==2.1.0
tensorflow-estimator==2.1.0
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
textwrap3==0.9.2
toml==0.10.0
tornado==6.0.3
tqdm==4.42.1
traitlets==4.3.3
treon==0.1.3
typed-ast==1.4.0
typing-extensions==3.7.4.1
urllib3==1.25.3
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.15.6
widgetsnbextension==3.5.1
wrapt==1.11.2
zipp==0.6.0

</details>

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

Some algorithms supported in StellarGraph via demos aren't listed in documentation on readthedocs

Describe the bug

In StellarGraph, some algorithms are in demo notebooks or scripts, without being listed in our main documentation. This could mean that a user reading our docs thinks they're not supported in StellarGraph.

closed time in a month

timpitman

issue closedstellargraph/stellargraph

Flaky test: test_nodemapper_isolated_nodes in tests.mapper.test_node_mappers

Describe the bug

https://buildkite.com/stellar/stellargraph-public/builds/4206#_

def test_nodemapper_isolated_nodes():
        n_feat = 4
        n_batch = 2
    
        # test graph
        G = example_graph_random(feature_size=n_feat, n_nodes=6, n_isolates=1, n_edges=20)
    
        # Check connectedness
        Gnx = G.to_networkx()
        ccs = list(nx.connected_components(Gnx))
>       assert len(ccs) == 2
E       assert 3 == 2
E        +  where 3 = len([{0}, {1, 2, 3, 4}, {5}])

tests/mapper/test_node_mappers.py:305: AssertionError

Environment

StellarGraph: https://github.com/stellargraph/stellargraph/commit/e2086659d8b1c3921978cec6749a78699e6f522a

<details>

absl-py==0.9.0
ansiwrap==0.8.4
appdirs==1.4.4
astunparse==1.6.3
async-generator==1.10
attrs==19.3.0
backcall==0.1.0
black==19.10b0
bleach==3.1.5
boto==2.49.0
boto3==1.13.11
botocore==1.16.11
cachetools==4.1.0
certifi==2020.4.5.1
chardet==3.0.4
click==7.1.2
commonmark==0.9.1
coverage==4.5.4
cycler==0.10.0
decorator==4.4.2
defusedxml==0.6.0
docopt==0.6.2
docutils==0.15.2
entrypoints==0.3
gast==0.3.3
gensim==3.8.3
google-auth==1.14.3
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.29.0
h5py==2.10.0
idna==2.9
ipykernel==5.2.1
ipython==7.14.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
isodate==0.6.0
jedi==0.17.0
Jinja2==2.11.2
jmespath==0.10.0
joblib==0.15.1
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==6.1.3
jupyter-console==6.1.0
jupyter-core==4.6.3
Keras-Preprocessing==1.1.2
kiwisolver==1.2.0
llvmlite==0.32.1
Markdown==3.2.2
MarkupSafe==1.1.1
matplotlib==3.2.1
mistune==0.8.4
more-itertools==8.3.0
mplleaflet==0.0.5
nbclient==0.3.0
nbconvert==5.6.1
nbformat==5.0.6
nest-asyncio==1.3.3
networkx==2.4
notebook==6.0.3
numba==0.49.1
numpy==1.18.4
oauthlib==3.1.0
opt-einsum==3.2.1
packaging==20.3
pandas==1.0.3
pandocfilters==1.4.2
papermill==2.1.1
parso==0.7.0
pathspec==0.8.0
pexpect==4.8.0
pickleshare==0.7.5
pluggy==0.13.1
prometheus-client==0.7.1
prompt-toolkit==3.0.5
protobuf==3.12.0
ptyprocess==0.6.0
py==1.8.1
py-cpuinfo==5.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
Pygments==2.6.1
pyparsing==2.4.7
pyrsistent==0.16.0
pytest==5.3.1
pytest-benchmark==3.2.3
pytest-cov==2.8.1
python-dateutil==2.8.1
pytz==2020.1
PyYAML==5.3.1
pyzmq==19.0.1
qtconsole==4.7.4
QtPy==1.9.0
rdflib==5.0.0
regex==2020.5.14
requests==2.23.0
requests-oauthlib==1.3.0
rsa==4.0
s3transfer==0.3.3
scikit-learn==0.23.0
scipy==1.4.1
seaborn==0.10.1
Send2Trash==1.5.0
six==1.14.0
smart-open==2.0.0
tenacity==6.2.0
tensorboard==2.2.1
tensorboard-plugin-wit==1.6.0.post3
tensorflow==2.2.0
tensorflow-estimator==2.2.0
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
textwrap3==0.9.2
threadpoolctl==2.0.0
toml==0.10.1
tornado==6.0.4
tqdm==4.46.0
traitlets==4.3.3
treon==0.1.3
typed-ast==1.4.1
urllib3==1.25.9
wcwidth==0.1.9
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.5.1
wrapt==1.12.1

</details>

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

Links to documentation in Read the Docs demo notebooks should stay within the same doc version

Description

Several of our notebooks have links to our documentation, such as

https://github.com/stellargraph/stellargraph/blob/625a99b798ef73957426c74066405457ff31174a/demos/node-classification/gcn/gcn-cora-node-classification-example.ipynb#L125

[the `Cora` loader](https://stellargraph.readthedocs.io/en/stable/api.html#stellargraph.datasets.datasets.Cora)

When published on Read the Docs (https://stellargraph.readthedocs.io/en/latest/demos/node-classification/gcn/gcn-cora-node-classification-example.html#Loading-the-CORA-network, although the link is broken at the time of writing, fixed by #1394), this link should point to the same version of the Read the Docs. Specifically, when viewing the develop docs (.../latest/...), the link should stay on the develop docs, but currently it always goes to .../stable/....

Once we've done several stable releases with notebooks, like this, it'll ensure that looking at 1.0.0 notebooks always links to 1.0.0 documentation, and similarly for 1.1.0, etc.

closed time in a month

huonw

issue closedstellargraph/stellargraph

40k instances of DeprecationWarning: tostring() is deprecated. Use tobytes() instead. in Github Actions logs

Describe the bug

Tests runs on GitHub Actions result in thousands of lines of logging, related to a single deprecation warning.

Details:

Actions Buildkite Actions status commit
https://github.com/stellargraph/stellargraph/runs/783114964?check_suite_focus=true#step:6:88 https://buildkite.com/stellar/stellargraph-public/builds/4814#3e6b3710-da57-4f89-bd4f-410a8cf6c628/262-340 good https://github.com/stellargraph/stellargraph/commit/3165a5281004c88c556a67e7a2d0e4eb59c8d427 (last good)
https://github.com/stellargraph/stellargraph/runs/793566010?check_suite_focus=true#step:6:39880 https://buildkite.com/stellar/stellargraph-public/builds/4886#e1f2642d-36ed-4dce-9fa6-3ef999115437/262-340 bad https://github.com/stellargraph/stellargraph/commit/aa2accf6ecc61347e71cd64d6b16218a7444d239 (first bad)
https://github.com/stellargraph/stellargraph/runs/797744297?check_suite_focus=true#step:6:39881 https://buildkite.com/stellar/stellargraph-public/builds/4962#8f167b08-b3a5-49e5-947b-81c3bffeaca9/262-340 bad https://github.com/stellargraph/stellargraph/commit/ba95b0ea5212799b39278341c5c450bfef393632 (current develop)

To Reproduce

???

Observed behavior

=============================== warnings summary ===============================
...
tests/test_ensemble.py::test_ensemble_init_parameters
tests/test_ensemble.py::test_ensemble_init_parameters
tests/test_ensemble.py::test_ensemble_init_parameters
...
tests/reproducibility/test_graphsage.py::test_link_prediction[False]
tests/reproducibility/test_graphsage.py::test_link_prediction[False]
tests/reproducibility/test_graphsage.py::test_link_prediction[False]
tests/reproducibility/test_graphsage.py::test_link_prediction[False]
  /opt/hostedtoolcache/Python/3.6.10/x64/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
    tensor_proto.tensor_content = nparray.tostring()

Expected behavior

A clear and concise description of what you expected to happen.

Environment

Operating system: Ubuntu, Windows

Python version: 3.6, 3.7, 3.8

Package versions: see above for StellarGraph commits

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

Deprecate Cluster-GCN in favour of GCN

Description

With #1585, the GCN model (among others) now supports training with ClusterNodeGenerator. We can thus deprecate (and deduplicate) the ClusterGCN model. This includes migrating the demos.

closed time in a month

huonw

issue closedstellargraph/stellargraph

Flaky test: test_squeezedsparseconversion_dtype

Describe the bug

The test_squeezedsparseconversion_dtype in tests/layer/test_misc.py test sometimes fails.

To Reproduce

  1. Run the test many times

Observed behavior

https://github.com/stellargraph/stellargraph/pull/1694/checks?check_run_id=797348419#step:6:113

2020-06-23T00:01:31.3083345Z _____________________ test_squeezedsparseconversion_dtype ______________________
2020-06-23T00:01:31.3083472Z 
2020-06-23T00:01:31.3083628Z     def test_squeezedsparseconversion_dtype():
2020-06-23T00:01:31.3084006Z         N = 10
2020-06-23T00:01:31.3084163Z         x_t = keras.Input(batch_shape=(1, N, 1), dtype="float64")
2020-06-23T00:01:31.3084326Z         A_ind = keras.Input(batch_shape=(1, None, 2), dtype="int64")
2020-06-23T00:01:31.3084492Z         A_val = keras.Input(batch_shape=(1, None), dtype="float32")
2020-06-23T00:01:31.3084646Z     
2020-06-23T00:01:31.3084807Z         A_mat = SqueezedSparseConversion(shape=(N, N), dtype="float64")([A_ind, A_val])
2020-06-23T00:01:31.3084966Z     
2020-06-23T00:01:31.3085115Z         x_out = keras.layers.Lambda(
2020-06-23T00:01:31.3085378Z             lambda xin: K.expand_dims(K.dot(xin[0], K.squeeze(xin[1], 0)), 0)
2020-06-23T00:01:31.3085533Z         )([A_mat, x_t])
2020-06-23T00:01:31.3085665Z     
2020-06-23T00:01:31.3085985Z         model = keras.Model(inputs=[x_t, A_ind, A_val], outputs=x_out)
2020-06-23T00:01:31.3086136Z     
2020-06-23T00:01:31.3086280Z         x = np.random.randn(1, N, 1)
2020-06-23T00:01:31.3086438Z         A_indices, A_values, A = sparse_matrix_example(N)
2020-06-23T00:01:31.3086592Z     
2020-06-23T00:01:31.3086753Z         z = model.predict([x, np.expand_dims(A_indices, 0), np.expand_dims(A_values, 0)])
2020-06-23T00:01:31.3086912Z     
2020-06-23T00:01:31.3087055Z         assert A_mat.dtype == tf.dtypes.float64
2020-06-23T00:01:31.3087822Z >       np.testing.assert_allclose(z.squeeze(), A.dot(x.squeeze()), atol=1e-7)
2020-06-23T00:01:31.3087981Z E       AssertionError: 
2020-06-23T00:01:31.3088303Z E       Not equal to tolerance rtol=1e-07, atol=1e-07
2020-06-23T00:01:31.3088448Z E       
2020-06-23T00:01:31.3088595Z E       Mismatched elements: 1 / 10 (10%)
2020-06-23T00:01:31.3088913Z E       Max absolute difference: 1.34913013e-07
2020-06-23T00:01:31.3089317Z E       Max relative difference: 1.06145628e-06
2020-06-23T00:01:31.3089648Z E        x: array([ 0.      ,  0.      , -0.121112,  0.      , -0.27779 , -0.127102,
2020-06-23T00:01:31.3089974Z E               0.      , -0.117679, -1.698527,  0.      ], dtype=float32)
2020-06-23T00:01:31.3090312Z E        y: array([ 0.      ,  0.      , -0.121112,  0.      , -0.27779 , -0.127102,
2020-06-23T00:01:31.3090797Z E               0.      , -0.117679, -1.698527,  0.      ])
2020-06-23T00:01:31.3090879Z 
2020-06-23T00:01:31.3091022Z tests/layer/test_misc.py:82: AssertionError

Expected behavior

A clear and concise description of what you expected to happen.

Environment

Operating system: Ubuntu CI

Python version: 3.8

Package versions: StellarGraph https://github.com/stellargraph/stellargraph/commit/bcf3ab08ed3b300ca2c9df3d0e9bebc3f67cd357

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

Hyperbolic tests fail on Windows: ValueError: high is out of bounds for int32

Describe the bug

CI run on Windows: #1696 https://github.com/stellargraph/stellargraph/pull/1696/checks?check_run_id=793669259 Results: https://gistpreview.github.io/?f35d36d87f30d84d9c39a7e90512b240#3a757c57-c84c-4ba4-9262-33306656c50d

To Reproduce

Run tests on windows

Observed behavior

The following tests fail:

tests.utils.test_hyperbolic.test_poincare_ball_exp_specialisation
tests.utils.test_hyperbolic.test_poincare_ball_distance_self
tests.utils.test_hyperbolic.test_poincare_ball_distance_exp
tests.utils.test_hyperbolic.test_poincare_ball_distance_vs_euclidean

with an error like:

@pytest.fixture
    def seeded():
>       seed = np.random.randint(2 ** 32)

tests\utils\test_hyperbolic.py:25: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
mtrand.pyx:743: in numpy.random.mtrand.RandomState.randint
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   ValueError: high is out of bounds for int32

_bounded_integers.pyx:1343: ValueError

Expected behavior

Testes should pass.

Environment

Operating system: Windows

Python: 3.6

StellarGraph develop: https://github.com/stellargraph/stellargraph/commit/e9a6ea19a8be05583ef1319bd50db2072123b596

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

Cora loading tests fail on windows: AssertionError: assert dtype('int64') == int

Describe the bug

CI run on Windows: #1696 https://github.com/stellargraph/stellargraph/pull/1696/checks?check_run_id=793669259 Results: https://gistpreview.github.io/?f35d36d87f30d84d9c39a7e90512b240#3a757c57-c84c-4ba4-9262-33306656c50d

To Reproduce

Run tests on windows

Observed behavior

The following tests fail:

tests.datasets.test_datasets.test_cora_load[False-False-False]
tests.datasets.test_datasets.test_cora_load[False-False-True]
tests.datasets.test_datasets.test_cora_load[False-True-False]
tests.datasets.test_datasets.test_cora_load[False-True-True]
tests.datasets.test_datasets.test_cora_load[True-False-False]
tests.datasets.test_datasets.test_cora_load[True-False-True]
tests.datasets.test_datasets.test_cora_load[True-True-False]
tests.datasets.test_datasets.test_cora_load[True-True-True]

with an error like:

>       assert g.nodes().dtype == int
E       AssertionError: assert dtype('int64') == int
E        +  where dtype('int64') = Int64Index([  31336, 1061127, 1106406,   13195,   37879, 1126012, 1107140,\n            1102850,   31349, 1106418,\n    ...454, 1131184, 1128974, 1128975, 1128977,\n            1128978,  117328,   24043],\n           dtype='int64', length=2708).dtype
E        +    where Int64Index([  31336, 1061127, 1106406,   13195,   37879, 1126012, 1107140,\n            1102850,   31349, 1106418,\n    ...454, 1131184, 1128974, 1128975, 1128977,\n            1128978,  117328,   24043],\n           dtype='int64', length=2708) = <bound method StellarGraph.nodes of <stellargraph.core.graph.StellarGraph object at 0x000001EC1EF736A0>>()
E        +      where <bound method StellarGraph.nodes of <stellargraph.core.graph.StellarGraph object at 0x000001EC1EF736A0>> = <stellargraph.core.graph.StellarGraph object at 0x000001EC1EF736A0>.nodes

tests\datasets\test_datasets.py:194: AssertionError

Expected behavior

Pass

Environment

Operating system: Windows

Python: 3.6

StellarGraph develop: https://github.com/stellargraph/stellargraph/commit/e9a6ea19a8be05583ef1319bd50db2072123b596

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

APPNP, GCN, RGCN tests fail on windows: ValueError: Unsupported numpy type: NPY_INT

Describe the bug

CI run on Windows: #1696 https://github.com/stellargraph/stellargraph/pull/1696/checks?check_run_id=793669259 Results: https://gistpreview.github.io/?f35d36d87f30d84d9c39a7e90512b240#3a757c57-c84c-4ba4-9262-33306656c50d

To Reproduce

Run tests on windows

Observed behavior

The following tests fail:

tests.layer.test_appnp.test_APPNP_apply_sparse
tests.layer.test_appnp.test_APPNP_linkmodel_apply_sparse
tests.layer.test_appnp.test_APPNP_apply_propagate_model_sparse
tests.layer.test_deep_graph_infomax.test_dgi[True-RGCN]
tests.layer.test_gcn.test_GraphConvolution_sparse
tests.layer.test_gcn.test_GCN_apply_sparse
tests.layer.test_gcn.test_GCN_linkmodel_apply_sparse
tests.layer.test_rgcn.test_RelationalGraphConvolution_sparse
tests.layer.test_rgcn.test_RGCN_apply_sparse
tests.layer.test_rgcn.test_RGCN_apply_sparse_directed

with an error like:

E     ValueError: Failed to convert a NumPy array to a Tensor (Unsupported numpy type: NPY_INT).

Expected behavior

Tests should pass.

Environment

Operating system: Windows

Python: 3.6

StellarGraph develop: https://github.com/stellargraph/stellargraph/commit/e9a6ea19a8be05583ef1319bd50db2072123b596

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

Add support for edge weights to Watch Your Step

Description

Watch Your Step uses an adjacency matrix. It could use a weighted adjacency matrix to support edge weights.

closed time in a month

huonw

issue closedstellargraph/stellargraph

Test on Windows on Github Actions

Description

Windows is a common platform. We should test on it.

There's some issues that seem to be windows specific: #1668, #1669, #1670, #1698, #1699, #1700.

closed time in a month

huonw

issue closedstellargraph/stellargraph

Unsupervised Cluster-GCN via Deep Graph Infomax

Description

Cluster-GCN is similar to GCN, and so should be able to be trained in an unsupervised fashion, with DGI.

closed time in a month

huonw

issue closedstellargraph/stellargraph

Knowledge graph model documentation lists only the constructor, no methods

Describe the bug

With #1573, the documentation for most of the knowledge graph model methods has disappeared.

To Reproduce

Steps to reproduce the behavior:

  1. Look at the documentation for a knowledge graph model, e.g. ComplEx https://stellargraph.readthedocs.io/en/latest/api.html#stellargraph.layer.ComplEx

Observed behavior

image

Expected behavior

Other methods should be included:

  • in_out_tensors
  • rank_edges_against_all_nodes

Environment

Operating system: N/A

Python version: N/A

Package versions: https://github.com/stellargraph/stellargraph/commit/ef5389c3411f06d02bc28fd48c80c73ab675558f

Additional context

N/A

closed time in a month

huonw

issue closedstellargraph/stellargraph

API docs and demos should have more inter-links

Description

Our API docs and our demos are somewhat separate, e.g. https://stellargraph.readthedocs.io/en/v1.1.0/api.html#stellargraph.StellarGraph.from_networkx doesn't link to the NetworkX demo.

There should be more links, to make it easier for people to jump between and find example code.

closed time in a month

huonw

issue closedstellargraph/stellargraph

Demo notebook navigation would be easier with more information in side-bar ToC

Describe the bug

We have quite a few notebooks, with logically distinct subsections. It might be nice to have this represented in the sidebar, so that someone can more easily jump to adjacent notebooks for a task or to a different section.

For instance, in the Pandas notebook, it'd be nice to be able to see the various sections to jump back to them, and see the NetworkX notebooks (and so on):

image

closed time in a month

huonw

issue closedstellargraph/stellargraph

Implement RotatE's self-adversarial loss function

Description

This is equations (5) and (6) from [1], and that paper includes an comparison of different scoring techniques in Table 7.

[1] Z. Sun, Z.-H. Deng, J.-Y. Nie, and J. Tang, “RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space,” arXiv:1902.10197, Feb. 2019.

image

image

closed time in a month

huonw
more