Commit 70d480cb authored by Carlos GO's avatar Carlos GO
Browse files

readme predict

parent aab6faed
......@@ -94,7 +94,7 @@ The output of the GED computation on a set of graphs is a pickled list object co
You can convert this output to a distance matrix and a list indicating the graphs each entry in the distance matrix correspond to.
```python
>>> from RNAmigos.post_ged import data_prepare
>>> from RNAmigos.post_ged import prepare_data
>>> geds = '../data/geds_delta.pickle'
>>> fps = '../data/all_rna_ligands_fingerprints.pickle'
>>> DM, L, graphlist = prepare_data(geds, fps)
......@@ -149,10 +149,23 @@ array([ 8., 10., 10., 8., 22., 10., 6., 8., 8., 18., 20., 8., 10.,
### Embedding a single graph
```python
>>> G = nx.read_gpickle('/data/1jau.nxpickle')
>>> prototoypes = pickle.load('/data/sample_prototoypes.pickle')
>>> graph_embed(G, prototypes)
array([1., 0., 2., 4., 1., 4., 5.])
>>> G = nx.read_gpickle('data/1jau.nxpickle')
>>> prototoypes = pickle.load('data/sample_prototoypes.pickle')
>>> x = graph_embed(G, prototypes)
>>> x
array([ 0, 21, 17, 19, 17, 17, 13, 12, 11, 17, 14, 18, 17, 11, 17, 15, 7,
11, 19, 9, 19, 4, 14, 15, 23, 18, 18, 7, 19, 18, 16, 24, 23, 16,
16, 16, 24, 19, 13, 15, 14, 12, 13, 21, 17, 16, 17, 17, 6, 12, 13,
14, 2, 18, 19, 18, 16, 20, 19, 19, 23, 2, 19, 19, 4, 3, 4, 17,
18, 18, 9, 15, 17, 19, 21, 22, 18, 1, 21, 21, 16, 15, 18, 10, 2,
11, 0, 14, 19, 10, 15, 18, 14, 13, 11, 23, 1, 0, 3, 6, 2, 14,
21, 19, 22, 19, 20, 16, 17, 1, 20, 18, 11, 0, 10, 1, 3, 21, 21,
10, 27, 16, 14, 4, 5, 0, 14, 18, 12, 15, 18, 1, 1, 18, 6, 20,
0, 19, 17, 23, 14, 19, 16, 17, 17, 21, 21, 29, 16, 18, 18, 20, 19,
16, 18, 1, 1, 1, 18, 3, 0, 15, 21, 18, 15, 17, 12, 18, 6, 1,
8, 3, 6, 21, 19, 1, 15, 22, 14, 20, 20, 20, 16, 20, 16, 9, 5,
5, 13, 6])
```
......@@ -163,7 +176,24 @@ User-friendly API coming soon.
Once all graphs are embedded we have the standard machine learning input matrix $X$ with $n$ examples as rows and $r$ features as distances to each prototype.
Any type of classification can now be performed using a label (output) vector for single-class classification or matrix for multi-output classification.
Alternatively, we can classify graphs using k-nearest neighbours and skip the embedding procedure.
You can load a pre-trained model in `/models/` and embed new graphs to make predictions:
Alternatively, we can classify graphs using k-nearest neighbours and skip the embedding procedure.
```python
>>> mlp = pickle.load(open('models/mlp_pre_train_spanning_190.pickle', 'rb'))
>>> x = np.reshape((1, 190))
>>> mlp.predict(x)
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0,
0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1,
1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0]])
```
The result is a 166 bit MACSS molecular fingerprint for the given graph embedding.
......@@ -200,7 +200,7 @@ def graph_embed(G, prototypes):
`array`: numpy array representing embedding vector.
"""
embedding = np.zeros(len(prototypes))
embedding = []
for p in prototypes:
ops,_,_ = ged((G,p), source_only=True)
embedding.append(ops.cost)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment