help.html 6.48 KB
Newer Older
Carlos GO's avatar
help  
Carlos GO committed
1
2
3
4
{% extends "default.html" %}

{% block body %}

Carlos GO's avatar
Carlos GO committed
5
<h1> What is RNAmigos? </h1>
Carlos GO's avatar
help  
Carlos GO committed
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

RNAmigos is a Graph Neural Network for predicting RNA small molecule ligands. <br>

We trained RNAmigos on a set of RNA-small molecule complexes from the PDB DataBank to predict the associated ligand given only the binding site structure.

The input to RNAmigos is an RNA binding site structure. We can accept atomic coordinates, which are internally converted to a base pairing network, or we can take a base pairing network directly.

From the base pairing network, we predict a molecular fingerprint.

The resulting fingerprint can be used to identify similar molecules in a ligand database and thus accelerate virtual screening.

For full details see article.



Carlos GO's avatar
Carlos GO committed
21
<h1> What isn't RNAmigos? </h1>
Carlos GO's avatar
help  
Carlos GO committed
22
23

RNAmigos is not a docking tool, or an affinity predictor, hence we do not require a docked complex as input, nor do we return an affinity score.
Carlos GO's avatar
help    
Carlos GO committed
24
We also do not scan full RNAs for binding sites, this tool assumes that the input structure already represents a binding site.
Carlos GO's avatar
help  
Carlos GO committed
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

<h1> Uploading Structure Data </h1>

We accept two input formats: <code>mmCIF (.cif)</code> and <code>JSON (.json)</code>. mmCIF is a format for describing atomic cooridnates, and we accept graphs encoded in the JSON <a href="https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.readwrite.json_graph.node_link_data.html#networkx.readwrite.json_graph.node_link_data">node-link</a> format.

The graph must have a node attribute <code>'nt'</code> which stores one of 5 possible nucleotides as characters <code>'A', 'U', 'C', 'G', 'N'</code> and the edges must have a <code>'label'</code> attribute which stores one of 13 possible edge labels, according to the Leontis-Westhof nomenclature.

Here is a sample node-link JSON:

<pre>
<code>
{'directed': False,
 'multigraph': False,
 'graph': {},
 'nodes': [{'nt': 'A', 'id': ('A', 5)},
  {'nt': 'G', 'id': ('A', 6)},
  {'nt': 'U', 'id': ('A', 26)},
  {'nt': 'A', 'id': ('A', 7)},
  {'nt': 'C', 'id': ('A', 25)},
  {'nt': 'U', 'id': ('A', 9)},
  {'nt': 'U', 'id': ('A', 8)},
  {'nt': 'U', 'id': ('A', 24)},
  {'nt': 'A', 'id': ('A', 11)},
  {'nt': 'G', 'id': ('A', 10)},
  {'nt': 'C', 'id': ('A', 23)},
  {'nt': 'U', 'id': ('A', 22)},
  {'nt': 'G', 'id': ('A', 12)},
  {'nt': 'C', 'id': ('A', 13)},
  {'nt': 'C', 'id': ('A', 21)},
  {'nt': 'G', 'id': ('A', 20)}],
 'links': [{'label': 'B53', 'source': ('A', 5), 'target': ('A', 6)},
  {'label': 'CWW', 'source': ('A', 5), 'target': ('A', 26)},
  {'label': 'B53', 'source': ('A', 6), 'target': ('A', 7)},
  {'label': 'CWW', 'source': ('A', 6), 'target': ('A', 25)},
  {'label': 'TSW', 'source': ('A', 6), 'target': ('A', 9)},
  {'label': 'CSS', 'source': ('A', 26), 'target': ('A', 9)},
  {'label': 'B53', 'source': ('A', 26), 'target': ('A', 25)},
  {'label': 'B53', 'source': ('A', 7), 'target': ('A', 8)},
  {'label': 'CWW', 'source': ('A', 7), 'target': ('A', 24)},
  {'label': 'B53', 'source': ('A', 25), 'target': ('A', 24)},
  {'label': 'B53', 'source': ('A', 9), 'target': ('A', 8)},
  {'label': 'B53', 'source': ('A', 9), 'target': ('A', 10)},
  {'label': 'CHW', 'source': ('A', 8), 'target': ('A', 11)},
  {'label': 'B53', 'source': ('A', 24), 'target': ('A', 23)},
  {'label': 'B53', 'source': ('A', 11), 'target': ('A', 10)},
  {'label': 'B53', 'source': ('A', 11), 'target': ('A', 12)},
  {'label': 'CWW', 'source': ('A', 11), 'target': ('A', 22)},
  {'label': 'CWW', 'source': ('A', 10), 'target': ('A', 23)},
  {'label': 'CSW', 'source': ('A', 10), 'target': ('A', 22)},
  {'label': 'B53', 'source': ('A', 23), 'target': ('A', 22)},
  {'label': 'B53', 'source': ('A', 22), 'target': ('A', 21)},
  {'label': 'B53', 'source': ('A', 12), 'target': ('A', 13)},
  {'label': 'CWW', 'source': ('A', 12), 'target': ('A', 21)},
  {'label': 'CWW', 'source': ('A', 13), 'target': ('A', 20)},
  {'label': 'B53', 'source': ('A', 21), 'target': ('A', 20)}]}
</code>
</pre>

The <code>'id'</code> attribute can be whatever you like. In this case, it represents a chain and position in a PDB.

Carlos GO's avatar
help    
Carlos GO committed
85
86
<b>IMPORTANT:</b> file must not exceed 1MB. We do not process whole RNAs, just binding sites, which should be a handful of residues large.

Carlos GO's avatar
help  
Carlos GO committed
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
<h1> Uploading Ligand Database </h1>

If you provide your own ligand database to screen, we will use the prediction to return the most similar ligands in the database to the prediction.

The format for each line is <code>[SMILES] [some data]</code>.

Here is a sample file with 10 ligands.

<pre>
<code>
c1ccc(cc1)[C@@H](C(=O)O)N 004
c1nc(c2c(n1)n3c(n2)[C@@H]([C@@H]4[C@H](C[C@H]3O4)O)OP(=O)(O)O)N 02I
C1COCCN1CC2=CC[C@H](NC2)C(=O)O 04X
c1nc(c2c(n1)n(cn2)[C@@H]3[C@H]([C@H]([C@@H](O3)COP(=O)(O)O)O)O)N 0A
C1=CN(C(=O)N=C1N)[C@@H]2[C@H]([C@H]([C@@H](O2)COP(=O)(O)O)O)O 0C
COc1cc2c(cc1OC)nc(nc2N)N3CCNCC3 0EC
c1nc2c(n1[C@@H]3[C@H]([C@H]([C@@H](O3)COP(=O)(O)O)O)O)N=C(NC2=O)N 0G
c1ccc(cc1)C[C@H](C(=O)N2CCC[C@H]2C(=O)N[C@@H](CCCNC(=[NH2+])N)[C@@H](CCl)O)N 0G6
CS[C@@H]([C@@H](C(=O)O)N)C(=O)O 0TD
C1=CN(C(=O)NC1=O)[C@@H]2[C@H]([C@H]([C@@H](O2)COP(=O)(O)O)O)O 0U
</code>
</pre>

Carlos GO's avatar
more    
Carlos GO committed
110
111
The default library consists of all ligands that have been found co-crystallized with RNA in the PDB DataBank.

Carlos GO's avatar
help  
Carlos GO committed
112
113
114
115
116
117
118

<h1> Interpreting output </h1>

When your query completes, we display a list of the 30 most similar ligands to the prediction, chosen from the ligand library. 

You can then download the full list of distances to each element in the library by clicking the 'Download' button.

Carlos GO's avatar
Carlos GO committed
119
120
When you download the results, you will get hits.csv  with a list of ligands and their distances to the prediction, graph.json which contains a JSON description of the base pairing network used to make the prediction, fingerprint.txt with is the predicted MACCS fingerprint. 

Carlos GO's avatar
help  
Carlos GO committed
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
<h1> Source Code </h1>

<a href="http://jwgitlab.cs.mcgill.ca/cgoliver/rnamigos_gcn/tree/master">GitLab</a><br>
<a href="https://github.com/cgoliver/RNAmigos">GitHub</a>


<h1> Citing </h1>

<pre>
<code>
@article {Oliver701326,
	author = {Oliver, Carlos and Mallet, Vincent and Sarrazin Gendron, 
	Roman and Reinharz, Vladimir and Hamilton, 
	William L. and Moitessier, Nicolas and Waldisp{\"u}hl, J{\'e}r{\^o}me},
	title = {Augmented base pairing networks encode RNA-small molecule binding preferences},
	elocation-id = {701326},
	year = {2020},
	doi = {10.1101/701326},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2020/02/01/701326},
	eprint = {https://www.biorxiv.org/content/early/2020/02/01/701326.full.pdf},
	journal = {bioRxiv}
}
</code>
</pre>

<h1>Contact </h1>

Carlos GO's avatar
pad    
Carlos GO committed
149
150
<code>rnamigos@cs.mcgill.ca</code> <br><br>

Carlos GO's avatar
help  
Carlos GO committed
151
{% endblock body %}