Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The small molecule is C[C@@H](O)c1c2ccccc2cc2ccccc12. The target protein (O75795) has sequence MSLKWMSVFLLMQLSCYFSSGSCGKVLVWPTEYSHWINMKTILEELVQRGHEVIVLTSSASILVNASKSSAIKLEVYPTSLTKNDLEDFFMKMFDRWTYSISKNTFWSYFSQLQELCWEYSDYNIKLCEDAVLNKKLMRKLQESKFDVLLADAVNPCGELLAELLNIPFLYSLRFSVGYTVEKNGGGFLFPPSYVPVVMSELSDQMIFMERIKNMIYMLYFDFWFQAYDLKKWDQFYSEVLGRPTTLFETMGKAEMWLIRTYWDFEFPRPFLPNVDFVGGLHCKPAKPLPKEMEEFVQSSGENGIVVFSLGSMISNMSEESANMIASALAQIPQKVLWRFDGKKPNTLGSNTRLYKWLPQNDLLGHPKTKAFITHGGTNGIYEAIYHGIPMVGIPLFADQHDNIAHMKAKGAALSVDIRTMSSRDLLNALKSVINDPIYKENIMKLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRVAAHNLTWIQYHSLDVIAFLLA.... The pIC50 is 3.1. (2) The drug is CC(=O)NCCc1ccc(O)c(-c2c(O)c(O)c3c(c2O)C(=O)c2c(cc(O)c(C(=O)O)c2C(=O)O)C3=O)c1. The target protein sequence is TKLVYQIFDTFFAEQIEKDDREDKENAFKRRRCGVCEVCQQPECGKCKACKDMVKFGGSGRSKQACQERRCPNMAMKEADDDEEVDDNIPEMPSPKKMHQGKKKKQNKNRISWVGEAVKTDGKKSYYKKVCIDAETLEVGDCVSVIPDDSSKPLYLARVTALWEDSSNGQMFHAHWFCAGTDTVLGATSDPLELFLVDECEDMQLSYIHSKVKVIYKAPSENWAMEGGMDPESLLEGDDGKTYFYQLWYDQDYARFESPPKTQPTEDNKFKFCVSCARLAEMRQKEIPRVLEQLEDLDSRVLYYSATKNGILYRVGDGVYLPPEAFTFNIKLSSPVKRPRKEPVDEDLYPEHYRKYSDYIKGSNLDAPEPYRIGRIKEIFCPKKSNGRPNETDIKIRVNKFYRPENTHKSTPASYHADINLLYWSDEEAVVDFKAVQGRCTVEYGEDLPECVQVYSMGGPNRFYFLEAYNAKSKSFEDPPNHARSPGNKGKGKGKGKGKP.... The pIC50 is 6.2. (3) The compound is Cn1cc(NC(=O)N2CCC[C@H]2C(=O)NC(c2ccccc2)c2ccccc2)c2ccccc21. The target protein sequence is GRILGGREAEAHARPYMASVQLNGAHLCGGVLVAEQWVLSAAHCLEDAADGKVQVLLGAHSLSQPEPSKRLYDVLRAVPHPDSQPDTIDHDLLLLQLSEKATLGPAVRPLPWQRVDRDVAPGTLCDVAGWGIVNHAGRRPDSLQHVLLPVLDRATCNRRTHHDGAITERLMCAESNRRDSCKGDSGGPLVCGGVLEGVVTSGSRVCGNRKKPGIYTRVASYAAWIDSVLA. The pIC50 is 5.2. (4) The compound is CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H](N)CCC(=O)O)[C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)[C@@H](C)O)[C@@H](C)CC)[C@@H](C)O)C(C)C)[C@@H](C)O. The target protein (P26818) has sequence MADLEAVLADVSYLMAMEKSKATPAARASKKIVLPEPSIRSVMQKYLEERHEITFDKIFNQRIGFLLFKDFCLNEINEAVPQVKFYEEIKEYEKLENEEDRLCRSRQIYDTYIMKELLSCSHPFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGSIFQKFMESDKFTRFCQWKNVELNIHLTMNDFSVHRIIGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKMKQGETLALNERIMLSLVSTGDCPFIVCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYATEIILGLEHMHNRFVVYRDLKPANILLDEHGHVRISDLGLACDFSKKKPHASVGTHGYMAPEVLQKGTAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTMNVELPDVFSPELKSLLEGLLQRDVSKRLGCHGGSAQELKTHDFFRGIDWQHVYLQKYPPPLIPPRGEVNAADAFDIGSFDEEDTKGIKLLD.... The pIC50 is 4.7. (5) The compound is COc1ccnc(C(=O)Nc2ccccc2Oc2ccc(C(=O)O)c(C(=O)O)c2)c1. The target protein (Q9ET01) has sequence MAKPLTDQEKRRQISIRGIVGVENVAELKKGFNRHLHFTLVKDRNVATPRDYYFALAHTVRDHLVGRWIRTQQHYYDKCPKRVYYLSLEFYMGRTLQNTMINLGLQNACDEAIYQLGLDMEELEEIEEDAGLGNGGLGRLAACFLDSMATLGLAAYGYGIRYEYGIFNQKIREGWQVEEADDWLRHGNPWEKARPEFMLPVHFYGRVEHTQTGTKWVDTQVVLALPYDTPVPGYMNNTVNTMRLWSARAPNDFNLQDFNVGDYIQAVLDRNLAENISRVLYPNDNFFEGKELRLKQEYFVVAATLQDVIRRFKASKFGSKDGMGTVFDAFPDQVAIQLNDTHPALAIPELMRIFVDIEKLPWAKAWEITKKTFAYTNHTVLPEALERWPVELVEKLLPRHLEIIYEINQKHLDRIVALFPKDISRMRRMSLIEEEGGKRINMAHLCIVGCHAVNGVAKIHSDIVKTQVFKDFSELEPDKFQNKTNGITPRRWLLLCNPGL.... The pIC50 is 7.7. (6) The drug is O=C1C(Nc2ccccc2)=CC(=O)c2ncccc21. The target protein (P46943) has sequence MLKFRIRPVRHIRCYKRHAYFLRYNHTTTPAQKLQAQIEQIPLENYRNFSIVAHVDHGKSTLSDRLLEITHVIDPNARNKQVLDKLEVERERGITIKAQTCSMFYKDKRTGKNYLLHLIDTPGHVDFRGEVSRSYASCGGAILLVDASQGIQAQTVANFYLAFSLGLKLIPVINKIDLNFTDVKQVKDQIVNNFELPEEDIIGVSAKTGLNVEELLLPAIIDRIPPPTGRPDKPFRALLVDSWYDAYLGAVLLVNIVDGSVRKNDKVICAQTKEKYEVKDIGIMYPDRTSTGTLKTGQVGYLVLGMKDSKEAKIGDTIMHLSKVNETEVLPGFEEQKPMVFVGAFPADGIEFKAMDDDMSRLVLNDRSVTLERETSNALGQGWRLGFLGSLHASVFRERLEKEYGSKLIITQPTVPYLVEFTDGKKKLITNPDEFPDGATKRVNVAAFHEPFIEAVMTLPQEYLGSVIRLCDSNRGEQIDITYLNTNGQVMLKYYLPLSH.... The pIC50 is 3.5. (7) The compound is Cc1cc(C)c(CNC(=O)c2cc(-c3ccc(CN4CCOCC4)cc3)cc(N(CC3CCC3)C3CCOCC3)c2C)c(=O)[nH]1. The target protein sequence is MGQTGKKSEKGPVCWRKRVKSEYMRLRQLKRFRRADEVKSMFSSNRQKILERTEILNQEWKQRRIQPVHILTSVSSLRGTRECSVTSDLDFPTQVIPLKTLNAVASVPIMYSWSPLQQNFMVEDETVLHNIPYMGDEVLDQDGTFIEELIKNYDGKVHGDRECGFINDEIFVELVNALGQYNDDDDDDDGDDPEEREEKQKDLEDHRDDKESRPPRKFPSDKIFEAISSMFPDKGTAEELKEKYKELTEQQLPGALPPECTPNIDGPNAKSVQREQSLHSFHTLFCRRCFKYDCFLHPFHATPNTYKRKNTETALDNKPCGPQCYQHLEGAKEFAAALTAERIKTPPKRPGGRRRGRLPNNSSRPSTPTINVLESKDTDSDREAGTETGGENNDKEEEEKKDETSSSSEANSRCQTPIKMKPNIEPPENVEWSGAEASMFRVLIGTYYDNFCAIARLIGTKTCRQVYEFRVKESSIIAPAPAEDVDTPPRKKKRKHRLWA.... The pIC50 is 7.2.