From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is N#Cc1c(NC(=O)c2cc(OCCN)ccc2Cl)sc2c1CCN(Cc1ccc(C(=O)O)cc1)C2. The target protein sequence is MKLTIHEIAQVVGAKNDISIFEDTQLEKAEFDSRLIGTGDLFVPLKGARDGHDFIETAFENGAAVTLSEKEVSNHPYILVDDVLTAFQSLASYYLEKTTVDVFAVTGSNGKTTTKDMLAHLLSTRYKTYKTQGNYNNEIGLPYTVLHMPEGTEKLVLEMGQDHLGDIHLLSELARPKTAIVTLVGEAHLAFFKDRSEIAKGKMQIADGMASGSLLLAPADPIVEDYLPTDKKVVRFGQGAELEITDLVERKDSLTFKANFLEQVLDLPVTGKYNATNAMIASYVALQEGVSEEQIHQAFQDLELTRNRTEWKKAANGADILSDVYNANPTAMKLILETFSAIPANEGGKKIAVLADMKELGNQSVQLHNQMILSLSPDVLDTVIFYGEDIAELAQLASQMFPIGHVYYFKKTEDQDQFEDLVKQVKESLSANDQILLKGSNSMNLAMLVESLENETK. The pIC50 is 3.2. (2) The drug is CC(=O)N1CCN(c2ccc(OCc3cc4cnc(C#N)nc4n3CCC3CCCCC3)cc2)CC1. The target protein (P07154) has sequence MTPLLLLAVLCLGTALATPKFDQTFNAQWHQWKSTHRRLYGTNEEEWRRAVWEKNMRMIQLHNGEYSNGKHGFTMEMNAFGDMTNEEFRQIVNGYRHQKHKKGRLFQEPLMLQIPKTVDWREKGCVTPVKNQGQCGSCWAFSASGCLEGQMFLKTGKLISLSEQNLVDCSHDQGNQGCNGGLMDFAFQYIKENGGLDSEESYPYEAKDGSCKYRAEYAVANDTGFVDIPQQEKALMKAVATVGPISVAMDASHPSLQFYSSGIYYEPNCSSKDLDHGVLVVGYGYEGTDSNKDKYWLVKNSWGKEWGMDGYIKIAKDRNNHCGLATAASYPIVN. The pIC50 is 5.0. (3) The pIC50 is 5.8. The drug is CC(C)(C)c1ccc2[nH]c3c(c2c1)CC(=O)N(Cc1ccccc1)c1ccccc1-3. The target protein (P40926) has sequence MLSALARPASAALRRSFSTSAQNNAKVAVLGASGGIGQPLSLLLKNSPLVSRLTLYDIAHTPGVAADLSHIETKAAVKGYLGPEQLPDCLKGCDVVVIPAGVPRKPGMTRDDLFNTNATIVATLTAACAQHCPEAMICVIANPVNSTIPITAEVFKKHGVYNPNKIFGVTTLDIVRANTFVAELKGLDPARVNVPVIGGHAGKTIIPLISQCTPKVDFPQDQLTALTGRIQEAGTEVVKAKAGAGSATLSMAYAGARFVFSLVDAMNGKEGVVECSFVKSQETECTYFSTPLLLGKKGIEKNLGIGKVSSFEEKMISDAIPELKASIKKGEDFVKTLK. (4) The small molecule is Clc1cc(N2CCOCC2)c2nc([C@H]3C[C@H](c4ccc5ccccc5n4)C3)cn2n1. The target protein (Q9QYJ6) has sequence MEDGPSNNASCFRRLTECFLSPSLTDEKVKAYLSLHPQVLDEFVSESVSAETVEKWLKRKNNKAEDEPSPKEVSRYQDTNMQGVVYELNSYIEQRLDTGGDNHLLLYELSSIIRIATKADGFALYFLGECNNSLCVFTPPGMKEGQPRLIPAGPITQGTTISAYVAKSRKTLLVEDILGDERFPRGTGLESGTRIQSVLCLPIVTAIGDLIGILELYRHWGKEAFCLSHQEVATANLAWASVAIHQVQVCRGLAKQTELNDFLLDVSKTYFDNIVAIDSLLEHIMIYAKNLVNADRCALFQVDHKNKELYSDLFDIGEEKEGKPVFKKTKEIRFSIEKGIAGQVARTGEVLNIPDAYADPRFNREVDLYTGYTTRNILCMPIVSRGSVIGVVQMVNKISGSAFSKTDENNFKMFAVFCALALHCANMYHRIRHSECIYRVTMEKLSYHSICTSEEWQGLMHFNLPARICRDIELFHFDIGPFENMWPGIFVYMIHRSCGT.... The pIC50 is 7.5. (5) The target protein (Q07523) has sequence MPLVCLADFKAHAQKQLSKTSWDFIEGEADDGITYSENIAAFKRIRLRPRYLRDMSKVDTRTTIQGQEISAPICISPTAFHSIAWPDGEKSTARAAQEANICYVISSYASYSLEDIVAAAPEGFRWFQLYMKSDWDFNKQMVQRAEALGFKALVITIDTPVLGNRRRDKRNQLNLEANILLKDLRALKEEKPTQSVPVSFPKASFCWNDLSLLQSITRLPIILKGILTKEDAELAMKHNVQGIVVSNHGGRQLDEVSASIDALREVVAAVKGKIEVYMDGGVRTGTDVLKALALGARCIFLGRPILWGLACKGEDGVKEVLDILTAELHRCMTLSGCQSVAEISPDLIQFSRL. The pIC50 is 6.7. The small molecule is Cc1[nH]nc(C(=O)O)c1Cc1cccc(-c2ccc(C(N)=O)cc2)c1. (6) The compound is O=C(/C=C/c1ccccc1)OCc1cccc(Oc2ccccc2)c1. The target protein sequence is MPHVENASETYIPGRLDGKVALVTGSGRGIGAAVAVHLGRLGAKVVVNYANSTKDAEKVVSEIKALGSDAIAIKADIRQVPEIVKLFDQAVAHFGHLDIAVSNSGVVSFGHLKDVTEEEFDRVFSLNTRGQFFVAREAYRHLTEGGRIVLTSSNTSKDFSVPKHSLYSGSKGAVDSFVRIFSKDCGDKKITVNAVAPGGTVTDMFHEVSHHYIPNGTSYTAEQRQQMAAHASPLHRNGWPQDVANVVGFLVSKEGEWVNGKVLTLDGGAA. The pIC50 is 3.9. (7) The drug is O=c1c2ccccc2[se]n1-c1ccccc1. The target protein (P49407) has sequence MGDKGTRVFKKASPNGKLTVYLGKRDFVDHIDLVDPVDGVVLVDPEYLKERRVYVTLTCAFRYGREDLDVLGLTFRKDLFVANVQSFPPAPEDKKPLTRLQERLIKKLGEHAYPFTFEIPPNLPCSVTLQPGPEDTGKACGVDYEVKAFCAENLEEKIHKRNSVRLVIRKVQYAPERPGPQPTAETTRQFLMSDKPLHLEASLDKEIYYHGEPISVNVHVTNNTNKTVKKIKISVRQYADICLFNTAQYKCPVAMEEADDTVAPSSTFCKVYTLTPFLANNREKRGLALDGKLKHEDTNLASSTLLREGANREILGIIVSYKVKVKLVVSRGGLLGDLASSDVAVELPFTLMHPKPKEEPPHREVPENETPVDTNLIELDTNDDDIVFEDFARQRLKGMKDDKEEEEDGTGSPQLNNR. The pIC50 is 5.8. (8) The drug is Cc1[nH]c(/C=C2\C(=O)Nc3ccc(S(=O)(=O)N(C)c4cccc(Cl)c4)cc32)c(C)c1C(=O)N1CCN(C)CC1. The target protein (Q75ZY9) has sequence MKAPAVLAPGILVLLFTLVQKSYGECKEALVKSEMNVNMKYQLPNFTAETPIQNVVLHKHHIYLGAVNYIYVLNDKDLQKVAEYKTGPVLEHPDCSPCQDCSHKANLSGGVWEDNINMALLVDTYYDDQLISCGSVHRGTCQRHILPPSNIADIQSEVHCMYSSQADEEPSQCPDCVVSALGTKVLISEKDRFINFFVGNTINSSDHPDHSLHSISVRRLKETQDGFKFLTDQSYIDVLPEFRDSYPIKYVHAFESNHFIYFLTVQRETLDAQTFHTRIIRFCSVDSGLHSYMEMPLECILTEKRRKRSTREEVFNILQAAYVSKPGAHLAKQIGANLNDDILYGVFAQSKPDSAEPMNRSAVCAFPIKYVNEFFNKIVNKNNVRCLQHFYGPNHEHCFNRTLLRNSSGCEARNDEYRTEFTTALQRVDLFMGQFNQVLLTSISTFIKGDLTIANLGTSEGRFMQVVVSRSGLSTPHVNFRLDSHPVSPEAIVEHPLNQN.... The pIC50 is 6.7. (9) The target protein (P9WHJ3) has sequence MTQTPDREKALELAVAQIEKSYGKGSVMRLGDEARQPISVIPTGSIALDVALGIGGLPRGRVIEIYGPESSGKTTVALHAVANAQAAGGVAAFIDAEHALDPDYAKKLGVDTDSLLVSQPDTGEQALEIADMLIRSGALDIVVIDSVAALVPRAELEGEMGDSHVGLQARLMSQALRKMTGALNNSGTTAIFINQLRDKIGVMFGSPETTTGGKALKFYASVRMDVRRVETLKDGTNAVGNRTRVKVVKNKCLAEGTRIFDPVTGTTHRIEDVVDGRKPIHVVAAAKDGTLHARPVVSWFDQGTRDVIGLRIAGGAIVWATPDHKVLTEYGWRAAGELRKGDRVAQPRRFDGFGDSAPIPADHARLLGYLIGDGRDGWVGGKTPINFINVQRALIDDVTRIAATLGCAAHPQGRISLAIAHRPGERNGVADLCQQAGIYGKLAWEKTIPNWFFEPDIAADIVGNLLFGLFESDGWVSREQTGALRVGYTTTSEQLAHQIH.... The small molecule is CC(C)c1nnc(NC(=O)c2nc(S(=O)(=O)Cc3ccc(F)cc3)ncc2Cl)s1. The pIC50 is 4.3. (10) The drug is CCN1CCN(C(=O)N[C@@H](C(=O)N[C@H](C(=O)O)[C@H]2N[C@@H](C(=O)O)C(C)(C)S2)C2C=CC=CC2)C(=O)C1=O. The target protein sequence is MKLNHFQGALYPWRFCVIVGLLLAMVGAIVWRIVDLHVIDHDFLKGQGDARSVRHIAIPAHRGLITDRNGEPLAVSTPVTTLWANPKELMAAKERWPQLAAALGQDTKLFADRIEQNAEREFIYLVRGLTPEQGEGVISLKVPGVYSIEEFRRFYPAGEVVAHAVGFTDVDDRGREGIELAFDEWLAGVPGKRQVLKDRRGRVIKDVQVTKNAKPGKTLALSIDLRLQYLAHRELRNALVENGAKAGSLVIMDVKTGEILAMTNQPTYNPNNRRNLQPAAMRNRAMIDVFEPGSTVKPFSMSAALASGRWKPSDIVDVYPGTLQIGRYTIRDVSRNSRQLDLTGILIKSSNVGISKIAFDIGAESIYSVMQQVGLGQDTGLGFPGERVGNLPNHRKWPKAETATLAYGYGLSVTAIQLAHAYAALANDGKSVPLSMTRVDRVPDGVQVISPEVASTVQGMLQQVVEAQGGVFRAQVPGYHAAGKSGTARKVSVGTKGYRE.... The pIC50 is 3.9.