From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1cc(OC)c(S(=O)(=O)N2c3ccccc3CCC2C)cc1NC(=S)NCC(=O)O. The target protein (Q97R46) has sequence MSNFAIILAAGKGTRMKSDLPKVLHKVAGISMLEHVFRSVGAIQPEKTVTVVGHKAELVEEVLAGQTEFVTQSEQLGTGHAVMMTEPILEGLSGHTLVIAGDTPLITGESLKNLIDFHINHKNVATILTAETDNPFGYGRIVRNDNAEVLRIVEQKDATDFEKQIKEINTGTYVFDNERLFEALKNINTNNAQGEYYITDVIGIFRETGEKVGAYTLKDFDESLGVNDRVALATAESVMRRRINHKHMVNGVSFVNPEATYIDIDVEIASEVQIEANVTLKGQTKIGAETVLTNGTYVVDSTIGAGAVITNSMIEESSVADGVIVGPYAHIRPNSSLGAQVHIGNFVEVKGSSIGENTKAGHLTYIGNCEVGSNVNFGAGTITVNYDGKNKYKTVIGNNVFVGSNSTIIAPVELGDNSLVGAGSTITKDVPADAIAIGRGRQINKDEYATRLPHHPKNQ. The pIC50 is 4.5. (2) The compound is O=c1oc2ccccc2cc1-c1cn(-c2ccccc2)c(=S)[nH]1. The target protein (Q99728) has sequence MPDNRQPRNRQPRIRSGNEPRSAPAMEPDGRGAWAHSRAALDRLEKLLRCSRCTNILREPVCLGGCEHIFCSNCVSDCIGTGCPVCYTPAWIQDLKINRQLDSMIQLCSKLRNLLHDNELSDLKEDKPRKSLFNDAGNKKNSIKMWFSPRSKKVRYVVSKASVQTQPAIKKDASAQQDSYEFVSPSPPADVSERAKKASARSGKKQKKKTLAEINQKWNLEAEKEDGEFDSKEESKQKLVSFCSQPSVISSPQINGEIDLLASGSLTESECFGSLTEVSLPLAEQIESPDTKSRNEVVTPEKVCKNYLTSKKSLPLENNGKRGHHNRLSSPISKRCRTSILSTSGDFVKQTVPSENIPLPECSSPPSCKRKVGGTSGRKNSNMSDEFISLSPGTPPSTLSSSSYRRVMSSPSAMKLLPNMAVKRNHRGETLLHIASIKGDIPSVEYLLQNGSDPNVKDHAGWTPLHEACNHGHLKVVELLLQHKALVNTTGYQNDSPLHD.... The pIC50 is 8.4. (3) The drug is O=c1sn(-c2cccc(Cl)c2)c(=O)n1Cc1ccc(F)cc1. The target protein (P57771) has sequence MAALLMPRRNKGMRTRLGCLSHKSDSCSDFTAILPDKPNRALKRLSTEEATRWADSFDVLLSHKYGVAAFRAFLKTEFSEENLEFWLACEEFKKTRSTAKLVSKAHRIFEEFVDVQAPREVNIDFQTREATRKNLQEPSLTCFDQAQGKVHSLMEKDSYPRFLRSKMYLDLLSQSQRRLS. The pIC50 is 4.9. (4) The compound is CCC(CC)CN[C@H]1CC(C(=O)O)=C[C@@H](OC(CC)CC)[C@@H]1NC(C)=O. The target protein sequence is MNPNQKIITIGSICMVIGIVSLMLQIGNMISIWVSHSIQTGNQRQAEPISNTKFLTEKAVASVTLAGNSSLCPISGWAVYSKDNSIRIGSRGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPHRTLMSCPVGEAPSPYNSRFESVAWSASACHDGTSWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFTVMTDGPSSGQASYKIFKMEKGKVVKSVELDAPNYHYEECSCYPDAGEITCVCRDNWHGSNRPWVSFNQNLEYQIGYICSGVFGDNPRPNDGTGSCGPVSPNGAYGVKGFSFKYGNGVWIGRTKSTNSRSGFEMIWDPNGWTGTDSSFSVKQDIVAITDWSGYSGSFVQHPELTGLDCIRPCFWVELIRGRPKESTIWTSGSSISFCGVNSDTVSWSWPDGAELPFTIDK. The pIC50 is 7.6. (5) The small molecule is CC(/C=C/CCC(=O)N1CCCC1=O)=C\C1CCCCO1. The target protein (P06526) has sequence MDPLCTASSGPRKKRPRQVGASMASPPHDIKFQNLVLFILEKKMGTTRRNFLMELARRKGFRVENELSDSVTHIVAENNSGSEVLEWLQVQNIRASSQLELLDVSWLIESMGAGKPVEITGKHQLVVRTDYSATPNPGFQKTPPLAVKKISQYACQRKTTLNNYNHIFTDAFEILAENSEFKENEVSYVTFMRAASVLKSLPFTIISMKDTEGIPCLGDKVKCIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRSLSKIMSDKTLKFTKMQKAGFLYYEDLVSCVTRAEAEAVGVLVKEAVWAFLPDAFVTMTGGFRRGKKIGHDVDFLITSPGSAEDEEQLLPKVINLWEKKGLLLYYDLVESTFEKFKLPSRQVDTLDHFQKCFLILKLHHQRVDSSKSNQQEGKTWKAIRVDLVMCPYENRAFALLGWTGSRQFERDIRRYATHERKMMLDNHALYDKTKRVFLKAESEEEIFAHLGLD.... The pIC50 is 3.7. (6) The compound is Cc1cccc(NC(=O)c2nn(Cc3c[nH]cn3)c3c2CN(C(=O)c2ccc[nH]2)CC3)c1. The target protein sequence is MAGYLRVVRSLCRASGSRPAWAPAALTAPTSQEQPRRHYADKRIKVAKPVVEMDGDEMTRIIWQFIKEKLILPHVDIQLKYFDLGLPNRDQTDDQVTIDSALATQKYSVAVKCATITPDEARVEEFKLKKMWKSPNGTIRNILGGTVFREPIICKNIPRLVPGWTKPITIGSHAHGDQYKATDFVADRAGTFKMVFTPKDGSGVKEWEVYNFPAGGVGMGMYNTDESISGFAHSCFQYAIQKKWPLYMSTKNTILKAYDGRFKDIFQEIFDKHYKTDFDKNKIWYEHRLIDDMVAQVLKSSGGFVWACKNYDGDVQSDILAQGFGSLGLMTSVLVCPDGKTIEAEAAHGTVTRHYREHQKGRPTSTNPIASIFAWTRGLEHRGKLDGNQDLIRFAQMLEKVCVETVESGAMTKDLAGCIHGLSNVKLNEHFLNTTDFLDTIKSNLDRALGRQ. The pIC50 is 4.0. (7) The compound is Cc1cc(-c2ccc(OC(=O)Nc3ccc(-c4cnccn4)cn3)cc2)ccn1. The target protein (Q9H237) has sequence MATFSRQEFFQQLLQGCLLPTAQQGLDQIWLLLAICLACRLLWRLGLPSYLKHASTVAGGFFSLYHFFQLHMVWVVLLSLLCYLVLFLCRHSSHRGVFLSVTILIYLLMGEMHMVDTVTWHKMRGAQMIVAMKAVSLGFDLDRGEVGTVPSPVEFMGYLYFVGTIVFGPWISFHSYLQAVQGRPLSCRWLQKVARSLALALLCLVLSTCVGPYLFPYFIPLNGDRLLRNKKRKARGTMVRWLRAYESAVSFHFSNYFVGFLSEATATLAGAGFTEEKDHLEWDLTVSKPLNVELPRSMVEVVTSWNLPMSYWLNNYVFKNALRLGTFSAVLVTYAASALLHGFSFHLAAVLLSLAFITYVEHVLRKRLARILSACVLSKRCPPDCSHQHRLGLGVRALNLLFGALAIFHLAYLGSLFDVDVDDTTEEQGYGMAYTVHKWSELSWASHWVTFGCWIFYRLIG. The pIC50 is 9.5. (8) The drug is COc1cccc(OCc2cc3n(n2)[C@H](C)CN(c2ccc(F)cc2)C3=O)c1. The target protein (P31422) has sequence MKMLTRLQILMLALFSKGFLLSLGDHNFMRREIKIEGDLVLGGLFPINEKGTGTEECGRINEDRGIQRLEAMLFAIDEINKDNYLLPGVKLGVHILDTCSRDTYALEQSLEFVRASLTKVDEAEYMCPDGSYAIQENIPLLIAGVIGGSYSSVSIQVANLLRLFQIPQISYASTSAKLSDKSRYDYFARTVPPDFYQAKAMAEILRFFNWTYVSTVASEGDYGETGIEAFEQEARLRNICIATAEKVGRSNIRKSYDSVIRELLQKPNARVVVLFMRSDDSRELIAAANRVNASFTWVASDGWGAQESIVKGSEHVAYGAITLELASHPVRQFDRYFQSLNPYNNHRNPWFRDFWEQKFQCSLQNKRNHRQVCDKHLAIDSSNYEQESKIMFVVNAVYAMAHALHKMQRTLCPNTTKLCDAMKILDGKKLYKEYLLKINFTAPFNPNKGADSIVKFDTFGDGMGRYNVFNLQQTGGKYSYLKVGHWAETLSLDVDSIHWS.... The pIC50 is 5.9. (9) The small molecule is CC(C)C(=O)c1ccc(O)c(O)c1. The target protein (P80041) has sequence MNASDFRRRGKEMVDYMADYLEGIEGRQVYPDVQPGYLRPLIPATAPQEPDTFEDILQDVEKIIMPGVTHWHSPYFFAYFPTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLQLPEAFLAGEAGEGGGVIQGSASEATLVALLAARTKVVRRLQAASPGLTQGAVLEKLVAYASDQAHSSVERAGLIGGVKLKAIPSDGKFAMRASALQEALERDKAAGLIPFFVVATLGTTSCCSFDNLLEVGPICHEEDIWLHVDAAYAGSAFICPEFRHLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKRRTDLTGAFKLDPVYLKHSHQGSGLITDYRHWQLPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVQLSHEFEAFVLQDPRFEVCAEVTLGLVCFRLKGSDGLNEALLERINSARKIHLVPCRLRGQFVLRFAICSRKVESGHVRLAWEHIRGLAAELLAAEEGKAEIKS. The pIC50 is 4.3. (10) The drug is CNCCC(Oc1ccc(C(F)(F)F)cc1)c1ccccc1. The target protein (P25122) has sequence MGQGDESERIVINVGGTRHQTYRSTLRTLPGTRLAWLAEPDAHSHFDYDPRADEFFFDRHPGVFAHILNYYRTGKLHCPADVCGPLYEEELAFWGIDETDVEPCCWMTYRQHRDAEEALDSFGGAPLDNSADDADADGPGDSGDGEDELEMTKRLALSDSPDGRPGGFWRRWQPRIWALFEDPYSSRYARYVAFASLFFILVSITTFCLETHERFNPIVNKTEIENVRNGTQVRYYREAETEAFLTYIEGVCVVWFTFEFLMRVVFCPNKVEFIKNSLNIIDFVAILPFYLEVGLSGLSSKAAKDVLGFLRVVRFVRILRIFKLTRHFVGLRVLGHTLRASTNEFLLLIIFLALGVLIFATMIYYAERIGAQPNDPSASEHTHFKNIPIGFWWAVVTMTTLGYGDMYPQTWSGMLVGALCALAGVLTIAMPVPVIVNNFGMYYSLAMAKQKLPKKKKKHIPRPPQLGSPNYCKSVVNSPHHSTQSDTCPLAQEEILEINR.... The pIC50 is 4.9.