From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=C(NCCN1CCC(n2c(=O)[nH]c3ccccc32)CC1)c1cc2cc(F)ccc2[nH]1. The target protein sequence is ADSATPHLDAVEQTLRQVSPGLEGDVWERTSGNKLDGSAADPSDWLLQTPGCWGDDKCADRVGTKRLLAKMTENIGNATRTVDISTLAPFPNGAFQDAIVAGLKESAAKGNKLKVRILVGAAPVYHMNVIPSKYRDELTAKLGKAAENITLNVASMTTSKTAFSWNHSKILVVDGQSALTGGINSWKDDYLDTTHPVSDVDLALTGPAAGSAGRYLDTLWTWTCQNKSNIASVWFAASGNAGCMPTMHKDTNPKASPATGNVPVIAVGGLGVGIKDVDPKSTFRPDLPTASDTKCVVGLHDNTNADRDYDTVNPEESALRALVASAKGHIEISQQDLNATCPPLPRYDIRLYDALAAKMAAGVKVRIVVSDPANRGAVGSGGYSQIKSLSEISDTLRNRLANITGGQQAAKTAMCSNLQLATFRSSPNGKWADGHPYAQHHKLVSVDSSTFYIGSKNLYPSWLQDFGYIVESPEAAKQLDAKLLDPQWKYSQETATVDYA.... The pIC50 is 7.0. (2) The target protein (Q04844) has sequence MARAPLGVLLLLGLLGRGVGKNEELRLYHHLFNNYDPGSRPVREPEDTVTISLKVTLTNLISLNEKEETLTTSVWIGIDWQDYRLNYSKDDFGGIETLRVPSELVWLPEIVLENNIDGQFGVAYDANVLVYEGGSVTWLPPAIYRSVCAVEVTYFPFDWQNCSLIFRSQTYNAEEVEFTFAVDNDGKTINKIDIDTEAYTENGEWAIDFCPGVIRRHHGGATDGPGETDVIYSLIIRRKPLFYVINIIVPCVLISGLVLLAYFLPAQAGGQKCTVSINVLLAQTVFLFLIAQKIPETSLSVPLLGRFLIFVMVVATLIVMNCVIVLNVSQRTPTTHAMSPRLRHVLLELLPRLLGSPPPPEAPRAASPPRRASSVGLLLRAEELILKKPRSELVFEGQRHRQGTWTAAFCQSLGAAAPEVRCCVDAVNFVAESTRDQEATGEEVSDWVRMGNALDNICFWAALVLFSVGSSLIFLGAYFNRVPDLPYAPCIQP. The pIC50 is 6.9. The compound is Nc1c2c(nc3c(CO)cccc13)CCCC2. (3) The compound is Cc1ccc(-c2c(C3=NN(C(=O)CCC(=O)O)C(c4ccc(Cl)cc4)C3)c(=O)[nH]c3ccccc23)cc1. The target protein (Q00961) has sequence MGGALGPALLLTSLLGAWARLGAGQGEQAVTVAVVFGSSGPLQTQARTRLTSQNFLDLPLEIQPLTVGVNNTNPSSILTQICGLLGAARVHGIVFEDNVDTEAVAQLLDFVSSQTHVPILSISGGSAVVLTPKEPGSAFLQLGVSLEQQLQVLFKVLEEYDWSAFAVITSLHPGHALFLEGVRAVADASYLSWRLLDVLTLELGPGGPRARTQRLLRQVDAPVLVAYCSREEAEVLFAEAAQAGLVGPGHVWLVPNLALGSTDAPPAAFPVGLISVVTESWRLSLRQKVRDGVAILALGAHSYRRQYGTLPAPAGDCRSHPGPVSPAREAFYRHLLNVTWEGRDFSFSPGGYLVRPTMVVIALNRHRLWEMVGRWDHGVLYMKYPVWPRYSTSLQPVVDSRHLTVATLEERPFVIVESPDPGTGGCVPNTVPCRRQSNHTFSSGDLTPYTKLCCKGFCIDILKKLAKVVKFSYDLYLVTNGKHGKRVRGVWNGMIGEVYY.... The pIC50 is 5.6. (4) The drug is COC[C@@H](C(=O)Nc1cccc2c(-c3nc(Nc4cn(C)nc4C)ncc3F)c[nH]c12)N1CCN(C)CC1. The target protein sequence is ISSDYELLSDPTPGALAPRDGLWNGAQLYACQDPTIFEERHLKYISQLGKGNFGSVELCRYDPLGDNTGALVAVKQLQHSGPDQQRDFQREIQILKALHSDFIVKYRGVSYGPGRQSLRLVMEYLPSGCLRDFLQRHRARLDASRLLLYSSQICKGMEYLGSRRCVHRDLAARNILVESEAHVKIADFGLAKLLPLDKDYYVVREPGQSPIFWYAPESLSDNIFSRQSDVWSFGVVLYELFTYCDKSCSPSAEFLRMMGCERDVPALCRLLELLEEGQRLPAPPACPAEVHELMKLCWAPSPQDRPSFSALGPQLDMLWSGSRGCETHAFTAHPEGKHHSLSFS. The pIC50 is 4.5. (5) The small molecule is CCc1nc(N)nc(N)c1C#CCc1cc(OC)cc(-c2ccc(C(=O)O)cc2)c1. The target protein (P13955) has sequence MTLSIIVAHDKQRVIGYQNQLPWHLPNDLKHIKQLTTGNTLVMARKTFNSIGKPLPNRRNVVLTNQASFHHEGVDVINSLDEIKELSGHVFIFGGQTLYEAMIDQVDDMYITVIDGKFQGDTFFPPYTFENWEVESSVEGQLDEKNTIPHTFLHLVRRKGK. The pIC50 is 6.0. (6) The small molecule is N#C/C(=C\c1ccc(O)c(O)c1)C(=O)OCCCCCCCCn1ccnc1. The target protein (Q02759) has sequence MGVYRIRVSTGDSKYAGSNNEVYLWLVGQHGEASLGKLLRPCRDSEAEFKVDVSEYLGPLLFVRVQKWHYLTDDAWFCNWISVKGPGDQGSEYMFPCYRWVQGRSILSLPEGTGCTVVEDSQGLFRKHREEELEERRSLYRWGNWKDGSILNVAAASISDLPVDQRFREDKRIEFEASQVIGVMDTVVNFPINTVTCWKSLDDFNCVFKSGHTKMAERVRNSWKEDAFFGYQFLNGANPMVLKRSTCLPARLVFPPGMEKLQAQLNKELQKGTLFEADFFLLDGIKANVILCSQQYLAAPLVMLKLMPDGQLLPIAIQLELPKTGSTPPPIFTPSDPPMDWLLAKCWVRSSDLQLHELQAHLLRGHLMAEVFAVATMRCLPSVHPVFKLLVPHLLYTMEINVRARSDLISERGFFDKAMSTGGGGHLDLLKQAGAFLTYCSLCPPDDLAERGLLDIETCFYAKDALRLWQIMNRYVVGMFNLHYKTDKAVQDDYELQSWC.... The pIC50 is 6.7. (7) The compound is COc1cc(/C=C2\SC(=O)N(Cc3ccc(C(=O)O)cc3)C2=O)ccc1O. The target protein (P40347) has sequence MTIEKPKISVAFICLGNFCRSPMAEAIFKHEVEKANLENRFNKIDSFGTSNYHVGESPDHRTVSICKQHGVKINHKGKQIKTKHFDEYDYIIGMDESNINNLKKIQPEGSKAKVCLFGDWNTNDGTVQTIIEDPWYGDIQDFEYNFKQITYFSKQFLKKEL. The pIC50 is 3.4. (8) The target protein (P40136) has sequence MTRNKFIPNKFSIISFSVLLFAISSSQAIEVNAMNEHYTESDIKRNHKTEKNKTEKEKFKDSINNLVKTEFTNETLDKIQQTQDLLKKIPKDVLEIYSELGGEIYFTDIDLVEHKELQDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKEKLELNNKSIDINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFEKISESLKKEGVEKDRIDVLKGEKALKASGLVPEHADAFKKIARELNTYILFRPVNKLATNLIKSGVATKGLNVHGKSSDWGPVAGYIPFDQDLSKKHGQQLAVEKGNLENKKSITEHEGEIGKIPLKLDHLRIEELKENGIILKGKKEIDNGKKYYLLESNNQVYEFRISDENNEVQYKTKEGKITVLGEKFNWRNIEVMAKNVEGVLKPLTADYDLFALAPS.... The pIC50 is 5.7. The small molecule is O=C(O)c1cccc(NC(=O)c2cccc3c2C(=O)c2ccccc2-3)c1.