Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)CC(CN)Nc1cc(Nc2cccc(C(C)(C)C)n2)c(C(N)=O)nn1. The target protein sequence is PKEVYLDRKLLTLEDKELGSGNFGTVKKGYYQMKKVVKTVAVKILKNEANDPALKDELLAEANVMQQLDNPYIVRMIGICEAESWMLVMEMAELGPLNKYLQQNRHVKDKNIIELVHQVSMGMKYLEESNFVHRDLAARNVLLVTQHYAKISDFGLSKALRADENYYKAQTHGKWPVKWYAPECINYYKFSSKSDVWSFGVLMWEAFSYGQKPYRGMKGSEVTAMLEKGERMGCPAGCPREMYDLMNLCWTYDVENRPGFAAVELRLRNYYYDVVN. The pIC50 is 8.2. (2) The small molecule is C=CCNC(=O)O[C@H]1CC[C@@](CNC(=O)c2cc(Cl)ccc2OC)(c2ccccc2)CC1. The target protein (P22001) has sequence MDERLSLLRSPPPPSARHRAHPPQRPASSGGAHTLVNHGYAEPAAGRELPPDMTVVPGDHLLEPEVADGGGAPPQGGCGGGGCDRYEPLPPSLPAAGEQDCCGERVVINISGLRFETQLKTLCQFPETLLGDPKRRMRYFDPLRNEYFFDRNRPSFDAILYYYQSGGRIRRPVNVPIDIFSEEIRFYQLGEEAMEKFREDEGFLREEERPLPRRDFQRQVWLLFEYPESSGPARGIAIVSVLVILISIVIFCLETLPEFRDEKDYPASTSQDSFEAAGNSTSGSRAGASSFSDPFFVVETLCIIWFSFELLVRFFACPSKATFSRNIMNLIDIVAIIPYFITLGTELAERQGNGQQAMSLAILRVIRLVRVFRIFKLSRHSKGLQILGQTLKASMRELGLLIFFLFIGVILFSSAVYFAEADDPTSGFSSIPDAFWWAVVTMTTVGYGDMHPVTIGGKIVGSLCAIAGVLTIALPVPVIVSNFNYFYHRETEGEEQSQYM.... The pIC50 is 6.6. (3) The compound is N=C(CC1CCCCC1)P(=O)(O)CC(Cc1ccc(C(F)(F)F)cc1)C(=O)O. The target protein (P16444) has sequence MWSGWWLWPLVAVCTADFFRDEAERIMRDSPVIDGHNDLPWQLLDMFNNRLQDERANLTTLAGTHTNIPKLRAGFVGGQFWSVYTPCDTQNKDAVRRTLEQMDVVHRMCRMYPETFLYVTSSAGIRQAFREGKVASLIGVEGGHSIDSSLGVLRALYQLGMRYLTLTHSCNTPWADNWLVDTGDSEPQSQGLSPFGQRVVKELNRLGVLIDLAHVSVATMKATLQLSRAPVIFSHSSAYSVCASRRNVPDDVLRLVKQTDSLVMVNFYNNYISCTNKANLSQVADHLDHIKEVAGARAVGFGGDFDGVPRVPEGLEDVSKYPDLIAELLRRNWTEAEVKGALADNLLRVFEAVEQASNLTQAPEEEPIPLDQLGGSCRTHYGYSSGASSLHRHWGLLLASLAPLVLCLSLL. The pIC50 is 6.7. (4) The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTIGTGSFGRVMLVKHMETGNHYAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEFSFKDNSNLYMVMEYMPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIKVADFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The pIC50 is 7.0. The small molecule is Cc1cncc2cccc(S(=O)(=O)N3CCCNC[C@@H]3C)c12. (5) The compound is CC/C=C/CCn1ccnc1. The target protein (Q16696) has sequence MLASGLLLVTLLACLTVMVLMSVWRQRKSRGKLPPGPTPLPFIGNYLQLNTEQMYNSLMKISERYGPVFTIHLGPRRVVVLCGHDAVKEALVDQAEEFSGRGEQATFDWLFKGYGVAFSNGERAKQLRRFSIATLRGFGVGKRGIEERIQEEAGFLIDALRGTHGANIDPTFFLSRTVSNVISSIVFGDRFDYEDKEFLSLLRMMLGSFQFTATSTGQLYEMFSSVMKHLPGPQQQAFKELQGLEDFIAKKVEHNQRTLDPNSPRDFIDSFLIRMQEEEKNPNTEFYLKNLVMTTLNLFFAGTETVSTTLRYGFLLLMKHPEVEAKVHEEIDRVIGKNRQPKFEDRAKMPYTEAVIHEIQRFGDMLPMGLAHRVNKDTKFRDFFLPKGTEVFPMLGSVLRDPRFFSNPRDFNPQHFLDKKGQFKKSDAFVPFSIGKRYCFGEGLARMELFLFFTTIMQNFRFKSPQSPKDIDVSPKHVGFATIPRNYTMSFLPR. The pIC50 is 5.2. (6) The compound is O=C(N[C@H]1CS[C@H]2CCC[C@@H](C(=O)O)N2C1=O)[C@@H](S)Cc1ccccc1. The target protein (P12822) has sequence MGAAPGRRGPRLLRPPPPLLLLLLLLRPPPAALTLDPGLLPGDFAADEAGARLFASSYNSSAEQVLFRSTAASWAHDTNITAENARRQEEEALLSQEFAEAWGKKAKELYDPVWQNFTDPELRRIIGAVRTLGPANLPLAKRQQYNSLLSNMSQIYSTGKVCFPNKTASCWSLDPDLNNILASSRSYAMLLFAWEGWHNAVGIPLKPLYQEFTALSNEAYRQDGFSDTGAYWRSWYDSPTFEEDLERIYHQLEPLYLNLHAYVRRVLHRRYGDRYINLRGPIPAHLLGNMWAQSWESIYDMVVPFPDKPNLDVTSTMVQKGWNATHMFRVAEEFFTSLGLLPMPPEFWAESMLEKPEDGREVVCHASAWDFYNRKDFRIKQCTQVTMDQLSTVHHEMGHVQYYLQYKDQPVSLRRANPGFHEAIGDVLALSVSTPAHLHKIGLLDHVTNDTESDINYLLKMALEKIAFLPFGYLVDQWRWGVFSGRTPSSRYNFDWWYLR.... The pIC50 is 8.3. (7) The drug is C=CS(=O)(=O)Nc1cccc(-c2nn(C(C)C)c3ncnc(N)c23)c1. The target protein sequence is QTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVCEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQCWRKDPEERPTFEYLQAFLEDYFTSTEPQYQPGENL. The pIC50 is 7.0. (8) The compound is Fc1cc(F)cc(Nc2nc(NC3CC(F)(F)C3)nc(-c3nc(F)ccc3F)n2)c1. The target protein sequence is MSKKISGGSVVEMQGDEMTRIIWELIKEKLIFPYVELDLHSYDLGIENRDATNDQVTKDAAEAIKKHNVGVKCATITPDEKRVEEFKLKQMWKSPNGTIRNILGGTVFREAIICKNIPRLVSGWVKPIIIGHHAYGDQYRATDFVVPGPGKVEITYTPSDGTQKVTYLVHNFEEGGGVAMGMYNQDKSIEDFAHSSFQMALSKGWPLYLSTKNTILKKYDGRFKDIFQEIYDKQYKSQFEAQKIWYEHRLIDDMVAQAMKSEGGFIWACKNYDGDVQSDSVAQGYGSLGMMTSVLVCPDGKTVEAEAAHGTVTRHYRMYQKGQETSTNPIASIFAWTRGLAHRAKLDNNKELAFFANALEEVSIETIEAGFMTKDLAACIKGLPNVQRSDYLNTFEFMDKLGENLKIKLAQAKL. The pIC50 is 7.3. (9) The compound is O=C(Nc1cccc(O)c1C(=O)O)c1ccc(Br)c(Oc2ccccc2)c1. The target protein (P43711) has sequence MNSRILSTGSYLPSHIRTNADLEKMVDTSDEWIVTRSGIRERRIAAEDETVATMGFEAAKNAIEAAQINPQDIELIIVATTSHSHAYPSAACQVQGLLNIDDAISFDLAAACTGFVYALSVADQFIRAGKVKKALVIGSDLNSRKLDETDRSTVVLFGDGAGAVILEASEQEGIISTHLHASADKNNALVLAQPERGIEKSGYIEMQGNETFKLAVRELSNVVEETLLANNLDKKDLDWLVPHQANLRIITATAKKLEMDMSQVVVTLDKYANNSAATVPVALDEAIRDGRIQRGQLLLLEAFGGGWTWGSALVRF. The pIC50 is 4.7. (10) The compound is Cc1ccc(=O)[nH]c1C(=O)N[C@@H](Cc1ccccc1)C(=O)C(N)=O. The target protein (P07384) has sequence MSEEIITPVYCTGVSAQVQKQRARELGLGRHENAIKYLGQDYEQLRVRCLQSGTLFRDEAFPPVPQSLGYKDLGPNSSKTYGIKWKRPTELLSNPQFIVDGATRTDICQGALGDCWLLAAIASLTLNDTLLHRVVPHGQSFQNGYAGIFHFQLWQFGEWVDVVVDDLLPIKDGKLVFVHSAEGNEFWSALLEKAYAKVNGSYEALSGGSTSEGFEDFTGGVTEWYELRKAPSDLYQIILKALERGSLLGCSIDISSVLDMEAITFKKLVKGHAYSVTGAKQVNYRGQVVSLIRMRNPWGEVEWTGAWSDSSSEWNNVDPYERDQLRVKMEDGEFWMSFRDFMREFTRLEICNLTPDALKSRTIRKWNTTLYEGTWRRGSTAGGCRNYPATFWVNPQFKIRLDETDDPDDYGDRESGCSFVLALMQKHRRRERRFGRDMETIGFAVYEVPPELVGQPAVHLKRDFFLANASRARSEQFINLREVSTRFRLPPGEYVVVPST.... The pIC50 is 5.5.