Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is OC[C@@H]1O[C@H](SCC#CC#CCS[C@@H]2O[C@H](CO)[C@H](O)[C@H](O)[C@H]2O)[C@@H](O)[C@@H](O)[C@@H]1O. The target protein (P09382) has sequence MACGLVASNLNLKPGECLRVRGEVAPDAKSFVLNLGKDSNNLCLHFNPRFNAHGDANTIVCNSKDGGAWGTEQREAVFPFQPGSVAEVCITFDQANLTVKLPDGYEFKFPNRLNLEAINYMAADGDFKIKCVAFD. The pIC50 is 2.3. (2) The small molecule is CC(C)(C)c1ccc(C(O)CCCN2CCC(C(O)(c3ccccc3)c3ccccc3)CC2)cc1. The target protein (P51589) has sequence MLAAMGSLAAALWAVVHPRTLLLGTVAFLLAADFLKRRRPKNYPPGPWRLPFLGNFFLVDFEQSHLEVQLFVKKYGNLFSLELGDISAVLITGLPLIKEALIHMDQNFGNRPVTPMREHIFKKNGLIMSSGQAWKEQRRFTLTALRNFGLGKKSLEERIQEEAQHLTEAIKEENGQPFDPHFKINNAVSNIICSITFGERFEYQDSWFQQLLKLLDEVTYLEASKTCQLYNVFPWIMKFLPGPHQTLFSNWKKLKLFVSHMIDKHRKDWNPAETRDFIDAYLKEMSKHTGNPTSSFHEENLICSTLDLFFAGTETTSTTLRWALLYMALYPEIQEKVQAEIDRVIGQGQQPSTAARESMPYTNAVIHEVQRMGNIIPLNVPREVTVDTTLAGYHLPKGTMILTNLTALHRDPTEWATPDTFNPDHFLENGQFKKREAFMPFSIGKRACLGEQLARTELFIFFTSLMQKFTFRPPNNEKLSLKFRMGITISPVSHRLCAVP.... The pIC50 is 6.0. (3) The drug is COc1cc(-c2cc(C3CCCNC3)n3ncnc(N)c23)cc(F)c1CO. The target protein (P37023) has sequence MTLGSPRKGLLMLLMALVTQGDPVKPSRGPLVTCTCESPHCKGPTCRGAWCTVVLVREEGRHPQEHRGCGNLHRELCRGRPTEFVNHYCCDSHLCNHNVSLVLEATQPPSEQPGTDGQLALILGPVLALLALVALGVLGLWHVRRRQEKQRGLHSELGESSLILKASEQGDSMLGDLLDSDCTTGSGSGLPFLVQRTVARQVALVECVGKGRYGEVWRGLWHGESVAVKIFSSRDEQSWFRETEIYNTVLLRHDNILGFIASDMTSRNSSTQLWLITHYHEHGSLYDFLQRQTLEPHLALRLAVSAACGLAHLHVEIFGTQGKPAIAHRDFKSRNVLVKSNLQCCIADLGLAVMHSQGSDYLDIGNNPRVGTKRYMAPEVLDEQIRTDCFESYKWTDIWAFGLVLWEIARRTIVNGIVEDYRPPFYDVVPNDPSFEDMKKVVCVDQQTPTIPNRLAADPVLSGLAQMMRECWYPNPSARLTALRIKKTLQKISNSPEKPK.... The pIC50 is 8.7. (4) The small molecule is O=C(/C=C/c1ccc([N+](=O)[O-])cc1)c1cccc([N+](=O)[O-])c1. The target protein sequence is MENNSTERYIFKPNFLGEGSYGKVYKAYDTILKKEVAIKKMKLNEISNYIDDCGINFVLLREIKIMKEIKHKNIMSALDLYCEKDYINLVMEIMDYDLSKIINRKIFLTDSQKKCILLQILNGLNVLHKYYFMHRDLSPANIFINKKGEVKLADFGLCTKYGYDMYSDKLFRDKYKKNLNLTSKVVTLWYRAPELLLGSNKYNSSIDMWSFGCIFAELLLQKALFPGENEIDQLGKIFFLLGTPNENNWPEALCLPLYTEFTKATKKDFKTYFKIDDDDCIDLLTSFLKLNAHERISAEDAMKHRYFFNDPLPCDISQLPFNDL. The pIC50 is 3.6. (5) The drug is CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)CNC(=O)CNC(=O)[C@@H](N)Cc1ccccc1)[C@@H](C)O)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)O. The target protein (P35370) has sequence MESLFPAPYWEVLYGSHFQGNLSLLNETVPHHLLLNASHSAFLPLGLKVTIVGLYLAVCIGGLLGNCLVMYVILRHTKMKTATNIYIFNLALADTLVLLTLPFQGTDILLGFWPFGNALCKTVIAIDYYNMFTSTFTLTAMSVDRYVAICHPIRALDVRTSSKAQAVNVAIWALASVVGVPVAIMGSAQVEDEEIECLVEIPAPQDYWGPVFAICIFLFSFIIPVLIISVCYSLMIRRLRGVRLLSGSREKDRNLRRITRLVLVVVAVFVGCWTPVQVFVLVQGLGVQPGSETAVAILRFCTALGYVNSCLNPILYAFLDENFKACFRKFCCASSLHREMQVSDRVRSIAKDVGLGCKTSETVPRPA. The pIC50 is 9.0. (6) The compound is CC(C)C[C@H]1C[C@@H](C)[C@]2(C(=O)Nc3ccc(F)cc32)N1C(=O)c1cn(Cc2ccccc2)cn1. The target protein (Q7Z478) has sequence MGGKNKKHKAPAAAVVRAAVSASRAKSAEAGIAGEAQSKKPVSRPATAAAAAAGSREPRVKQGPKIYSFNSTNDSSGPANLDKSILKVVINNKLEQRIIGVINEHKKQNNDKGMISGRLTAKKLQDLYMALQAFSFKTKDIEDAMTNTLLYGGDLHSALDWLCLNLSDDALPEGFSQEFEEQQPKSRPKFQSPQIQATISPPLQPKTKTYEEDPKSKPKKEEKNMEVNMKEWILRYAEQQNEEEKNENSKSLEEEEKFDPNERYLHLAAKLLDAKEQAATFKLEKNKQGQKEAQEKIRKFQREMETLEDHPVFNPAMKISHQQNERKKPPVATEGESALNFNLFEKSAAATEEEKDKKKEPHDVRNFDYTARSWTGKSPKQFLIDWVRKNLPKSPNPSFEKVPVGRYWKCRVRVIKSEDDVLVVCPTILTEDGMQAQHLGATLALYRLVKGQSVHQLLPPTYRDVWLEWSDAEKKREELNKMETNKPRDLFIAKLLNKLK.... The pIC50 is 4.0.