Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(C)(C)C#C/C=C/CN(Cc1ccccc1)c1cccc2c1CCC(=O)N2. The target protein (P23316) has sequence MHNINNGYVPNREKTITKRKVRLVGGKAGNLVLENPVPTELRKVLTRTESPFGEFTNMTYTACTSQPDTFSAEGFTLRAAKYGRETEIVICITMYNEDEVAFARTMHGVMKNIAHLCSRHKSKIWGKDSWKKVQVIIVADGRNKVQQSVLELLTATGCYQENLARPYVNNSKVNAHLFEYTTQISIDENLKFKGDEKNLAPVQVLFCLKESNQKKINSHRWLFNAFCPVLDPNVIVLLDVGTKPDNHAIYNLWKAFDRDSNVAGAAGEIKAMKGKGWINLTNPLVASQNFEYKLSNILDKPLESLFGYISVLPGALSAYRYIALKNHDDGTGPLASYFKGEDLLCSHDKDKENTKANFFEANMYLAEDRILCWELVSKRNDNWVLKFVKSATGETDVPETIAEFLSQRRRWINGAFFAALYSLYHFRKIWTTDHSYARKFWLHVEEFIYQLVSLLFSFFSLSNFYLTFYFLTGSLVSYKSLGKKGGFWIFTLFNYLCIGV.... The pIC50 is 7.3. (2) The compound is CS(=O)(=O)Nc1cccnc1OCC(=O)NCC(O)CN1CCc2ccccc2C1. The target protein (O14744) has sequence MAAMAVGGAGGSRVSSGRDLNCVPEIADTLGAVAKQGFDFLCMPVFHPRFKREFIQEPAKNRPGPQTRSDLLLSGRDWNTLIVGKLSPWIRPDSKVEKIRRNSEAAMLQELNFGAYLGLPAFLLPLNQEDNTNLARVLTNHIHTGHHSSMFWMRVPLVAPEDLRDDIIENAPTTHTEEYSGEEKTWMWWHNFRTLCDYSKRIAVALEIGADLPSNHVIDRWLGEPIKAAILPTSIFLTNKKGFPVLSKMHQRLIFRLLKLEVQFIITGTNHHSEKEFCSYLQYLEYLSQNRPPPNAYELFAKGYEDYLQSPLQPLMDNLESQTYEVFEKDPIKYSQYQQAIYKCLLDRVPEEEKDTNVQVLMVLGAGRGPLVNASLRAAKQADRRIKLYAVEKNPNAVVTLENWQFEEWGSQVTVVSSDMREWVAPEKADIIVSELLGSFADNELSPECLDGAQHFLKDDGVSIPGEYTSFLAPISSSKLYNEVRACREKDRDPEAQFEM.... The pIC50 is 5.3. (3) The drug is O=C(O)c1cn2c3c(c(N4CCNCC4)c(F)cc3c1=O)Oc1cc([N+](=O)[O-])ccc1-2. The target protein sequence is MGKALVIVESPAKAKTINKYLGSDYVVKSSVGHIRDLPTSGSAAKKSADSTSTKTAKKPKKDERGALVNRMGVDPWHNWEAHYEVLPGKEKVVSELKQLAEKADHIYLATDLDREGEAIAWHLREVIGGDDARYSRVVFNEITKNAIRQAFNKPGELNIDRVNAQQARRFMDRVVGYMVSPLLWKKIARGLSAGRVQSVAVRLVVEREREIKAFVPEEFWEVDASTTTPSGEALALQVTHQNDKPFRPVNKEQTQAAVSLLEKARYSVLEREDKPTTSKPGAPFITSTLQQAASTRLGFGVKKTMMMAQRLYEAGYITYMRTDSTNLSQDAVNMVRGYISDNFGKKYLPESPNQYASKENSQEAHEAIRPSDVNVMAESLKDMEADAQKLYQLIWRQFVACQMTPAKYDSTTLTVGAGDFRLKARGRILRFDGWTKVMPALRKGDEDRILPAVNKGDALTLVELTPAQHFTKPPARFSEASLVKELEKRGIGRPSTYASI.... The pIC50 is 5.4. (4) The drug is Oc1ccc(/C(F)=C/c2ccccc2)cc1. The target protein sequence is MSFPATPDYTGLNKPVGQEVSIKGLKASEGTIPADVRGAFFRAVPDPQFPPFFHPDTALSDDGMISRVLFNADGTVDYDIRYVQTPRWKAERAAGKRLFGRYRNPYTNDPSAFDLEGTVSNTTPVWHA. The pIC50 is 5.5. (5) The drug is O=C1CC(C2CCCCC2)NC(=O)C1C(=O)Nc1ccc(C2CCCCC2)cc1. The target protein (Q97SR4) has sequence MFGFFKKDKAVEVEVPTQVPAHIGIIMDGNGRWAKKRMQPRVFGHKAGMEALQTVTKAANKLGVKVITVYAFSTENWTRPDQEVKFIMNLPVEFYDNYVPELHANNVKIQMIGETDRLPKQTFEALTKAEELTKNNTGLILNFALNYGGRAEITQALKLISQDVLDAKINPGDITEELIGNYLFTQHLPKDLRDPDLIIRTSGELRLSNFLPWQGAYSELYFTDTLWPDFDEAALQEAILAYNRRHRRFGGV. The pIC50 is 7.2. (6) The compound is C[C@]12CC[C@H](O)C[C@H]1CC[C@@H]1[C@@H]2CC[C@]2(C)[C@@H](C=CC=CC=NN=C(N)N)CC[C@]12O. The target protein (P50997) has sequence MGKGVGRDKYEPAAVSEHGDKKKAKKERDMDELKKEVSMDDHKLSLDELHRKYGTDLSRGLTTARAAEILARDGPNALTPPPTTPEWVKFCRQLFGGFSMLLWIGAILCFLAYGIQAATEEEPQNDNLYLGVVLSAVVIITGCFSYYQEAKSSKIMESFKNMVPQQALVIRNGEKMSINAEEVVIGDLVEVKGGDRIPADLRIISANGCKVDNSSLTGESEPQTRSPDFTNENPLETRNIAFFSTNCVKGTARGIVVYTGDRTVMGRIATLASGLEGGQTPIAAEIEHFIHIITGVAVFLGVSFFILSLILEYTWLEAVIFLIGIIVANVPEGLLATVTVCLTLTAKRMARKNCLVKNLEAVETLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIHEADTTENQSGVSFDKSSATWLALSRIAGLCNRAVFQANQENLPILKRAVAGDASESALLKCIELCCGSVKEMRDRYAKIVEIPFNSTNKYQLSIHKNPNTSEPR.... The pIC50 is 5.1. (7) The pIC50 is 4.9. The target protein (Q8WUK0) has sequence MAATALLEAGLARVLFYPTLLYTLFRGKVPGRAHRDWYHRIDPTVLLGALPLRSLTRQLVQDENVRGVITMNEEYETRFLCNSSQEWKRLGVEQLRLSTVDMTGIPTLDNLQKGVQFALKYQSLGQCVYVHCKAGRSRSATMVAAYLIQVHKWSPEEAVRAIAKIRSYIHIRPGQLDVLKEFHKQITARATKDGTFVISKT. The compound is O=C(O)CCc1cccc([Sb](=O)(O)O)c1. (8) The drug is N#C/C=C/c1ccc(-c2ccncc2)cc1. The target protein sequence is MPSRAEDYEVLYTIGTGSYGRCQKIRRKSDGKILVWKELDYGSMTEAEKQMLVSEVNLLRELKHPNIVRYYDRIIDRTNTTLYIVMEYCEGGDLASVITKGTKERQYLDEEFVLRVMTQLTLALKECHRRSDGGHTVLHRDLKPANVFLDGKQNVKLGDFGLARILNHDTSFAKAFVGTPYYMSPEQMNRMSYNEKSDIWSLGCLLYELCALMPPFTAFSQKELAGKIREGKFRRIPYRYSDELNEIITRMLNLKDYHRPSVEEILENPLIADLVADEQRRNLERRGRQLGEPEKSQDSSPVLSELKLKEIQLQERERALKAREERLEQKEQELCVRERLAEDKLARAENLLKNYSLLKERKFLSLASNPELLNLPSSVIKKKVHFSGESKENIMRSENSESQLTSKSKCKDLKKRLHAAQLRAQALSDIEKNYQLKSRQILGMR. The pIC50 is 3.5. (9) The compound is Cc1ncc([N+](=O)[O-])n1CC(C)OC(=O)/C=C/c1ccc(-c2ccccc2)cc1. The target protein sequence is MYTKIIGTGSYLPEQVRTNADLEKMVDTSDEWIVTRTGIRERHIAAPNETVSTMGFEAATRAIEMAGIEKDQIGLIVVATTSATHAFPSAACQIQSMLGIKGCPAFDVAAACAGFTYALSVADQYVKSGAVKYALVVGSDVLARTCDPTDRGTIIIFGDGAGAAVLAASEEPGIISTHLHADGSYGELLTLPNADRVNPENSIHLTMAGNEVFKVAVTELAHIVDETLAANNLDRSQLDWLVPHQANLRIISATAKKLGMSMDNVVVTLDRHGNTSAASVPCALDEAVRDGRIKPGQLVLLEAFGGGFTWGSALVRF. The pIC50 is 5.6.