Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is O=C1N[C@@H]2[C@H](CCCCCc3cn(CCCCOc4ccc5ccccc5c4)nn3)SC[C@@H]2N1. The target protein sequence is MSKYSQDVLQLLYKNKPNYISGQSIAESLNISRTAVKKVIDQLKLEGCKIDSVNHKGHLLQQLPDIWYQGIIDQYTKSSALFDFSEVYDSIDSTQLAAKKSLVGNQSSFFILSDEQTKGRGRFNRHWSSSKGQGLWMSVVLRPNVAFSMISKFNLFIALGIRDAIQHFSQDEVKVKWPNDIYIDNGKVCGFLTEMVANNDGIEAIICGIGINLTQQLENFDESIRHRATSIQLHDKNKLDRYQFLERLLQEIEKRYNQFLTLPFSEIREEYIAASNIWNRTLLFTENDKQFKGQAIDLDYDGYLIVRDEAGESHRLISADIDF. The pIC50 is 5.2. (2) The target protein sequence is MPQTLVFFSPDGGDPPLPKCLCSQIYCHGELLRQVQMARLYQDDKQFVDMPLSVAPDQVLQRFSELAQAHNFSIPQQELQDFIREHFQAVGQELQPWTPEDWKDSPQFLQKILDPKLRAWAGQLHQLWKKLGKKVKPEVLSHPERFSLIYSGHPFIVPGGRFVEFYYWDSYWVMEGLLLSEMPGTVKGMLQNFLDLVQTYGHVPNGARVYYLQRSQPPLLSLMMERYVTQANDTAFLRDNLETLALELDFWTKNRSISVSSGGKSYVLNRYHVPYGGPRPESYSKDAELAATLSEGDHEALWAELKAGAESGWDFSSRWFVGGPNPDSLSSIRTSKLVPVDLNASGNDSQAEKYRNLRAQRMAAMKDILWDEEKGAWFDYDLENGKKNLEFYPSNLAPLWAGCFSDPGDVDKALKYLEDSQILTYHYGIPTSLRKTGQQWDFPNAWAPLQDLVIRGLAKSPSARAQEVAFQLAQNWIRTNFDVYSRRSAMYEKYDISNGG.... The compound is CCCCN1C[C@H](O)[C@@H](O)[C@H](O)[C@H]1CO. The pIC50 is 4.4. (3) The compound is COc1cc(C=Nc2nnc(S)s2)cc(OC)c1. The target protein sequence is LIPFDDAVGPTEFSPFDQWTGYCTHGSTLFPTWHRPYVLILEQILSGHAQQIADTYTVNKSEWKKAATEFRHPYWDWASNSVPPPEVISLPKVTITTPNGQKTSVANPLMRYTFNSVNDGGFYGPYNQWDTTLRQPDSTGVNAKDNVNRLKSVLKNAQASLTRATYDMFNRVTTWPHFSSHTPASGGSTSNSIEAIHDNIHVLVGGNGHMSDPSVAPFDPIFFLHHANVDRLIALWSAIRYDVWTSPGDAQFGTYTLRYKQSVDESTDLAPWWKTQNEYWKSNELRSTESLGYTYPEFVGLDMYNKDAVNKTISRKVAQLYGPQRGGQRSLVEDLSNSHARRSQRPAKRSRLGQLLKGLFSDWSAQIKFNRHEVGQSFSVCLFLGNVPEDPREWLVSPNLVGARHAFVRSVKTDHVAEEIGFIPINQWIAEHTGLPSFAVDLVKPLLAQGLQWRVLLADGTPAELDSLEVTILEVPSELTDDEPNPRSRPPRYHKDITHG.... The pIC50 is 5.3. (4) The drug is c1cc(-c2cnn3cc(-c4ccc(OCCN5CCCCC5)cc4)cnc23)ccn1. The target protein (P12644) has sequence MIPGNRMLMVVLLCQVLLGGASHASLIPETGKKKVAEIQGHAGGRRSGQSHELLRDFEATLLQMFGLRRRPQPSKSAVIPDYMRDLYRLQSGEEEEEQIHSTGLEYPERPASRANTVRSFHHEEHLENIPGTSENSAFRFLFNLSSIPENEVISSAELRLFREQVDQGPDWERGFHRINIYEVMKPPAEVVPGHLITRLLDTRLVHHNVTRWETFDVSPAVLRWTREKQPNYGLAIEVTHLHQTRTHQGQHVRISRSLPQGSGNWAQLRPLLVTFGHDGRGHALTRRRRAKRSPKHHSQRARKKNKNCRRHSLYVDFSDVGWNDWIVAPPGYQAFYCHGDCPFPLADHLNSTNHAIVQTLVNSVNSSIPKACCVPTELSAISMLYLDEYDKVVLKNYQEMVVEGCGCR. The pIC50 is 6.4. (5) The drug is CN(C)C(=O)N1CC2(CCN(c3ccc(-c4cc(-c5cnn(C)c5)cn5ncc(C#N)c45)cn3)CC2)C1. The target protein sequence is HCYHKFAHKPPISSAEMTFRRPAQAFPVSYSSSGARRPSLDSMENQVSVDAFKILEDPKWEFPRKNLVLGKTLGEGEFGKVVKATAFHLKGRAGYTTVAVKMLKENASPSELRDLLSEFNVLKQVNHPHVIKLYGACSQDGPLLLIMEYAKYGSLRGFLRESRKVGPGYLGSGGSRNSSSLDHPDERALTMGDLISFAWQISQGMQYLAEMKLVHRDLAARNILVAEGRKMKISDFGLSRDVYEEDSYVKRSQGRIPVKWMAIESLFDHIYTTQSDVWSFGVLLWEIVTLGGNPYPGIPPERLFNLLKTGHRMERPDNCSEEMYRLMLQCWKQEPDKRPVFADISKDLEKMMVKRRDYLDLAASTPSDSLIYDDGLSEEETPLVDCNNAPLPRALPSTWIENKLYGMSDPNWPGESPVPLTRADGTNTGFPRYPNDSVYANWMLSPSAAKLMDTFDS. The pIC50 is 7.5. (6) The compound is Nc1ccc([As]2SCCCS2)cc1. The target protein (P70619) has sequence VNVGCVPKKVMWNTAVHSEFIHDHVDYGFQNCKSKFNWHVIKEKRDAYVSRLNNIYQNNLTKSHIEVIHGYATFRDGPQPTAEVNGKKFTAPHILIATGGVPTVPHENQIPGASLGITSDGFFQLEDLPSRSVIVGAGYIAVEIAGILSALGSKTSLMIRHDKVLRSFDSLISSNCTEELENAGGVEVLTVKKFSQVKEVKKTSSGLELHVVTALPGRKPTVTTIPDVDCLLWAIGRDPNSKGLNLNKLGIQTDDKGHILVDEFQNTNVKGVYAVGDVCGKALLTPVAIAAGRKLAHRLFEGKEDSRLDYDNIPTVVFSHPPIGTVGLTEDEAVHKYGKDNVKIYSTAFTPMYHAVTTRKTKCVMKMVCANKEEKVVGIHMQGIGCDEMLQGFAVAVKMGATKADFDNRVAIHPTSSEELVTLR. The pIC50 is 5.0. (7) The compound is COc1cc(Cc2cccc(OC)c2O)cc(OC)c1. The target protein (P0DP25,P0DP24,P0DP23) has sequence MADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTAK. The pIC50 is 4.2. (8) The compound is C[n+]1ccc(-c2cc[n+](C)cc2)cc1. The target protein (P04993) has sequence MKLQKQLLEAVEHKQLRPLDVQFALTVAGDEHPAVTLAAALLSHDAGEGHVCLPLSRLENNEASHPLLATCVSEIGELQNWEECLLASQAVSRGDEPTPMILCGDRLYLNRMWCNERTVARFFNEVNHAIEVDEALLAQTLDKLFPVSDEINWQKVAAAVALTRRISVISGGPGTGKTTTVAKLLAALIQMADGERCRIRLAAPTGKAAARLTESLGKALRQLPLTDEQKKRIPEDASTLHRLLGAQPGSQRLRHHAGNPLHLDVLVVDEASMIDLPMMSRLIDALPDHARVIFLGDRDQLASVEAGAVLGDICAYANAGFTAERARQLSRLTGTHVPAGTGTEAASLRDSLCLLQKSYRFGSDSGIGQLAAAINRGDKTAVKTVFQQDFTDIEKRLLQSGEDYIAMLEEALAGYGRYLDLLQARAEPDLIIQAFNEYQLLCALREGPFGVAGLNERIEQFMQQKRKIHRHPHSRWYEGRPVMIARNDSALGLFNGDIGI.... The pIC50 is 3.9.