Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (Q9Y286) has sequence MLLLLLLPLLWGRERVEGQKSNRKDYSLTMQSSVTVQEGMCVHVRCSFSYPVDSQTDSDPVHGYWFRAGNDISWKAPVATNNPAWAVQEETRDRFHLLGDPQTKNCTLSIRDARMSDAGRYFFRMEKGNIKWNYKYDQLSVNVTALTHRPNILIPGTLESGCFQNLTCSVPWACEQGTPPMISWMGTSVSPLHPSTTRSSVLTLIPQPQHHGTSLTCQVTLPGAGVTTNRTIQLNVSYPPQNLTVTVFQGEGTASTALGNSSSLSVLEGQSLRLVCAVDSNPPARLSWTWRSLTLYPSQPSNPLVLELQVHLGDEGEFTCRAQNSLGSQHVSLNLSLQQEYTGKMRPVSGVLLGAVGGAGATALVFLSFCVIFIVVRSCRKKSARPAADVGDIGMKDANTIRGSASQGNLTESWADDNPRHHGLAAHSSGEEREIQYAPLSFHKGEPQDLSGQEATNNEYSEIKIPK. The drug is CC(=O)N[C@@H]1[C@@H](O)C[C@](OCCCn2cc(-c3ccc(-c4ccccc4)cc3)nn2)(C(=O)[O-])O[C@H]1[C@H](O)[C@H](O)CNS(C)(=O)=O.[Na+]. The pIC50 is 5.6. (2) The compound is NCc1ccc(Nc2c(C(=O)C3CC3)cnc3ccc(-c4cc(F)c(O)c(Cl)c4)cc23)cc1. The target protein (Q14680) has sequence MKDYDELLKYYELHETIGTGGFAKVKLACHILTGEMVAIKIMDKNTLGSDLPRIKTEIEALKNLRHQHICQLYHVLETANKIFMVLEYCPGGELFDYIISQDRLSEEETRVVFRQIVSAVAYVHSQGYAHRDLKPENLLFDEYHKLKLIDFGLCAKPKGNKDYHLQTCCGSLAYAAPELIQGKSYLGSEADVWSMGILLYVLMCGFLPFDDDNVMALYKKIMRGKYDVPKWLSPSSILLLQQMLQVDPKKRISMKNLLNHPWIMQDYNYPVEWQSKNPFIHLDDDCVTELSVHHRNNRQTMEDLISLWQYDHLTATYLLLLAKKARGKPVRLRLSSFSCGQASATPFTDIKSNNWSLEDVTASDKNYVAGLIDYDWCEDDLSTGAATPRTSQFTKYWTESNGVESKSLTPALCRTPANKLKNKENVYTPKSAVKNEEYFMFPEPKTPVNKNQHKREILTTPNRYTTPSKARNQCLKETPIKIPVNSTGTDKLMTGVISPE.... The pIC50 is 8.6. (3) The drug is COc1cnc(-c2ccccc2C(F)(F)CNC(=O)c2ccc(COCC(F)(F)F)nc2)cn1. The target protein (O08562) has sequence MAMLPPPGPQSFVHFTKQSLALIEQRISEEKAKEHKDEKKDDEEEGPKPSSDLEAGKQLPFIYGDIPPGMVSEPLEDLDPYYADKKTFIVLNKGKAIFRFNATPALYMLSPFSPLRRISIKILVHSLFSMLIMCTILTNCIFMTLSNPPEWTKNVEYTFTGIYTFESLIKILARGFCVGEFTFLRDPWNWLDFVVIVFAYLTEFVNLGNVSALRTFRVLRALKTISVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALIGLQLFMGNLKHKCFRKELEENETLESIMNTAESEEELKKYFYYLEGSKDALLCGFSTDSGQCPEGYICVKAGRNPDYGYTSFDTFSWAFLALFRLMTQDYWENLYQQTLRAAGKTYMIFFVVVIFLGSFYLINLILAVVAMAYEEQNQANIEEAKQKELEFQQMLDRLKKEQEEAEAIAAAAAEFTSIGRSRIMGLSESSSETSRLSSKSAKERRNRRKKKKQKMSSGEEKGDDEKLS.... The pIC50 is 6.0. (4) The target protein (Q9EQ60) has sequence MTEGTLAADEVRVPLGASPPAPAAPVRASPASPGAPGREEQGGSGSGVLAPESPGTECGADLGADEEQPVPYPALAATVFFCLGQTTRPRSWCLRLVCNPWFEHISMLVIMLNCVTLGMFRPCEDVECRSERCSILEAFDDFIFAFFAVEMVIKMVALGLFGQKCYLGDTWNRLDFFIVMAGMMEYSLDGHNVSLSAIRTVRVLRPLRAINRVPSMRILVTLLLDTLPMLGNVLLLCFFVFFIFGIVGVQLWAGLLRNRCFLDSAFVRNNNLTFLRPYYQTEEGEENPFICSSRRDNGMQKCSHIPSRRELRVQCTLGWEAYGQPQAEDGGAGRNACINWNQYYNVCRSGEFNPHNGAINFDNIGYAWIAIFQVITLEGWVDIMYYVMDAHSFYNFIYFILLIIMGSFFMINLCLVVIATQFSETKQRENQLMREQRARYLSNDSTLASFSEPGSCYEELLKYVGHIFRKVKRRSLRLYARWQSRWRKKVDPSSTVHGQG.... The pIC50 is 5.1. The small molecule is O=C(c1cc(S(=O)(=O)Nc2ccccc2F)c(F)cc1Cl)N1CCN2CCC[C@@H]2C1.