Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The drug is CC(O)(c1ccc(N(C2CCCC2)S(=O)(=O)c2ccccc2Cl)cc1)C(F)(F)F. The target protein (P28845) has sequence MAFMKKYLLPILGLFMAYYYYSANEEFRPEMLQGKKVIVTGASKGIGREMAYHLAKMGAHVVVTARSKETLQKVVSHCLELGAASAHYIAGTMEDMTFAEQFVAQAGKLMGGLDMLILNHITNTSLNLFHDDIHHVRKSMEVNFLSYVVLTVAALPMLKQSNGSIVVVSSLAGKVAYPMVAAYSASKFALDGFFSSIRKEYSVSRVNVSITLCVLGLIDTETAMKAVSGIVHMQAAPKEECALEIIKGGALRQEEVYYDSSLWTTLLIRNPCRKILEFLYSTSYNMDRFINK. The pIC50 is 7.2. (2) The compound is COc1ccc2oc(Nc3cccc(Br)c3)nc2c1. The target protein (P48999) has sequence MPSYTVTVATGSQWFAGTDDYIYLSLIGSAGCSEKHLLDKAFYNDFERGAVDSYDVTVDEELGEIYLVKIEKRKYWLHDDWYLKYITLKTPHGDYIEFPCYRWITGEGEIVLRDGRAKLARDDQIHILKQHRRKELEARQKQYRWMEWNPGFPLSIDAKCHKDLPRDIQFDSEKGVDFVLNYSKAMENLFINRFMHMFQSSWHDFADFEKIFVKISNTISERVKNHWQEDLMFGYQFLNGCNPVLIKRCTALPPKLPVTTEMVECSLERQLSLEQEVQEGNIFIVDYELLDGIDANKTDPCTHQFLAAPICLLYKNLANKIVPIAIQLNQTPGESNPIFLPTDSKYDWLLAKIWVRSSDFHVHQTITHLLRTHLVSEVFGIAMYRQLPAVHPLFKLLVAHVRFTIAINTKAREQLICEYGLFDKANATGGGGHVQMVQRAVQDLTYSSLCFPEAIKARGMDSTEDIPFYFYRDDGLLVWEAIQSFTMEVVSIYYENDQVV.... The pIC50 is 5.7. (3) The pIC50 is 5.3. The target protein sequence is MKISEFLHLALPEEQWLPTISGVLRQFAEEECYVYERPPCWYLGKGCQARLHINADGTQATFIDDAGEQKWAVDSIADCARRFMAHPQVKGRRVYGQVGFNFAAHARGIAFNAGEWPLLTLTVPREELIFEKGNVTVYADSADGCRRLCEWVKEASTTTQNAPLAVDTALNGEAYKQQVARAVAEIRRGEYVKVIVSRAIPLPSRIDMPATLLYGRQANTPVRSFMFRQEGREALGFSPELVMSVTGNKVVTEPLAGTRDRMGNPEHNKAKEAELLHDSKEVLEHILSVKEAIAELEAVCLPGSVVVEDLMSVRQRGSVQHLGSGVSGQLAENKDAWDAFTVLFPSITASGIPKNAALNAIMQIEKTPRELYSGAILLLDDTRFDAALVLRSVFQDSQRCWIQAGAGIIAQSTPERELTETREKLASIAPYLMV. The compound is COC(=O)Oc1c([N+](=O)[O-])cc2oc(=O)sc2c1[N+](=O)[O-]. (4) The drug is CCOC(=O)c1c(C)n(C)c2ccc(O)c(CN(C)C)c12. The target is TRQARRNRRRRWRERQR. The pIC50 is 4.1. (5) The small molecule is Oc1ccc2c(c1)C1C(CC2)C1c1ccncc1. The target protein (P30099) has sequence MGACDNDFIELHSRVTADVWLARPWQCLHRTRALGTTATLAPKTLKPFEAIPQYSRNKWLKMIQILREQGQENLHLEMHQAFQELGPIFRHSAGGAQIVSVMLPEDAEKLHQVESILPRRMHLEPWVAHRELRGLRRGVFLLNGAEWRFNRLKLNPNVLSPKAVQNFVPMVDEVARDFLEALKKKVRQNARGSLTMDVQQSLFNYTIEASNFALFGERLGLLGHDLNPGSLKFIHALHSMFKSTTQLLFLPRSLTRWTSTQVWKEHFDAWDVISEYANRCIWKVHQELRLGSSQTYSGIVAALITQGALPLDAIKANSMELTAGSVDTTAIPLVMTLFELARNPDVQQALRQETLAAEASIAANPQKAMSDLPLLRAALKETLRLYPVGGFLERILNSDLVLQNYHVPAGTLVLLYLYSMGRNPAVFPRPERYMPQRWLERKRSFQHLAFGFGVRQCLGRRLAEVEMLLLLHHMLKTFQVETLRQEDVQMAYRFVLMPSS.... The pIC50 is 5.2. (6) The small molecule is COC1=CC(=O)C[C@@H](C)[C@]12Oc1c(Cl)c(OC)cc(OC)c1C2=O. The pIC50 is 4.0. The target protein sequence is MSSEVETSEGVDESENNSTAPEKENHTKMADLSELLKEGTKEAHDRAENTQFVKDFLKGNIKKELFKLATTALYFTYSALEEEMDRNKDHPAFAPLYFPTELHRKEALIKDMEYFFGENWEEQVKCSEAAQKYVDRIHYVGQNEPELLVAHAYTRYMGDLSGGQVLKKVAQRALKLPSTGEGTQFYLFEHVDNAQQFKQFYRARMNALDLSMKTKERIVEEANKAFEYNMQIFSELDQAGSMLTKETLEDGLPVHDGKGDVRKCPFYAAQPDKGTLGGSNCPFRTAMAVLRKPSLQLILAASVALVAGLLAWYYM. (7) The small molecule is Nc1nc2cc(-c3ccc4ncn(CCN5CCCC5)c(=O)c4c3)ccc2o1. The target protein (P27986) has sequence MSAEGYQYRALYDYKKEREEDIDLHLGDILTVNKGSLVALGFSDGQEARPEEIGWLNGYNETTGERGDFPGTYVEYIGRKKISPPTPKPRPPRPLPVAPGSSKTEADVEQQALTLPDLAEQFAPPDIAPPLLIKLVEAIEKKGLECSTLYRTQSSSNLAELRQLLDCDTPSVDLEMIDVHVLADAFKRYLLDLPNPVIPAAVYSEMISLAPEVQSSEEYIQLLKKLIRSPSIPHQYWLTLQYLLKHFFKLSQTSSKNLLNARVLSEIFSPMLFRFSAASSDNTENLIKVIEILISTEWNERQPAPALPPKPPKPTTVANNGMNNNMSLQDAEWYWGDISREEVNEKLRDTADGTFLVRDASTKMHGDYTLTLRKGGNNKLIKIFHRDGKYGFSDPLTFSSVVELINHYRNESLAQYNPKLDVKLLYPVSKYQQDQVVKEDNIEAVGKKLHEYNTQFQEKSREYDRLYEEYTRTSQEIQMKRTAIEAFNETIKIFEEQCQT.... The pIC50 is 5.0.