Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The compound is O=C(O)c1ccc(-c2ccc(O[C@H]3O[C@H](CO)[C@@H](O)[C@H](O)[C@@H]3O)c(Cl)c2)cc1. The target protein (Q9UJ71) has sequence MTVEKEAPDAHFTVDKQNISLWPREPPPKSGPSLVPGKTPTVRAALICLTLVLVASVLLQAVLYPRFMGTISDVKTNVQLLKGRVDNISTLDSEIKKNSDGMEAAGVQIQMVNESLGYVRSQFLKLKTSVEKANAQIQILTRSWEEVSTLNAQIPELKSDLEKASALNTKIRALQGSLENMSKLLKRQNDILQVVSQGWKYFKGNFYYFSLIPKTWYSAEQFCVSRNSHLTSVTSESEQEFLYKTAGGLIYWIGLTKAGMEGDWSWVDDTPFNKVQSVRFWIPGEPNNAGNNEHCGNIKAPSLQAWNDAPCDKTFLFICKRPYVPSEP. The pIC50 is 3.0. (2) The drug is O=C1NC(=O)/C(=C/c2cnn3c(NC4CC4)cc(N4CCN(c5ccccn5)CC4)nc23)S1. The target protein (P67870) has sequence MSSSEEVSWISWFCGLRGNEFFCEVDEDYIQDKFNLTGLNEQVPHYRQALDMILDLEPDEELEDNPNQSDLIEQAAEMLYGLIHARYILTNRGIAQMLEKYQQGDFGYCPRVYCENQPMLPIGLSDIPGEAMVKLYCPKCMDVYTPKSSRHHHTDGAYFGTGFPHMLFMVHPEYRPKRPANQFVPRLYGFKIHPMAYQLQLQAASNFKSPVKTIR. The pIC50 is 5.7. (3) The pIC50 is 6.2. The drug is Oc1c(F)cc(Cn2c(=S)[nH]c3ccccc32)cc1F. The target protein (P15101) has sequence MQVPSPSVREAASMYGTAVAVFLVILVAALQGSAPAESPFPFHIPLDPEGTLELSWNISYAQETIYFQLLVRELKAGVLFGMSDRGELENADLVVLWTDRDGAYFGDAWSDQKGQVHLDSQQDYQLLRAQRTPEGLYLLFKRPFGTCDPNDYLIEDGTVHLVYGFLEEPLRSLESINTSGLHTGLQRVQLLKPSIPKPALPADTRTMEIRAPDVLIPGQQTTYWCYVTELPDGFPRHHIVMYEPIVTEGNEALVHHMEVFQCAAEFETIPHFSGPCDSKMKPQRLNFCRHVLAAWALGAKAFYYPEEAGLAFGGPGSSRFLRLEVHYHNPLVITGRRDSSGIRLYYTAALRRFDAGIMELGLAYTPVMAIPPQETAFVLTGYCTDKCTQLALPASGIHIFASQLHTHLTGRKVVTVLARDGRETEIVNRDNHYSPHFQEIRMLKKVVSVQPGDVLITSCTYNTEDRRLATVGGFGILEEMCVNYVHYYPQTQLELCKSAV.... (4) The drug is COc1cc(C(=O)OCCCN2CCCN(CCCOC(=O)c3cc(OC)c(OC)c(OC)c3)CC2)cc(OC)c1OC. The target protein (O54699) has sequence MAHGNAPRDSYHLVGISFFILGLGTLLPWNFFITAIPYFQGRLAGTNSSAETPSTNHTSPTDTFNFNNWVTLLSQLPLLLFTLLNSFLYQCIPESVRILGSLLAILLLFALTAALVKVDLSPGLFFSITMASVWFINSFCAVLQGSLFGQLGTMPSTYSTLFLSGQGLAGIFAALAMLTSLASGVDPQTSALGYFITPCVGILLSIICYLSLPHLKFARYYLTKKPQAPVQELETKAELLGADEKNGIPVSPQQAGPTLDLDPEKELELGLEEPQKPGKPSVFVVFRKIWLTALCLVLVFTVTLSVFPAITAMVTTSSNSPGKWSQFFNPICCFLLFNVMDWLGRSLTSYFLWPDEDSQLLPLLVCLRFLFVPLFMLCHVPQRARLPIIFWQDAYFITFMLLFAISNGYFVSLTMCLAPRQVLPHEREVAGALMTFFLALGLSCGASLSFLFKALL. The pIC50 is 5.1. (5) The small molecule is CCOc1cc2ncnc(Nc3ccc(C)c(Cl)c3)c2c2c1OCCO2. The target protein sequence is MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYVVTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCK.... The pIC50 is 7.3.