Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The pIC50 is 6.3. The compound is COc1ccc(C(=O)Nc2c(Cl)cncc2Cl)c2c1nc(C(F)(F)F)n2C. The target protein (P14644) has sequence NSSRTSSAASDLHGEDMIVTPFAQVLASLRTVRSNVAALAHGAGSATRQALLGTPPQSSQQAAPAEESGLQLAQETLEELDWCLEQLETLQTRRSVGEMASNKFKRMLNRELTHLSETSRSGNQVSEYISQTFLDQQAEVELPAPPTEDHPWPMAQITGLRKSCHTSLPTAAIPRFGVQTDQEEQLAKELEDTNKWGLDVFKVAELSGNRPLTAVIFRVLQERDLLKTFQIPADTLLRYLLTLEGHYHSNVAYHNSIHAADVVQSAHVLLGTPALEAVFTDLEVLAAIFACAIHDVDHPGVSNQFLINTNSELALMYNDSSVLENHHLAVGFKLLQGENCDIFQNLSTKQKLSLRRMVIDMVLATDMSKHMSLLADLKTMVETKKVTSLGVLLLDNYSDRIQVLQSLVHCADLSNPAKPLPLYRQWTERIMAEFFQQGDRERESGLDISPMCDKHTASVEKSQVGFIDYIAHPLWETWADLVHPDAQELLDTLEDNREWY.... (2) The compound is Cc1c(Cn2ccnc2)oc2ccc(C(=O)O)cc12. The target protein (P15393) has sequence MALRVTADVWLARPWQCLHRTRALGTTAKVAPKTLKPFEAIPQYSRNKWLKMIQILREQGQENLHLEMHQAFQELGPIFRHSAGGAQIVSVMLPEDAEKLHQVESILPHRMPLEPWVAHRELRGLRRGVFLLNGADWRFNRLQLNPNMLSPKAIQSFVPFVDVVARDFVENLKKRMLENVHGSMSINIQSNMFNYTMEASHFVISGERLGLTGHDLKPESVTFTHALHSMFKSTTQLMFLPKSLTRWTSTRVWKEHFDSWDIISEYVTKCIKNVYRELAEGRQQSWSVISEMVAQSTLSMDAIHANSMELIAGSVDTTAISLVMTLFELARNPDVQQALRQESLAAEASIVANPQKAMSDLPLLRAALKETLRLYPVGSFVERIVHSDLVLQNYHVPAGTFVIIYLYSMGRNPAVFPRPERYMPQRWLERKRSFQHLAFGFGVRQCLGRRLAEVEMLLLLHHMLKTFQVETLRQEDMQMVFRFLLMPSSSPFLTFRPVS. The pIC50 is 4.0. (3) The pIC50 is 7.4. The target protein sequence is MSLHFLYYCSEPTLDVKIAFCQGFDKHVDVSSVVKHYNMSKSKVDNQFYSVEVGDSTFTVLKRYQNLKPIGSGAQGIVCAAYDAVLDRNVAIKKLSRPFQNQTHAKRAYRELVLMKCVNHKNIISLLNVFTPQKTLEEFQDVYLVMELMDANLCQVIQMELDHERMSYLLYQMLSAIKHLHSAGIIHRDLKPSNIVVKSDCTLKILDFGLARTAGTSFMMTPYVVTRYYRAPEVILGMGYKENVDIWSVGCIMGEMVRHKILFPGRDYIDQWNKVIEQLGTPCPEFMKKLQPTVRNYVENRPKYAGLTFPKLFPDSLFPADSEHNKLKASQARDLLSKMLVIDPAKRISVDDALQHPYINVWYDPAEVEAPPPQIYDKQLDEREHTIEEWKELIYKEVMNSEEKTKNGVVKGQPSPSGAAVNSSESLPPSSSVNDISSMSTDQTLASDTDSSLEASAGPLGCCR. The small molecule is N#CC(c1ccnc(NCCc2ccc(S(N)(=O)=O)cc2)n1)c1nc2ccccc2s1. (4) The drug is CC1(C)[C@H](C(=O)O)N2C(=O)[C@@H](CO)[C@H]2S1(=O)=O. The target protein (P14488) has sequence MKNTLLKLGVCVSLLGITPFVSTISSVQAERTVEHKVIKNETGTISISQLNKNVWVHTELGYFSGEAVPSNGLVLNTSKGLVLVDSSWDDKLTKELIEMVEKKFKKRVTDVIITHAHADRIGGMKTLKERGIKAHSTALTAELAKKNGYEEPLGDLQSVTNLKFGNMKVETFYPGKGHTEDNIVVWLPQYQILAGGCLVKSASSKDLGNVADAYVNEWSTSIENVLKRYGNINLVVPGHGEVGDRGLLLHTLDLLK. The pIC50 is 3.7.