From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is N#C/C(=C\c1cccc(Cl)c1)c1c[nH]c2ccccc12. The target protein (O95235) has sequence MSQGILSPPAGLLSDDDVVVSPMFESTAADLGSVVRKNLLSDCSVVSTSLEDKQQVPSEDSMEKVKVYLRVRPLLPSELERQEDQGCVRIENVETLVLQAPKDSFALKSNERGIGQATHRFTFSQIFGPEVGQASFFNLTVKEMVKDVLKGQNWLIYTYGVTNSGKTHTIQGTIKDGGILPRSLALIFNSLQGQLHPTPDLKPLLSNEVIWLDSKQIRQEEMKKLSLLNGGLQEEELSTSLKRSVYIESRIGTSTSFDSGIAGLSSISQCTSSSQLDETSHRWAQPDTAPLPVPANIRFSIWISFFEIYNELLYDLLEPPSQQRKRQTLRLCEDQNGNPYVKDLNWIHVQDAEEAWKLLKVGRKNQSFASTHLNQNSSRSHSIFSIRILHLQGEGDIVPKISELSLCDLAGSERCKDQKSGERLKEAGNINTSLHTLGRCIAALRQNQQNRSKQNLVPFRDSKLTRVFQGFFTGRGRSCMIVNVNPCASTYDETLHVAKF.... The pIC50 is 4.7. (2) The drug is N#Cc1ccc(OCc2ccsc2)cc1OC(C(=O)O)c1ccc(Cl)cc1. The target protein (P26684) has sequence MGVLCFLASFWLALVGGAIADNAERYSANLSSHVEDFTPFPGTEFNFLGTTLQPPNLALPSNGSMHGYCPQQTKITTAFKYINTVISCTIFIVGMVGNATLLRIIYQNKCMRNGPNALIASLALGDLIYVVIDLPINVFKLLAGRWPFDHNDFGVFLCKLFPFLQKSSVGITVLNLCALSVDRYRAVASWSRVQGIGIPLITAIEIVSIWILSFILAIPEAIGFVMVPFEYKGEQHRTCMLNATTKFMEFYQDVKDWWLFGFYFCMPLVCTAIFYTLMTCEMLNRRNGSLRIALSEHLKQRREVAKTVFCLVVIFALCWFPLHLSRILKKTVYDEMDKNRCELLSFLLLMDYIGINLATMNSCINPIALYFVSKKFKNCFQSCLCCCCHQSKSLMTSVPMNGTSIQWKNQEQNHNTERSSHKDSMN. The pIC50 is 6.3. (3) The compound is OC[C@@H]1NC[C@H](O)[C@@H](O)[C@H]1O. The target protein sequence is MLASLSSSSRAAISCIPLCLLFLTLASSNGVFAAAPPKVGSGYKLVSLVEHPEGGALVGYLQVKQRTSTYGPDIPLLRLYVKHETKDRIRVQITDADKPRWEVPYNLLQREPAPPVTGGRITGVPFAAGEYPGEELVFTYGRDPFWFAVHRKSSREALFNTSCGALVFKDQYIEASTSLPRDAALYGLGENTQPGGIRLRPNDPYTIYTTDISAINLNTDLYGSHPVYVDLRSRGGHGVAHAVLLLNSNGMDVFYRGTSLTYKVIGGLLDFYLFSGPTPLAVVDQYTSMIGRPAPMPYWAFGFHQCRWGYKNLSVVEGVVEGYRNAQIPLDVIWNDDDHMDAAKDFTLDPVNYPRPKLLEFLDKIHAQGMKYIVLIDPGIAVNNTYGVYQRGMQGDVFIKLDGKPYLAQVWPGPVYFPDFLNPNGVSWWIDEVRRFHDLVPVDGLWIDMNEASNFCTGKCEIPTTHLCPLPNTTTPWVCCLDCKNLTNTRWDEPPYKINA.... The pIC50 is 3.4. (4) The small molecule is CCCN(CCC)CCC(=O)Nc1ccc(NC(=O)CCN(CCC)CCC)c2c1C(=O)c1ccccc1C2=O. The target protein (O14746) has sequence MPRAPRCRAVRSLLRSHYREVLPLATFVRRLGPQGWRLVQRGDPAAFRALVAQCLVCVPWDARPPPAAPSFRQVSCLKELVARVLQRLCERGAKNVLAFGFALLDGARGGPPEAFTTSVRSYLPNTVTDALRGSGAWGLLLRRVGDDVLVHLLARCALFVLVAPSCAYQVCGPPLYQLGAATQARPPPHASGPRRRLGCERAWNHSVREAGVPLGLPAPGARRRGGSASRSLPLPKRPRRGAAPEPERTPVGQGSWAHPGRTRGPSDRGFCVVSPARPAEEATSLEGALSGTRHSHPSVGRQHHAGPPSTSRPPRPWDTPCPPVYAETKHFLYSSGDKEQLRPSFLLSSLRPSLTGARRLVETIFLGSRPWMPGTPRRLPRLPQRYWQMRPLFLELLGNHAQCPYGVLLKTHCPLRAAVTPAAGVCAREKPQGSVAAPEEEDTDPRRLVQLLRQHSSPWQVYGFVRACLRRLVPPGLWGSRHNERRFLRNTKKFISLGKH.... The pIC50 is 4.3. (5) The drug is O=C1CC[C@H](N2Cc3c(OCc4ccc(CN5CCOCC5)cc4)cccc3C2=O)C(=O)N1. The target protein (P10147) has sequence MQVSTAALAVLLCTMALCNQFSASLAADTPTACCFSYTSRQIPQNFIADYFETSSQCSKPGVIFLTKRSRQVCADPSEEWVQKYVSDLELSA. The pIC50 is 7.5. (6) The compound is C=C1NC(=O)C(C)C(CCC(C)C(=O)C=CC(C)=CCC(C)CCCCCCC)OC(=O)[C@H](CC(OS(=O)(=O)O)C(N)=O)NC(=O)[C@@H](C)CNC1=O. The target protein (P00642) has sequence MSNKKQSNRLTEQHKLSQGVIGIFGDYAKAHDLAVGEVSKLVKKALSNEYPQLSFRYRDSIKKTEINEALKKIDPDLGGTLFVSNSSIKPDGGIVEVKDDYGEWRVVLVAEAKHQGKDIINIRNGLLVGKRGDQDLMAAGNAIERSHKNISEIANFMLSESHFPYVLFLEGSNFLTENISITRPDGRVVNLEYNSGILNRLDRLTAANYGMPINSNLCINKFVNHKDKSIMLQAASIYTQGDGREWDSKIMFEIMFDISTTSLRVLGRDLFEQLTSK. The pIC50 is 3.6. (7) The small molecule is CCn1sc(=O)n(-c2ccccc2)c1=O. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 5.2. (8) The compound is FC(F)(F)c1ccc(-c2cc3nccnc3[nH]2)cc1. The target protein (P18266) has sequence MSGRPRTTSFAESCKPVQQPSAFGSMKVSRDKDGSKVTTVVATPGQGPDRPQEVSYTDTKVIGNGSFGVVYQAKLCDSGELVAIKKVLQDKRFKNRELQIMRKLDHCNIVRLRYFFYSSGEKKDEVYLNLVLDYVPETVYRVARHYSRAKQTLPVIYVKLYMYQLFRSLAYIHSFGICHRDIKPQNLLLDPDTAVLKLCDFGSAKQLVRGEPNVSYICSRYYRAPELIFGATDYTSSIDMWSAGCVLAELLLGQPIFPGDSGVDQLVEIIKVLGTPTREQIREMNPNYTEFKFPQIKAHPWTKVFRPRTPPEAIALCSRLLEYTPTARLTPLEACAHSFFDELRDPNVKLPNGRDTPALFNFTTQELSSNPPLATILIPPHARIQAAASPPANATAASDTNAGDRGQTNNAASASASNST. The pIC50 is 5.1. (9) The drug is Cc1ccc(-n2nc3c(SCCN4CCOCC4)nnc(C)c3c2C)cc1. The target protein (P04156) has sequence MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCVNITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVG. The pIC50 is 4.9.