From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is N#Cc1c(-c2ccc(Cl)cc2)nc(SCC(=O)Nc2ccc(F)c(Cl)c2)[nH]c1=O. The target protein (P07195) has sequence MATLKEKLIAPVAEEEATVPNNKITVVGVGQVGMACAISILGKSLADELALVDVLEDKLKGEMMDLQHGSLFLQTPKIVADKDYSVTANSKIVVVTAGVRQQEGESRLNLVQRNVNVFKFIIPQIVKYSPDCIIIVVSNPVDILTYVTWKLSGLPKHRVIGSGCNLDSARFRYLMAEKLGIHPSSCHGWILGEHGDSSVAVWSGVNVAGVSLQELNPEMGTDNDSENWKEVHKMVVESAYEVIKLKGYTNWAIGLSVADLIESMLKNLSRIHPVSTMVKGMYGIENEVFLSLPCILNARGLTSVINQKLKDDEVAQLKKSADTLWDIQKDLKDL. The pIC50 is 4.0. (2) The target protein (Q8ZLS1) has sequence MSHLALQPGFDFQQAGKEVLEIEREGLAELDQYINQHFTLACEKMFNCTGKVVVMGMGKSGHIGRKMAATFASTGTSSFFVHPGEAAHGDLGMVTPQDVVIAISNSGESSEIAALIPVLKRLHVPLICITGRPESSMARAADVHLCVKVPKEACPLGLAPTSSTTATLVMGDALAVALLKARGFTAEDFALSHPGGALGRKLLLRVSDIMHTGDEIPHVNKHATLRDALLEITRKNLGMTVICDESMKIDGIFTDGDLRRVFDMGGDMRQLGIAEVMTPGGIRVRPGILAVDALNLMQSRHITSVLVADGDQLLGVLHMHDLLRAGVV. The pIC50 is 2.3. The drug is O=C(O)C(F)C(O)C(O)COP(=O)(O)O. (3) The small molecule is Cc1ccc2c(=O)c3ccccc3n(CCCN)c2c1C. The target protein (Q64725) has sequence MAGNAVDNANHLTYFFGNITREEAEDYLVQGGMTDGLYLLRQSRNYLGGFALSVAHNRKAHHYTIERELNGTYAISGGRAHASPADLCHYHSQEPEGLVCLLKKPFNRPPGVQPKTGPFEDLKENLIREYVKQTWNLQGQALEQAIISQKPQLEKLIATTAHEKMPWFHGNISRDESEQTVLIGSKTNGKFLIRARDNNGSFALCLLHEGKVLHYRIDRDKTGKLSIPEGKKFDTLWQLVEHYSYKPDGLLRVLTVPCQKIGVQMGHPGSSNAHPVTWSPGGIISRIKSYSFPKPGHKKPPPPQGSRPESTVSFNPYEPTGGAWGPDRGLQREALPMDTEVYESPYADPEEIRPKEVYLDRKLLTLEDNELGSGNFGTVKKGYYQMKKVVKTVAVKILKNEANDPALKDELLAEANVMQQLDNPYIVRMIGICEAESWMLVMEMAAWGPLNKYLQQNRHIKDKNIIELVHQVSMGMKYLEESNFVHRDLAARNVLLVTQH.... The pIC50 is 5.0. (4) The compound is O=C1C=C(Nc2cccc(C(=O)O)c2)C(=O)C=C1Nc1cccc(C(=O)O)c1. The target protein (P39052) has sequence MGNRGMEELIPLVNKLQDAFSSIGQSCHLDLPQIAVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLILQLIFSKTEYAEFLHCKSKKFTDFDEVRQEIEAETDRVTGTNKGISPVPINLRVYSPHVLNLTLIDLPGITKVPVGDQPPDIEYQIKDMILQFISRESSLILAVTPANMDLANSDALKLAKEVDPQGLRTIGVITKLDLMDEGTDARDVLENKLLPLRRGYIGVVNRSQKDIEGRKDIRAALAAERKFFLSHPAYRHMADRMGTPHLQKTLNQQLTNHIRESLPTLRSKLQSQLLSLEKEVEEYKNFRPDDPTRKTKALLQMVQQFGVDFEKRIEGSGDQVDTLELSGGARINRIFHERFPFELVKMEFDEKDLRREISYAIKNIHGVRTGLFTPDLAFEAIVKKQVVKLKEPCLKCVDLVIQELISTVRQCTSKLSSYPRLREETERIVTTYIREREGRTKDQILLLIDIEQSYINTNHEDFIGFANAQ.... The pIC50 is 4.8. (5) The compound is CNS(=O)(=O)c1ccc(-c2oc3ncnc(NC[C@@H]4CCCO4)c3c2-c2ccccc2)cc1. The target protein (O54967) has sequence MQPEEGTGWLLELLSEVQLQQYFLRLRDDLNITRLSHFEYVKNEDLEKIGMGRPGQRRLWEAVKRRKAMCKRKSWMSKVFSGKRLEAEFPSQHSQSTFRKPSPTPGSLPGEGTLQSLTCLIGEKDLRLLEKLGDGSFGVVRRGEWDAPAGKTVSVAVKCLKPDVLSQPEAMDDFIREVNAMHSLDHRNLIRLYGVVLTLPMKMVTELAPLGSLLDRLRKHQGHFLLGTLSRYAVQVAEGMAYLESKRFIHRDLAARNLLLATRDLVKIGDFGLMRALPQNDDHYVMQEHRKVPFAWCAPESLKTRTFSHASDTWMFGVTLWEMFTYGQEPWIGLNGSQILHKIDKEGERLPRPEDCPQDIYNVMVQCWAHKPEDRPTFVALRDFLLEAQPTDMRALQDFEEPDKLHIQMNDVITVIEGRAENYWWRGQNTRTLCVGPFPRNVVTSVAGLSAQDISQPLQNSFIHTGHGDSDPRHCWGFPDRIDELYLGNPMDPPDLLSVE.... The pIC50 is 6.5. (6) The compound is C[C@H]1COc2c(N3CCN(C)CC3)c(F)cc3c(=O)c(C(=O)O)cn1c23. The target protein sequence is MGNRIPEEVVEQIRTSSDIVEVIGEYVQLRKQGRNYFGLCPFHGENSPSFSVSSDKQIFHCFGCGEGGNVFSFLMKMEGLAFTEAVQKLGERNGIAVAEYTSGQGQQEDISDDTVIMQQAHELLKKYYHHLLVNTEEGNEALSYLLKRGITKEMIEKFEIGYASPAWDAATKILQKRGLSLSSMEQAGLLIRSEKDGSHYDRFRGRVMFPIYTLQGKVIAFSGRALGDDTPKYLNSPETPIFHKSKLLYNFHQARPFIRKRGQVVLFEGYADVLAAVKSGVEEAVATMGTALTEEQAKLLRRNVETVVLCYDGDKAGREATMKAGQLLLQVGCQVKVTSLPDKLDPDEYVQQYGTTAFENLVKSSISFVGFKINYLRLGKNLQDESGKEEYVKSVLKELSLLQDAMQAESYLKSLSQEFSYSMETLLNQLHQYRKEQKVQQKQVKQVSKPSQIVQTKPKLTGFERAEREIIYHMLQSPEVAVRMESHIEDFHTEEHKGIL.... The pIC50 is 4.1. (7) The drug is CC(=O)N[C@@H]1[C@@H](O)C=C(C(=O)O)O[C@H]1[C@@H](O)[C@@H](O)CNC(=O)c1ccccc1. The target protein (Q8WWR8) has sequence MGVPRTPSRTVLFERERTGLTYRVPSLLPVPPGPTLLAFVEQRLSPDDSHAHRLVLRRGTLAGGSVRWGALHVLGTAALAEHRSMNPCPVHDAGTGTVFLFFIAVLGHTPEAVQIATGRNAARLCCVASRDAGLSWGSARDLTEEAIGGAVQDWATFAVGPGHGVQLPSGRLLVPAYTYRVDRRECFGKICRTSPHSFAFYSDDHGRTWRCGGLVPNLRSGECQLAAVDGGQAGSFLYCNARSPLGSRVQALSTDEGTSFLPAERVASLPETAWGCQGSIVGFPAPAPNRPRDDSWSVGPGSPLQPPLLGPGVHEPPEEAAVDPRGGQVPGGPFSRLQPRGDGPRQPGPRPGVSGDVGSWTLALPMPFAAPPQSPTWLLYSHPVGRRARLHMGIRLSQSPLDPRSWTEPWVIYEGPSGYSDLASIGPAPEGGLVFACLYESGARTSYDEISFCTFSLREVLENVPASPKPPNLGDKPRGCCWPS. The pIC50 is 3.1.