From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1ccc(N2CCN(C(=O)c3cc4c(s3)-c3ccccc3SC4)CC2)cc1. The target protein sequence is MCGNTMSVPLLTDAATVSGAERETAAVIFLHGLGDTGHSWADALSTIRLPHVKYICPHAPRIPVTLNMKMVMPSWFDLMGLSPDAPEDEAGIKKAAENIKALIEHEMKNGIPANRIVLGGFAQGGALSLYTALTCPHPLAGIVALSCWLPLHRAFPQAANGSAKDLAILQCHGELDPMVPVRFGALTAEKLRSVVTPARVQFKTYPGVMHSSCPQEMAAVKEFLEKLLPPV. The pIC50 is 4.0. (2) The target protein (Q04750) has sequence MSGDHLHNDSQIEADFRLNDSHKHKDKHKDREHRHKEHKKDKDKDREKSKHSNSEHKDSEKKHKEKEKTKHKDGSSEKHKDKHKDRDKERRKEEKIRAAGDAKIKKEKENGFSSPPRIKDEPEDDGYFAPPKEDIKPLKRLRDEDDADYKPKKIKTEDIKKEKKRKSEEEEDGKLKKPKNKDKDKKVAEPDNKKKKPKKEEEQKWKWWEEERYPEGIKWKFLEHKGPVFAPPYEPLPESVKFYYDGKVMKLSPKAEEVATFFAKMLDHEYTTKEIFRKNFFKDWRKEMTNDEKNTITNLSKCDFTQMSQYFKAQSEARKQMSKEEKLKIKEENEKLLKEYGFCVMDNHRERIANFKIEPPGLFRGRGNHPKMGMLKRRIMPEDIIINCSKDAKVPSPPPGHKWKEVRHDNKVTWLVSWTENIQGSIKYIMLNPSSRIKGEKDWQKYETARRLKKCVDKIRNQYREDWKSKEMKVRQRAVALYFIDKLALRAGNEKEEGET.... The pIC50 is 4.4. The compound is CC[C@@]1(O)C(=O)OCc2c1cc1n(c2=O)Cc2cc3ccccc3nc2-1. (3) The small molecule is COc1cc(O)c2c(c1)O[C@]1(C)[C@@H](O)[C@@H](C)C[C@H](O)[C@@H]1C2=O. The target protein (P47199) has sequence MATGQKLMRAIRVFEFGGPEVLKLQSDVVVPVPQSHQVLIKVHACGVNPVETYIRSGAYSRKPALPYTPGSDVAGIIESVGDKVSAFKKGDRVFCYSTVSGGYAEFALAADDTIYPLPETLNFRQGAALGIPYFTACRALFHSARARAGESVLVHGASGGVGLATCQIARAHGLKVLGTAGSEEGKKLVLQNGAHEVFNHKEANYIDKIKMSVGDKDKGVDVIIEMLANENLSNDLKLLSHGGRVVVVGCRGPIEINPRDTMAKETSIIGVSLSSSTKEEFQQFAGLLQAGIEKGWVKPVIGSEYPLEKAAQAHEDIIHGSGKTGKMILLL. The pIC50 is 4.3. (4) The pIC50 is 8.2. The target protein sequence is APITAYAQQTRGLLGTIVTSLTGRDKNVVTGEVQVLSTTTQTFLGTTVGGVMWTVYHGAGSRTLAGAKHPALQMYTNVDQDLVGWPAPPGAKSLELCTCGSADLYLVTRDADVIPARRRGDSTASLLSPRPLACLKGSSGGPVMCPSGHVAGIFRAAVCTRGVAKALQFIPVETLSTQARSPSFSDNSTPPAVPQSYQVGYLHAPTGSGKSTKVPAAYVAQGYNVLVLNPSVAATLGFGSYMSRAHGIDPNIRTGNRTVTTGAKLTYSTYGKFLADGGCSGGAYDVIICDECHAQDATSILGIGTVLDQAETAGVRLTVLATATPPGSITVPHSNIEEVALGSEGGIPFYGKAIPIAQLEGGRHLIFCHSRKKCDELASKLRGMGLNAVAYYRGLDVSVIPTVGDVVVCATDALMTGFTGDFDSVIDCNVAVEQYVDFSLDPTFSIETRTAPQDAVSRSQRRGRTGRGRPSTYRYVTPGERPSGMFDSVVLCECYDAGCS.... The drug is C=C[C@@H]1C[C@]1(NC(=O)[C@@H]1C[C@@]2(CN1C(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@@H]1CCCCN1C(C)C)C(C)(C)C)C(C)(C)C)C(C)(C)C21CCC1)C(=O)NS(=O)(=O)N1CCCC1. (5) The drug is CC1(C)CCC(n2cnc(C(Cc3ccc(N)nc3)C(=O)O)c2)CC1. The target protein (P15086) has sequence MLALLVLVTVALASAHHGGEHFEGEKVFRVNVEDENHINIIRELASTTQIDFWKPDSVTQIKPHSTVDFRVKAEDTVTVENVLKQNELQYKVLISNLRNVVEAQFDSRVRATGHSYEKYNKWETIEAWTQQVATENPALISRSVIGTTFEGRAIYLLKVGKAGQNKPAIFMDCGFHAREWISPAFCQWFVREAVRTYGREIQVTELLDKLDFYVLPVLNIDGYIYTWTKSRFWRKTRSTHTGSSCIGTDPNRNFDAGWCEIGASRNPCDETYCGPAAESEKETKALADFIRNKLSSIKAYLTIHSYSQMMIYPYSYAYKLGENNAELNALAKATVKELASLHGTKYTYGPGATTIYPAAGGSDDWAYDQGIRYSFTFELRDTGRYGFLLPESQIRATCEETFLAIKYVASYVLEHLY. The pIC50 is 8.5. (6) The compound is Nc1ncnc2c1ncn2CCCCOP(=O)(O)OP(=O)(O)OC[C@H]1OC(O)[C@H](O)[C@@H]1O. The target protein (O94759) has sequence MEPSALRKAGSEQEEGFEGLPRRVTDLGMVSNLRRSNSSLFKSWRLQCPFGNNDKQESLSSWIPENIKKKECVYFVESSKLSDAGKVVCQCGYTHEQHLEEATKPHTFQGTQWDPKKHVQEMPTDAFGDIVFTGLSQKVKKYVRVSQDTPSSVIYHLMTQHWGLDVPNLLISVTGGAKNFNMKPRLKSIFRRGLVKVAQTTGAWIITGGSHTGVMKQVGEAVRDFSLSSSYKEGELITIGVATWGTVHRREGLIHPTGSFPAEYILDEDGQGNLTCLDSNHSHFILVDDGTHGQYGVEIPLRTRLEKFISEQTKERGGVAIKIPIVCVVLEGGPGTLHTIDNATTNGTPCVVVEGSGRVADVIAQVANLPVSDITISLIQQKLSVFFQEMFETFTESRIVEWTKKIQDIVRRRQLLTVFREGKDGQQDVDVAILQALLKASRSQDHFGHENWDHQLKLAVAWNRVDIARSEIFMDEWQWKPSDLHPTMTAALISNKPEFV.... The pIC50 is 4.0.