Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The drug is CCN1CCN(CCC(=O)Nc2ccc3c(c2)C(=O)c2cc(NC(=O)CCN4CCN(CC)CC4)ccc2-3)CC1. The target protein (O14746) has sequence MPRAPRCRAVRSLLRSHYREVLPLATFVRRLGPQGWRLVQRGDPAAFRALVAQCLVCVPWDARPPPAAPSFRQVSCLKELVARVLQRLCERGAKNVLAFGFALLDGARGGPPEAFTTSVRSYLPNTVTDALRGSGAWGLLLRRVGDDVLVHLLARCALFVLVAPSCAYQVCGPPLYQLGAATQARPPPHASGPRRRLGCERAWNHSVREAGVPLGLPAPGARRRGGSASRSLPLPKRPRRGAAPEPERTPVGQGSWAHPGRTRGPSDRGFCVVSPARPAEEATSLEGALSGTRHSHPSVGRQHHAGPPSTSRPPRPWDTPCPPVYAETKHFLYSSGDKEQLRPSFLLSSLRPSLTGARRLVETIFLGSRPWMPGTPRRLPRLPQRYWQMRPLFLELLGNHAQCPYGVLLKTHCPLRAAVTPAAGVCAREKPQGSVAAPEEEDTDPRRLVQLLRQHSSPWQVYGFVRACLRRLVPPGLWGSRHNERRFLRNTKKFISLGKH.... The pIC50 is 4.8. (2) The drug is O=C(O)CNC(=O)c1c(=O)oc(O)c2ccc(Br)cc12. The target protein (P59722) has sequence PRAQPAPAQPRVAPPPGGAPGAARAGGAARRGDSSTAASRVPGPEDATQAGSGPGPAEPSSEDPPPSRSPGPERASLCPAGGGPGEALSPSGGLRPNGQTKPLPALKLALEYIVPCMNKHGICVVDDFLGRETGQQIGDEVRALHDTGKFTDGQLVSQKSDSSKDIRGDKITWIEGKEPGCETIGLLMSSMDDLIRHCSGKLGNYRINGRTKAMVACYPGNGTGYVRHVDNPNGDGRCVTCIYYLNKDWDAKVSGGILRIFPEGKAQFADIEPKFDRLLFFWSDRRNPHEVQPAYATRYAITVWYFDADERARAKVKYLTGEKGVRVELKPNSVSKDV. The pIC50 is 6.8. (3) The drug is Cc1cnc(-c2ccc([C@@H]3[C@@H](c4ccccc4)[C@@]3(F)C(=O)NO)cc2)nc1. The target protein sequence is TKPRFTTGLVYDTLMLKHQCTCGSSSSHPEHAGRIQSIWSRLQETGLRGKCECIRGRKATLEELQTVHSEAHTLLYGTNPLNGSGSDSKKLLGSLASVFVRLPCGGVGVDSDTIWNEVHSAGAARLAVGCVVELVFKVATGELKNGFAVVRPPGHHAEESTPMGFCYFNSVAVAAKLLQQRLSVSKILIVDWDVHHGNGTQQAFYSDPSVLYMSLHRYDDGNFFPGSGAPDEVGTGPGVGFNVNMAFTGGLDPPMGDAEYLAAFRTVVMPIASEFAPDVVLVSSGFDAVEGHPTPLGGYNLSARCFGYLTKQLMGLAGGRIVLALEGGHDLTAICDASEACVSALLGNELDPLPEKVLQQRPNANAVRSMEKVMEIHSKYWRCLQRTTSTAGRSLIEAQTCENEEAETVT. The pIC50 is 7.7. (4) The drug is CCc1nc(N)nc(N)c1-c1ccc(Cl)cc1. The target protein (P07686) has sequence MELCGLGLPRPPMLLALLLATLLAAMLALLTQVALVVQVAEAARAPSVSAKPGPALWPLPLSVKMTPNLLHLAPENFYISHSPNSTAGPSCTLLEEAFRRYHGYIFGFYKWHHEPAEFQAKTQVQQLLVSITLQSECDAFPNISSDESYTLLVKEPVAVLKANRVWGALRGLETFSQLVYQDSYGTFTINESTIIDSPRFSHRGILIDTSRHYLPVKIILKTLDAMAFNKFNVLHWHIVDDQSFPYQSITFPELSNKGSYSLSHVYTPNDVRMVIEYARLRGIRVLPEFDTPGHTLSWGKGQKDLLTPCYSRQNKLDSFGPINPTLNTTYSFLTTFFKEISEVFPDQFIHLGGDEVEFKCWESNPKIQDFMRQKGFGTDFKKLESFYIQKVLDIIATINKGSIVWQEVFDDKAKLAPGTIVEVWKDSAYPEELSRVTASGFPVILSAPWYLDLISYGQDWRKYYKVEPLDFGGTQKQKQLFIGGEACLWGEYVDATNLTP.... The pIC50 is 5.3. (5) The compound is O=C(O)CCN1Cc2ccccc2C1=O. The target protein (P53686) has sequence MSVSTASTEMSVRKIAAHMKSNPNAKVIFMVGAGISTSCGIPDFRSPGTGLYHNLARLKLPYPEAVFDVDFFQSDPLPFYTLAKELYPGNFRPSKFHYLLKLFQDKDVLKRVYTQNIDTLERQAGVKDDLIIEAHGSFAHCHCIGCGKVYPPQVFKSKLAEHPIKDFVKCDVCGELVKPAIVFFGEDLPDSFSETWLNDSEWLREKITTSGKHPQQPLVIVVGTSLAVYPFASLPEEIPRKVKRVLCNLETVGDFKANKRPTDLIVHQYSDEFAEQLVEELGWQEDFEKILTAQGGMGDNSKEQLLEIVHDLENLSLDQSEHESADKKDKKLQRLNGHDSDEDGASNSSSSQKAAKE. The pIC50 is 4.7. (6) The drug is CNC(=O)NC(N)=[NH+]CCC[C@@H]1NC(=O)[C@@H](Cc2ccccc2)NC(=O)C[C@@H](C(=O)O)NC(=O)C[C@@H](C(=O)[O-])NC(=O)[C@H](Cc2ccccc2)N(C)C1=O. The target protein (P11797) has sequence MSTRKAVIGYYFIPTNQINNYTETDTSVVPFPVSNITPAKAKQLTHINFSFLDINSNLECAWDPATNDAKARDVVNRLTALKAHNPSLRIMFSIGGWYYSNDLGVSHANYVNAVKTPAARTKFAQSCVRIMKDYGFDGVDIDWEYPQAAEVDGFIAALQEIRTLLNQQTIADGRQALPYQLTIAGAGGAFFLSRYYSKLAQIVAPLDYINLMTYDLAGPWEKITNHQAALFGDAAGPTFYNALREANLGWSWEELTRAFPSPFSLTVDAAVQQHLMMEGVPSAKIVMGVPFYGRAFKGVSGGNGGQYSSHSTPGEDPYPNADYWLVGCDECVRDKDPRIASYRQLEQMLQGNYGYQRLWNDKTKTPYLYHAQNGLFVTYDDAESFKYKAKYIKQQQLGGVMFWHLGQDNRNGDLLAALDRYFNAADYDDSQLDMGTGLRYTGVGPGNLPIMTAPAYVPGTTYAQGALVSYQGYVWQTKWGYITSAPGSDSAWLKVGRLA. The pIC50 is 5.0.