Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The pIC50 is 5.0. The drug is CCCCC1NC(CO)C(O)C(O)C1O. The target protein sequence is MLASLSSSSRAAISCIPLCLLFLTLASSNGVFAAAPPKVGSGYKLVSLVEHPEGGALVGYLQVKQRTSTYGPDIPLLRLYVKHETKDRIRVQITDADKPRWEVPYNLLQREPAPPVTGGRITGVPFAAGEYPGEELVFTYGRDPFWFAVHRKSSREALFNTSCGALVFKDQYIEASTSLPRDAALYGLGENTQPGGIRLRPNDPYTIYTTDISAINLNTDLYGSHPVYVDLRSRGGHGVAHAVLLLNSNGMDVFYRGTSLTYKVIGGLLDFYLFSGPTPLAVVDQYTSMIGRPAPMPYWAFGFHQCRWGYKNLSVVEGVVEGYRNAQIPLDVIWNDDDHMDAAKDFTLDPVNYPRPKLLEFLDKIHAQGMKYIVLIDPGIAVNNTYGVYQRGMQGDVFIKLDGKPYLAQVWPGPVYFPDFLNPNGVSWWIDEVRRFHDLVPVDGLWIDMNEASNFCTGKCEIPTTHLCPLPNTTTPWVCCLDCKNLTNTRWDEPPYKINA.... (2) The small molecule is O=C(N[C@H]1CCN(Cc2ccc(OC3CCNCC3)c(Br)c2)C1)c1ccc(Cl)c(Cl)c1. The target protein (Q9HB55) has sequence MDLIPNFAMETWVLVATSLVLLYIYGTHSHKLFKKLGIPGPTPLPFLGTILFYLRGLWNFDRECNEKYGEMWGLYEGQQPMLVIMDPDMIKTVLVKECYSVFTNQMPLGPMGFLKSALSFAEDEEWKRIRTLLSPAFTSVKFKEMVPIISQCGDMLVRSLRQEAENSKSINLKDFFGAYTMDVITGTLFGVNLDSLNNPQDPFLKNMKKLLKLDFLDPFLLLISLFPFLTPVFEALNIGLFPKDVTHFLKNSIERMKESRLKDKQKHRVDFFQQMIDSQNSKETKSHKALSDLELVAQSIIIIFAAYDTTSTTLPFIMYELATHPDVQQKLQEEIDAVLPNKAPVTYDALVQMEYLDMVVNETLRLFPVVSRVTRVCKKDIEINGVFIPKGLAVMVPIYALHHDPKYWTEPEKFCPERFSKKNKDSIDLYRYIPFGAGPRNCIGMRFALTNIKLAVIRALQNFSFKPCKETQIPLKLDNLPILQPEKPIVLKVHLRDGIT.... The pIC50 is 5.7. (3) The small molecule is CNC(=O)C(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)c1ccccc1)C(C)(C)C)C(C)C. The target protein (P08176) has sequence MKIVLAIASLLALSAVYARPSSIKTFEEYKKAFNKSYATFEDEEAARKNFLESVKYVQSNGGAINHLSDLSLDEFKNRFLMSAEAFEHLKTQFDLNAETNACSINGNAPAEIDLRQMRTVTPIRMQGGCGSCWAFSGVAATESAYLAYRNQSLDLAEQELVDCASQHGCHGDTIPRGIEYIQHNGVVQESYYRYVAREQSCRRPNAQRFGISNYCQIYPPNVNKIREALAQTHSAIAVIIGIKDLDAFRHYDGRTIIQRDNGYQPNYHAVNIVGYSNAQGVDYWIVRNSWDTNWGDNGYGYFAANIDLMMIEEYPYVVIL. The pIC50 is 7.6. (4) The drug is O=C(Nc1ccccc1)C1CCN(c2ncc(-c3ccc4c(c3)OCCCO4)s2)CC1. The target protein (Q01158) has sequence MENMENDENIVYGPEPFYPIEEGSAGAQLRKYMDRYAKLGAIAFTNALTGVDYTYAEYLEKSCCLGEALKNYGLVVDGRIALCSENCEEFFIPVLAGLFIGVGVAPTNEIYTLRELVHSLGISKPTIVFSSKKGLDKVITVQKTVTAIKTIVILDSKVDYRGYQSMDNFIKKNTPQGFKGSSFKTVEVNRKEQVALIMNSSGSTGLPKGVQLTHENAVTRFSHARDPIYGNQVSPGTAILTVVPFHHGFGMFTTLGYLTCGFRIVMLTKFDEETFLKTLQDYKCSSVILVPTLFAILNRSELLDKYDLSNLVEIASGGAPLSKEIGEAVARRFNLPGVRQGYGLTETTSAIIITPEGDDKPGASGKVVPLFKAKVIDLDTKKTLGPNRRGEVCVKGPMLMKGYVDNPEATREIIDEEGWLHTGDIGYYDEEKHFFIVDRLKSLIKYKGYQVPPAELESVLLQHPNIFDAGVAGVPDPIAGELPGAVVVLEKGKSMTEKEV.... The pIC50 is 5.4. (5) The compound is COc1cccc(Nc2nc(Cl)nc3nc[nH]c23)c1. The target protein (Q9FUJ3) has sequence MANLRLMITLITVLMITKSSNGIKIDLPKSLNLTLSTDPSIISAASHDFGNITTVTPGGVICPSSTADISRLLQYAANGKSTFQVAARGQGHSLNGQASVSGGVIVNMTCITDVVVSKDKKYADVAAGTLWVDVLKKTAEKGVSPVSWTDYLHITVGGTLSNGGIGGQVFRNGPLVSNVLELDVITGKGEMLTCSRQLNPELFYGVLGGLGQFGIITRARIVLDHAPKRAKWFRMLYSDFTTFTKDQERLISMANDIGVDYLEGQIFLSNGVVDTSFFPPSDQSKVADLVKQHGIIYVLEVAKYYDDPNLPIISKVIDTLTKTLSYLPGFISMHDVAYFDFLNRVHVEENKLRSLGLWELPHPWLNLYVPKSRILDFHNGVVKDILLKQKSASGLALLYPTNRNKWDNRMSAMIPEIDEDVIYIIGLLQSATPKDLPEVESVNEKIIRFCKDSGIKIKQYLMHYTSKEDWIEHFGSKWDDFSKRKDLFDPKKLLSPGQDI.... The pIC50 is 5.7. (6) The drug is CC(C)OC(=O)OCOP(=O)(CO[C@H](C)Cn1cnc2c(N)ncnc21)OCOC(=O)OC(C)C. The target protein sequence is PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTELEKEGKISKIGPENPYNTPVFAIKKKNSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLWRWGLYTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTNRGRQKVVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQ.... The pIC50 is 5.1.