From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cc1ccc(C(=O)NCc2cccc(C(C)(C)C)c2)cc1Nc1ncnc2cnc(N3CCCN(C)CC3)nc12. The target protein sequence is MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADPAIPEEVWNIKQMIKLTQEHIEALLDKFGGEHNPPSIYLEAYEEYTSKLDALQQREQQLLESLGNGTDFSVSSSASMDTVTSSSSSSLSVLPSSLSVFQNPTDVARSNPKSPQKPIVRVFLPNKQRTVVPARCGVTVRDSLKKALMMRGLIPECCAVYRIQDGEKKPIGWDTDISWLTGEELHVEVLENVPLTTHNFVRKTFFTLAFCDFCRKLLFQGFRCQTCGYKFHQRCSTEVPLMCVNYDQLDLLFVSKFFEHHPIPQEEASLAETALTSGSSPSAPASDSIGPQILTSPSPSKSIPIPQPFRPADEDHRNQFGQRDRSSSAPNVHINTIEPVNIDDLIRDQGFRGDGGSTTGLSATPPASLPGSLTNVKALQKSPGPQRERKSSSSSEDRNRMKTLGRRDSSDDWEIPDGQITVGQRIGSGSFGTVYKGKWHGDVAVKMLNVTAPTPQQLQAFKN.... The pIC50 is 6.8. (2) The drug is CC[C@H](C)[C@H](NC(=O)[C@H](CO)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)Cc1cnc[nH]1)C(C)C)[C@@H](C)O)[C@@H](C)O)C(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(N)=O. The target protein (P32241) has sequence MRPPSPLPARWLCVLAGALAWALGPAGGQAARLQEECDYVQMIEVQHKQCLEEAQLENETIGCSKMWDNLTCWPATPRGQVVVLACPLIFKLFSSIQGRNVSRSCTDEGWTHLEPGPYPIACGLDDKAASLDEQQTMFYGSVKTGYTIGYGLSLATLLVATAILSLFRKLHCTRNYIHMHLFISFILRAAAVFIKDLALFDSGESDQCSEGSVGCKAAMVFFQYCVMANFFWLLVEGLYLYTLLAVSFFSERKYFWGYILIGWGVPSTFTMVWTIARIHFEDYGCWDTINSSLWWIIKGPILTSILVNFILFICIIRILLQKLRPPDIRKSDSSPYSRLARSTLLLIPLFGVHYIMFAFFPDNFKPEVKMVFELVVGSFQGFVVAILYCFLNGEVQAELRRKWRRWHLQGVLGWNPKYRHPSGGSNGATCSTQVSMLTRVSPGARRSSSFQAEVSLV. The pIC50 is 8.9. (3) The drug is COC(=O)C1C2CC[C@H](C[C@@H]1OC(=O)Nc1cccc([N+](=O)[O-])c1)N2C. The target protein (P27922) has sequence MSEGRCSVAHMSSVVAPAKEANAMGPKAVELVLVKEQNGVQLTNSTLLNPPQSPTEAQDRETWSKKADFLLSVIGFAVDLANVWRFPYLCYKNGGGAFLVPYLFFMVVAGVPLFYMELALGQFNREGAAGVWKICPILRGVGYTAILISLYIGFFYNVIIAWALHYLLSSFTTELPWTHCNHSWNSPRCSDARAPNASSGPNGTSRTTPAAEYFERGVLHLHESQGIDDLGPPRWQLTSCLVLVIVLLYFSLWKGVKTSGKVVWITATMPYVVLFALLLRGITLPGAVDAIRAYLSVDFHRLCEASVWIDAAIQICFSLGVGLGVLIAFSSYNKFTNNCYRDAIITTSVNSLTSFSSGFVVFSFLGYMAQKHSVPIGDVAKDGPGLIFIIYPEALATLPLSSVWAVVFFVMLLTLGIDSAMGGMESVITGLADEFQLLHRHRELFTLLVVLATFLLSLFCVTNGGIYVFTLLDHFAAGTSILFGVLMEVIGVAWFYGVWQ.... The pIC50 is 7.4. (4) The small molecule is CN1C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN)NCc2cc(Br)cnc2Sc2ccccc2CNC(=O)[C@@H]1Cc1c[nH]c2ccccc12. The target protein (A3M3H0) has sequence MNKVYKVIWNASIGAWVATSEIAKSKTKTKSKTLNLSAAVLSGVICFAPNAFAGTNTEGGIGQGTSISGTTSCREGSANTANQKDIAIGCGAQTQDRTGSNIANRNNPYNNSTGAYAGAMKQGGAISVGTGAVVEKGLGTAIGSYATTQGISGVAIGTGALSSGNTALAVGRQSAATADFSQAIGNVAAATGKGSLAIGHSATAEGYRSIAIGSPDIENADPVAGQAGAAYQPKMATKATGKDSIAFGGGAVATEENALAIGAFSESKGKKSVAIGTGAKAQKDNAVVIGDQAEASFEGGVAIGKGARSEAENSIALGKDSKASQATGESFLTKQSAPTGVLSIGDIGTERRIQNVADGAADSDAATVRQLKAARTHYVSINDNGQPGGNFENDGATGRNAIAVGVNASAAGREAMAIGGNAQAIGSGAIAMGSSSQTVGRGDVAIGRNASTQGAEGVNSNQSVAIGDQTKAIGDQSVAIGADVIAKGNSSVAIGGDDVD.... The pIC50 is 5.5. (5) The compound is CCC[C@H](c1ccc(C(=O)O)c(Oc2cccc(Cl)c2)c1)N1CCC[C@H](n2cc(C)c(=O)[nH]c2=O)C1. The target protein (P23919) has sequence MAARRGALIVLEGVDRAGKSTQSRKLVEALCAAGHRAELLRFPERSTEIGKLLSSYLQKKSDVEDHSVHLLFSANRWEQVPLIKEKLSQGVTLVVDRYAFSGVAFTGAKENFSLDWCKQPDVGLPKPDLVLFLQLQLADAAKRGAFGHERYENGAFQERALRCFHQLMKDTTLNWKMVDASKSIEAVHEDIRVLSEDAIRTATEKPLGELWK. The pIC50 is 3.8. (6) The compound is CC[C@H](C)[C@H](NC(=O)[C@H](CCCCN=[N+]=[N-])NC(=O)OCc1ccccc1)C(=O)N[C@@H](CC(C)C)B1O[C@@H]2C[C@@H]3C[C@@H](C3(C)C)[C@]2(C)O1. The target protein (P28063) has sequence MALLDLCGAARGQRPEWAALDAGSGGRSDPGHYSFSAQAPELALPRGMQPTAFLRSFGGDQERNVQIEMAHGTTTLAFKFQHGVIVAVDSRATAGSYISSLRMNKVIEINPYLLGTMSGCAADCQYWERLLAKECRLYYLRNGERISVSAASKLLSNMMLQYRGMGLSMGSMICGWDKKGPGLYYVDDNGTRLSGQMFSTGSGNTYAYGVMDSGYRQDLSPEEAYDLGRRAIAYATHRDNYSGGVVNMYHMKEDGWVKVESSDVSDLLYKYGEAAL. The pIC50 is 6.9. (7) The drug is O=C([O-])c1cc(NC(=S)NCCCCCCOP(=O)([O-])OP(=O)([O-])OCC2OC(n3ccc(=O)[nH]c3=O)C(O)C2O)ccc1-c1c2ccc(=O)cc-2oc2cc(O)ccc12. The target protein (P9WIQ1) has sequence MQPMTARFDLFVVGSGFFGLTIAERVATQLDKRVLVLERRPHIGGNAYSEAEPQTGIEVHKYGAHLFHTSNKRVWDYVRQFTDFTDYRHRVFAMHNGQAYQFPMGLGLVSQFFGKYFTPEQARQLIAEQAAEIDTADAQNLEEKAISLIGRPLYEAFVKGYTAKQWQTDPKELPAANITRLPVRYTFDNRYFSDTYEGLPTDGYTAWLQNMAADHRIEVRLNTDWFDVRGQLRPGSPAAPVVYTGPLDRYFDYAEGRLGWRTLDFEVEVLPIGDFQGTAVMNYNDLDVPYTRIHEFRHFHPERDYPTDKTVIMREYSRFAEDDDEPYYPINTEADRALLATYRARAKSETASSKVLFGGRLGTYQYLDMHMAIASALNMYDNVLAPHLRDGVPLLQDGA. The pIC50 is 5.2. (8) The drug is CC(=O)C=CC(=O)NC[C@H](N)C(=O)O. The target protein sequence is MCGIFGYCNYLVERSRGEIIDTLVDGLQRLEYRGYDSTGIAIDGDEADSTFIYKQIGKVSALKEEITKQNPNRDVTFVSHCGIAHTRWATHGRPEQVNCHPQRSDPEDQFVVVHNGIITNFRELKTLLINKGYKFESDTDTECIAKLYLHLYNTNLQNGHDLDFHELTKLVLLELEGSYGLLCKSCHYPNEVIATRKGSPLLIGVKSEKKLKVDFVDVEFPEENAGQPEIPLKSNNKSFGLGPKKAREFEAGSQNANLLPIAANEFNLRHSQSRAFLSEDGSPTPVEFFVSSDAASVVKHTKKVLFLEDDDLAHIYDGELHIHRSRREVGASMTRSIQTLEMELAQIMKGPYDHFMQKEIYEQPESTFNTMRGRIDYENNKVILGGLKAWLPVVRRARRLIMIACGTSYHSCLATRAIFEELSDIPVSVELASDFLDRKCPVFRDDVCVFVSQSGETADTMLALNYCLERGALTVGIVNSVGSSISRVTHCGVHINAGPE.... The pIC50 is 2.2.