Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (Q9BSW2) has sequence MAAPDGRVVSRPQRLGQGSGQGPKGSGACLHPLDSLEQKETQEQTSGQLVMLRKAQEFFQTCDAEGKGFIARKDMQRLHKELPLSLEELEDVFDALDADGNGYLTPQEFTTGFSHFFFSQNNPSQEDAGEQVAQRHEEKVYLSRGDEDLGDMGEDEEAQFRMLMDRLGAQKVLEDESDVKQLWLQLKKEEPHLLSNFEDFLTRIISQLQEAHEEKNELECALKRKIAAYDEEIQHLYEEMEQQIKSEKEQFLLKDTERFQARSQELEQKLLCKEQELEQLTQKQKRLEGQCTALHHDKHETKAENTKLKLTNQELARELERTSWELQDAQQQLESLQQEACKLHQEKEMEVYRVTESLQREKAGLLKQLDFLRCVGGHWPVLRAPPRSLGSEGPV. The pIC50 is 6.6. The small molecule is O=C(Nc1ccc(-n2nc(C(F)(F)F)cc2C2CC2)cc1)c1ccncc1. (2) The compound is CCCCCCCCCCCCCCCCCC(=O)c1c(C(=O)O)n(CCCCCCCC(=O)O)c2ccccc12. The target protein (A4IFJ5) has sequence MSFIDPYQHIIVEHHYSHKFTVVVLRATKVTKGTFGDMLDTPDPYVELFISSTPDSRKRTRHFNNDINPVWNETFEFILDPNQENILEITLMDANYVMDETLGTTTFPISSMKVGEKKQVPFIFNQVTEMILEMSLEVCSSPDLRFSMALCDQEKAFRQQRKENIKENMKKLLGPKNSEGLHSTRDVPVVAILGSGGGFRAMVGFSGVMKALYESGILDCATYIAGLSGSTWYMSTLYSHPDFPEKGPEEINKELMKNVSHNPLLLLTPQKIKRYVESLWRKKSSGQPVTFTDIFGMLIGETLIHNRMNTTLSSLKEKVNTGQCPLPLFTCLHVKPDVSELMFADWVEFSPFEIGMAKYGTFMAPDLFGSKFFMGTVVKKYEENPLHFLMGVWGSAFSILFNRVLGVSGSQSKGSTMEEELENITAKHIVSNDSSDSDDESQGPKGTEHEEAEREYQNDNQASWVQRMLMALVSDSALFNTREGRAGKVHNFMLGLNLNT.... The pIC50 is 5.8. (3) The small molecule is C[C@H](CC(=O)O)NC(=O)c1ccc([C@@H](CCC(C)(C)C)N2C(=O)C(c3cc(Cl)cc(Cl)c3)=N[C@]23CC[C@@H](C(C)(C)C)CC3)cc1. The target protein (Q61606) has sequence MPLTQLHCPHLLLLLLVLSCLPEAPSAQVMDFLFEKWKLYSDQCHHNLSLLPPPTELVCNRTFDKYSCWPDTPPNTTANISCPWYLPWYHKVQHRLVFKRCGPDGQWVRGPRGQPWRNASQCQLDDEEIEVQKGVAKMYSSQQVMYTVGYSLSLGALLLALVILLGLRKLHCTRNYIHGNLFASFVLKAGSVLVIDWLLKTRYSQKIGDDLSVSVWLSDGAMAGCRVATVIMQYGIIANYCWLLVEGVYLYSLLSLATFSERSFFSLYLGIGWGAPLLFVIPWVVVKCLFENVQCWTSNDNMGFWWILRIPVFLALLINFFIFVHIIHLLVAKLRAHQMHYADYKFRLARSTLTLIPLLGVHEVVFAFVTDEHAQGTLRSTKLFFDLFLSSFQGLLVAVLYCFLNKEVQAELMRRWRQWQEGKALQEERLASSHGSHMAPAGPCHGDPCEKLQLMSAGSSSGTGCVPSMETSLASSLPRLADSPT. The pIC50 is 5.2. (4) The small molecule is CNCCN1CCC(OCc2ccc(C3CCCCC3)cc2)CC1. The target protein (Q96LA8) has sequence MSQPKKRKLESGGGGEGGEGTEEEDGAEREAALERPRRTKRERDQLYYECYSDVSVHEEMIADRVRTDAYRLGILRNWAALRGKTVLDVGAGTGILSIFCAQAGARRVYAVEASAIWQQAREVVRFNGLEDRVHVLPGPVETVELPEQVDAIVSEWMGYGLLHESMLSSVLHARTKWLKEGGLLLPASAELFIAPISDQMLEWRLGFWSQVKQHYGVDMSCLEGFATRCLMGHSEIVVQGLSGEDVLARPQRFAQLELSRAGLEQELEAGVGGRFRCSCYGSAPMHGFAIWFQVTFPGGESEKPLVLSTSPFHPATHWKQALLYLNEPVQVEQDTDVSGEITLLPSRDNPRRLRVLLRYKVGDQEEKTKDFAMED. The pIC50 is 7.5. (5) The pIC50 is 5.9. The compound is CNC(=NC#N)NCCSCc1[nH]cnc1C. The target protein (Q96FL8) has sequence MEAPEEPAPVRGGPEATLEVRGSRCLRLSAFREELRALLVLAGPAFLVQLMVFLISFISSVFCGHLGKLELDAVTLAIAVINVTGVSVGFGLSSACDTLISQTYGSQNLKHVGVILQRSALVLLLCCFPCWALFLNTQHILLLFRQDPDVSRLTQTYVTIFIPALPATFLYMLQVKYLLNQGIVLPQIVTGVAANLVNALANYLFLHQLHLGVIGSALANLISQYTLALLLFLYILGKKLHQATWGGWSLECLQDWASFLRLAIPSMLMLCMEWWAYEVGSFLSGILGMVELGAQSIVYELAIIVYMVPAGFSVAASVRVGNALGAGDMEQARKSSTVSLLITVLFAVAFSVLLLSCKDHVGYIFTTDRDIINLVAQVVPIYAVSHLFEALACTSGGVLRGSGNQKVGAIVNTIGYYVVGLPIGIALMFATTLGVMGLWSGIIICTVFQAVCFLGFIIQLNWKKACQQAQVHANLKVNNVPRSGNSALPQDPLHPGCPEN.... (6) The compound is Cc1cc(Cl)cc2c(COS(N)(=O)=O)cc(-c3ccc(Br)cc3)nc12. The target protein (P78600) has sequence MIIIKRFLHIKTVPKSYGNQLSKFKYSKQIPTHEVLTKLGYITYPRAGLVNWSKMGLLIQNKISQIIRQRMDEIQFEEVSLSLISHKELWKLTNRWDQEEIFKLVGDEYLLVPTAEEEITNYVKKQFLESYKNFPLALYQINPKFRNEKRPRGGLLRGKEFLMKDAYSFDLNESEAMKTYEKVVGAYHKIFQDLGIPYVKAEADSGDIGGSLSHEWHYLNSSGEDTVFECNECHNVSNMEKALSYPKEIDETIEVSVIYFTTEDKSTLICAYYPSNRVLEPKFIQNEIPDIDLDSINDLSEFNHDISTRIVRIMDSRLSSRSKFPDFPISNFINRSLITTLTDIPIVLAQEGEICGHCEEGKLSASSAIEVGHTFYLGDKYSKPLDLEVDVPTSNNSIEKQRIMMGCYGIGISRIIAAIAEINRDEKGLKWPRSIAPWEVTVVEVSKQKQLKNVNDNNHHNNPQDNFQEIYNILNQANIDYRLDNRSDSMGKKLKQSDLL.... The pIC50 is 5.6. (7) The small molecule is Cc1cccc(NC(=O)c2ccc3[nH]c4c(c3c2)C(C)CNC4=O)c1. The target protein sequence is QQFPQFHVKSGLQIKKNAIIDDYKVTSQVLGLGINGKVLQIFNKRTQEKFALKMLQDCPKARREVELHWRASQCPHIVRIVDVYENLYAGRKCLLIVMECLDGGELFSRIQDRGDQAFTEREASEIMKSIGEAIQYLHSINIAHRDVKPENLLYTSKRPNAILKLTDFGFAKETTSHNSLTTPCYTPYYVAPEVLGPEKYDKSCDMWSLGVIMYILLCGYPPFYSNHGLAISPGMKTRIRMGQYEFPNPEWSEVSEEVKMLIRNLLKTEPTQRMTITEFMNHPWIMQSTKVPQTPLHTSRVLKEDKERWEDVKEEMTSALATMR. The pIC50 is 5.5.