Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The compound is O=C1c2cccc(O)c2C(=O)c2c(O)cc(CO)cc21. The target protein (Q9Z0J5) has sequence MAAIVAALRGSSGRFRPQTRVLTRGTRGAAGAASAAGGQQNFDLLVIGGGSGGLACAKEAAQLGRKVAVADYVEPSPRGTKWGLGGTCVNVGCIPKKLMHQAALLGGMIRDAQHYGWEVAQPVQHNWKAMAEAVQNHVKSLNWGHRVQLQDRKVKYFNIKASFVNEHTVHGVDKAGKVTQLSAKHIVIATGGRPKYPTQVKGALEHGITSDDIFWLKESPGKTLVVGASYVALECAGFLTGIGLDTTVMMRSVPLRGFDQQMASLVTEHMESHGTRFLKGCVPSLIRKLPTNQLQVTWEDLASGKEDVGTFDTVLWAIGRVPETRNLNLEKAGVNTNPKNQKIIVDAQEATSVPHIYAIGDVAEGRPELTPTAIKAGKLLAQRLFGKSSTLMNYSNVPTTVFTPLEYGCVGLSEEEAVALHGQEHIEVYHAYYKPLEFTVADRDASQCYIKMVCMREPPQLVLGLHFLGPNAGEVTQGFALGIQCGASYAQVMQTVGIHP.... The pIC50 is 3.7. (2) The compound is C=CCCNc1nc(NCc2csc(-c3cccs3)n2)nc(N2CCC[C@@H]2CNS(=O)(=O)c2ccc(CCC)cc2)n1. The target protein (Q9UHI8) has sequence MQRAVPEGFGRRKLGSDMGNAERAPGSRSFGPVPTLLLLAAALLAVSDALGRPSEEDEELVVPELERAPGHGTTRLRLHAFDQQLDLELRPDSSFLAPGFTLQNVGRKSGSETPLPETDLAHCFYSGTVNGDPSSAAALSLCEGVRGAFYLLGEAYFIQPLPAASERLATAAPGEKPPAPLQFHLLRRNRQGDVGGTCGVVDDEPRPTGKAETEDEDEGTEGEDEGAQWSPQDPALQGVGQPTGTGSIRKKRFVSSHRYVETMLVADQSMAEFHGSGLKHYLLTLFSVAARLYKHPSIRNSVSLVVVKILVIHDEQKGPEVTSNAALTLRNFCNWQKQHNPPSDRDAEHYDTAILFTRQDLCGSQTCDTLGMADVGTVCDPSRSCSVIEDDGLQAAFTTAHELGHVFNMPHDDAKQCASLNGVNQDSHMMASMLSNLDHSQPWSPCSAYMITSFLDNGHGECLMDKPQNPIQLPGDLPGTSYDANRQCQFTFGEDSKHCP.... The pIC50 is 4.3. (3) The drug is CCCCCCc1ccc(C(=O)CCN(C)CCO)cc1. The target protein sequence is EEMIRSLQQRPEPTPEEWDLIHIATEAHRSTNAQGSHWKQRRKFLPDDIGQSPIVSMPDGDKVDLEAFSEFTKIITPAITRVVDFAKKLPMFSELPCEDQIILLKGCCMEIMSLRAAVRYDPESDTLTLSGEMAVKREQLKNGGLGVVSDAIFELGKSLSAFNLDDTEVALLQAVLLMSTDRSGLLCVDKIEKSQEAYLLAFEHYVNHRKHNIPHFWPKLLMKEREVQSSILYKGAAAEGRPGGSLGVHPEGQQLLGMHVVQV. The pIC50 is 5.5. (4) The drug is C[C@]12CC[C@H]3[C@@H](CCC4=CC(=O)CC[C@@]43C)[C@@H]1CC[C@@H]2C(=O)COS(=O)(=O)c1ccc(Br)cc1. The target protein sequence is MSSPNRKLKPTILVVDDEPDNLDLLYRTFHREFKVLKAESGPAALKILEEVGEVAVIISDQRMPYMSGTEFLSLTATQYPDSIRIILTGYTDVEDLVEAINSGKVFKYVTKPWKSDELKAIVQQGLETHNVLKSRTEELRLAQKQESLLYEVTSTIRACPNSQEMLQRIVETVGKMFEVSYCLLRSFGVGSDLIGLGAGVSPTKQDITATQGKEWFAYLAEGQNHQNSTTDNISVINNNDLELRSLVWETTEVMILSEGLGNDISDHDGPEWQQRRDVYQRADIRSSLIVPLYYRQELLAVLALHHTGSPRNWHEHEVQLAAGVADQAALALSQVRAYEQVRELARREALVNTITNAIRSSLDPQKIFAAITEQLGEALEVDGCALSLWSPGDEYMQCVGLYNAAIKETVVETRPAALSEPDTSTTTNLPLLGVETNQSIESDQSDDLPQSAAPISGNPVLQELIRTRAPVAIADIEQRPDSMVMLPLRSPSKALLVVPL.... The pIC50 is 4.8.