Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Nc1ncc(C(=O)Nc2cc(Cl)cc(Cl)c2)c(C(F)(F)F)n1. The target protein (P18846) has sequence MEDSHKSTTSETAPQPGSAVQGAHISHIAQQVSSLSESEESQDSSDSIGSSQKAHGILARRPSYRKILKDLSSEDTRGRKGDGENSGVSAAVTSMSVPTPIYQTSSGQYIAIAPNGALQLASPGTDGVQGLQTLTMTNSGSTQQGTTILQYAQTSDGQQILVPSNQVVVQTASGDMQTYQIRTTPSATSLPQTVVMTSPVTLTSQTTKTDDPQLKREIRLMKNREAARECRRKKKEYVKCLENRVAVLENQNKTLIEELKTLKDLYSNKSV. The pIC50 is 5.0. (2) The small molecule is C[S+](C)CC(=O)NCC(=O)N1CCN(C(=O)C23CC4CC(CC(C4)C2)C3)CC1. The target protein (O95932) has sequence MAGIRVTKVDWQRSRNGAAHHTQEYPCPELVVRRGQSFSLTLELSRALDCEEILIFTMETGPRASEALHTKAVFQTSELERGEGWTAAREAQMEKTLTVSLASPPSAVIGRYLLSIRLSSHRKHSNRRLGEFVLLFNPWCAEDDVFLASEEERQEYVLSDSGIIFRGVEKHIRAQGWNYGQFEEDILNICLSILDRSPGHQNNPATDVSCRHNPIYVTRVISAMVNSNNDRGVVQGQWQGKYGGGTSPLHWRGSVAILQKWLKGRYKPVKYGQCWVFAGVLCTVLRCLGIATRVVSNFNSAHDTDQNLSVDKYVDSFGRTLEDLTEDSMWNFHVWNESWFARQDLGPSYNGWQVLDATPQEESEGVFRCGPASVTAIREGDVHLAHDGPFVFAEVNADYITWLWHEDESRERVYSNTKKIGRCISTKAVGSDSRVDITDLYKYPEGSRKERQVYSKAVNRLFGVEASGRRIWIRRAGGRCLWRDDLLEPATKPSIAGKFK.... The pIC50 is 5.0. (3) The small molecule is O=c1c2ccccc2n(CC2CC2)c(=O)n1O. The target protein sequence is MGIQGLAKLIADVAPSAIRENDIKSYFGRKVAIDASMSIYQFLIAVRQGGDVLQNEEGETTSHLMGMFYRTIRMMENGIKPVYVFDGKPPQLKSGELAKRSERRAEAEKQLQQAQAAGAEQEVEKFTKRLVKVTKQHNDECKHLLSLMGIPYLDAPSEAEASCAALVKAGKVYAAATEDMDCLTFGSPVLMRHLTASEAKKLPIQEFHLSRILQELGLNQEQFVDLCILLGSDYCESIRGIGPKRAVDLIQKHKSIEEIVRRLDPNKYPVPENWLHKEAHQLFLEPEVLDPESVELKWSEPNEEELIKFMCGEKQFSEERIRSGVKRLSKSRQGST. The pIC50 is 7.5. (4) The small molecule is CNC(C)C(=O)Nc1cc(-c2c(C)ccc3ncccc23)cc(NC(=O)c2ccccn2)n1. The target protein sequence is SDAVSSDRNFPNSTNLPRNPSMADYEARIFTFGTWIYSVNKEQLARAGFYALGEGDKVKCFHCGGGLTDWKPSEDPWEQHAKWYPGCKYLLEQKGQEYINNIHLTHSLEECLVRTT. The pIC50 is 5.8. (5) The small molecule is O=C1CCCC(=O)C1C(=O)c1ccccc1C(F)(F)F. The target protein (Q02110) has sequence MTSYSDKGEKPERGRFLHFHSVTFWVGNAKQAASYYCSKIGFEPLAYKGLETGSREVVSHVVKQDKIVFVFSSALNPWNKEMGDHLVKHGDGVKDIAFEVEDCDYIVQKARERGAIIVREEVCCAADVRGHHTPLDRARQVWEGTLVEKMTFCLDSRPQPSQTLLHRLLLSKLPKCGLEIIDHIVGNQPDQEMESASQWYMRNLQFHRFWSVDDTQIHTEYSALRSVVMANYEESIKMPINEPAPGKKKSQIQEYVDYNGGAGVQHIALKTEDIITAIRSLRERGVEFLAVPFTYYKQLQEKLKSAKIRVKESIDVLEELKILVDYDEKGYLLQIFTKPMQDRPTVFLEVIQRNNHQGFGAGNFNSLFKAFEEEQELRGNLTDTDPNGVPFRL. The pIC50 is 6.6. (6) The drug is COCCOCCC(C#N)c1ccc(Cl)cc1. The target protein sequence is MGMRTVLTGLAGMLLGSMMPVQADMPRPTGLAADIRWTAYGVPHIRAKDERGLGYGIGYAYARDNACLLAEEIVTARGERARYFGSEGKSSAELDNLPSDIFYAWLNQPEALQAFWQAQTPAVRQLLEGYAAGFNRFLREADGKTTSCLGQPWLRAIATDDLLRLTRRLLVEGGVGQFADALVAAAPPGTEKVALSGEQAFQVAEQRRQRFRLERGSNAIAVGSERSADGKGMLLANPHFPWNGAMRFYQMHLTIPGRLDVMGASLPGLPVVNIGFSRHLAWTHTVDTSSHFTLYRLALDPKDPRRYLVDGRSLPLEEKSVAIEVRGADGKLSRVEHKVYQSIYGPLVVWPGKLDWNRSEAYALRDANLENTRVLQQWYSINQASDVADLRRRVEALQGIPWVNTLAADEQGNALYMNQSVVPYLKPELIPACAIPQLVAEGLPALQGQDSRCAWSRDPAAAQAGITPAAQLPVLLRRDFVQNSNDSAWLTNPASPLQGF.... The pIC50 is 5.3. (7) The compound is CC(C)N(C[C@H]1O[C@@H](n2ccc3c(N)ncnc32)[C@H](O)[C@@H]1O)[C@H]1C[C@H](CCc2nc3cc(C(C)(C)C)ccc3[nH]2)C1. The target protein sequence is MGEKLELRLKSPVGAEPAVYPWPLPVYDKHHDAAHEIIETIRWVCEEIPDLKLAMENYVLIDYDTKSFESMQRLCDKYNRAIDSIHQLWKGTTQPMKLNTRPSTGLLRHILQQVYNHSVTDPEKLNNYEPFSPEVYGETSFDLVAQMIDEIKMTDDDLFVDLGSGVGQVVLQVAAATNCKHHYGVEKADIPAKYAETMDREFRKWMKWYGKKHAEYTLERGDFLSEEWRERIANTSVIFVNNFAFGPEVDHQLKERFANMKEGGRIVSSKPFAPLNFRINSRNLSDIGTIMRVVELSPLKGSVSWTGKPVSYYLHTIDRTILENYFSSLKNPKLREEQEAARRRQQRESKSNAATPTKGPEGKVAGPADAPMDSGAEEEKAGAATVKKPSPSKARKKKLNKKGRKMAGRKRGRPKK. The pIC50 is 9.2. (8) The drug is O=c1ssc(=O)n1-c1ccccc1. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 4.9. (9) The compound is CC(=O)OC[C@@]12[C@@H](OC(C)=O)CC[C@@H](C)[C@]13OC(C)(C)[C@H](C(=O)[C@H]2OC(=O)c1ccccc1)[C@H]3OC(C)=O. The target protein (O75387) has sequence MAPTLQQAYRRRWWMACTAVLENLFFSAVLLGWGSLLIILKNEGFYSSTCPAESSTNTTQDEQRRWPGCDQQDEMLNLGFTIGSFVLSATTLPLGILMDRFGPRPVRLVGSACFTASCTLMALASRDVEALSPLIFLALSLNGFGGICLTFTSLTLPNMFGNLRSTLMALMIGSYASSAITFPGIKLIYDAGVAFVVIMFTWSGLACLIFLNCTLNWPIEAFPAPEEVNYTKKIKLSGLALDHKVTGDLFYTHVTTMGQRLSQKAPSLEDGSDAFMSPQDVRGTSENLPERSVPLRKSLCSPTFLWSLLTMGMTQLRIIFYMAAVNKMLEYLVTGGQEHETNEQQQKVAETVGFYSSVFGAMQLLCLLTCPLIGYIMDWRIKDCVDAPTQGTVLGDARDGVATKSIRPRYCKIQKLTNAISAFTLTNLLLVGFGITCLINNLHLQFVTFVLHTIVRGFFHSACGSLYAAVFPSNHFGTLTGLQSLISAVFALLQQPLFMA.... The pIC50 is 3.8.