Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCCCC(C(=O)COc1c(F)c(F)cc(F)c1F)n1cc(C2(C)CCCN2)nn1. The target protein (P97821) has sequence MGPWTHSLRAVLLLVLLGVCTVRSDTPANCTYPDLLGTWVFQVGPRSSRSDINCSVMEATEEKVVVHLKKLDTAYDELGNSGHFTLIYNQGFEIVLNDYKWFAFFKYEVRGHTAISYCHETMTGWVHDVLGRNWACFVGKKVESHIEKVNMNAAHLGGLQERYSERLYTHNHNFVKAINTVQKSWTATAYKEYEKMSLRDLIRRSGHSQRIPRPKPAPMTDEIQQQILNLPESWDWRNVQGVNYVSPVRNQESCGSCYSFASMGMLEARIRILTNNSQTPILSPQEVVSCSPYAQGCDGGFPYLIAGKYAQDFGVVEESCFPYTAKDSPCKPRENCLRYYSSDYYYVGGFYGGCNEALMKLELVKHGPMAVAFEVHDDFLHYHSGIYHHTGLSDPFNPFELTNHAVLLVGYGRDPVTGIEYWIIKNSWGSNWGESGYFRIRRGTDECAIESIAVAAIPIPKL. The pIC50 is 6.4. (2) The pIC50 is 7.3. The small molecule is C[C@H](NC1=NC(=O)[C@](C)(C(C)(C)O)S1)c1ccc(F)cc1. The target protein sequence is MAFKKKYLPPLLGFFLAYYYYSANEEFRPEMLQGKKVIVTGASKGIGEQMAYHLAKMGAHVVVTARSKETLKKVVSHCLELGAASAHYIPGTMEDMTFAEQFVAKAGKLMGGLDMLILNHITNTSMNLFSGDIHLVRRSMEVNFLSYVVLSAAALPMLKQSNGSIVVVSSKAGKMSSPLVAPYSASKFALDGFFSSVRMEHSVTKVNVSITLCILGLINTDTAMKAVSGILSTVGASSKEECALEIIKGGALRQEEVYYDNSVWTAFLLGNPGRKILEFLSLRSYKLDKFINN. (3) The compound is CCCCCCCC/C=C\CCCCCCCC(=O)N[C@H](COP(=O)(O)O)Cc1ccc(OCc2cccc(OC)c2)cc1. The target protein (Q9UBY5) has sequence MNECHYDKHMDFFYNRSNTDTVDDWTGTKLVIVLCVGTFFCLFIFFSNSLVIAAVIKNRKFHFPFYYLLANLAAADFFAGIAYVFLMFNTGPVSKTLTVNRWFLRQGLLDSSLTASLTNLLVIAVERHMSIMRMRVHSNLTKKRVTLLILLVWAIAIFMGAVPTLGWNCLCNISACSSLAPIYSRSYLVFWTVSNLMAFLIMVVVYLRIYVYVKRKTNVLSPHTSGSISRRRTPMKLMKTVMTVLGAFVVCWTPGLVVLLLDGLNCRQCGVQHVKRWFLLLALLNSVVNPIIYSYKDEDMYGTMKKMICCFSQENPERRPSRIPSTVLSRSDTGSQYIEDSISQGAVCNKSTS. The pIC50 is 6.2. (4) The compound is Nc1nc2nc(C(=O)NCc3cn(Cc4cccnc4)nn3)cnc2c(=O)[nH]1. The target protein (P02879) has sequence MKPGGNTIVIWMYAVATWLCFGSTSGWSFTLEDNNIFPKQYPIINFTTAGATVQSYTNFIRAVRGRLTTGADVRHEIPVLPNRVGLPINQRFILVELSNHAELSVTLALDVTNAYVVGYRAGNSAYFFHPDNQEDAEAITHLFTDVQNRYTFAFGGNYDRLEQLAGNLRENIELGNGPLEEAISALYYYSTGGTQLPTLARSFIICIQMISEAARFQYIEGEMRTRIRYNRRSAPDPSVITLENSWGRLSTAIQESNQGAFASPIQLQRRNGSKFSVYDVSILIPIIALMVYRCAPPPSSQFSLLIRPVVPNFNADVCMDPEPIVRIVGRNGLCVDVRDGRFHNGNAIQLWPCKSNTDANQLWTLKRDNTIRSNGKCLTTYGYSPGVYVMIYDCNTAATDATRWQIWDNGTIINPRSSLVLAATSGNSGTTLTVQTNIYAVSQGWLPTNNTQPFVTTIVGLYGLCLQANSGQVWIEDCSSEKAEQQWALYADGSIRPQQN.... The pIC50 is 4.0. (5) The drug is CC1(C)CN(Cc2ccccc2)CCC1Oc1ccc(S(=O)(=O)Nc2ncns2)cc1Cl. The target protein (Q9UQD0) has sequence MAARLLAPPGPDSFKPFTPESLANIERRIAESKLKKPPKADGSHREDDEDSKPKPNSDLEAGKSLPFIYGDIPQGLVAVPLEDFDPYYLTQKTFVVLNRGKTLFRFSATPALYILSPFNLIRRIAIKILIHSVFSMIIMCTILTNCVFMTFSNPPDWSKNVEYTFTGIYTFESLVKIIARGFCIDGFTFLRDPWNWLDFSVIMMAYITEFVNLGNVSALRTFRVLRALKTISVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALIGLQLFMGNLRNKCVVWPINFNESYLENGTKGFDWEEYINNKTNFYTVPGMLEPLLCGNSSDAGQCPEGYQCMKAGRNPNYGYTSFDTFSWAFLALFRLMTQDYWENLYQLTLRAAGKTYMIFFVLVIFVGSFYLVNLILAVVAMAYEEQNQATLEEAEQKEAEFKAMLEQLKKQQEEAQAAAMATSAGTVSEDAIEEEGEEGGGSPRSSSEISKLSSKSAKERRNRRKKRKQ.... The pIC50 is 6.4. (6) The small molecule is CNC(=O)[C@H](Cc1ccccc1)NC(=O)[C@]1(CS)CC[C@H](C(C)(C)C)CC1. The target protein (P48032) has sequence MTPWLGLVVLLSCWSLGHWGTEACTCSPSHPQDAFCNSDIVIRAKVVGKKLVKEGPFGTLVYTIKQMKMYRGFSKMPHVQYIHTEASESLCGLKLEVNKYQYLLTGRVYEGKMYTGLCNFVERWDHLTLSQRKGLNYRYHLGCNCKIKSCYYLPCFVTSKKECLWTDMLSNFGYPGYQSKHYACIRQKGGYCSWYRGWAPPDKSISNATDP. The pIC50 is 5.0. (7) The drug is CCN1C(=O)/C(=C/c2ccc(OC)c(OC)c2)S/C1=N\c1cccc(C(=O)O)c1. The target protein (P9WP55) has sequence MSIAEDITQLIGRTPLVRLRRVTDGAVADIVAKLEFFNPANSVKDRIGVAMLQAAEQAGLIKPDTIILEPTSGNTGIALAMVCAARGYRCVLTMPETMSLERRMLLRAYGAELILTPGADGMSGAIAKAEELAKTDQRYFVPQQFENPANPAIHRVTTAEEVWRDTDGKVDIVVAGVGTGGTITGVAQVIKERKPSARFVAVEPAASPVLSGGQKGPHPIQGIGAGFVPPVLDQDLVDEIITVGNEDALNVARRLAREEGLLVGISSGAATVAALQVARRPENAGKLIVVVLPDFGERYLSTPLFADVAD. The pIC50 is 7.2. (8) The compound is C=CC(=O)NC[C@H](NC(=O)NC(C)(C)C)C(=O)N1CC2[C@@H]([C@H]1C(=O)NC(CC1CCC1)C(=O)C(N)=O)C2(C)C. The target protein sequence is APITAYAQQTRGLLGCIITSLTGRDKNQVEGEVQIVSTAAQTFLATCINGVCWTVYHGAGTRTIASPKGPVIQMYTNVDQDLVGWPAPQGARSLTPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPAGHAVGLFKAAVCTRGVAKAVDFIPVENLETTMRS. The pIC50 is 5.5. (9) The compound is NC[C@@H]1C[C@@]1(C(=O)N1CCOc2ccccc21)c1ccc2c(c1)OCO2. The target protein (Q01959) has sequence MSKSKCSVGLMSSVVAPAKEPNAVGPKEVELILVKEQNGVQLTSSTLTNPRQSPVEAQDRETWGKKIDFLLSVIGFAVDLANVWRFPYLCYKNGGGAFLVPYLLFMVIAGMPLFYMELALGQFNREGAAGVWKICPILKGVGFTVILISLYVGFFYNVIIAWALHYLFSSFTTELPWIHCNNSWNSPNCSDAHPGDSSGDSSGLNDTFGTTPAAEYFERGVLHLHQSHGIDDLGPPRWQLTACLVLVIVLLYFSLWKGVKTSGKVVWITATMPYVVLTALLLRGVTLPGAIDGIRAYLSVDFYRLCEASVWIDAATQVCFSLGVGFGVLIAFSSYNKFTNNCYRDAIVTTSINSLTSFSSGFVVFSFLGYMAQKHSVPIGDVAKDGPGLIFIIYPEAIATLPLSSAWAVVFFIMLLTLGIDSAMGGMESVITGLIDEFQLLHRHRELFTLFIVLATFLLSLFCVTNGGIYVFTLLDHFAAGTSILFGVLIEAIGVAWFYG.... The pIC50 is 5.3.