Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The target protein (P06856) has sequence MDVRCINWFESHGENRFLYLKSRCRNGETVFIRFPHYFYYVVTDEIYQSLSPPPFNARPLGKMRTIDIDETISYNLDIKDRKCSVADMWLIEEPKKRSIQNATMDEFLNISWFYISNGISPDGCYSLDEQYLTKINNGCYHCDDPRNCFAKKIPRFDIPRSYLFLDIECHFDKKFPSVFINPISHTSYCYIDLSGKRLLFTLINEEMLTEQEIQEAVDRGCLRIQSLMEMDYERELVLCSEIVLLRIAKQLLELTFDYVVTFNGHNFDLRYITNRLELLTGEKIIFRSPDKKEAVYLCIYERNQSSHKGVGGMANTTFHVNNNNGTIFFDLYSFIQKSEKLDSYKLDSISKNAFSCMGKVLNRGVREMTFIGDDTTDAKGKAAAFAKVLTTGNYVTVDEDIICKVIRKDIWENGFKVVLLCPTLPNDTYKLSFGKDDVDLAQMYKDYNLNIALDMARYCIHDACLCQYLWEYYGVETKTDAGASTYVLPQSMVFEYRAST.... The pIC50 is 4.3. The drug is Cc1cc(OC(=O)c2ccco2)nc(SCc2c(Cl)cccc2Cl)n1. (2) The small molecule is CCCCN(CCCC)c1ccc(/C=C(\C#N)C(=O)O)c(OC)c1. The target protein (P53987) has sequence MPPAIGGPVGYTPPDGGWGWAVVVGAFISIGFSYAFPKSITVFFKEIEIIFSATTSEVSWISSIMLAVMYAGGPISSILVNKYGSRPVMIAGGCLSGCGLIAASFCNTVQELYFCIGVIGGLGLAFNLNPALTMIGKYFYKKRPLANGLAMAGSPVFLSTLAPLNQAFFGIFGWRGSFLILGGLLLNCCVAGSLMRPIGPQQGKVEKLKSKESLQEAGKSDANTDLIGGSPKGEKLSVFQTVNKFLDLSLFTHRGFLLYLSGNVVMFFGLFTPLVFLSNYGKSKHFSSEKSAFLLSILAFVDMVARPSMGLAANTRWIRPRVQYFFAASVVANGVCHLLAPLSTTYVGFCIYAGVFGFAFGWLSSVLFETLMDLVGPQRFSSAVGLVTIVECCPVLLGPPLLGRLNDMYGDYKYTYWACGVILIIAGLYLFIGMGINYRLVAKEQKAEEKKRDGKEDETSTDVDEKPKKTMKETQSPAPLQNSSGDPAEEESPV. The pIC50 is 8.1. (3) The small molecule is COC1=CC=CN2N=C(/C=C\c3nc(-c4cccs4)cn3C)NC12. The target protein sequence is ICTSEEWQGLMQFTLPVRLCKEIELFHFDIGPFENMWPGIFVYMVHRSCGTSCFELEKLCRFIMSVKKNYRRVPYHNWKHAVTVAHCMYAILQNNHTLFTDLERKGLLIACLCHDLDHRGFSNSYLQKFDHPLAALYSTSTMEQHHFSQTVSILQLEGHNIFSTLSSSEYEQVLEIIRKAIIATDLALYFGNRKQLEEMYQTGSLNLNNQSHRDRVIGLMMTACDLCSVTKLWPVTKLTANDIYAEFWAEGDEMKKLGIQPIPMMDRDKKDEVPQGQLGFYNAVAIPCYTTLTQILPPTEPLLKACRDNLSQWEKVIRGEETATWISSPSVAQKAAASED. The pIC50 is 7.2. (4) The compound is CCN(Cc1ccc(OCCN2C(=O)CCC2=O)c(C)c1)C(CC(=O)O)c1ccc(Cl)cc1. The target protein (P02778) has sequence MNQTAILICCLIFLTLSGIQGVPLSRTVRCTCISISNQPVNPRSLEKLEIIPASQFCPRVEIIATMKKKGEKRCLNPESKAIKNLLKAVSKERSKRSP. The pIC50 is 6.4. (5) The compound is COc1ccc([N+](=O)[O-])cc1NC(=O)Cc1ccccc1O. The target protein (Q62976) has sequence MANGGGGGGGGSSGSSGGGGGGGGGETALRMSSNIHANHLSLDASSSSSSSSSSSSSSSSSVHEPKMDALIIPVTMEVPCDSRGQRMWWAFLASSMVTFFGGLFIILLWRTLKYLWTVCCHCGGKTKEAQKINNGSSQADGTLKPVDEKEEVVAAEVGWMTSVKDWAGVMISAQTLTGRVLVVLVFALSIGALVIYFIDSSNPIESCQNFYKDFTLQIDMAFNVFFLLYFGLRFIAANDKLWFWLEVNSVVDFFTVPPVFVSVYLNRSWLGLRFLRALRLIQFSEILQFLNILKTSNSIKLVNLLSIFISTWLTAAGFIHLVENSGDPWENFQNNQALTYWECVYLLMVTMSTVGYGDVYAKTTLGRLFMVFFILGGLAMFASYVPEIIELIGNRKKYGGSYSAVSGRKHIVVCGHITLESVSNFLKDFLHKDRDDVNVEIVFLHNISPNLELEALFKRHFTQVEFYQGSVLNPHDLARVKIESADACLILANKYCADPD.... The pIC50 is 4.7. (6) The compound is O=[N+]([O-])c1cccnc1NCCNc1ncc(C(F)(F)F)cc1Cl. The target protein (Q96LD8) has sequence MDPVVLSYMDSLLRQSDVSLLDPPSWLNDHIIGFAFEYFANSQFHDCSDHVSFISPEVTQFIKCTSNPAEIAMFLEPLDLPNKRVVFLAINDNSNQAAGGTHWSLLVYLQDKNSFFHYDSHSRSNSVHAKQVAEKLEAFLGRKGDKLAFVEEKAPAQQNSYDCGMYVICNTEALCQNFFRQQTESLLQLLTPAYITKKRGEWKDLITTLAKK. The pIC50 is 4.6. (7) The small molecule is CCN(C(=O)Cn1c(C(=O)N[C@H]2CC[C@H](C(=O)OC)CC2)cc2sccc21)c1cccc(C(N)=O)c1. The target protein (Q9GV45) has sequence MAYSTLFIIALTAVVTQASSTQKSNLTFTLADFVGDWQQTAGYNQDQVLEQGGLSSLFQALGVSVTPIQKVVLSGENGLKADIHVIIPYEGLSGFQMGLIEMIFKVVYPVDDHHFKIILHYGTLVIDGVTPNMIDYFGRPYPGIAVFDGKQITVTGTLWNGNKIYDERLINPDGSLLFRVTINGVTGWRLCENILA. The pIC50 is 6.6. (8) The drug is O=C1CC(Cc2ccccc2)NC(=O)C1C(=O)NC1CCCCC1. The target protein (Q97SR4) has sequence MFGFFKKDKAVEVEVPTQVPAHIGIIMDGNGRWAKKRMQPRVFGHKAGMEALQTVTKAANKLGVKVITVYAFSTENWTRPDQEVKFIMNLPVEFYDNYVPELHANNVKIQMIGETDRLPKQTFEALTKAEELTKNNTGLILNFALNYGGRAEITQALKLISQDVLDAKINPGDITEELIGNYLFTQHLPKDLRDPDLIIRTSGELRLSNFLPWQGAYSELYFTDTLWPDFDEAALQEAILAYNRRHRRFGGV. The pIC50 is 7.2. (9) The compound is O=C(OC(C(F)(F)F)C(F)(F)F)N1CCC2(CCN(Cc3c(N4CC5COCC5C4)cccc3C(F)(F)F)C2)CC1. The target protein (O08914) has sequence MVLSEVWTALSGLSGVCLACSLLSAAVVLRWTRSQTARGAVTRARQKQRAGLETMDKAVQRFRLQNPDLDSEALLALPLLQLVQKLQSGELSPEAVLFTYLGKAWEVNKGTNCVTSYLTDCETQLSQAPRQGLLYGVPVSLKECFSYKGHASTLGLSLNEGVTSESDCVVVQVLKLQGAVPFVHTNVPQSMLSYDCSNPLFGQTMNPWKPSKSPGGSSGGEGALIGSGGSPLGLGTDIGGSIRFPSAFCGICGLKPTGNRLSKSGLKSCVYGQTAVQLSVGPMARDVDSLALCMKALLCEDLFRLDSTIPPLPFREEIYRSSRPLRVGYYETDNYTMPTPAMRRAVMETKQSLEAAGHTLVPFLPNNIPYALEVLSAGGLFSDGGCSFLQNFKGDFVDPCLGDLVLVLKLPRWFKKLLSFLLKPLFPRLAAFLNSMCPRSAEKLWELQHEIEMYRQSVIAQWKAMNLDVVLTPMLGPALDLNTPGRATGAISYTVLYNCL.... The pIC50 is 6.0.