Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=c1nc2sc3ccccc3n2c(=O)n1-c1ccccc1. The target protein sequence is MPLVDFFCETCSKPWLVGWWDQFKRMLNRELTHLSEMSRSGNQVSEYISTTFLDKQNEVEIPSPTMKEREKQQAPRPRPSQPPPPPVPHLQPMSQITGLKKLMHSNSLNNSNIPRFGVKTDQEELLAQELENLNKWGLNIFCVSDYAGGRSLTCIMYMIFQERDLLKKFRIPVDTMVTYMLTLEDHYHADVAYHNSLHAADVLQSTHVLLATPALDAVFTDLEILAALFAAAIHDVDHPGVSNQFLINTNSELALMYNDESVLENHHLAVGFKLLQEDNCDIFQNLSKRQRQSLRKMVIDMVLATDMSKHMTLLADLKTMVETKKVTSSGVLLLDNYSDRIQVLRNMVHCADLSNPTKPLELYRQWTDRIMAEFFQQGDRERERGMEISPMCDKHTASVEKSQVGFIDYIVHPLWETWADLVHPDAQEILDTLEDNRDWYYSAIRQSPSPPPEEESRGPGHPPLPDKFQFELTLEEEEEEEISMAQIPCTAQEALTAQGL.... The pIC50 is 4.0. (2) The drug is COC(=O)[C@@H]1CCCN1C(=O)[C@@H](Cc1ccccc1)N(C)C(=O)[C@H](C)NC(=O)[C@@H](NC(=O)C[C@H](O)[C@H](CC(C)C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](OC(=O)[C@@H](OC(=O)[C@H](C(C)C)N(C)C)C(C)C)C(C)C)[C@@H](C)O. The target protein (P70343) has sequence MAENKHPDKPLKVLEQLGKEVLTEYLEKLVQSNVLKLKEEDKQKFNNAERSDKRWVFVDAMKKKHSKVGEMLLQTFFSVDPGSHHGEANLEMEEPEESLNTLKLCSPEEFTRLCREKTQEIYPIKEANGRTRKALIICNTEFKHLSLRYGANFDIIGMKGLLEDLGYDVVVKEELTAEGMESEMKDFAALSEHQTSDSTFLVLMSHGTLHGICGTMHSEKTPDVLQYDTIYQIFNNCHCPGLRDKPKVIIVQACRGGNSGEMWIRESSKPQLCRGVDLPRNMEADAVKLSHVEKDFIAFYSTTPHHLSYRDKTGGSYFITRLISCFRKHACSCHLFDIFLKVQQSFEKASIHSQMPTIDRATLTRYFYLFPGN. The pIC50 is 5.3. (3) The drug is O=C(/C=C/c1ccc(-c2ccc(O)c(C(=O)O)c2)o1)/C=C/c1ccc(-c2ccc(O)c(C(=O)O)c2)o1. The target protein sequence is MAGIFYFALFSCLFGICDAVTGSRVYPANEVTLLDSRSVQGELGWIASPLEGGWEEVSIMDEKNTPIRTYQVCNVMEPSQNNWLRTDWITREGAQRVYIEIKFTLRDCNSLPGVMGTCKETFNLYYYESDNDKERFIRENQFVKIDTIAADESFTQVDIGDRIMKLNTEIRDVGPLSKKGFYLAFQDVGACIALVSVRVFYKKCPLTVRNLAQFPDTITGADTSSLVEVRGSCVNNSEEKDVPKMYCGADGEWLVPIGNCLCNAGHEERSGECQACKIGYYKALSTDATCAKCPPHSYSVWEGATSCTCDRGFFRADNDAASMPCTRPPSAPLNLISNVNETSVNLEWSSPQNTGGRQDISYNVVCKKCGAGDPSKCRPCGSGVHYTPQQNGLKTTKVSITDLLAHTNYTFEIWAVNGVSKYNPNPDQSVSVTVTTNQAAPSSIALVQAKEVTRYSVALAWLEPDRPNGVILEYEVKYYEKDQNERSYRIVRTAARNTDI.... The pIC50 is 5.6. (4) The drug is CC(C)OC(=O)OCOP(=O)(CO[C@H](C)Cn1cnc2c(N)ncnc21)OCOC(=O)OC(C)C. The target protein sequence is PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTELEKEGKISKIGPENPYNTPVFAIKKKDSSKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYVDDLYVGSDLEIGQHRTKIEELRQHLWRWGLYTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTNRGRQKVVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQ.... The pIC50 is 5.2. (5) The small molecule is c1ccc(-c2c[nH]c([C@H]3Cc4c([nH]c5ccccc45)[C@@H](C4CCOCC4)N3)n2)cc1. The target protein (O08858) has sequence MEPLSLTSTPSWNASAASSSSHNWSLVDPVSPMGARAVLVPVLYLLVCTVGLGGNTLVIYVVLRYAKMKTVTNVYILNLAVADVLFMLGLPFLATQNAVSYWPFGSFLCRLVMTLDGINQFTSIFCLMVMSVDRYLAVVHPLRSARWRRPRVAKLASAAVWVFSLLMSLPLLVFADVQEGWGTCNLSWPEPVGLWGAAFITYTSVLGFFGPLLVICLCYLLIVVKVKAAGMRVGSSRRRRSERKVTRMVVVVVLVFVGCWLPFFIVNIVNLAFTLPEEPTSAGLYFFVVVLSYANSCANPLLYGFLSDNFRQSFRKALCLRRGYGVEDADAIEPRPDKSGRPQTTLPTRSCEANGLMQTSRL. The pIC50 is 5.0. (6) The compound is O=C1Oc2cc(O)ccc2/C1=C\c1ccc(O)cc1. The target protein (O52691) has sequence MINDQLPRWVREARVGTRTGGPAMRPKTSDSPYFGWDSEDWPEVTRQLLSEQPLSGDTLVDAVLASWESIFESRLGSGFHIGTQIRPTPQVMGFLLHALIPLELANGDPSWRADLNSSEKDLVYQPDHKYSIEMKTSSHKDQIFGNRSFGVENPGKGKKAKDGYYVAVNFEKWSDAPGRLPRIRTIRYGWLDHTDWVAQKSQTGQQSSLPAVVSNTQLLAIHTGGQR. The pIC50 is 3.4. (7) The compound is O=C(CCCOc1ccc(Cl)cc1)NC1CCOC1=O. The target protein sequence is MVISKPINARPLPAGLTASQQWTLLEWIHMAGHIETENELKAFLDQVLSQAPSERLLLALGRLNNQNQIQRLERVLNVSYPSDWLDQYMKENYAQHDPILRIHLGQGPVMWEERFNRAKGAEEKRFIAEATQNGMGSGITFSAASERNNIGSILSIAGREPGRNAALVAMLNCLTPHLHQAAIRVANLPPASPSNMPLSQREYDIFHWMSRGKTNWEIATILDISERTVKFHVANVIRKLNANNRTHAIVLGMHLAMPPSTVANE. The pIC50 is 6.4. (8) The compound is CCCc1c(C(=O)O)c(O)cc2c1C(=O)c1c(O)cccc1C2=O. The target protein (P17325) has sequence MTAKMETTFYDDALNASFLQSESGAYGYSNPKILKQSMTLNLADPVGNLKPHLRAKNSDLLTSPDVGLLKLASPELERLIIQSSNGHITTTPTPTQFLCPKNVTDEQEGFAEGFVRALAELHSQNTLPSVTSAAQPVSGAGMVAPAVASVAGAGGGGGYSASLHSEPPVYANLSNFNPGALSSGGGAPSYGATGLAFPSQPQQQQQPPQPPHHLPQQIPVQHPRLQALKEEPQTVPEMPGETPPLSPIDMESQERIKAERKRMRNRIAASKCRKRKLERIARLEEKVKTLKAQNSELASTANMLREQVAQLKQKVMNHVNSGCQLMLTQQLQTF. The pIC50 is 4.2. (9) The small molecule is CCOC(=O)C1=C[C@@H](OC(CC)CC)[C@H](NC(C)=O)[C@@H](N)C1. The target protein sequence is MDSNTVSSFQVDCFLWHVRKRFADQELGDAPFLDRLRRDQKSLRGRGSTLGLDIETATRAGKQIVERILEEESDEALKMTITSVPASRYLTDMTLEEMSRDWFMLMPKQKVAGSLCIRMDQAIMDKNIKLKANFSVIFDRLETLILLRAFTEEGAIVGEISPLPSLPGHTDEDVKNAIGVLIGGLEWNDNTVRVSETLQRFAWRSSNEDGRPPLPPKQKRKMARTIEPEV. The pIC50 is 9.5.