Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CO/C(=C\C=C\c1cc2cc(Cl)c(Cl)cc2[nH]1)C(=O)NCCCN1CCN(c2ncccn2)CC1. The target protein (Q9I8D0) has sequence MGELFRSEEMTLAQLFLQSEAAYCCVSELGELGKVQFRDLNPDVNVFQRKFVNEVRRCEEMDRKLRFVEKEIKKANIPIMDTGENPEVPFPRDMIDLEANFEKIENELKEINTNQEALKRNFLELTELKFILRKTQQFFDEMADPDLLEESSSLLEPSEMGRGAPLRLGFVAGVINRERIPTFERMLWRVCRGNVFLRQAEIENPLEDPVTGDYVHKSVFIIFFQGDQLKNRVKKICEGFRASLYPCPETPQERKEMASGVNTRIDDLQMVLNQTEDHRQRVLQAAAKNIRVWFIKVRKMKAIYHTLNLCNIDVTQKCLSAEVWCPVADLDSIQFALRRGTEHSGSTVPSILNRMQTNQTPPTYNKTNKFTCGFQNIVDAYGIGTYREINPAPYTIITFPFLFAVMFGDFGHGILMTLIAIWMVLRESRILSQKSDNEMFSTVFSGRYIILLMGLFSTYTGLIYNDCFSKSLNMFGSSWSVRPMFSKANWSDELLKTTPL.... The pIC50 is 6.0. (2) The small molecule is Cc1cc(Br)cn2c(Cc3ccccc3C(F)(F)F)c(-c3ccco3)nc12. The target protein (O60755) has sequence MADAQNISLDSPGSVGAVAVPVVFALIFLLGTVGNGLVLAVLLQPGPSAWQEPGSTTDLFILNLAVADLCFILCCVPFQATIYTLDAWLFGALVCKAVHLLIYLTMYASSFTLAAVSVDRYLAVRHPLRSRALRTPRNARAAVGLVWLLAALFSAPYLSYYGTVRYGALELCVPAWEDARRRALDVATFAAGYLLPVAVVSLAYGRTLRFLWAAVGPAGAAAAEARRRATGRAGRAMLAVAALYALCWGPHHALILCFWYGRFAFSPATYACRLASHCLAYANSCLNPLVYALASRHFRARFRRLWPCGRRRRHRARRALRRVRPASSGPPGCPGDARPSGRLLAGGGQGPEPREGPVHGGEAARGPE. The pIC50 is 4.8. (3) The drug is COC(=O)[C@H]1[C@H]2C[C@@H]3c4[nH]c5cc(OC)ccc5c4CCN3C[C@H]2C[C@@H](OC(=O)c2cc(OC)c(OC)c(OC)c2)[C@@H]1OC. The target protein sequence is MALSDLVLLRWLRDSRHSRKLILFIVFLALLLDNMLLTVVIPIIPSYLYSIKHEKNSTEIQTTRPELVVSTSESIFSYYNNSTVLITGNATGTLPGGQSHKATSTQHTVANTTVPSDCPSEDRDLLNENVQVGLLFASKATVQLLTNPFIGLLTNRIGYPIPMFAGFCIMFISTVMFAFSSSYAFLLIARSLQGIGSSCSSVAGMGMLASVYTDDEERGKPMGIALGGLAMGVLVGPPFGSVLYEFVGKTAPFLVLAALVLLDGAIQLFVLQPSRVQPESQKGTPLTTLLKDPYILIAAGSICFANMGIAMLEPALPIWMMETMCSRKWQLGVAFLPASISYLIGTNIFGILAHKMGRWLCALLGMVIVGISILCIPFAKNIYGLIAPNFGVGFAIGMVDSSMMPIMGYLVDLRHVSVYGSVYAIADVAFCMGYAIGPSAGGAIAKAIGFPWLMTIIGIIDIAFAPLCFFLRSPPAKEEKMAILMDHNCPIKRKMYTQNN.... The pIC50 is 8.4. (4) The small molecule is O=C(O)CNC(=O)c1c(=O)oc(O)c2ccc(-c3ccc(C(F)(F)F)cc3)cc12. The target protein (P59722) has sequence PRAQPAPAQPRVAPPPGGAPGAARAGGAARRGDSSTAASRVPGPEDATQAGSGPGPAEPSSEDPPPSRSPGPERASLCPAGGGPGEALSPSGGLRPNGQTKPLPALKLALEYIVPCMNKHGICVVDDFLGRETGQQIGDEVRALHDTGKFTDGQLVSQKSDSSKDIRGDKITWIEGKEPGCETIGLLMSSMDDLIRHCSGKLGNYRINGRTKAMVACYPGNGTGYVRHVDNPNGDGRCVTCIYYLNKDWDAKVSGGILRIFPEGKAQFADIEPKFDRLLFFWSDRRNPHEVQPAYATRYAITVWYFDADERARAKVKYLTGEKGVRVELKPNSVSKDV. The pIC50 is 6.8. (5) The compound is O=C1Nc2cc(Nc3cccc(NC(=O)c4cccc(C(F)(F)F)c4)c3)ccc2/C1=C/c1ccc[nH]1. The target protein (P14234) has sequence MGCVFCKKLEPASKEDVGLEGDFRSQTAEERYFPDPTQGRTSSVFPQPTSPAFLNTGNMRSISGTGVTIFVALYDYEARTGDDLTFTKGEKFHILNNTEYDWWEARSLSSGHRGYVPSNYVAPVDSIQAEEWYFGKISRKDAERQLLSSGNPQGAFLIRESETTKGAYSLSIRDWDQNRGDHIKHYKIRKLDTGGYYITTRAQFDSIQDLVRHYMEVNDGLCYLLTAPCTTTKPQTLGLAKDAWEIDRNSIALERRLGTGCFGDVWLGTWNCSTKVAVKTLKPGTMSPKAFLEEAQIMKLLRHDKLVQLYAVVSEEPIYIVTEFMCYGSLLDFLKDREGQNLMLPHLVDMAAQVAEGMAYMERMNYIHRDLRAANILVGEYLICKIADFGLARLIEDNEYNPQQGTKFPIKWTAPEAALFGRFTVKSDVWSFGILLTELITKGRVPYPGMNNREVLEQVEHGYHMPCPPGCPASLYEVMEQAWRLDPEERPTFEYLQSFL.... The pIC50 is 5.0. (6) The compound is O=C1Nc2cn(C3CCOCC3)nc2C(=O)NCCOCCn2cc(cn2)-c2cccc1c2. The target protein sequence is RPFPFCWPLCEISRGTHNFSEELKIGEGGFGCVYRAVMRNTVYAVKRLKENADLEWTAVKQSFLTEVEQLSRFRHPNIVDFAGYCAQNGFYCLVYGFLPNGSLEDRLHCQTQACPPLSWPQRLDILLGTARAIQFLHQDSPSLIHGDIKSSNVLLDERLTPKLGDFGLARFSRFAGSSPSQSSMVARTQTVRGTLAYLPEEYIKTGRLAVDTDTFSFGVVVLETLAGQRAVKTHGARTKYLKDLVEEEAEEAGVALRSTQSTLQAGLAADAWAAPIAMQIYKKHLDPRPGPCPPELGLGLGQLACCCLHRRAKRRPPMTQVYERLEKLQAVVAGVPGHSEAASCIPPSPQENSYVSSTGRAHSGAAPWQPLAAPSGASAQAAEQLQRGPNQPVESDESLGGLSAALRSWHLTPSCPLDPAPLREAGCPQGDTAGESSWGSGPGSRPTAVEGLALGSSASSSSEPPQIIINPARQKMVQKLALYEDGALDSLQLLSSSSLP.... The pIC50 is 5.3. (7) The pIC50 is 4.5. The target protein sequence is MSKTLKKKKHWLSKVQECAVSWAGPPGDFGAEIRGGAERGEFPYLGRLREEPGGGTCCVVSGKAPSPGDVLLEVNGTPVSGLTNRDTLAVIRHFREPIRLKTVKPGKVINKDLRHYLSLQFQKGSIDHKLQQVIRDNLYLRTIPCTTRAPRDGEVPGVDYNFISVEQFKALEESGALLESGTYDGNFYGTPKPPAEPSPFQPDPVDQVLFDNEFDAESQRKRTTSVSKMERMDSSLPEEEEDEDKEAINGSGNAENRERHSESSDWMKTVPSYNQTNSSMDFRNYMMRDETLEPLPKNWEMAYTDTGMIYFIDHNTKTTTWLDPRLCKKAKAPEDCEDGELPYGWEKIEDPQYGTYYVDFTLVAQAGVQWHDLGSLQPPPPGFNHLNQKTQFENPVEEAKRKKQLGQVEIGSSKPDMEKSHFTRDPSQLKGVLVRASLKKSTMGFGFTIIGGDRPDEFLQVKNVLKDGPAAQDGKIAPGDVIVDINGNCVLGHTHADVVQ.... The small molecule is CC[C@H](C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](NC(C)=O)[C@H](C)O)C(=O)N(C)[C@H](C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@H](C(=O)O)C(C)C)[C@H](C)O. (8) The drug is COc1cc2[nH]ncc2cc1Nc1ncnc2sc(C(=O)N(C)CC(=O)N(C)C)nc12. The target protein sequence is MVSSQKLEKPIEMGSSEPLPIADGDRRRKKKRRGRATDSLPGKFEDMYKLTSELLGEGAYAKVQGAVSLQNGKEYAVKIIEKQAGHSRSRVFREVETLYQCQGNKNILELIEFFEDDTRFYLVFEKLQGGSILAHIQKQKHFNEREASRVVRDVAAALDFLHTKGIAHRDLKPENILCESPEKVSPVKICDFDLGSGMKLNNSCTPITTPELTTPCGSAEYMAPEVVEVFTDQATFYDKRCDLWSLGVVLYIMLSGYPPFVGHCGADCGWDRGEVCRVCQNKLFESIQEGKYEFPDKDWAHISSEAKDLISKLLVRDAKQRLSAAQVLQHPWVQGQAPEKGLPDPQVLQRNSSTMDLTLFAAEAIALNRQLSQHEENELAEEPEALADGLCSMKLSPPCKSRLARRRALAQAGRGEDRSPPTAL. The pIC50 is 6.9. (9) The compound is CCCCOc1ccc(CNC(=O)Oc2cccc(Cl)c2)cc1. The target protein (O00519) has sequence MVQYELWAALPGASGVALACCFVAAAVALRWSGRRTARGAVVRARQRQRAGLENMDRAAQRFRLQNPDLDSEALLALPLPQLVQKLHSRELAPEAVLFTYVGKAWEVNKGTNCVTSYLADCETQLSQAPRQGLLYGVPVSLKECFTYKGQDSTLGLSLNEGVPAECDSVVVHVLKLQGAVPFVHTNVPQSMFSYDCSNPLFGQTVNPWKSSKSPGGSSGGEGALIGSGGSPLGLGTDIGGSIRFPSSFCGICGLKPTGNRLSKSGLKGCVYGQEAVRLSVGPMARDVESLALCLRALLCEDMFRLDPTVPPLPFREEVYTSSQPLRVGYYETDNYTMPSPAMRRAVLETKQSLEAAGHTLVPFLPSNIPHALETLSTGGLFSDGGHTFLQNFKGDFVDPCLGDLVSILKLPQWLKGLLAFLVKPLLPRLSAFLSNMKSRSAGKLWELQHEIEVYRKTVIAQWRALDLDVVLTPMLAPALDLNAPGRATGAVSYTMLYNCL.... The pIC50 is 7.9. (10) The compound is O=C([O-])CC(S[Au])C(=O)[O-]. The target protein (Q94715) has sequence MKQFLTAAIVTLLMTAGYYHLQEDDTNDFERWALKNNKFYTESEKLYRMEIYNSNKRMIEEHNQREDVTYQMGENQFMTLSHEEFVDLYLQKSDSSVNIMGASLPEVQLEGLGAVDWRNYTTVKEQGQCASGWAFSVSNSLEAWYAIRGFQKINASTQQIVDCDYNNTGCSGGYNAYAMEYVLRVGLVSSTNYPYVAKNQTCKQSRNGTYFINGYSFVGGSQSNLQYYLNNYPISVGVEASNWQFYRSGLFSNCSSNGTNHYALAVGFDSANNWIVQNSWGTQWGESGNIRLYPQNTCGILNYPYQVY. The pIC50 is 3.6.