Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The target protein (P54132) has sequence MAAVPQNNLQEQLERHSARTLNNKLSLSKPKFSGFTFKKKTSSDNNVSVTNVSVAKTPVLRNKDVNVTEDFSFSEPLPNTTNQQRVKDFFKNAPAGQETQRGGSKSLLPDFLQTPKEVVCTTQNTPTVKKSRDTALKKLEFSSSPDSLSTINDWDDMDDFDTSETSKSFVTPPQSHFVRVSTAQKSKKGKRNFFKAQLYTTNTVKTDLPPPSSESEQIDLTEEQKDDSEWLSSDVICIDDGPIAEVHINEDAQESDSLKTHLEDERDNSEKKKNLEEAELHSTEKVPCIEFDDDDYDTDFVPPSPEEIISASSSSSKCLSTLKDLDTSDRKEDVLSTSKDLLSKPEKMSMQELNPETSTDCDARQISLQQQLIHVMEHICKLIDTIPDDKLKLLDCGNELLQQRNIRRKLLTEVDFNKSDASLLGSLWRYRPDSLDGPMEGDSCPTGNSMKELNFSHLPSNSVSPGDCLLTTTLGKTGFSATRKNLFERPLFNTHLQKSF.... The drug is O=C(Nc1ccc(-n2cncn2)c(F)c1)Nc1nnc(-c2ccncc2)s1. The pIC50 is 4.8. (2) The compound is Nc1nc2ccc(-c3cnn(C4CC4)c3)cc2s1. The target protein (P42356) has sequence MAAAPARGGGGGGGGGGGCSGSGSSASRGFYFNTVLSLARSLAVQRPASLEKVQKLLCMCPVDFHGIFQLDERRRDAVIALGIFLIESDLQHKDCVVPYLLRLLKGLPKVYWVEESTARKGRGALPVAESFSFCLVTLLSDVAYRDPSLRDEILEVLLQVLHVLLGMCQALEIQDKEYLCKYAIPCLIGISRAFGRYSNMEESLLSKLFPKIPPHSLRVLEELEGVRRRSFNDFRSILPSNLLTVCQEGTLKRKTSSVSSISQVSPERGMPPPSSPGGSAFHYFEASCLPDGTALEPEYYFSTISSSFSVSPLFNGVTYKEFNIPLEMLRELLNLVKKIVEEAVLKSLDAIVASVMEANPSADLYYTSFSDPLYLTMFKMLRDTLYYMKDLPTSFVKEIHDFVLEQFNTSQGELQKILHDADRIHNELSPLKLRCQANAACVDLMVWAVKDEQGAENLCIKLSEKLQSKTSSKVIIAHLPLLICCLQGLGRLCERFPVVV.... The pIC50 is 5.1. (3) The compound is Cc1c(C(C)C)c(=O)on1C(=O)N1CCCCC1C. The target protein (Q05469) has sequence MEPGSKSVSRSDWQPEPHQRPITPLEPGPEKTPIAQPESKTLQGSNTQQKPASNQRPLTQQETPAQHDAESQKEPRAQQKSASQEEFLAPQKPAPQQSPYIQRVLLTQQEAASQQGPGLGKESITQQEPALRQRHVAQPGPGPGEPPPAQQEAESTPAAQAKPGAKREPSAPTESTSQETPEQSDKQTTPVQGAKSKQGSLTELGFLTKLQELSIQRSALEWKALSEWVTDSESESDVGSSSDTDSPATMGGMVAQGVKLGFKGKSGYKVMSGYSGTSPHEKTSARNHRHYQDTASRLIHNMDLRTMTQSLVTLAEDNIAFFSSQGPGETAQRLSGVFAGVREQALGLEPALGRLLGVAHLFDLDPETPANGYRSLVHTARCCLAHLLHKSRYVASNRRSIFFRTSHNLAELEAYLAALTQLRALVYYAQRLLVTNRPGVLFFEGDEGLTADFLREYVTLHKGCFYGRCLGFQFTPAIRPFLQTISIGLVSFGEHYKRNE.... The pIC50 is 6.4. (4) The drug is CC[C@H](C)[C@H](NC(=O)CN1C/C=C\CCC(=O)N[C@@H](Cc2ccccc2)C(=O)N[C@@H](CCC(=O)O)C1=O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(N)=O. The target protein (Q07889) has sequence MQAQQLPYEFFSEENAPKWRGLLVPALKKVQGQVHPTLESNDDALQYVEELILQLLNMLCQAQPRSASDVEERVQKSFPHPIDKWAIADAQSAIEKRKRRNPLSLPVEKIHPLLKEVLGYKIDHQVSVYIVAVLEYISADILKLVGNYVRNIRHYEITKQDIKVAMCADKVLMDMFHQDVEDINILSLTDEEPSTSGEQTYYDLVKAFMAEIRQYIRELNLIIKVFREPFVSNSKLFSANDVENIFSRIVDIHELSVKLLGHIEDTVEMTDEGSPHPLVGSCFEDLAEELAFDPYESYARDILRPGFHDRFLSQLSKPGAALYLQSIGEGFKEAVQYVLPRLLLAPVYHCLHYFELLKQLEEKSEDQEDKECLKQAITALLNVQSGMEKICSKSLAKRRLSESACRFYSQQMKGKQLAIKKMNEIQKNIDGWEGKDIGQCCNEFIMEGTLTRVGAKHERHIFLFDGLMICCKSNHGQPRLPGASNAEYRLKEKFFMRKVQ.... The pIC50 is 4.6. (5) The drug is O=C1Nc2ccc(-c3cc[nH]n3)cc2[C@](CNC(=O)c2ccc(F)cc2)(C(F)(F)F)O1. The target protein (Q9NXB9) has sequence MEHLKAFDDEINAFLDNMFGPRDSRVRGWFMLDSYLPTFFLTVMYLLSIWLGNKYMKNRPALSLRGILTLYNLGITLLSAYMLAELILSTWEGGYNLQCQDLTSAGEADIRVAKVLWWYYFSKSVEFLDTIFFVLRKKTSQITFLHVYHHASMFNIWWCVLNWIPCGQSFFGPTLNSFIHILMYSYYGLSVFPSMHKYLWWKKYLTQAQLVQFVLTITHTMSAVVKPCGFPFGCLIFQSSYMLTLVILFLNFYVQTYRKKPMKKDMQEPPAGKEVKNGFSKAYFTAANGVMNKKAQ. The pIC50 is 5.0. (6) The compound is N#Cc1coc2ccccc2c1=O. The target protein sequence is MFKLLSKLLVYLTASIMAIASPLAFSVDSSGEYPTVSEIPVGEVRLYQIADGVWSHIATQSFDGAVYPSNGLIVRDGDELLLIDTAWGAKNTAALLAEIEKQIGLPVTRAVSTHFHDDRVGGVDVLRAAGVATYASPSTRRLAEVEGNEIPTHSLEGLSSSGDAVRFGPVELFYPGAAHSTDNLVVYVPSASVLYGGCAIYELSRTSAGNVADADLAEWPTSIERIQQHYPEAQFVIPGHGLPGGLDLLKHTTNVVKAHTNRSVVE. The pIC50 is 2.9. (7) The compound is Cc1cc(CNC(=O)c2cc([C@@H]3CCNC[C@H]3COc3ccc4c(c3)OCO4)ccc2F)n[nH]1. The target protein (P28327) has sequence MDFGSLETVVANSAFIAARGSFDASSGPASRDRKYLARLKLPPLSKCEALRESLDLGFEGMCLEQPIGKRLFQQFLRTHEQHGPALQLWKDIEDYDTADDALRPQKAQALRAAYLEPQAQLFCSFLDAETVARARAGAGDGLFQPLLRAVLAHLGQAPFQEFLDSLYFLRFLQWKWLEAQPMGEDWFLDFRVLGRGGFGEVFACQMKATGKLYACKKLNKKRLKKRKGYQGAMVEKKILAKVHSRFIVSLAYAFETKTDLCLVMTIMNGGDIRYHIYNVDEDNPGFQEPRAIFYTAQIVSGLEHLHQRNIIYRDLKPENVLLDDDGNVRISDLGLAVELKAGQTKTKGYAGTPGFMAPELLLGEEYDFSVDYFALGVTLYEMIAARGPFRARGEKVENKELKQRVLEQAVTYPDKFSPASKDFCEALLQKDPEKRLGFRDGSCDGLRTHPLFRDISWRQLEAGMLTPPFVPDSRTVYAKNIQDVGAFSTVKGVAFEKADT.... The pIC50 is 4.1. (8) The target protein (P0A2Y6) has sequence MRYLTAGESHGPRLTAIIEGIPAGLPLTAEDINEDLRRRQGGYGRGGRMKIENDQVVFTSGVRHGKTTGAPITMDVINKDHQKWLDIMSAEDIEDRLKSKRKITHPRPGHADLVGGIKYRFDDLRNSLERSSARETTMRVAVGAVAKRLLAELDMEIANHVVVFGGKEIDVPENLTVAEIKQRAAQSEVSIVNQEREQEIKDYIDQIKRDGDTIGGVVETVVGGVPVGLGSYVQWDRKLDARLAQAVVSINAFKGVEFGLGFEAGYRKGSQVMDEILWSKEDGYTRRTNNLGGFEGGMTNGQPIVVRGVMKPIPTLYKPLMSVDIETHEPYKATVERSDPTALPAAGMVMEAVVATVLAQEILEKFSSDNLEELKEAVAKHRDYTKNY. The drug is CCCOc1ccc(/C=C2/Oc3c(ccc(O)c3O)C2=O)c(O)c1. The pIC50 is 6.3.