This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCc1nc(C(N)=O)c(Nc2ccc(N3CCC(N4CCN(C)CC4)CC3)c(OC)c2)nc1NC1CCOCC1. The target protein sequence is MDGFAGSLDDSISAASTSDVQDRLSALESRVQQQEDEITVLKAALADVLRRLAISEDHVASVKKSVSSKGQPSPRAVIPMSCITNGSGANRKPSHTSAVSIAGKETLSSAAKSGTEKKKEKPQGQREKKEESHSNDQSPQIRASPSPQPSSQPLQIHRQTPESKNATPTKSIKRPSPAEKSHNSWENSDDSRNKLSKIPSTPKLIPKVTKTADKHKDVIINQAKMSTREKNSQVYRRKHQELQAMQMELQSPEYKLSKLRTSTIMTDYNPNYCFAGKTSSISDLKEVPRKNITLIRGLGHGAFGEVYEGQVSGMPNDPSPLQVAVKTLPEVCSEQDELDFLMEALIISKFNHQNIVRCIGVSLQSLPRFILLELMAGGDLKSFLRETRPRPSQPSSLAMLDLLHVARDIACGCQYLEENHFIHRDIAARNCLLTCPGPGRVAKIGDFGMARDIYRASYYRKGGCAMLPVKWMPPEAFMEGIFTSKTDTWSFGVLLWEIFS.... The pIC50 is 8.8. (2) The small molecule is Cc1ccc(F)c(Oc2ccc(S(=O)(=O)Nc3ccc(F)cn3)cc2C#N)c1F. The target protein (Q96S37) has sequence MAFSELLDLVGGLGRFQVLQTMALMVSIMWLCTQSMLENFSAAVPSHRCWAPLLDNSTAQASILGSLSPEALLAISIPPGPNQRPHQCRRFRQPQWQLLDPNATATSWSEADTEPCVDGWVYDRSIFTSTIVAKWNLVCDSHALKPMAQSIYLAGILVGAAACGPASDRFGRRLVLTWSYLQMAVMGTAAAFAPAFPVYCLFRFLLAFAVAGVMMNTGTLLMEWTAARARPLVMTLNSLGFSFGHGLTAAVAYGVRDWTLLQLVVSVPFFLCFLYSWWLAESARWLLTTGRLDWGLQELWRVAAINGKGAVQDTLTPEVLLSAMREELSMGQPPASLGTLLRMPGLRFRTCISTLCWFAFGFTFFGLALDLQALGSNIFLLQMFIGVVDIPAKMGALLLLSHLGRRPTLAASLLLAGLCILANTLVPHEMGALRSALAVLGLGGVGAAFTCITIYSSELFPTVLRMTAVGLGQMAARGGAILGPLVRLLGVHGPWLPLLV.... The pIC50 is 7.0. (3) The compound is CCCC[C@H](NC(=O)OC(C(C)C)C(C)C)C(=O)C(=O)Nc1ccnn1C1CCC1. The target protein (O35186) has sequence MWVFKFLLLPVVSFALSPEETLDTQWELWKKTHGKQYNSKVDEISRRLIWEKNLKKISVHNLEASLGAHTYELAMNHLGDMTSEEVVQKMTGLRVPPSRSFSNDTLYTPEWEGRVPDSIDYRKKGYVTPVKNQGQCGSCWAFSSAGALEGQLKKKTGKLLALSPQNLVDCVSENYGCGGGYMTTAFQYVQQNGGIDSEDAYPYVGQDESCMYNATAKAAKCRGYREIPVGNEKALKRAVARVGPVSVSIDASLTSFQFYSRGVYYDENCDRDNVNHAVLVVGYGTQKGNKYWIIKNSWGESWGNKGYVLLARNKNNACGITNLASFPKM. The pIC50 is 6.0. (4) The small molecule is O=C1CCC2(CO2)c2c1nnn2CC1CCCCC1. The target protein (P50579) has sequence MAGVEEVAASGSHLNGDLDPDDREEGAASTAEEAAKKKRRKKKKSKGPSAAGEQEPDKESGASVDEVARQLERSALEDKERDEDDEDGDGDGDGATGKKKKKKKKKRGPKVQTDPPSVPICDLYPNGVFPKGQECEYPPTQDGRTAAWRTTSEEKKALDQASEEIWNDFREAAEAHRQVRKYVMSWIKPGMTMIEICEKLEDCSRKLIKENGLNAGLAFPTGCSLNNCAAHYTPNAGDTTVLQYDDICKIDFGTHISGRIIDCAFTVTFNPKYDTLLKAVKDATNTGIKCAGIDVRLCDVGEAIQEVMESYEVEIDGKTYQVKPIRNLNGHSIGQYRIHAGKTVPIVKGGEATRMEEGEVYAIETFGSTGKGVVHDDMECSHYMKNFDVGHVPIRLPRTKHLLNVINENFGTLAFCRRWLDRLGESKYLMALKNLCDLGIVDPYPPLCDIKGSYTAQFEHTILLRPTCKEVVSRGDDY. The pIC50 is 4.3. (5) The pIC50 is 5.9. The compound is Nc1ncnc2c1c(-c1ccccc1)nn2-c1ccccc1. The target protein (Q8KRU5) has sequence MTSRYRSSEAHQGLASFSPRRRTVVKAAAATAVLAGPLAAALPARATTGTPAFLHGVASGDPLPDGVLLWTRVTPTADATPGSGLGPDTEVGWTVATDKAFTNVVAKGSTTATAASDHTVKADIRGLAPATDHWFRFSAGGTDSPAGRARTAPAADAAVAGLRFGVVSCANWEAGYFAAYRHLAARGDLDAWLHLGDYIYEYGAGEYGTRGTSVRSHAPAHEILTLADYRVRHGRYKTDPDLQALHAAAPVVAIWDDHEIANDTWSGGAENHTEGVEGAWAARQAAAKQAYFEWMPVRPAIAGTTYRRLRFGKLADLSLLDLRSFRAQQVSLGDGDVDDPDRTLTGRAQLDWLKAGLKSSDTTWRLVGNSVMIAPFAIGSLSAELLKPLAKLLGLPQEGLAVNTDQWDGYTDDRRELLAHLRSNAIRNTVFLTGDIHMAWANDVPVNAGTYPLSASAATEFVVTSVTSDNLDDLVKVPEGTVSALASPVIRAANRHVHWV.... (6) The drug is CCCCCCc1c(C(=O)CCCC(=O)O)c2ccccc2n1C. The target protein (Q8TDS5) has sequence MLCHRGGQLIVPIIPLCPEHSCRGRRLQNLLSGPWPKQPMELHNLSSPSPSLSSSVLPPSFSPSPSSAPSAFTTVGGSSGGPCHPTSSSLVSAFLAPILALEFVLGLVGNSLALFIFCIHTRPWTSNTVFLVSLVAADFLLISNLPLRVDYYLLHETWRFGAAACKVNLFMLSTNRTASVVFLTAIALNRYLKVVQPHHVLSRASVGAAARVAGGLWVGILLLNGHLLLSTFSGPSCLSYRVGTKPSASLRWHQALYLLEFFLPLALILFAIVSIGLTIRNRGLGGQAGPQRAMRVLAMVVAVYTICFLPSIIFGMASMVAFWLSACRSLDLCTQLFHGSLAFTYLNSVLDPVLYCFSSPNFLHQSRALLGLTRGRQGPVSDESSYQPSRQWRYREASRKAEAIGKLKVQGEVSLEKEGSSQG. The pIC50 is 5.2. (7) The target protein (P18054) has sequence MGRYRIRVATGAWLFSGSYNRVQLWLVGTRGEAELELQLRPARGEEEEFDHDVAEDLGLLQFVRLRKHHWLVDDAWFCDRITVQGPGACAEVAFPCYRWVQGEDILSLPEGTARLPGDNALDMFQKHREKELKDRQQIYCWATWKEGLPLTIAADRKDDLPPNMRFHEEKRLDFEWTLKAGALEMALKRVYTLLSSWNCLEDFDQIFWGQKSALAEKVRQCWQDDELFSYQFLNGANPMLLRRSTSLPSRLVLPSGMEELQAQLEKELQNGSLFEADFILLDGIPANVIRGEKQYLAAPLVMLKMEPNGKLQPMVIQIQPPNPSSPTPTLFLPSDPPLAWLLAKSWVRNSDFQLHEIQYHLLNTHLVAEVIAVATMRCLPGLHPIFKFLIPHIRYTMEINTRARTQLISDGGIFDKAVSTGGGGHVQLLRRAAAQLTYCSLCPPDDLADRGLLGLPGALYAHDALRLWEIIARYVEGIVHLFYQRDDIVKGDPELQAWCR.... The drug is C[C@@H](Cc1cc(O)c(O)cc1Cl)[C@H](C)Cc1cc(O)c(O)cc1Cl. The pIC50 is 6.4. (8) The small molecule is CC(=O)Oc1ccc(C=Cc2ccccc2)cc1. The target protein (P11344) has sequence MFLAVLYCLLWSFQISDGHFPRACASSKNLLAKECCPPWMGDGSPCGQLSGRGSCQDILLSSAPSGPQFPFKGVDDRESWPSVFYNRTCQCSGNFMGFNCGNCKFGFGGPNCTEKRVLIRRNIFDLSVSEKNKFFSYLTLAKHTISSVYVIPTGTYGQMNNGSTPMFNDINIYDLFVWMHYYVSRDTLLGGSEIWRDIDFAHEAPGFLPWHRLFLLLWEQEIRELTGDENFTVPYWDWRDAENCDICTDEYLGGRHPENPNLLSPASFFSSWQIICSRSEEYNSHQVLCDGTPEGPLLRNPGNHDKAKTPRLPSSADVEFCLSLTQYESGSMDRTANFSFRNTLEGFASPLTGIADPSQSSMHNALHIFMNGTMSQVQGSANDPIFLLHHAFVDSIFEQWLRRHRPLLEVYPEANAPIGHNRDSYMVPFIPLYRNGDFFITSKDLGYDYSYLQESDPGFYRNYIEPYLEQASRIWPWLLGAALVGAVIAAALSGLSSRLC.... The pIC50 is 3.3. (9) The small molecule is O=C(O)c1cnn(-c2nc(N3CCCCC3)c3ccc(-c4ccccc4)cc3n2)c1. The target protein sequence is KNPFSTGDTDLDLEMLAPYIPMDDDFQLRSFDQLSNGQTKPLPALKLALEYIVPCMNKHGICVVDDFLGKETGQQIGDEVRALHDTGKFTDGQLVSQKSDSSKDIRGDKITWIEGKEPGCETIGLLMSSMDDLIRHCNGKLGSYKINGRTKAMVACYPGNGTGYVRHVDNPNGDGRCVTCIYYLNKDWDAKVSGGILRIFPEGKAQFADIEPKFDRLLFFWSDRRNPHEVQPAYATRYAITVWYFDADERARAKVKYLTGEKGVRVELNKPSDSVGKDVF. The pIC50 is 6.1.