This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC1(C)CCCC1CNC(=O)c1ccc(-c2noc(CCC(F)(F)F)n2)cc1. The target protein (O70344) has sequence MAAASSPPRTERKRGGWGRLLGSRRGSASLAKKCPFSLELAEGGPAGGTLYAPVAPPGALSPGSPAPPASPAAPPAGLELGPRPPVSLDPRVSIYSARRPLLARTHIQGRVYNFLERPTGWKCFVYHFAVFLIVLACLIFSVLSTIEQYAALATGTLFWMEIVLVVFFGTEYVVRLWSAGCRSKYVGIWGRLRFARKPISIIDLIVVVASMVVLCVGSKGQVFATSAIRGIRFLQILRMLHVDRQGGTWRLLGSVVFIHRQELITTLYIGFLGLIFSSYFVYLAEKDAVNESGRVEFGSYADALWWGVVTVTTIGYGDKVPQTWVGKTIASCFSVFAISFFALPAGILGSGFALKVQQKQRQKHFNRQIPAAASLIQTAWRCYAAENPDSSTWKIYVRKPARSHTLLSPSPKPKKSAMVRKKKFKPDKDNGVSPGEKMLTVPHITCDPPEERRPDHFSVDGYDSSVRKSPTLLEVSPTHFMRTNSFAEDLDLEGETLLTP.... The pIC50 is 7.6. (2) The small molecule is Cc1c(C(=O)Nc2cc(Cl)cc(Cl)c2)cccc1[N+](=O)[O-]. The target protein sequence is MISKLKPQFMFLPKKHILSYCRKDVLNLFEQKFYYTSKRKESNNMKNESLLRLINYNRYYNKIDSNNYYNGGKILSNDRQYIYSPLCEYKKKINDISSYVSVPFKINIRNLGTSNFVNNKKDVLDNDYIYENIKKEKSKHKKIIFLLFVSLFGLYGFFESYNPEFFLYDIFLKFCLKYIDGEICHDLFLLLGKYNILPYDTSNDSIYACTNIKHLDFINPFGVAAGFDKNGVCIDSILKLGFSFIEIGTITPRGQTGNAKPRIFADVESRSIINSCGFNNMGCDKVTENLILFRKRQEEDKLLSKHIVGVSIGKNKDTVNIVDDLKYCINKIGRYADYIAINVSSPNTPGLRDNQEAGKLKNIILSVKEEIDNLEKNNIMNDESTYNEDNKIVEKKNNFNKNNSHMMKDAKDNFLWFNTTKKKPLVFVKLAPDLNQEQKKEIADVLLETNIDGMIISNTTTQINDIKSFENKKGGVSGAKLKDISTKFICEMYNYTNKQI.... The pIC50 is 6.4. (3) The compound is COc1ncccc1C(=O)Nc1cc2c(cc1OC(C)C)n(C)c(=O)n2C. The target protein (P55201) has sequence MGVDFDVKTFCHNLRATKPPYECPVETCRKVYKSYSGIEYHLYHYDHDNPPPPQQTPLRKHKKKGRQSRPANKQSPSPSEVSQSPGREVMSYAQAQRMVEVDLHGRVHRISIFDNLDVVSEDEEAPEEAPENGSNKENTETPAATPKSGKHKNKEKRKDSNHHHHHNVSASTTPKLPEVVYRELEQDTPDAPPRPTSYYRYIEKSAEELDEEVEYDMDEEDYIWLDIMNERRKTEGVSPIPQEIFEYLMDRLEKESYFESHNKGDPNALVDEDAVCCICNDGECQNSNVILFCDMCNLAVHQECYGVPYIPEGQWLCRRCLQSPSRAVDCALCPNKGGAFKQTDDGRWAHVVCALWIPEVCFANTVFLEPIDSIEHIPPARWKLTCYICKQRGSGACIQCHKANCYTAFHVTCAQQAGLYMKMEPVRETGANGTSFSVRKTAYCDIHTPPGSARRLPALSHSEGEEDEDEEEDEGKGWSSEKVKKAKAKSRIKMKKARKI.... The pIC50 is 7.5. (4) The drug is O=C(CCc1cccc(O)c1)N[C@@H](Cc1ccccc1)C(=O)CCl. The target protein (P08311) has sequence MQPLLLLLAFLLPTGAEAGEIIGGRESRPHSRPYMAYLQIQSPAGQSRCGGFLVREDFVLTAAHCWGSNINVTLGAHNIQRRENTQQHITARRAIRHPQYNQRTIQNDIMLLQLSRRVRRNRNVNPVALPRAQEGLRPGTLCTVAGWGRVSMRRGTDTLREVQLRVQRDRQCLRIFGSYDPRRQICVGDRRERKAAFKGDSGGPLLCNNVAHGIVSYGKSSGVPPEVFTRVSSFLPWIRTTMRSFKLLDQMETPL. The pIC50 is 4.2. (5) The drug is CCN(CC)Cc1ccc(C(=O)N(CCc2ccccc2OC)[C@@H]2CCNC2)cc1. The target protein (Q14703) has sequence MKLVNIWLLLLVVLLCGKKHLGDRLEKKSFEKAPCPGCSHLTLKVEFSSTVVEYEYIVAFNGYFTAKARNSFISSALKSSEVDNWRIIPRNNPSSDYPSDFEVIQIKEKQKAGLLTLEDHPNIKRVTPQRKVFRSLKYAESDPTVPCNETRWSQKWQSSRPLRRASLSLGSGFWHATGRHSSRRLLRAIPRQVAQTLQADVLWQMGYTGANVRVAVFDTGLSEKHPHFKNVKERTNWTNERTLDDGLGHGTFVAGVIASMRECQGFAPDAELHIFRVFTNNQVSYTSWFLDAFNYAILKKIDVLNLSIGGPDFMDHPFVDKVWELTANNVIMVSAIGNDGPLYGTLNNPADQMDVIGVGGIDFEDNIARFSSRGMTTWELPGGYGRMKPDIVTYGAGVRGSGVKGGCRALSGTSVASPVVAGAVTLLVSTVQKRELVNPASMKQALIASARRLPGVNMFEQGHGKLDLLRAYQILNSYKPQASLSPSYIDLTECPYMWPY.... The pIC50 is 6.8. (6) The compound is O=C1Nc2ccc(Br)cc2/C1=C\C(=O)c1ccc(F)cc1. The target protein sequence is MENNSTERYIFKPNFLGEGSYGKVYKAYDTILKKEVAIKKMKLNEISNYIDDCGINFVLLREIKIMKEIKHKNIMSALDLYCEKDYINLVMEIMDYDLSKIINRKIFLTDSQKKCILLQILNGLNVLHKYYFMHRDLSPANIFINKKGEVKLADFGLCTKYGYDMYSDKLFRDKYKKNLNLTSKVVTLWYRAPELLLGSNKYNSSIDMWSFGCIFAELLLQKALFPGENEIDQLGKIFFLLGTPNENNWPEALCLPLYTEFTKATKKDFKTYFKIDDDDCIDLLTSFLKLNAHERISAEDAMKHRYFFNDPLPCDISQLPFNDL. The pIC50 is 5.8. (7) The compound is CN(Cc1cnc2nc(N)nc(N)c2n1)c1ccc(C(=O)N[C@@H](CCC(=O)O)C(=O)O)cc1. The target protein (P00376) has sequence MVRPLNCIVAVSQNMGIGKNGDLPWPPLRNEFQYFQRMTTVSSVEGKQNLVIMGRKTWFSIPEKNRPLKDRINIVLSRELKEPPKGAHFLAKSLDDALELIEDPELTNKVDVVWIVGGSSVYKEAMNKPGHVRLFVTRIMQEFESDAFFPEIDFEKYKLLPEYPGVPLDVQEEKGIKYKFEVYEKNN. The pIC50 is 8.7. (8) The small molecule is Cc1ccc(C[C@H](N)COc2cncc(-c3ccc4[nH]nc(C)c4c3)c2)cc1F. The target protein sequence is AKPKHRVTMNEFEYLKLLGKGTFGKVILVKEKATGRYYAMKILKKEVIVAKDEVAHTLTENRVLQNSRHPFLTALKYSFQTHDRLCFVMEYANGGELFFHLSRERVFSEDRARFYGAEIVSALDYLHSEKNVVYRDLKLENLMLDKDGHIKITDFGLCKEGIKDGATMKTFCGTPEYLAPEVLEDNDYGRAVDWWGLGVVMYEMMCGRLPFYNQDHEKLFELILMEEIRFPRTLGPEAKALLAGLLKKDPKQRLGGGSEDAKEIMQHRFFAGIVWQHVYEKKLSPPFKPQVTSETDTRYFDEEFTAQMITIDPPDQDDSMECVDSERRPHFPQFDYSASGTA. The pIC50 is 8.6. (9) The compound is c1cc(Nc2ccnc(Nc3cc(N4CCOCC4)cc(N4CCOCC4)c3)n2)c2cn[nH]c2c1. The target protein (P54760) has sequence MELRVLLCWASLAAALEETLLNTKLETADLKWVTFPQVDGQWEELSGLDEEQHSVRTYEVCDVQRAPGQAHWLRTGWVPRRGAVHVYATLRFTMLECLSLPRAGRSCKETFTVFYYESDADTATALTPAWMENPYIKVDTVAAEHLTRKRPGAEATGKVNVKTLRLGPLSKAGFYLAFQDQGACMALLSLHLFYKKCAQLTVNLTRFPETVPRELVVPVAGSCVVDAVPAPGPSPSLYCREDGQWAEQPVTGCSCAPGFEAAEGNTKCRACAQGTFKPLSGEGSCQPCPANSHSNTIGSAVCQCRVGYFRARTDPRGAPCTTPPSAPRSVVSRLNGSSLHLEWSAPLESGGREDLTYALRCRECRPGGSCAPCGGDLTFDPGPRDLVEPWVVVRGLRPDFTYTFEVTALNGVSSLATGPVPFEPVNVTTDREVPPAVSDIRVTRSSPSSLSLAWAVPRAPSGAVLDYEVKYHEKGAEGPSSVRFLKTSENRAELRGLKRG.... The pIC50 is 8.0. (10) The drug is CS(=O)(=O)c1ccc(-c2cnc(NCc3ccco3)n3cnnc23)cc1. The target protein sequence is PTGKMPGAPETAPGDGAGASRQRKLEALIRDPRSPINVESLLDGLNSLVLDLDFPALRKNKNIDNFLNRYEKIVKKIRGLQMKAEDYDVVKVIGRGAFGEVQLVRHKASQKVYAMKLLSKFEMIKRSDSAFFWEERDIMAFANSPWVVQLFYAFQDDRYLYMVMEYMPGGDLVNLMSNYDVPEKWAKFYTAEVVLALDAIHSMGLIHRDVKPDNMLLDKHGHLKLADFGTCMKMDETGMVHCDTAVGTPDYISPEVLKSQGGDGFYGRECDWWSVGVFLYEMLVGDTPFYADSLVGTYSKIMDHKNSLCFPEDAEISKHAKNLICAFLTDREVRLGRNGVEEIRQHPFFKNDQWHWDNIRETAAPVVPELSSDIDSSNFDDIEDDKGDVETFPIPKAFVGNQLPFIGFTYYRENLLLSDSPSCRETDSIQSRKNEESQEIQKKLYTLEEHLSNEMQAKEELEQKCKSVNTRLEKTAKELEEEITLRKSVESALRQLEREK.... The pIC50 is 5.0.