This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)(C)CC(=O)NCc1nc2cc(C(F)(F)F)cc(C(F)(F)F)c2[nH]1. The target protein (O95180) has sequence MTEGARAADEVRVPLGAPPPGPAALVGASPESPGAPGREAERGSELGVSPSESPAAERGAELGADEEQRVPYPALAATVFFCLGQTTRPRSWCLRLVCNPWFEHVSMLVIMLNCVTLGMFRPCEDVECGSERCNILEAFDAFIFAFFAVEMVIKMVALGLFGQKCYLGDTWNRLDFFIVVAGMMEYSLDGHNVSLSAIRTVRVLRPLRAINRVPSMRILVTLLLDTLPMLGNVLLLCFFVFFIFGIVGVQLWAGLLRNRCFLDSAFVRNNNLTFLRPYYQTEEGEENPFICSSRRDNGMQKCSHIPGRRELRMPCTLGWEAYTQPQAEGVGAARNACINWNQYYNVCRSGDSNPHNGAINFDNIGYAWIAIFQVITLEGWVDIMYYVMDAHSFYNFIYFILLIIVGSFFMINLCLVVIATQFSETKQRESQLMREQRARHLSNDSTLASFSEPGSCYEELLKYVGHIFRKVKRRSLRLYARWQSRWRKKVDPSAVQGQGP.... The pIC50 is 5.2. (2) The small molecule is N#Cc1cnc(C(=O)Nc2ccc(S(N)(=O)=O)cc2C2=CCCCC2)[nH]1. The target protein (P09581) has sequence MELGPPLVLLLATVWHGQGAPVIEPSGPELVVEPGETVTLRCVSNGSVEWDGPISPYWTLDPESPGSTLTTRNATFKNTGTYRCTELEDPMAGSTTIHLYVKDPAHSWNLLAQEVTVVEGQEAVLPCLITDPALKDSVSLMREGGRQVLRKTVYFFSPWRGFIIRKAKVLDSNTYVCKTMVNGRESTSTGIWLKVNRVHPEPPQIKLEPSKLVRIRGEAAQIVCSATNAEVGFNVILKRGDTKLEIPLNSDFQDNYYKKVRALSLNAVDFQDAGIYSCVASNDVGTRTATMNFQVVESAYLNLTSEQSLLQEVSVGDSLILTVHADAYPSIQHYNWTYLGPFFEDQRKLEFITQRAIYRYTFKLFLNRVKASEAGQYFLMAQNKAGWNNLTFELTLRYPPEVSVTWMPVNGSDVLFCDVSGYPQPSVTWMECRGHTDRCDEAQALQVWNDTHPEVLSQKPFDKVIIQSQLPIGTLKHNMTYFCKTHNSVGNSSQYFRAVS.... The pIC50 is 6.8. (3) The drug is N=C(N)NCCC[C@H]1N[C@@H](CNC(=O)c2ccc3ccccc3c2)CCN(Cc2cc(F)cc(Cl)c2)C1=O. The target protein (P33032) has sequence MNSSFHLHFLDLNLNATEGNLSGPNVKNKSSPCEDMGIAVEVFLTLGVISLLENILVIGAIVKNKNLHSPMYFFVCSLAVADMLVSMSSAWETITIYLLNNKHLVIADAFVRHIDNVFDSMICISVVASMCSLLAIAVDRYVTIFYALRYHHIMTARRSGAIIAGIWAFCTGCGIVFILYSESTYVILCLISMFFAMLFLLVSLYIHMFLLARTHVKRIAALPGASSARQRTSMQGAVTVTMLLGVFTVCWAPFFLHLTLMLSCPQNLYCSRFMSHFNMYLILIMCNSVMDPLIYAFRSQEMRKTFKEIICCRGFRIACSFPRRD. The pIC50 is 6.0. (4) The compound is O=C(CNC(=O)C1CCCCC1C(=O)O)NO. The target protein (P47820) has sequence MGAASGQRGRWPLSPPLLMLSLLLLLLLPPSPAPALDPGLQPGNFSADEAGAQLFADSYNSSAEVVMFQSTAASWAHDTNITEENARLQEEAALINQEFAEVWGKKAKELYESIWQNFTDQKLRRIIGSVQTLGPANLPLTQRLQYNSLLSNMSRIYSTGKVCFPNKTATCWSLDPELTNILASSRNYAKVLFAWEGWHDAVGIPLRPLYQDFTALSNEAYRQDGFSDTGAYWRSWYESPSFEESLEHLYHQVEPLYLNLHAFVRRALHRRYGDKYINLRGPIPAHLLGDMWAQSWENIYDMVVPFPDKPNLDVTSTMVQKGWNATHMFRVAEEFFTSLGLSPMPPEFWAESMLEKPADGREVVCHASAWDFYNRKDFRIKQCTRVTMDQLSTVHHEMGHVQYYLQYKDLHVSLRRGANPGFHEAIGDVLALSVSTPAHLHKIGLLDRVANDIESDINYLLKMALEKIAFLPFGYLVDQWRWGVFSGRTPPSRYNYDWWY.... The pIC50 is 5.8. (5) The target protein (Q9NY47) has sequence MAVPARTCGASRPGPARTARPWPGCGPHPGPGTRRPTSGPPRPLWLLLPLLPLLAAPGASAYSFPQQHTMQHWARRLEQEVDGVMRIFGGVQQLREIYKDNRNLFEVQENEPQKLVEKVAGDIESLLDRKVQALKRLADAAENFQKAHRWQDNIKEEDIVYYDAKADAELDDPESEDVERGSKASTLRLDFIEDPNFKNKVNYSYAAVQIPTDIYKGSTVILNELNWTEALENVFMENRRQDPTLLWQVFGSATGVTRYYPATPWRAPKKIDLYDVRRRPWYIQGASSPKDMVIIVDVSGSVSGLTLKLMKTSVCEMLDTLSDDDYVNVASFNEKAQPVSCFTHLVQANVRNKKVFKEAVQGMVAKGTTGYKAGFEYAFDQLQNSNITRANCNKMIMMFTDGGEDRVQDVFEKYNWPNRTVRVFTFSVGQHNYDVTPLQWMACANKGYYFEIPSIGAIRINTQEYLDVLGRPMVLAGKEAKQVQWTNVYEDALGLGLVVT.... The pIC50 is 5.8. The compound is CCOc1ccc(-n2nc3c(N4CC[C@@H](O)C4)nnc(C)c3c2C)cc1. (6) The drug is Cc1oc2cc3oc(=O)c4c(c3cc2c1C)CCC4. The target protein (P30837) has sequence MLRFLAPRLLSLQGRTARYSSAAALPSPILNPDIPYNQLFINNEWQDAVSKKTFPTVNPTTGEVIGHVAEGDRADVDRAVKAAREAFRLGSPWRRMDASERGRLLNLLADLVERDRVYLASLETLDNGKPFQESYALDLDEVIKVYRYFAGWADKWHGKTIPMDGQHFCFTRHEPVGVCGQIIPWNFPLVMQGWKLAPALATGNTVVMKVAEQTPLSALYLASLIKEAGFPPGVVNIITGYGPTAGAAIAQHVDVDKVAFTGSTEVGHLIQKAAGDSNLKRVTLELGGKSPSIVLADADMEHAVEQCHEALFFNMGQCCCAGSRTFVEESIYNEFLERTVEKAKQRKVGNPFELDTQQGPQVDKEQFERVLGYIQLGQKEGAKLLCGGERFGERGFFIKPTVFGGVQDDMRIAKEEIFGPVQPLFKFKKIEEVVERANNTRYGLAAAVFTRDLDKAMYFTQALQAGTVWVNTYNIVTCHTPFGGFKESGNGRELGEDGLK.... The pIC50 is 6.8. (7) The compound is Cc1nc(SCc2cccc(C(=O)O)c2)[nH]c(=O)c1C#N. The target protein (Q8TDX5) has sequence MKIDIHSHILPKEWPDLKKRFGYGGWVQLQHHSKGEAKLLKDGKVFRVVRENCWDPEVRIREMDQKGVTVQALSTVPVMFSYWAKPEDTLNLCQLLNNDLASTVVSYPRRFVGLGTLPMQAPELAVKEMERCVKELGFPGVQIGTHVNEWDLNAQELFPVYAAAERLKCSLFVHPWDMQMDGRMAKYWLPWLVGMPAETTIAICSMIMGGVFEKFPKLKVCFAHGGGAFPFTVGRISHGFSMRPDLCAQDNPMNPKKYLGSFYTDALVHDPLSLKLLTDVIGKDKVILGTDYPFPLGELEPGKLIESMEEFDEETKNKLKAGNALAFLGLERKQFE. The pIC50 is 5.8.