This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is C/C(=C\Cn1oc(=O)[nH]c1=O)c1cccc(OCc2nc(-c3ccc(C(F)(F)F)cc3)oc2-c2ccccc2)c1. The target protein (P20417) has sequence MEMEKEFEQIDKAGNWAAIYQDIRHEASDFPCRIAKLPKNKNRNRYRDVSPFDHSRIKLHQEDNDYINASLIKMEEAQRSYILTQGPLPNTCGHFWEMVWEQKSRGVVMLNRIMEKGSLKCAQYWPQKEEKEMVFDDTNLKLTLISEDVKSYYTVRQLELENLATQEAREILHFHYTTWPDFGVPESPASFLNFLFKVRESGSLSPEHGPIVVHCSAGIGRSGTFCLADTCLLLMDKRKDPSSVDIKKVLLEMRRFRMGLIQTADQLRFSYLAVIEGAKFIMGDSSVQDQWKELSHEDLEPPPEHVPPPPRPPKRTLEPHNGKCKELFSNHQWVSEESCEDEDILAREESRAPSIAVHSMSSMSQDTEVRKRMVGGGLQSAQASVPTEEELSPTEEEQKAHRPVHWKPFLVNVCMATALATGAYLCYRVCFH. The pIC50 is 6.5. (2) The small molecule is CNc1ccc(/C=C/c2c(F)cccc2Cl)cc1. The target protein sequence is AKHLFTSESVSEGHPDKIADQISDAVLDAILEQDPKARVACETYVKTGMVLVGGEITTSAWVDIEEITRNTVREIGYVHSDMGFDANSCAVLSAIGKQSPDINQGVDRADPLEQGAGDQGLMFGYATNETDVLMPAPITYAHRLVQRQAEVRKNGTLPWLRPDAKSQVTFQYDDGKIVGIDAVVLSTQHSEEIDQKSLQEAVMEEIIKPILPAEWLTSATKFFINPTGRFVIGGPMGDCGLTGRKIIVDTYGGMARHGGGAFSGKDPSKVDRSAAYAARYVAKNIVAAGLADRCEIQVSYAIGVAEPTSIMVETFGTEKVPSEQLTLLVREFFDLRPYGLIQMLDLLHPIYKETAAYGHFGREHFPWEKTDKAQLLRDAAGLK. The pIC50 is 5.7. (3) The small molecule is CCc1ccc(-c2cn3nc(C)c(C)nc3n2)cc1. The target protein sequence is MNEGAPGDSDLETEARVPWSIMGHCLRTGQARMSATPTPAGEGARRDELFGILQILHQCILSSGDAFVLTGVCCSWRQNGKPPYSQKEDKEVQTGYMNAQIEIIPCKICGDKSSGIHYGVITCEGCKGFFRRSQQSNATYSCPRQKNCLIDRTSRNRCQHCRLQKCLAVGMSRDAVKFGRMSKKQRDSLYAEVQKHRMQQQQRDHQQQPGEAEPLTPTYNISANGLTELHDDLSNYIDGHTPEGSKADSAVSSFYLDIQPSPDQSGLDINGIKPEPICDYTPASGFFPYCSFTNGETSPTVSMAELEHLAQNISKSHLETCQYLREELQQITWQTFLQEEIENYQNKQREVMWQLCAIKITEAIQYVVEFAKRIDGFMELCQNDQIVLLKAGSLEVVFIRMCRAFDSQNNTVYFDGKYASPDVFKSLGCEDFISFVFEFGKSLCSMHLTEDEIALFSAFVLMSADRSWLQEKVKIEKLQQKIQLALQHVLQKNHREDGIL.... The pIC50 is 5.8. (4) The small molecule is O=C(O)c1cc(-c2ccccc2)c2ccc(OCc3ccccc3)cc2c1. The target protein (Q9ESG6) has sequence MNNSTTTDPPNQPCSWNTLITKQIIPVLYGMVFITGLLLNGISGWIFFYVPSSKSFIIYLKNIVVADFLMGLTFPFKVLGDSGLGPWQVNVFVCRVSAVIFYVNMYVSIVFFGLISFDRYYKIVKPLLTSIVQSVNYSKLLSVLVWMLMLLLAVPNIILTNQGVKEVTKIQCMELKNELGRKWHKASNYIFVSIFWVVFLLLIVFYTAITRKIFKSHLKSRKNSTSVKRKSSRNIFSIVLVFVVCFVPYHIARIPYTKSQTEGHYSCRTKETLLYAKEFTLLLSAANVCLDPIIYFFLCQPFREVLNKKLHMSLKVQNDLEVSKTKRENAIHESTDTL. The pIC50 is 5.5. (5) The drug is CC(C)CCC[C@@H](C)[C@H]1CCC2C3CC=C4C[C@@H](OC(O)CO)CC[C@]4(C)C3CC[C@@]21C. The target protein (P09884) has sequence MAPVHGDDSLSDSGSFVSSRARREKKSKKGRQEALERLKKAKAGEKYKYEVEDFTGVYEEVDEEQYSKLVQARQDDDWIVDDDGIGYVEDGREIFDDDLEDDALDADEKGKDGKARNKDKRNVKKLAVTKPNNIKSMFIACAGKKTADKAVDLSKDGLLGDILQDLNTETPQITPPPVMILKKKRSIGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPLKRAEFAGDDVQVESTEEEQESGAMEFEDGDFDEPMEVEEVDLEPMAAKAWDKESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFSVQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQYNQPGVVFLFGKVWIESAETHVSCCVMVKNIERTLYFLPREMKIDLNTGKETGTPISMKDVYEEFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPEKSEYLEVKYSAEMPQLPQDLKGETFSHVFGTNTSSLELFLMNRKIKGPCWLE.... The pIC50 is 3.3. (6) The compound is CC(C)(C)Sc1ccc([N+](=O)[O-])c2nonc12. The pIC50 is 6.5. The target protein (P09211) has sequence MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ.