From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=c1c2ccccc2[se]n1-c1ccccc1. The target protein (Q4QLK7) has sequence MDKFRVYGQSRLSGSVNISGAKNAALPILFAAILATEPVKLTNVPELKDIETTLNILRQLGVIANRDETGAVLLDASNINHFTAPYELVKTMRASIWALAPLVARFHQAQVSLPGGCSIGARPVDLHISGLEKLGADIVLEEGYVKAQVSDRLVGTRIVIEKVSVGATLSIMMAATLAKGTTVIENAAREPEIVDTADFLNKMGAKITGAGSDHITTEGVERLTGCEHSIVPDRIETGTFLIAAAISGGRVVCQNTKADTLDAVIDKLREAGAQVDVTENSITLDMLGNRPKAVNIRTAPHPGFPTDMQAQFTLLNMVAEGTSIITETIFENRFMHIPELIRMGGKAEIEGNTAVCHGVEQLSGTEVIATDLRASISLVLAGCIATGETIVDRIYHIDRGYEHIEDKLRALGAKIERFSRSDEA. The pIC50 is 7.0. (2) The compound is N[C@H]1Cc2ccccc2N(O)C1=O. The target protein (Q6YP21) has sequence MFLAQRSLCSLSGRAKFLKTISSSKILGFSTSAKMSLKFTNAKRIEGLDSNVWIEFTKLAADPSVVNLGQGFPDISPPTYVKEELSKIAAIDSLNQYTRGFGHPSLVKALSYLYEKLYQKQIDSNKEILVTVGAYGSLFNTIQALIDEGDEVILIVPFYDCYEPMVRMAGATPVFIPLRSKPVYGKRWSSSDWTLDPQELESKFNSKTKAIILNTPHNPLGKVYNREELQVIADLCIKYDTLCISDEVYEWLVYSGNKHLKIATFPGMWERTITIGSAGKTFSVTGWKLGWSIGPNHLIKHLQTVQQNTIYTCATPLQEALAQAFWIDIKRMDDPECYFNSLPKELEVKRDRMVRLLESVGLKPIVPDGGYFIIADVSLLDPDLSDMKNNEPYDYKFVKWMTKHKKLSAIPVSAFCNSETKSQFEKFVRFCFIKKDSTLDAAEEIIKAWSVQKS. The pIC50 is 5.0. (3) The target protein (P12938) has sequence MELLAGTGLWPMAIFTVIFILLVDLMHRRQRWTSRYPPGPVPWPVLGNLLQVDLCNMPYSMYKLQNRYGDVFSLQMGWKPVVVINGLKAVQELLVTCGEDTADRPEMPIFQHIGYGHKAKGVVLAPYGPEWREQRRFSVSTLRNFGVGKKSLEQWVTDEASHLCDALTAEAGRPLDPYTLLNKAVCNVIASLIYARRFDYGDPDFIKVLKILKESMGEQTGLFPEVLNMFPVLLRIPGLADKVFPGQKTFLTMVDNLVTEHKKTWDPDQPPRDLTDAFLAEIEKAKGNPESSFNDANLRLVVNDLFGAGMVTTSITLTWALLLMILHPDVQCRVQQEIDEVIGQVRHPEMADQAHMPFTNAVIHEVQRFADIVPMNLPHKTSRDIEVQGFLIPKGTTLIPNLSSVLKDETVWEKPLRFHPEHFLDAQGNFVKHEAFMPFSAGRRACLGEPLARMELFLFFTCLLQRFSFSVPTGQPRPSDYGVFAFLLSPSPYQLCAFKR.... The drug is COc1ccc2c(CN(C)C)cc(=O)oc2c1. The pIC50 is 3.1. (4) The drug is CCc1c(C(=O)C(N)=O)c2c(OCC(=O)O)cccc2n1Cc1ccccc1. The target protein (P31482) has sequence MKVLLLLAASIMAFGSIQVQGNIAQFGEMIRLKTGKRAELSYAFYGCHCGLGGKGSPKDATDRCCVTHDCCYKSLEKSGCGTKLLKYKYSHQGGQITCSANQNSCQKRLCQCDKAAAECFARNKKTYSLKYQFYPNMFCKGKKPKC. The pIC50 is 7.2. (5) The small molecule is CNc1nc2cc(F)ccc2n1-c1nc2c(c(C3(S(C)(=O)=O)CCC3)n1)OC[C@@H]1COCCN21. The target protein sequence is LAEFMEHSDKGPLPLRDDNGIVLLGERAAKCRAYAKALHYKELEFQKGPTPAILESLISINNKLQQPEAAAGVLEYAMKHFGELEIQATWYEKLHEWEDALVAYDKKMDTNKDDPELMLGRMRCLEALGEWGQLHQQCCEKWTLVNDETQAKMARMAAAAAWGLGQWDSMEEYTCMIPRDTHDGAFYRAVLALHQDLFSLAQQCIDKARDLLDAELTAMAGESYSRAYGAMVSCHMLSELEEVIQYKLVPERREIIRQIWWERLQGCQRIVEDWQKILMVRSLVVSPHEDMRTWLKYASLCGKSGRLALAHKTLVLLLGVDPSRQLDHPLPTVHPQVTYAYMKNMWKSARKIDAFQHMQHFVQTMQQQAQHAIATEDQQHKQELHKLMARCFLKLGEWQLNLQGINESTIPKVLQYYSAATEHDRSWYKAWHAWAVMNFEAVLHYKHQNQARDEKKKLRHASGANITNATTAATTAATATTTASTEGSNSESEAESTENS.... The pIC50 is 5.2. (6) The compound is COc1ccc(-c2cc(NC(=O)NCC(=O)O)c(C(=O)O)s2)cc1. The target protein sequence is MGNPILAGLGFSLPKRQVSNHDLVGRINTSDEFIVERTGVRTRYHVEPEQAVSALMVPAARQAIEAAGLLPEDIDLLLVNTLSPDHHDPSQACLIQPLLGLRHIPVLDIRAQCSGLLYGLQMARGQILAGLARHVLVVCGEVLSKRMDCSDRGRNLSILLGDGAGAVVVSAGESLDDGLLDLRLGADGNYFDLLMTAAPGSASPTFLDENVLREGGGEFLMRGRPMFEHASQTLVRIAGEMLAAHELTLDDIDHVICHQPNLRILDAVQEQLGIPQHKFAVTVDRLGNMASASTPVTLAMFWPDIQPGQRVLVLTYGSGATWGAALYRKPEEVNRPC. The pIC50 is 5.2. (7) The drug is N[C@H](CCc1ns[nH]c1=O)C(=O)O. The target protein (Q6P6R0) has sequence MSHEKSFLVSGDSYPPPNPGYPVGPQAPMPPYVQPPYPGAPYPQAAFQPSPYGQPGYPHGPGPYPQGGYPQGPYPQGGYPQGPYPQSPFPPNPYGQPPPFQDPGSPQHGNYQEEGPPSYYDNQDFPSVNWDKSIRQAFIRKVFLVLTLQLSVTLSTVAIFTFVGEVKGFVRANVWTYYVSYAIFFISLIVLSCCGDFRRKHPWNLVALSILTISLSYMVGMIASFYNTEAVIMAVGITTAVCFTVVIFSMQTRYDFTSCMGVLLVSVVVLFIFAILCIFIRNRILEIVYASLGALLFTCFLAVDTQLLLGNKQLSLSPEEYVFAALNLYTDIINIFLYILTIIGRAKE. The pIC50 is 3.5.