From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is COc1cc2c(Oc3ccc(NC(=O)C4(C(=O)Nc5ccc(F)cc5)CC4)cc3F)ccnc2cc1OCCCN1CCOCC1. The target protein sequence is MGTSHPAFLVLGCLLTGLSLILCQLSLPSILPNENEKVVQLNSSFSLRCFGESEVSWQYPMSEEESSDVEIRNEENNSGLFVTVLEVSSASAAHTGLYTCYYNHTQTEENELEGRHIYIYVPDPDVAFVPLGMTDYLVIVEDDDSAIIPCRTTDPETPVTLHNSEGVVPASYDSRQGFNGTFTVGPYICEATVKGKKFQTIPFNVYALKATSELDLEMEALKTVYKSGETIVVTCAVFNNEVVDLQWTYPGEVKGKGITMLEEIKVPSIKLVYTLTVPEATVKDSGDYECAARQATREVKEMKKVTISVHEKGFIEIKPTFSQLEAVNLHEVKHFVVEVRAYPPPRISWLKNNLTLIENLTEITTDVEKIQEIRYRSKLKLIRAKEEDSGHYTIVAQNEDAVKSYTFELLTQVPSSILDLVDDHHGSTGGQTVRCTAEGTPLPDIEWMICKDIKKCNNETSWTILANNVSNIITEIHSRDRSTVEGRVTFAKVEETIAVR.... The pIC50 is 8.2. (2) The small molecule is O=C(O)c1cccc(-n2c(/C=C/c3cccc([N+](=O)[O-])c3)nc3ccccc3c2=O)c1. The target protein (P19490) has sequence MPYIFAFFCTGFLGAVVGANFPNNIQIGGLFPNQQSQEHAAFRFALSQLTEPPKLLPQIDIVNISDSFEMTYRFCSQFSKGVYAIFGFYERRTVNMLTSFCGALHVCFITPSFPVDTSNQFVLQLRPELQEALISIIDHYKWQTFVYIYDADRGLSVLQRVLDTAAEKNWQVTAVNILTTTEEGYRMLFQDLEKKKERLVVVDCESERLNAILGQIVKLEKNGIGYHYILANLGFMDIDLNKFKESGANVTGFQLVNYTDTIPARIMQQWRTSDSRDHTRVDWKRPKYTSALTYDGVKVMAEAFQSLRRQRIDISRRGNAGDCLANPAVPWGQGIDIQRALQQVRFEGLTGNVQFNEKGRRTNYTLHVIEMKHDGIRKIGYWNEDDKFVPAATDAQAGGDNSSVQNRTYIVTTILEDPYVMLKKNANQFEGNDRYEGYCVELAAEIAKHVGYSYRLEIVSDGKYGARDPDTKAWNGMVGELVYGRADVAVAPLTITLVRE.... The pIC50 is 4.5. (3) The drug is Cc1ccc(S(=O)(=O)Oc2ccc(Cl)cc2Cl)cc1. The target protein sequence is MAVAKVEPIKIMLKPGKDGPKLRQWPLTKEKIEALKEICEKMEKEGQLEEAPPTNPYNTPTFAIKKKDKNKWRMLIDFRELNKVTQDFTEIQLGIPHPAGLAKKRRITVLDVGDAYFSIPLHEDFRPYTAFTLPSVNNAEPGKRYIYKVLPQGWKGSPAIFQHTMRQVLEPFRKANKDVIIIQYMDDILIASDRTDLEHDRVILQLKELLNGLGFSTPDEKFQKDPPYHWMGYELWPTKWKLQKIQLPQKEIWTVNDIQKLVGVLNWAAQLYPGIKTKHLCRLIRGKMTLTEEVQWTELAEAELEENRIILSQEQEGHYYQEEKELEATVQKDQDNQWTYKIHQEDKILKVGKYAKVKNTHTNGIRLLAQVVQKIGKEALVIWGRIPKFHLPVEREIWEQWWDNYWQVTWIPDWDFVSTPPLVRLAFNLVGDPIPGAETFYTDGSCNRQSKEGKAGYVTDRGKDKVKKLEQTTNQQAELEAFAMALTDSGPKVNIIVDSQ.... The pIC50 is 4.1. (4) The compound is CC1(C)O[C@@H]2O[C@H]3C(CO)=NO[C@H]3[C@@H]2O1. The target protein (P10482) has sequence MDMSFPKGFLWGAATASYQIEGAWNEDGKGESIWDRFTHQKRNILYGHNGDVACDHYHRFEEDVSLMKELGLKAYRFSIAWTRIFPDGFGTVNQKGLEFYDRLINKLVENGIEPVVTLYHWDLPQKLQDIGGWANPEIVNYYFDYAMLVINRYKDKVKKWITFNEPYCIAFLGYFHGIHAPGIKDFKVAMDVVHSLMLSHFKVVKAVKENNIDVEVGITLNLTPVYLQTERLGYKVSEIEREMVSLSSQLDNQLFLDPVLKGSYPQKLLDYLVQKDLLDSQKALSMQQEVKENFIFPDFLGINYYTRAVRLYDENSSWIFPIRWEHPAGEYTEMGWEVFPQGLFDLLIWIKESYPQIPIYITENGAAYNDIVTEDGKVHDSKRIEYLKQHFEAARKAIENGVDLRGYFVWSLMDNFEWAMGYTKRFGIIYVDYETQKRIKKDSFYFYQQYIKENS. The pIC50 is 3.3. (5) The small molecule is Cc1cc(O)n(-c2ccccc2C(=O)OC[C@@H]2CCCN(CCCc3ccccc3)C2)c1O. The target protein (P54131) has sequence MRGSLCLALAASILHVSLQGEFQRKLYKDLVKNYNPLERPVANDSLPLTVYFSLSLLQIMDVDEKNQVLTTNIWLQMTWTDHYLQWNASEYPGVKTVRFPDGQIWKPDILLYNSADERFDATFHTNVLVNSSGHCQYLPPGIFKSSCYIDVRWFPFDVQQCKLKFGSWSYGGWSLDLQMQEADISGYIPNGEWDLVGVLGKRSEKFYECCKEPYPDVTFTVSIRRRTLYYGLNLLIPCVLISALALLVFLLPADSGEKISLGITVLLSLTVFMLLVAEIMPATSDSVPLIAQYFASTMIIVGLSVVVTVIVLQYHHHDPDGGKMPKWTRVVLLNWCAWFLRMKRPGEDKVRPACQHNERRCSLASVEMSAVAGPPATNGNLLYIGFRGLDTMHCAPTPDSGVVCGRVACSPTHDEHLLHAGQPSEGDPDLAKILEEVRYIAHRFRCQDESEAVCSEWKFAACVVDRLCLMAFSVFTILCTIGILMSAPNFVEAVSKDFA. The pIC50 is 3.8. (6) The small molecule is O=C(CCc1ccccc1)N(CCO)Cc1nc2c(c(=O)[nH]1)COCC2. The target protein sequence is APEDKEYQSVEEEMQSTIREHRDGGNAGGIFNRYNVIRIQKVVNKKLRERFCHRQKEVSEENHNHHNERMLFHGSPFINAIIHKGFDERHAYIGGMFGAGIYFAENSSKSNQYVYGIGGGTGCPTHKDRSCYICHRQMLFCRVTLGKSFLQFSTMKMAHAPPGHHSVIGRPSVNGLAYAEYVIYRGEQAYPEYLITYQIMKPEAPS. The pIC50 is 4.8.