Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The small molecule is Nc1ccc2ccccc2n1. The target protein sequence is MEEKEILWNEAKAFIAACYQELGKAAEVKDRLADIKSEIDLTGSYVHTKEELEHGAKMAWRNSNRCIGRLFWNSLNVIDRRDVRTKEEVRDALFHHIETATNNGKIRPTITIFPPEEKGEKQVEIWNHQLIRYAGYESDGERIGDPASCSLTAACEELGWRGERTDFDLLPLIFRMKGDEQPVWYELPRSLVIEVPITHPDIEAFSDLELKWYGVPIVSDMKLEVGGIHYNAAPFNGWYMGTEIGARNLADEKRYDKLKKVASVIGIAADYNTDLWKDQALVELNKAVLHSYKKQGVSIVDHHTAASQFKRFEEQAEEAGRKLTGDWTWLIPPISPAATHIFHRSYDNSIVKPNYFYQDKPYE. The pIC50 is 4.2. (2) The small molecule is COC(=O)C1C(c2ccc(I)cc2)CC2CCC1N2C. The target protein (P23977) has sequence MSKSKCSVGPMSSVVAPAKESNAVGPREVELILVKEQNGVQLTNSTLINPPQTPVEAQERETWSKKIDFLLSVIGFAVDLANVWRFPYLCYKNGGGAFLVPYLLFMVIAGMPLFYMELALGQFNREGAAGVWKICPVLKGVGFTVILISFYVGFFYNVIIAWALHYFFSSFTMDLPWIHCNNTWNSPNCSDAHASNSSDGLGLNDTFGTTPAAEYFERGVLHLHQSRGIDDLGPPRWQLTACLVLVIVLLYFSLWKGVKTSGKVVWITATMPYVVLTALLLRGVTLPGAMDGIRAYLSVDFYRLCEASVWIDAATQVCFSLGVGFGVLIAFSSYNKFTNNCYRDAIITTSINSLTSFSSGFVVFSFLGYMAQKHNVPIRDVATDGPGLIFIIYPEAIATLPLSSAWAAVFFLMLLTLGIDSAMGGMESVITGLVDEFQLLHRHRELFTLGIVLATFLLSLFCVTNGGIYVFTLLDHFAAGTSILFGVLIEAIGVAWFYGV.... The pIC50 is 7.1. (3) The pIC50 is 9.2. The target protein (P32871) has sequence MPPRPSSGELWGIHLMPPRILVECLLPNGMIVTLECLREATLITIKHELFKEARKYPLHQLLQDESSYIFVSVTQEAEREEFFDETRRLCDLRLFQPFLKVIEPVGNREEKILNREIGFAIGMPVCEFDMVKDPEVQDFRRNILNVCKEAVDLRDLNSPHSRAMYVYPPNVESSPELPKHIYNKLDKGQIIVVIWVIVSPNNDKQKYTLKINHDCVPEQVIAEAIRKKTRSMLLSSEQLKLCVLEYQGKYILKVCGCDEYFLEKYPLSQYKYIRSCIMLGRMPNLMLMAKESLYSQLPMDCFTMPSYSRRISTATPYMNGETSTKSLWVINSALRIKILCATYVNVNIRDIDKIYVRTGIYHGGEPLCDNVNTQRVPCSNPRWNEWLNYDIYIPDLPRAARLCLSICSVKGRKGAKEEHCPLAWGNINLFDYTDTLVSGKMALNLWPVPHGLEDLLNPIGVTGSNPNKETPCLELEFDWFSSVVKFPDMSVIEEHANWSV.... The small molecule is O=c1[nH]c(CN2CCOCC2)nc2c1C1CCCN1C(=S)N2c1ccccc1. (4) The small molecule is CNC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](CSC)NC(=O)CS. The target protein (O88766) has sequence MLHLKTLPFLFFFHTQLATALPVPPEHLEEKNMKTAENYLRKFYHLPSNQFRSARNATMIAEKLKEMQRFFGLPETGKPDAATIEIMEKPRCGVPDSGDFLLTPGSPKWTHTNLTYRIINHTPQMSKAEVKTEIEKAFKIWSVPSTLTFTETLEGEADINIAFVSRDHGDNSPFDGPNGILAHAFQPGRGIGGDAHFDSEETWTQDSKNYNLFLVAAHEFGHSLGLSHSTDPGALMYPNYAYREPSTYSLPQDDINGIQTIYGPSDNPVQPTGPSTPTACDPHLRFDAATTLRGEIYFFKDKYFWRRHPQLRTVDLNFISLFWPFLPNGLQAAYEDFDRDLVFLFKGRQYWALSAYDLQQGYPRDISNYGFPRSVQAIDAAVSYNGKTYFFVNNQCWRYDNQRRSMDPGYPTSIASVFPGINCRIDAVFQQDSFFLFFSGPQYFAFNLVSRRVTRVARSNLWLNCP. The pIC50 is 7.7.