Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (Q9Y6T7) has sequence MTNQEKWAHLSPSEFSQLQKYAEYSTKKLKDVLEEFHGNGVLAKYNPEGKQDILNQTIDFEGFKLFMKTFLEAELPDDFTAHLFMSFSNKFPHSSPMVKSKPALLSGGLRMNKGAITPPRTTSPANTCSPEVIHLKDIVCYLSLLERGRPEDKLEFMFRLYDTDGNGFLDSSELENIISQMMHVAEYLEWDVTELNPILHEMMEEIDYDHDGTVSLEEWIQGGMTTIPLLVLLGLENNVKDDGQHVWRLKHFNKPAYCNLCLNMLIGVGKQGLCCSFCKYTVHERCVARAPPSCIKTYVKSKRNTDVMHHYWVEGNCPTKCDKCHKTVKCYQGLTGLHCVWCQITLHNKCASHLKPECDCGPLKDHILPPTTICPVVLQTLPTSGVSVPEERQSTVKKEKSGSQQPNKVIDKNKMQRANSVTVDGQGLQVTPVPGTHPLLVFVNPKSGGKQGERIYRKFQYLLNPRQVYSLSGNGPMPGLNFFRDVPDFRVLACGGDGTV.... The pIC50 is 4.3. The drug is O=C(c1cc([C@H]2CCCN2c2cc(F)cc(F)c2)c2oc(N3CCOCC3)cc(=O)c2c1)N1CCOCC1. (2) The small molecule is Cc1ccc2c(c1)[C@@H](O)CCO2. The target protein (O75795) has sequence MSLKWMSVFLLMQLSCYFSSGSCGKVLVWPTEYSHWINMKTILEELVQRGHEVIVLTSSASILVNASKSSAIKLEVYPTSLTKNDLEDFFMKMFDRWTYSISKNTFWSYFSQLQELCWEYSDYNIKLCEDAVLNKKLMRKLQESKFDVLLADAVNPCGELLAELLNIPFLYSLRFSVGYTVEKNGGGFLFPPSYVPVVMSELSDQMIFMERIKNMIYMLYFDFWFQAYDLKKWDQFYSEVLGRPTTLFETMGKAEMWLIRTYWDFEFPRPFLPNVDFVGGLHCKPAKPLPKEMEEFVQSSGENGIVVFSLGSMISNMSEESANMIASALAQIPQKVLWRFDGKKPNTLGSNTRLYKWLPQNDLLGHPKTKAFITHGGTNGIYEAIYHGIPMVGIPLFADQHDNIAHMKAKGAALSVDIRTMSSRDLLNALKSVINDPIYKENIMKLSRIHHDQPVKPLDRAVFWIEFVMRHKGAKHLRVAAHNLTWIQYHSLDVIAFLLA.... The pIC50 is 3.5. (3) The compound is CC(C)COc1ccc(Cc2cnc(N)nc2N)cc1. The target is TRQARRNRRRRWRERQR. The pIC50 is 4.1. (4) The drug is Cc1cc(OCc2ccc(-c3ccccc3-c3nnn[nH]3)cc2)c2cccc(C)c2n1. The target protein (Q9WV26) has sequence MILNSSTEDGIKRIQDDCPKAGRHSYIFVMIPTLYSIIFVVGIFGNSLVVIVIYFYMKLKTVASVFLLNLALADICFLLTLPLWAVYTAMEYRWPFGNYLCKIASASVSFNLYASVFLLTCLSIDRYLAIVHPMKSRLRRTMLVAKVTCVIIWLMAGLASLPAVIHRNVFFIENTNITVCAFHYESQNSTLPIGLGLTKNILGFMFPFLIILTSYTLIWKALKKAYEIQKNKPRNDDIFKIIMAIVLFFFFSWVPHQIFTFLDVLIQLGIIHDCKISDIVDTAMPITICIAYFNNCLNPLFYGFLGKKFKKYFLQLLKYIPPKAKSHSTLSTKMSTLSYRPSDNVSSSAKKPVQCFEVE. The pIC50 is 6.5. (5) The small molecule is CCCCN1C[C@@H](O)[C@H](O)[C@H]1CO. The target protein sequence is MGTGSLAPGVRAGGGNTGWLWMSSCNLGLPVLSISFLIWLLLAAPGAQAAGYKTCPTTKPGMLNVHLLPHTHDDVGWLKTVDQYYYGIMSDVQHASVQYILDSVIYSLLNDPTRRFIYVEMAFFSRWWKQQTNVTQDAVRNLVRQGRLEFVNGGWVMNDEAATHYGAIVDQMTLGLRFLQDTFGSDGLPRVAWHIDPFGHSREQASLFAQMGFDGFFLGRIDYQDKFNRKRKLKMEELWRASASLKPPAADLFTGVLPNNYNPPKDLCWDVLCTDPPVVDDPTSPEFNANKLVDYFLNLASSQKKYYRTNHTVMTMGSDFQYENANMWFKNMDKLIRLVNEQQANGSKVHVLYSTPSCYLWELNKANLTWTVKEDDFFPYADGPHMFWTGYFSSRPALKRYERLSYNFLQVCNQLEALVGPEAKVGPYGSGDSAPLNEAMAVLQHHDAVTGTARQNVVNDYAKQLAAGWGPCEVLVSNALARLSLYKQNFSFCREINISI.... The pIC50 is 4.3. (6) The drug is COc1ccc(OCCCCCCCC(=O)O)cc1Cc1cnc(N)nc1N. The target protein sequence is MQKPVCLVVAMTPKRGIGINNGLPWPHLTTDFKHFSRVTKTTPEEASRLNGWLPRKFAKTGDSGLPSPSVGKRFNAVVMGRKTWESMPRKFRPLVDRLNIVVSSSLKEEDIAAEKPQAEGQQRVRVCASLPAALSLLEEEYKDSVDQIFVVGGAGLYEAALSLGVASHLYITRVAREFPCDVFFPAFPGDDILSNKSTAAQAAAPAESVFVPFCPELGREKDNEATYRPIFISKTFSDNGVPYDFVVLEKRRKTDDAATAEPSNAMSSLTSTRETTPVHGLQAPSSAAAIAPVLAWMDEEDRKKREQKELIRAVPHVHFRGHEEFQYLDLIADIINNGRTMDDRTGVGVISKFGCTMRYSLDQAFPLLTTKRVFWKGVLEELLWFIRGDTNANHLSEKGVKIWDKNVTREFLDSRNLPHREVGDIGPGYGFQWRHFGAAYKDMHTDYTGQGVDQLKNVIQMLRTNPTDRRMLMTAWNPAALDEMALPPCHLLCQFYVNDQ.... The pIC50 is 6.4.