Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The small molecule is O=C1c2cccc3cc([N+](=O)[O-])cc(c23)C(=O)N1NCCO. The target protein (P35739) has sequence MLRGQRHGQLGWHRPAAGLGGLVTSLMLACACAASCRETCCPVGPSGLRCTRAGTLNTLRGLRGAGNLTELYVENQRDLQRLEFEDLQGLGELRSLTIVKSGLRFVAPDAFHFTPRLSHLNLSSNALESLSWKTVQGLSLQDLTLSGNPLHCSCALLWLQRWEQEDLCGVYTQKLQGSGSGDQFLPLGHNNSCGVPSVKIQMPNDSVEVGDDVFLQCQVEGQALQQADWILTELEGTATMKKSGDLPSLGLTLVNVTSDLNKKNVTCWAENDVGRAEVSVQVSVSFPASVHLGKAVEQHHWCIPFSVDGQPAPSLRWFFNGSVLNETSFIFTQFLESALTNETMRHGCLRLNQPTHVNNGNYTLLAANPYGQAAASIMAAFMDNPFEFNPEDPIPVSFSPVDTNSTSRDPVEKKDETPFGVSVAVGLAVSAALFLSALLLVLNKCGQRSKFGINRPAVLAPEDGLAMSLHFMTLGGSSLSPTEGKGSGLQGHIMENPQYF.... The pIC50 is 5.2. (2) The target protein (Q2FYS5) has sequence MNKQNNYSDDSIQVLEGLEAVRKRPGMYIGSTDKRGLHHLVYEIVDNSVDEVLNGYGNEIDVTINKDGSISIEDNGRGMPTGIHKSGKPTVEVIFTVLHAGGKFGQGGYKTSGGLHGVGASVVNALSEWLEVEIHRDGNIYHQSFKNGGSPSSGLVKKGKTKKTGTKVTFKPDDTIFKASTSFNFDVLSERLQESAFLLKNLKITLNDLRSGKERQEHYHYEEGIKEFVSYVNEGKEVLHDVATFSGEANGIEVDVAFQYNDQYSESILSFVNNVRTKDGGTHEVGFKTAMTRVFNDYARRINELKTKDKNLDGNDIREGLTAVVSVRIPEELLQFEGQTKSKLGTSEARSAVDSVVADKLPFYLEEKGQLSKSLVKKAIKAQQAREAARKAREDARSGKKNKRKDTLLSGKLTPAQSKNTEKNELYLVEGDSAGGSAKLGRDRKFQAILPLRGKVINTEKARLEDIFKNEEINTIIHTIGAGVGTDFKIEDSNYNRVII.... The drug is CCn1c(=O)ccc2c(-c3cnc(-c4cccnc4)s3)n[nH]c21. The pIC50 is 6.1. (3) The drug is O=C(Nc1ccc(Oc2ccccc2)cc1)c1ccc2nn[nH]c2c1O. The target protein (O42772) has sequence MALRLATRRFAPIAFRRGMATTIEHTKEPISATAEALSASRPPIKETKTSTVKEPQMDADAKTKTFHIYRWNPDQPTDKPRMQSYTLDLNKTGPMMLDALIRIKNEVDPTLTFRRSCREGICGSCAMNIDGVNTLACLCRIPTDTAKETRIYPLPHTYVVKDLVPDMTQFYKQYKSIKPYLQRDTAPPDGKENRQSVADRKKLDGLYECILCACCSTSCPSYWWNSEEYLGPAVLLQSYRWINDSRDEKTAQRKDALNNSMSLYRCHTILNCSRTCPKGLNPALAIAEIKKSMAFTG. The pIC50 is 4.5. (4) The small molecule is CC(C)(C)OC(=O)N[C@@H](CC(=O)OCc1ccccc1)C(=O)O[C@@H]1CO[C@@H]2[C@H](OCc3ccccc3)CO[C@H]12. The target protein (P49862) has sequence MARSLLLPLQILLLSLALETAGEEAQGDKIIDGAPCARGSHPWQVALLSGNQLHCGGVLVNERWVLTAAHCKMNEYTVHLGSDTLGDRRAQRIKASKSFRHPGYSTQTHVNDLMLVKLNSQARLSSMVKKVRLPSRCEPPGTTCTVSGWGTTTSPDVTFPSDLMCVDVKLISPQDCTKVYKDLLENSMLCAGIPDSKKNACNGDSGGPLVCRGTLQGLVSWGTFPCGQPNDPGVYTQVCKFTKWINDTMKKHR. The pIC50 is 5.0. (5) The drug is CC(=O)c1csc(C(=O)CCl)c1. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 5.1. (6) The drug is O=c1oc2cc(OCc3ccccc3)ccc2c2ccccc12. The target protein (P27338) has sequence MSNKCDVVVVGGGISGMAAAKLLHDSGLNVVVLEARDRVGGRTYTLRNQKVKYVDLGGSYVGPTQNRILRLAKELGLETYKVNEVERLIHHVKGKSYPFRGPFPPVWNPITYLDHNNFWRTMDDMGREIPSDAPWKAPLAEEWDNMTMKELLDKLCWTESAKQLATLFVNLCVTAETHEVSALWFLWYVKQCGGTTRIISTTNGGQERKFVGGSGQVSERIMDLLGDRVKLERPVIYIDQTRENVLVETLNHEMYEAKYVISAIPPTLGMKIHFNPPLPMMRNQMITRVPLGSVIKCIVYYKEPFWRKKDYCGTMIIDGEEAPVAYTLDDTKPEGNYAAIMGFILAHKARKLARLTKEERLKKLCELYAKVLGSLEALEPVHYEEKNWCEEQYSGGCYTTYFPPGILTQYGRVLRQPVDRIYFAGTETATHWSGYMEGAVEAGERAAREILHAMGKIPEDEIWQSEPESVDVPAQPITTTFLERHLPSVPGLLRLIGLTT.... The pIC50 is 8.9. (7) The small molecule is N=C(N)NCCC[C@@H](NC(=O)[C@@H](CCCNC(=N)N)NC(=O)[C@@H](CCCNC(=N)N)NC(=O)[C@@H](CCCNC(=N)N)NC(=O)CCCCCNC(=O)[C@H]1OC(n2cnc3c(N)ncnc32)[C@H](O)[C@@H]1O)C(=O)O. The target protein sequence is MRRGGAGAPPDLGSVLGHTTPNLRDLYALGRKLGQGQFGTTYLCTELATGIDYACKSISKRKLITKEDVDDVRREIQIMHHLSGHKNVVAIKGAYEDQVYVHIVMELCAGGELFDRIIQRGHYSERKAAALTRIIVGVVEACHSLGVMHRDLKPENFLLANRDDDLSLKAIDFGLSVFFKPGQVFTDVVGSPYYVAPEVLLKSYGPAADVWTAGVILYILLSGVPPFWAETQQGIFDAVLKGAIDFDSDPWPVISDSAKDLIRRMLNPRPAERLTAHEVLCHPWIRDHGVAPDRPLDPAVLSRIKQFSAMNKLKKMALRVIAESLSEEEIAGLKEMFQTMDTDNSGAITYDELKEGLRKYGSTLKDTEIRDLMDAADIDNSGTIDYIEFIAATLHLNKLEREEHLVAAFSYFDKDGSGYITVDELQLACKEHNMPDAFLDDVINEADQDNDGRIDYGEFVAMMTKGNMGVGRRTMRNSLNISMRDDLVCSET. The pIC50 is 4.3. (8) The compound is COc1ccc(NC(=O)Nc2ccc(/C=C/c3cc(O)cc(O)c3)cc2)cc1. The target protein (P53341) has sequence MTISDHPETEPKWWKEATIYQIYPASFKDSNNDGWGDLKGITSKLQYIKDLGVDAIWVCPFYDSPQQDMGYDISNYEKVWPTYGTNEDCFELIDKTHKLGMKFITDLVINHCSTEHEWFKESRSSKTNPKRDWFFWRPPKGYDAEGKPIPPNNWKSFFGGSAWTFDETTNEFYLRLFASRQVDLNWENEDCRRAIFESAVGFWLDHGVDGFRIDTAGLYSKRPGLPDSPIFDKTSKLQHPNWGSHNGPRIHEYHQELHRFMKNRVKDGREIMTVGEVAHGSDNALYTSAARYEVSEVFSFTHVEVGTSPFFRYNIVPFTLKQWKEAIASNFLFINGTDSWATTYIENHDQARSITRFADDSPKYRKISGKLLTLLECSLTGTLYVYQGQEIGQINFKEWPIEKYEDVDVKNNYEIIKKSFGKNSKEMKDFFKGIALLSRDHSRTPMPWTKDKPNAGFTGPDVKPWFLLNESFEQGINVEQESRDDDSVLNFWKRALQARK.... The pIC50 is 4.0.