This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CC[C@H](C)[C@@H]1NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)[C@@H](N)CSSC[C@@H](C(=O)N2Cc3ccccc3C[C@@H]2C(=O)N[C@@H](CC(C)C)C(=O)NCC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCC(N)=O)NC1=O. The target protein (P30559) has sequence MEGALAANWSAEAANASAAPPGAEGNRTAGPPRRNEALARVEVAVLCLILLLALSGNACVLLALRTTRQKHSRLFFFMKHLSIADLVVAVFQVLPQLLWDITFRFYGPDLLCRLVKYLQVVGMFASTYLLLLMSLDRCLAICQPLRSLRRRTDRLAVLATWLGCLVASAPQVHIFSLREVADGVFDCWAVFIQPWGPKAYITWITLAVYIVPVIVLAACYGLISFKIWQNLRLKTAAAAAAEAPEGAAAGDGGRVALARVSSVKLISKAKIRTVKMTFIIVLAFIVCWTPFFFVQMWSVWDANAPKEASAFIIVMLLASLNSCCNPWIYMLFTGHLFHELVQRFLCCSASYLKGRRLGETSASKKSNSSSFVLSHRSSSQRSCSQPSTA. The pIC50 is 6.1. (2) The small molecule is O=S1(=O)CCC2(CCC[C@@]3(S(=O)(=O)c4ccc(Cl)cc4)c4c(F)ccc(F)c4OC[C@@H]23)CC1. The target protein sequence is DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVV. The pIC50 is 7.2. (3) The compound is Cc1ncc(CCP(=O)(O)O)c(CO)c1O. The target protein (P60487) has sequence MARCERLRGAALRDVLGQAQGVLFDCDGVLWNGERIVPGAPELLQRLARAGKNTLFVSNNSRRARPELALRFARLGFAGLRAEQLFSSALCAARLLRQRLSGPPDASGAVFVLGGEGLRAELRAAGLRLAGDPGEDPRVRAVLVGYDEQFSFSRLTEACAHLRDPDCLLVATDRDPWHPLSDGSRTPGTGSLAAAVETASGRQALVVGKPSPYMFQCITEDFSVDPARTLMVGDRLETDILFGHRCGMTTVLTLTGVSSLEEAQAYLTAGQRDLVPHYYVESIADLMEGLED. The pIC50 is 2.9. (4) The small molecule is CC(C)[C@@H](NS(=O)(=O)c1ccc(-c2cccc(O)c2)cc1)C(=O)O. The target protein (O75173) has sequence MSQTGSHPGRGLAGRWLWGAQPCLLLPIVPLSWLVWLLLLLLASLLPSARLASPLPREEEIVFPEKLNGSVLPGSGAPARLLCRLQAFGETLLLELEQDSGVQVEGLTVQYLGQAPELLGGAEPGTYLTGTINGDPESVASLHWDGGALLGVLQYRGAELHLQPLEGGTPNSAGGPGAHILRRKSPASGQGPMCNVKAPLGSPSPRPRRAKRFASLSRFVETLVVADDKMAAFHGAGLKRYLLTVMAAAAKAFKHPSIRNPVSLVVTRLVILGSGEEGPQVGPSAAQTLRSFCAWQRGLNTPEDSDPDHFDTAILFTRQDLCGVSTCDTLGMADVGTVCDPARSCAIVEDDGLQSAFTAAHELGHVFNMLHDNSKPCISLNGPLSTSRHVMAPVMAHVDPEEPWSPCSARFITDFLDNGYGHCLLDKPEAPLHLPVTFPGKDYDADRQCQLTFGPDSRHCPQLPPPCAALWCSGHLNGHAMCQTKHSPWADGTPCGPAQA.... The pIC50 is 5.0. (5) The small molecule is CC(C)[C@H](NC(=O)C[C@H](C)C1CCCCC1)C(=O)N1CCCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(N)=O)[C@@H](C)O. The target protein (O00220) has sequence MAPPPARVHLGAFLAVTPNPGSAASGTEAAAATPSKVWGSSAGRIEPRGGGRGALPTSMGQHGPSARARAGRAPGPRPAREASPRLRVHKTFKFVVVGVLLQVVPSSAATIKLHDQSIGTQQWEHSPLGELCPPGSHRSEHPGACNRCTEGVGYTNASNNLFACLPCTACKSDEEERSPCTTTRNTACQCKPGTFRNDNSAEMCRKCSRGCPRGMVKVKDCTPWSDIECVHKESGNGHNIWVILVVTLVVPLLLVAVLIVCCCIGSGCGGDPKCMDRVCFWRLGLLRGPGAEDNAHNEILSNADSLSTFVSEQQMESQEPADLTGVTVQSPGEAQCLLGPAEAEGSQRRRLLVPANGADPTETLMLFFDKFANIVPFDSWDQLMRQLDLTKNEIDVVRAGTAGPGDALYAMLMKWVNKTGRNASIHTLLDALERMEERHAREKIQDLLVDSGKFIYLEDGTGSAVSLE. The pIC50 is 5.5.