Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=C1c2c(O)cccc2C(SCCCS)c2cccc(O)c21. The target protein sequence is MPSYTVTVATGSQWFAGTDDYIYLSLVGSAGCSEKHLLDKPFYNDFERGAVDSYDVTVDEELGDIQLIKIEKRKYWFHDDWYLKYITVKTPCGDYIEFPCYRWISGEGEIVLRDGQAKLACDDQIHVLKQHRRKELETRQKQYRWMEWNPGFPLSIDAKCHKDLPRDIQFDSEKGVDFVLNYSKAMENLFINRFMHMFQSSWSDFADFEKIFVRISNTISERVMNHWQEDRMFGYQFLNGCNPVMIQRCLKLPDNLPVTTEMVECSLERQLTLEQEIEQGNIFIVDFKLLDGIDANKTDPCTLQFLAAPICLLYKNLANKIVPIAIQLNQVPGEENPIFLPSDAKYDWLLAKIWVRSSDFHVHQTITHLLRTHLVSEVFGIAMYRQLPAVHPIFKLLVAHVRFTIAINTKAREQLICEYGLFDKANATGG. The pIC50 is 4.5. (2) The compound is CC(=O)/C=C/C1=CC[C@H]2[C@H](C[C@@H]1C)OC(=O)[C@@H]2C. The target protein (P01120) has sequence MPLNKSNIREYKLVVVGGGGVGKSALTIQLTQSHFVDEYDPTIEDSYRKQVVIDDEVSILDILDTAGQEEYSAMREQYMRNGEGFLLVYSITSKSSLDELMTYYQQILRVKDTDYVPIVVVGNKSDLENEKQVSYQDGLNMAKQMNAPFLETSAKQAINVEEAFYTLARLVRDEGGKYNKTLTENDNSKQTSQDTKGSGANSVPRNSGGHRKMSNAANGKNVNSSTTVVNARNASIESKTGLAGNQATNGKTQTDRTNIDNSTGQAGQANAQSANTVNNRVNNNSKAGQVSNAKQARKQQAAPGGNTSEASKSGSGGCCIIS. The pIC50 is 3.7. (3) The small molecule is CN1CCN(Cc2ccc([C@H]3CC[C@H](Oc4nc(-c5ccncn5)cc(=O)n4C)CC3)cc2)CC1. The target protein sequence is MSGRPRTTSFAESCKPVQQPSAFGSMKVSRDKDGSKVTTVVATPGQGPDRPQEVSYTDTKVIGNGSFGVVYQAKLCDSGELVAIKKVLQDKRFKNRELQIMRKLDHCNIVRLRYFFYSSGEKKDEVYLNLVLDYVPETVYRVARHYSRAKQTLPVIYVKLYMYQLFRSLAYIHSFGICHRDIKPQNLLLDPDTAVLKLCDFGSAKQLVRGEPNVSYICSRYYRAPELIFGATDYTSSIDVWSAGCVLAELLLGQPIFPGDSGVDQLVEIIKVLGTPTREQIREMNPNYTEFKFPQIKAHPWTKVFRPRTPPEAIALCSRLLEYTPTARLTPLEACAHSFFDELRDPNVKLPNGRDTPALFNFTTQELSSNPPLATILIPPHARIQAAASTPSNTTAASDANAGDRGQTNNAASASASDS. The pIC50 is 8.1. (4) The compound is NCCCCN1c2ccccc2Sc2ccc(C(F)(F)F)cc21. The target protein (P33302) has sequence MPEAKLNNNVNDVTSYSSASSSTENAADLHNYNGFDEHTEARIQKLARTLTAQSMQNSTQSAPNKSDAQSIFSSGVEGVNPIFSDPEAPGYDPKLDPNSENFSSAAWVKNMAHLSAADPDFYKPYSLGCAWKNLSASGASADVAYQSTVVNIPYKILKSGLRKFQRSKETNTFQILKPMDGCLNPGELLVVLGRPGSGCTTLLKSISSNTHGFDLGADTKISYSGYSGDDIKKHFRGEVVYNAEADVHLPHLTVFETLVTVARLKTPQNRIKGVDRESYANHLAEVAMATYGLSHTRNTKVGNDIVRGVSGGERKRVSIAEVSICGSKFQCWDNATRGLDSATALEFIRALKTQADISNTSATVAIYQCSQDAYDLFNKVCVLDDGYQIYYGPADKAKKYFEDMGYVCPSRQTTADFLTSVTSPSERTLNKDMLKKGIHIPQTPKEMNDYWVKSPNYKELMKEVDQRLLNDDEASREAIKEAHIAKQSKRARPSSPYTVS.... The pIC50 is 5.9. (5) The compound is CNC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CC(C)C)NC(=O)CCS. The target protein (P50282) has sequence MSPWQPLLLVLLALGYSFAAPHQRQPTYVVFPRDLKTSNLTDTQLAEDYLYRYGYTRAAQMMGEKQSLRPALLMLQKQLSLPQTGELDSETLKAIRSPRCGVPDVGKFQTFDGDLKWHHHNITYWIQSYTEDLPRDVIDDSFARAFAVWSAVTPLTFTRVYGLEADIVIQFGVAEHGDGYPFDGKDGLLAHAFPPGPGIQGDAHFDDDELWSLGKGAVVPTYFGNANGAPCHFPFTFEGRSYLSCTTDGRNDGKPWCGTTADYDTDRKYGFCPSENLYTEHGNGDGKPCVFPFIFEGHSYSACTTKGRSDGYRWCATTANYDQDKADGFCPTRADVTVTGGNSAGEMCVFPFVFLGKQYSTCTSEGRSDGRLWCATTSNFDADKKWGFCPDQGYSLFLVAAHEFGHALGLDHSSVPEALMYPMYHYHEDSPLHEDDIKGIHHLYGRGSKPDPRPPATTAAEPQPTAPPTMCSTAPPMAYPTGGPTVAPTGAPSPGPTGPP.... The pIC50 is 5.0.