KinasePro

Kinase Chemistry – Just a year and a half behind the times.

Monday night OT

Posted by kinasepro on December 5, 2006

There’s been a fair amount of talk over the last little while on the topic of chemoinformatics and chemblogs. Here’s my two cents.

smiles inchi Aldrich
smiles inchi ChemExper
smiles inchi The PDB
smiles inchi Chemdraw (until v10)
smiles inchi The entire pharmaceutical industry.
smiles inchi Peter Murray Rust
smiles inchi IUPAC

So somehow a couple librarians have convinced Google that inchi > smiles. Result? Google may well do Inchi, but noone but the librarians are currently using it, and meanwhile google doesn’t index smiles very well. I’m reminded of a day when it was thought to be a good idea to put the CAS#s of new entities at the bottom of ACS journal articles. Don’t worry, we survived those librarians too.

Lookit, we don’t need a string of XML code that you need an advanced degree to use. We don’t need people telling us to tag our blog posts, we need an integrated solution. We need something that can draw structures and present them attractively in an index friendly HTML format. Near term: Get google to index picture descriptions, and code a firefox plugin that can insert smiles into said descriptions.

Till google has a smiles substructure search, I’m not going to bother.

14 Responses to “Monday night OT”

  1. Hey – i know i really got it together but two entries in the blogroll – you are too kind.
    Couldn’t agree more with the sentiment – not only does my ancient version of ChemDraw not support this exotic format, but I have enuff hassle in my life without learning some obscure new coding system.

  2. kinasepro said

    fixed the roll thanks.

  3. Paul said

    I could not agree more about the need for an integrated solution! I got a really thoughtful response from Peter Murray-Rust and friends, and I feel kind of bad about not acting on it, but putting random InChI designations at the bottom of all our blog posts doesn’t seem worth it to me. I think that CML is indeed the future, and I look forward to the day of being able to download a CML plugin for WordPress that will take care of everything for us lazy bloggers.

  4. Chris said

    The argument against SMILES seems to be they are not an Open Format and it is possible to represent a single molecule with multiple SMILES strings. For my part I can read and write SMILES, (and SMARTS and SMIRKS). I find InChi impenetrable and I don’t think there is syntax for substructure or similarity queries, in addition I don’t think there is a system for describing reactions.

    I’ve started to add SMILES to my web pages in the hope that someone will build an index at some point, I guess it would help if there was a SMILES tag?

  5. kinasepro said

    InchI and CML may well be the future, and noone will embrace it more then me, but SMILES is the present. For people working in the field not to understand that boggles the mind!

    I’ve experimented on this site a little with smiles. For instance a google search of the following string brings you here:

    O=C(C2=CN=C(NC3=NC(C)=NC(N4CCN(CCO)CC4)=C3)S2)NC1=C(C)C=CC=C1Cl

    Of note I’m not the only one with that string on the web! Maybe thats an important compound? Sadly google indexed that page under my SRC tag rather then as a standalone page. Put that together with the fact that smiles strings are not substructure searchable via google and its clear to me that google is not ready to be a chemistry informatics platform. It’s sad really, because it doesn’t seem to me that it would be that difficult for them to make SMILES strings substructure searchable via the same algorithm the PDB, relibase, aldrich and everybody else is using.

  6. […] Kinasepro has blogged about discussions of new chemoinformatics technology (specifically CML (Chemical Markup Language) and InChI (chemical identifier)). Here’s the post and some correspondence. It’s basically about the introduction of new technology. Obviously I’m not neutral but I will try to discuss it in a neutral manner. For that reason I have copied it more or less in full. There’s been a fair amount of talk [ChemBark] over the last little while on the topic of chemoinformatics and chemblogs. Here’s my two cents.smiles inchi Aldrich smiles inchi ChemExper smiles inchi The PDB smiles inchi Chemdraw (until v10) smiles inchi The entire pharmaceutical industry. smiles inchi Peter Murray Rust smiles inchi IUPAC […]

  7. I have tried to address these points in my blog:

    PDB Update


    and I hope this is an objective and dispassionate analysis. I have no intention of telling anyone to do anything – simply pointing out opportunities

    P.

  8. Egon said

    How to tag things in your blog…

    Chris, I wrote up how you can tag SMILES (or CAS or InChI’s):

    http://chem-bla-ics.blogspot.com/2006/12/including-smiles-cml-and-inchi-in.html

    How to get an InChI…

    A good source of InChI’s is PubChem. That is a quick and easy way to get the InChI’s for your molecules of interest. Alternatively, just use your favorite chemical editor, save the structure and let OpenBabel 2.0 have create an InChI.

    Why use InChI’s…

    Recently someone blogged about open access journals having a higher impact. While googling for InChI’s is not common place, this will increasingly become the method to find information on a certain molecule. Adding InChI’s to your blog has the advantage that they would show up in such searched, and will increase the impact of your blog.

    My promise…

    I appreciate that using CML and InChI’s is more difficult than just cut-n-pasting SMILES, but I promise to answer all questions asked about these issues in my blog, under this blog item:

    http://chem-bla-ics.blogspot.com/2006/12/including-smiles-cml-and-inchi-in.html

  9. Hi KinasePro,

    Nice blog!

    You are absolutely correct that “integrated solutions” are needed. Expecting chemists and other scientists to be manually pasting CML, InChIs, or even SMILES or molfiles into their blogs, reports, and other content has never worked and will never work. The best solution would eliminate the need for an end-user to even know about these technologies.

    A few months ago, I wrote about the need for “invisible” solutions to the problem you describe:

    http://depth-first.com/articles/2006/09/13/the-chemically-aware-web-are-we-there-yet

    The technical details are open to debate, but we’ve got to stop expecting scientists to become information technology experts.

    cheers,
    Rich

  10. […] Egon on SMILES InChI CML and RSS on Planet Blue Obelisk Egon Willighagen (chemblaics) blogged on: Including SMILES, CML and InChI in blogs I agree with everything Egon says and add comments. (Incidentally WordPress and Planet remove the microformats so please read his original for the correct syntax) The blogs ChemBark and KinasePro, have been some discussions on the use of SMILES, CML and InChI in Chemical Blogspace (with 70 chemistry blogs now!). Chemists seem to prefer SMILES over InChI, while there is interest in moving towards CML too. Peter commented. […]

  11. kinasepro said

    /em *nods dispassionately

  12. I’ve just published a small Greasemonkey script which addresses some of issues discusses here. Using proper markup, the script will recognize the chemistry in the HTML page, and automatically link to PubChem and Google.

    Read it at:

    http://chem-bla-ics.blogspot.com/2006/12/smiles-cas-and-inchi-in-blogs.html

  13. With the release of InChIKey this might be possible under the vision outlined at http://www.chemspider.com/blog/?p=133

  14. […] if you like. There is a very good description of the syntax here and the KinasePro blog has a short comment on how many people use SMILES vs InChi. The bottom line is that more people use SMILES, but it […]

Leave a comment