Async Google Spellcheck API Adaptor for TinyMCE
February 25, 2013 2 Comments
I recently added TinyMCE to a project in order to provide a stripped-down rich text editor with bold, italic and underline capability to a project. I discovered that the spell check functionality either required a client-side plugin for IE or a server-side implementation JSON RPC implementation called by TinyMCE via Ajax. Unfortunately, the only implementations for the server side provided by the TinyMCE project are in PHP and my project is in ASP.Net MVC 4.
Looking at the PHP implementations, one option is to adapt the Google Spellcheck API — which I didn’t even know existed. Basically this API allows you to post an XML document that contains a list of space-delimited words and get back a document which defines the substrings that are misspelled.
Using some examples of how the API works on the Google side, I was able to throw together a class that invokes it using the new async/await pattern in C# to create a Google Spellcheck API client that doesn’t block while wanting for its result.
using System; using System.IO; using System.Text; using System.Net; using System.Xml; using System.Threading.Tasks; using System.Collections.Generic; using System.Diagnostics; namespace WolfeReiter.Web.Utility { /* * http post to http://www.google.com/tbproxy/spell?lang=en&hl=en * * Google spellcheck API request looks like this. * * <?xml version="1.0" encoding="utf-8" ?> * <spellrequest textalreadyclipped="0" ignoredups="0" ignoredigits="1" ignoreallcaps="1"> * <text>Ths is a tst</text> * </spellrequest> * * The response look like ... * * <?xml version="1.0" encoding="UTF-8"?> * <spellresult error="0" clipped="0" charschecked="12"> * <c o="0" l="3" s="1">This Th's Thus Th HS</c> * <c o="9" l="3" s="1">test tat ST St st</c> * </spellresult> */ public class GoogleSpell { const string GOOGLE_REQUEST_TEMPLATE = "<?xml version=\"1.0\" encoding=\"utf-8\" ?><spellrequest textalreadyclipped=\"0\" ignoredups=\"0\" ignoredigits=\"1\" ignoreallcaps=\"1\"><text>{0}</text></spellrequest>"; public async Task<IEnumerable<string>> SpellcheckAsync(string lang, IEnumerable<string> wordList) { //convert list of words to space-delimited string. var words = string.Join(" ", wordList); var result = (await QueryGoogleAsync(lang, words)); var doc = new XmlDocument(); doc.LoadXml(result); // Build misspelled word list var misspelledWords = new List<string>(); foreach (var node in doc.SelectNodes("//c")) { var cElm = (XmlElement)node; //google sends back bad word positions to slice out of original data we sent. try { var badword = words.Substring(Convert.ToInt32(cElm.GetAttribute("o")), Convert.ToInt32(cElm.GetAttribute("l"))); misspelledWords.Add(badword); } catch( ArgumentOutOfRangeException e) { Trace.WriteLine(e); Debug.WriteLine(e); } } return misspelledWords; } public async Task<IEnumerable<string>> SuggestionsAsync(string lang, string word) { var result = (await QueryGoogleAsync(lang, word)); // Parse XML result var doc = new XmlDocument(); doc.LoadXml(result); // Build misspelled word list var suggestions = new List<string>(); foreach (XmlNode node in doc.SelectNodes("//c")) { var element = (XmlElement)node; if(!string.IsNullOrWhiteSpace(element.InnerText)) { foreach (var suggestion in element.InnerText.Split('\t')) { if (!string.IsNullOrEmpty(suggestion)) { suggestions.Add(suggestion); } } } } return suggestions; } async Task<string> QueryGoogleAsync(string lang, string data) { var scheme = "https"; var server = "www.google.com"; var port = 443; var path = "/tbproxy/spell"; var query = string.Format("?lang={0}&hl={1}", lang, data); var uriBuilder = new UriBuilder(scheme, server, port, path, query); string xml = string.Format(GOOGLE_REQUEST_TEMPLATE, EncodeUnicodeToASCII(data)); var request = WebRequest.CreateHttp(uriBuilder.Uri); request.Method = "POST"; request.KeepAlive = false; request.ContentType = "application/PTI26"; request.ContentLength = xml.Length; // Google-specific headers var headers = request.Headers; headers.Add("MIME-Version: 1.0"); headers.Add("Request-number: 1"); headers.Add("Document-type: Request"); headers.Add("Interface-Version: Test 1.4"); using (var requestStream = (await request.GetRequestStreamAsync())) { var xmlData = Encoding.ASCII.GetBytes(xml); requestStream.Write(xmlData, 0, xmlData.Length); var response = (await request.GetResponseAsync()); using (var responseStream = new StreamReader(response.GetResponseStream())) { return responseStream.ReadToEnd(); } } } string EncodeUnicodeToASCII(string s) { var builder = new StringBuilder(); foreach(var c in s.ToCharArray()) { //encode Unicode characters that can't be represented as ASCII if (c > 127) { builder.AppendFormat( "&#{0};", (int)c); } else { builder.Append(c); } } return builder.ToString(); } } }
The GoogleSpellChecker class below exposes two methods: SpellcheckAsync and SuggestionsAsync.
My MVC Controller class exposes this functionality to the TinyMCE by translating JSON back and forth to the GoogleSpell class.
using System; using System.Collections.Generic; using System.Linq; using System.Threading.Tasks; using System.Web; using System.Web.Mvc; using WolfeReiter.Web.Utility; namespace MvcProject.Controllers { public class TinyMCESpellcheckGatewayController : AsyncController { [HttpPost] public async Task<JsonResult> Index(SpellcheckRequest model) { var spellService = new GoogleSpell(); IEnumerable<string> result = null; if(string.Equals(model.method, "getSuggestions", StringComparison.InvariantCultureIgnoreCase)) { result = (await spellService.SuggestionsAsync(model.@params.First().Single(), model.@params.Skip(1).First().Single())); } else //assume checkWords { result = (await spellService.SpellcheckAsync(model.@params.First().Single(), model.@params.Skip(1).First())); } string error = null; return Json( new { result, id = model.id, error } ); } //class models JSON posted by TinyMCE allows MVC Model Binding to "just work" public class SpellcheckRequest { public SpellcheckRequest() { @params = new List<IEnumerable<string>>(); } public string method { get; set; } public string id { get; set; } public IEnumerable<IEnumerable<string>> @params { get; set; } } } }
Integrating the above controller with TinyMCE is straightforward. All that needs to happen is include the “spellchecker” plugin, the “spellchecker” toolbar button and set the spellchecker_rpc_url to point to the controller.
/*global $, jQuery, tinyMCE, tinymce */ /// <reference path="jquery-1.8.3.js" /> /// <reference path="jquery-ui-1.8.24.js" /> /// <reference path="modernizr-2.6.2.js" /> /// <reference path="tinymce/tinymce.jquery.js" /> /// <reference path="tinymce/tiny_mce_jquery.js" /> (function () { "use strict"; $(document).ready(function () { $('textarea.rich-text').tinymce({ mode: "exact", theme: "advanced", plugins: "safari,spellchecker,paste", gecko_spellcheck: true, theme_advanced_buttons1: "bold,italic,underline,|,undo,redo,|,spellchecker,code", theme_advanced_statusbar_location: "none", spellchecker_rpc_url: "/TinyMCESpellcheckGateway", //<-- point TinyMCE to GoolgeSpell adaptor controller /*strip pasted microsoft office styles*/ paste_strip_class_attributes: "mso" }); }); }());
That’s all there is to it. Here’s how TinyMCE renders on a <textarea class=”rich-text-“></textarea>.
Your blog is amazing. You write about very interesting things. Thanks for all your tips and information
Man thanks for the work. I have a question for some reason if I have a misspell word I can see that i return the misspell word but nothing else happens? it doesn display options or anything else.