Click here to Skip to main content
15,902,938 members

Ilka Guigova - Professional Profile



Summary

    Blog RSS
3,733
Author
60
Authority
328
Debator
193
Editor
5
Enquirer
346
Organiser
888
Participant

Reputation

Weekly Data. Recent events may not appear immediately. For information on Reputation please see the FAQ.

Privileges

Members need to achieve at least one of the given member levels in the given reputation categories in order to perform a given action. For example, to store personal files in your account area you will need to achieve Platinum level in either the Author or Authority category. The "If Owner" column means that owners of an item automatically have the privilege. The member types column lists member types who gain the privilege regardless of their reputation level.

ActionAuthorAuthorityDebatorEditorEnquirerOrganiserParticipantIf OwnerMember Types
Have no restrictions on voting frequencysilversilversilversilver
Bypass spam checks when posting contentsilversilversilversilversilversilvergoldSubEditor, Mentor, Protector, Editor
Store personal files in your account areaplatinumplatinumSubEditor, Editor
Have live hyperlinks in your profilebronzebronzebronzebronzebronzebronzesilverSubEditor, Protector, Editor
Have the ability to include a biography in your profilebronzebronzebronzebronzebronzebronzesilverSubEditor, Protector, Editor
Edit a Question in Q&AsilversilversilversilverYesSubEditor, Protector, Editor
Edit an Answer in Q&AsilversilversilversilverYesSubEditor, Protector, Editor
Delete a Question in Q&AYesSubEditor, Protector, Editor
Delete an Answer in Q&AYesSubEditor, Protector, Editor
Report an ArticlesilversilversilversilverSubEditor, Mentor, Protector, Editor
Approve/Disapprove a pending ArticlegoldgoldgoldgoldSubEditor, Mentor, Protector, Editor
Edit other members' articlesSubEditor, Protector, Editor
Create an article without requiring moderationplatinumSubEditor, Mentor, Protector, Editor
Approve/Disapprove a pending QuestionProtector
Approve/Disapprove a pending AnswerProtector
Report a forum messagesilversilverbronzeProtector, Editor
Approve/Disapprove a pending Forum MessageProtector
Have the ability to send direct emails to members in the forumsProtector
Create a new tagsilversilversilversilver
Modify a tagsilversilversilversilver

Actions with a green tick can be performed by this member.


 
GeneralHow To Change compare merge tool in TFS? Pin
Ilka Guigova7-Nov-11 10:15
Ilka Guigova7-Nov-11 10:15 
GeneralGetDeploymentDescription() Pin
Ilka Guigova3-Nov-11 13:42
Ilka Guigova3-Nov-11 13:42 
GeneralRemoving Event Handlers Using Reflection Pin
Ilka Guigova3-Nov-11 7:42
Ilka Guigova3-Nov-11 7:42 
GeneralLogos. No images. No JS. Just CSS Pin
Ilka Guigova31-Oct-11 10:07
Ilka Guigova31-Oct-11 10:07 
GeneralOperating System Version Info Pin
Ilka Guigova24-Oct-11 8:25
Ilka Guigova24-Oct-11 8:25 
GeneralSet a .NET 4.0 Application to auto-start upon login Pin
Ilka Guigova16-Oct-11 16:50
Ilka Guigova16-Oct-11 16:50 
GeneralA sample implementation Pin
Ilka Guigova26-Jun-12 7:13
Ilka Guigova26-Jun-12 7:13 
GeneralExtended ExtJS HtmlEditor to handle paste from MS Word and table manipulations Pin
Ilka Guigova6-Sep-11 9:49
Ilka Guigova6-Sep-11 9:49 

ContextKnowldege Center Site
NeedAllow for pages with predefined styles to be added
FeatureSupport a template engine in combination with a set of stylesheets
RequirementImplement an editor for manipulating text, images, and tables without interfering with stylesheets' fonts, colors, alignments, and overall structure.


There are a number of available WYSIWYG editors, such as CKEditor[^] and TinyMCE[^]. However, with their full spectrum of functionality, they are heavy-weight and detract from the predictability of the content layout. While a much simpler tool is required, it seems uneffective to implement one from scratch.

It was decided to use the ExtJS Ext.form.HtmlEditor[^] component as the system already uses the ExtJS framework. But in order for this to work, the component had to be configured correctly and extended to handle:


The copy and paste problem turn out to be a little tricky. There are a lot of bits and pieces of information on the web that, when collected, seemed random (and inconsitent); e.g.,
JavaScript
cleanWordHtml: function(html){
    // http://stackoverflow.com/questions/2875027/clean-microsoft-word-pasted-text-using-javascript
    // http://stackoverflow.com/questions/1068280/javascript-regex-multiline-flag-doesnt-work
    html = html.replace(/<!--[\s\S]*-->/g,""); 
    //html = html.replace(/\n/g, "<br/>");

    // http://www.1stclassmedia.co.uk/developers/clean-ms-word-formatting.php
    html = html.replace(/<o:p>\s*<\/o:p>/g, "") ;
    html = html.replace(/<o:p>.*?<\/o:p>/g, "&nbsp;") ;
    html = html.replace( /\s*mso-[^:]+:[^;"]+;?/gi, "" ) ;
    html = html.replace( /\s*MARGIN: 0cm 0cm 0pt\s*;/gi, "" ) ;
    html = html.replace( /\s*MARGIN: 0cm 0cm 0pt\s*"/gi, "\"" ) ;
    html = html.replace( /\s*TEXT-INDENT: 0cm\s*;/gi, "" ) ;
    html = html.replace( /\s*TEXT-INDENT: 0cm\s*"/gi, "\"" ) ;
    html = html.replace( /\s*TEXT-ALIGN: [^\s;]+;?"/gi, "\"" ) ;
    html = html.replace( /\s*PAGE-BREAK-BEFORE: [^\s;]+;?"/gi, "\"" ) ;
    html = html.replace( /\s*FONT-VARIANT: [^\s;]+;?"/gi, "\"" ) ;
    html = html.replace( /\s*tab-stops:[^;"]*;?/gi, "" ) ;
    html = html.replace( /\s*tab-stops:[^"]*/gi, "" ) ;
    html = html.replace( /\s*face="[^"]*"/gi, "" ) ;
    html = html.replace( /\s*face=[^ >]*/gi, "" ) ;
    html = html.replace( /\s*FONT-FAMILY:[^;"]*;?/gi, "" ) ;
    html = html.replace(/<(\w[^>]*) class=([^ |>]*)([^>]*)/gi, "<$1$3") ;
    html = html.replace( /<(\w[^>]*) style="([^\"]*)"([^>]*)/gi, "<$1$3" ) ;
    html = html.replace( /\s*style="\s*"/gi, '' ) ; 
    html = html.replace( /<SPAN\s*[^>]*>\s*&nbsp;\s*<\/SPAN>/gi, '&nbsp;' ) ; 
    html = html.replace( /<SPAN\s*[^>]*><\/SPAN>/gi, '' ) ; 
    html = html.replace(/<(\w[^>]*) lang=([^ |>]*)([^>]*)/gi, "<$1$3") ; 
    html = html.replace( /<SPAN\s*>(.*?)<\/SPAN>/gi, '$1' ) ; 
    html = html.replace( /<FONT\s*>(.*?)<\/FONT>/gi, '$1' ) ;
    html = html.replace(/<\\?\?xml[^>]*>/gi, "") ; 
    html = html.replace(/<\/?\w+:[^>]*>/gi, "") ; 
    html = html.replace( /<H\d>\s*<\/H\d>/gi, '' ) ;
    html = html.replace( /<H1([^>]*)>/gi, '' ) ;
    html = html.replace( /<H2([^>]*)>/gi, '' ) ;
    html = html.replace( /<H3([^>]*)>/gi, '' ) ;
    html = html.replace( /<H4([^>]*)>/gi, '' ) ;
    html = html.replace( /<H5([^>]*)>/gi, '' ) ;
    html = html.replace( /<H6([^>]*)>/gi, '' ) ;
    html = html.replace( /<\/H\d>/gi, '<br>' ) ; //remove this to take out breaks where Heading tags were 
    html = html.replace( /<(U|I|STRIKE)>&nbsp;<\/\1>/g, '&nbsp;' ) ;
    html = html.replace( /<(B|b)>&nbsp;<\/\b|B>/g, '' ) ;
    html = html.replace( /<([^\s>]+)[^>]*>\s*<\/\1>/g, '' ) ;
    html = html.replace( /<([^\s>]+)[^>]*>\s*<\/\1>/g, '' ) ;
    html = html.replace( /<([^\s>]+)[^>]*>\s*<\/\1>/g, '' ) ;
    html = html.replace( /(<P)([^>]*>.*?)(<\/P>)/gi, "<div$2</div>" ) ;
    html = html.replace( /(<font|<FONT)([^*>]*>.*?)(<\/FONT>|<\/font>)/gi, "<div$2</div>") ;
    html = html.replace( /size|SIZE = ([\d]{1})/g, '' ) ;

    // http://www.codinghorror.com/blog/2006/01/cleaning-words-nasty-html.html
    html = html.replace(/<!--(\w|\W)+?-->/gi, '');
    html = html.replace(/<title>(\w|\W)+?<\/title>/gi, '');
    html = html.replace(/\s?class=\w+/gi, '');
    html = html.replace(/\s+style='[^']+'/gi, '');
    html = html.replace(/<(meta|link|\/?o:|\/?style|\/?div|\/?st\d|\/?head|\/?html|body|\/?body|\/?span|!\[)[^>]*?>/gi, '');
    html = html.replace(/(<[^>]+>)+&nbsp;(<\/\w+>)+/gi, '');
    html = html.replace(/\s+v:\w+=""[^""]+""/gi, '');
    html = html.replace(/(\n\r){2,}/gi, '');

    // // http://www.tim-jarrett.com/labs_javascript_scrub_word.php
    html = html.replace(new RegExp(String.fromCharCode(8220), 'gi'), '"'); //"
    html = html.replace(new RegExp(String.fromCharCode(8221), 'gi'), '"'); //"
    html = html.replace(new RegExp(String.fromCharCode(8216), 'gi'), "'"); //
    html = html.replace(new RegExp(String.fromCharCode(8217), 'gi'), "'"); //
    html = html.replace(new RegExp(String.fromCharCode(8211), 'gi'), "-"); //
    html = html.replace(new RegExp(String.fromCharCode(8212), 'gi'), "--"); //
    html = html.replace(new RegExp(String.fromCharCode(189), 'gi'), "1/2"); //½
    html = html.replace(new RegExp(String.fromCharCode(188), 'gi'), "1/4"); //¼
    html = html.replace(new RegExp(String.fromCharCode(190), 'gi'), "3/4"); //¾
    html = html.replace(new RegExp(String.fromCharCode(169), 'gi'), "(C)"); //©
    html = html.replace(new RegExp(String.fromCharCode(174), 'gi'), "(R)"); //®
    html = html.replace(new RegExp(String.fromCharCode(8230), 'gi'), "..."); //

    return html;
}


I ended up refining and splitting the set into tag replacement and character replacement sets that work in the Ext HtmlEditor component (and do not interfere with its tags).

JavaScript
dirtyHtmlTags: [
    // http://stackoverflow.com/questions/2875027/clean-microsoft-word-pasted-text-using-javascript
    // http://stackoverflow.com/questions/1068280/javascript-regex-multiline-flag-doesnt-work
    {regex: /<!--[\s\S]*?-->/gi, replaceVal: ""}, 

    // http://www.1stclassmedia.co.uk/developers/clean-ms-word-formatting.php
    {regex: /<\\?\?xml[^>]*>/gi, replaceVal: ""}, 
    {regex: /<\/?\w+:[^>]*>/gi, replaceVal: ""}, // e.g. <o:p...

    {regex: /\s*MSO[-:][^;"']*/gi, replaceVal: ""},
    {regex: /\s*MARGIN[-:][^;"']*/gi, replaceVal: ""},
    {regex: /\s*PAGE[-:][^;"']*/gi, replaceVal: ""},
    {regex: /\s*TAB[-:][^;"']*/gi, replaceVal: ""},
    {regex: /\s*LINE[-:][^;"']*/gi, replaceVal: ""},
    {regex: /\s*FONT-SIZE[^;"']*/gi, replaceVal: ""},
    {regex: /\s*LANG=(["'])[^"']*?\1/gi, replaceVal: ""},
    {regex: /<(P|H\d)[^>]*>([\s\S]*?)<\/\1>/gi, replaceVal: "$2"},

    {regex: /\s*\w+=(["'])((&nbsp;|\s|;)*|\s*;+[^"']*?|[^"']*?;{2,})\1/gi, replaceVal: ""}, 
    {regex: /<span[^>]*>(&nbsp;|\s)*<\/span>/gi, replaceVal: ""},
    //{regex: /<([^\s>]+)[^>]*>(&nbsp;|\s)*<\/\1>/gi, replaceVal: ""},

    // http://www.codinghorror.com/blog/2006/01/cleaning-words-nasty-html.html
    {regex: /<(\/?title|\/?meta|\/?style|\/?st\d|\/?head|\/?html|\/?body|!\[)[^>]*?>/gi, replaceVal: ""},
    {regex: /(\n(\r)?){2,}/gi, replaceVal: ""}        
],

cleanHtml: function(html) {
    if (!html) return;

    Ext.each(this.dirtyHtmlTags, function(tag, idx){
        html = html.replace(tag.regex, tag.replaceVal);
    });

    // http://www.tim-jarrett.com/labs_javascript_scrub_word.php
    html = html.replace(new RegExp(String.fromCharCode(8220), 'gi'), '"'); //"
    html = html.replace(new RegExp(String.fromCharCode(8221), 'gi'), '"'); //"
    html = html.replace(new RegExp(String.fromCharCode(8216), 'gi'), "'"); //
    html = html.replace(new RegExp(String.fromCharCode(8217), 'gi'), "'"); //
    html = html.replace(new RegExp(String.fromCharCode(8211), 'gi'), "-"); //
    html = html.replace(new RegExp(String.fromCharCode(8212), 'gi'), "--"); //
    html = html.replace(new RegExp(String.fromCharCode(189), 'gi'), "1/2"); //½
    html = html.replace(new RegExp(String.fromCharCode(188), 'gi'), "1/4"); //¼
    html = html.replace(new RegExp(String.fromCharCode(190), 'gi'), "3/4"); //¾
    html = html.replace(new RegExp(String.fromCharCode(169), 'gi'), "(C)"); //©
    html = html.replace(new RegExp(String.fromCharCode(174), 'gi'), "(R)"); //®
    html = html.replace(new RegExp(String.fromCharCode(8230), 'gi'), "..."); //

    return Ext.ux.form.HtmlLintEditor.superclass.cleanHtml.call(this, html);
}


These regular expressions seem to cover most cases and have been successfully tested for the purposes of the project I worked on. I thought I might share the info.

See Also:
Introduction to Ranges[^]
Intercepting the Clipboard data on Paste[^]
GeneralConvertWord documents to Clean HTML Pin
Ilka Guigova24-Jul-12 10:12
Ilka Guigova24-Jul-12 10:12 
GeneralInvoking a Generic Method With Parameters using Reflection Pin
Ilka Guigova28-Aug-11 13:35
Ilka Guigova28-Aug-11 13:35 
GeneralRe: Invoking a Generic Method With Parameters using Reflection Pin
Ilka Guigova1-Mar-12 7:25
Ilka Guigova1-Mar-12 7:25 
GeneralSlide-In Image Captions Pin
Ilka Guigova27-Aug-11 19:49
Ilka Guigova27-Aug-11 19:49 
General30 Useful jQuery Tabs Tutorials Pin
Ilka Guigova9-Jul-11 14:36
Ilka Guigova9-Jul-11 14:36 
GeneralOne-liners Pin
Ilka Guigova4-Jun-12 7:34
Ilka Guigova4-Jun-12 7:34 
GeneralLogical XOR in Javascript Pin
Ilka Guigova25-Apr-11 14:36
Ilka Guigova25-Apr-11 14:36 
GeneralA Collapsible Outline of Indented Text in Javascript Pin
Ilka Guigova6-Feb-11 16:19
Ilka Guigova6-Feb-11 16:19 
GeneralPython calculations - arrangements Pin
Ilka Guigova30-Aug-10 7:44
Ilka Guigova30-Aug-10 7:44 
GeneralVisualStudio 2005 Slow On Save Pin
Ilka Guigova23-May-10 13:44
Ilka Guigova23-May-10 13:44 
GeneralT-SQL Cursor and XML Basic Example Pin
Ilka Guigova3-Apr-10 13:13
Ilka Guigova3-Apr-10 13:13 
GeneralT-SQL Auto Increment Variable Pin
Ilka Guigova2-Apr-10 20:04
Ilka Guigova2-Apr-10 20:04 
GeneralApplying XSL Transformations Pin
Ilka Guigova29-Mar-10 11:05
Ilka Guigova29-Mar-10 11:05 
GeneralXML DOM Load Functions Pin
Ilka Guigova10-Aug-12 13:46
Ilka Guigova10-Aug-12 13:46 
GeneralWorking with durations in MSSQL server Pin
Ilka Guigova31-Jan-10 13:11
Ilka Guigova31-Jan-10 13:11 
GeneralRe: Working with durations in MSSQL server Pin
Grunge Boy31-Jan-10 23:29
Grunge Boy31-Jan-10 23:29 
GeneralRe: Working with durations in MSSQL server Pin
Ilka Guigova1-Feb-10 7:17
Ilka Guigova1-Feb-10 7:17 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.