Archive for the ‘Browsers’ Category

Interesting code with HtmlAgilityPack

Yesterday I was busy with HTML to PDF conversion and for this I used the HTML Agility Pack. Everything worked great, except it seemed IE and FF/Chrome render different HTML. So today I took some fairly straightforward HTML and pushed it through HTMLAgility:






	
	




New Website Under Construction

And if I use this code to loop through the childnodes:

            HtmlDocument doc = new HtmlDocument();
            string s;
            StringBuilder builder = new StringBuilder();
            using (StreamReader reader = new StreamReader(@"C:\Documents and Settings\user\Desktop\fremus.net\index.htm"))
            {
                while ((s = reader.ReadLine()) != null)
                {
                    builder.AppendLine(s);
                }
            }
            doc.LoadHtml(builder.ToString());
            Console.WriteLine(doc.DocumentNode.ChildNodes.Count);
            foreach (HtmlNode node in doc.DocumentNode.ChildNodes)
            {
                Console.WriteLine(node.Name);
                foreach (HtmlNode childNode in node.ChildNodes)
                {
                    Console.WriteLine("\t\t" + childNode.Name);
                    foreach (HtmlNode grandChildNode in childNode.ChildNodes)
                    {
                        Console.WriteLine("\t\t\t" + grandChildNode.Name);
                    }
                }
            }

I get the following result in my command line window:
cmdline

As you can see from the output the html node has a text node. The head node has a text node, and it has 9 childnodes including 5 #text nodes. The body node has a text node as well, and it has 7 childnodes, four being #text and the other three being div. So what is this #text node? If you read this article on the W3C site you will see that it states:

A common error in DOM processing is to expect an element node to contain text.

However, the text of an element node is stored in a text node.

On the same page it then gives an example using a title tag. If you do a Google on “html #text node“, you will see that the second result points to an article and if you read the bit on the nodes it seems that each #text node is a child. The #text nodes that appear in the body node seem to point to the text spaces after each div or each element inside the body node. If I change my code slightly:

                    Console.WriteLine("\t\t" + childNode.Name);
                    foreach (HtmlNode grandChildNode in childNode.ChildNodes)
                    {
                        Console.WriteLine("\t\t\t" + grandChildNode.Name);
                        Console.WriteLine("\t\t\t\t" + grandChildNode.HasChildNodes);
                    }

It tells me that the divs have child elements, but the #text nodes do not. Thus it seems for each ‘empty space’ inside a node there exists a #text node. If I amend the HTML from earlier like this:





	
	







Then the footer div will have two text nodes, and the paragraph node will have a textnode. My issues yesterday had to do with the way IE rendered the HTML and that when I used HTMLAgility to parse it, the node counts weren’t the same. From the sample HTML I have given so far that difference is negligble, but I found that if I went to a site like this one and I saved the HTML from IE and Chrome into separate HTML files and I ran my code with that HTML, I got different node counts. Here are two screenshots that illustrate this:
chromeie

The first screen is the html from the page saved from chrome and the second one is from ie. Notice the extra text nodes.

  • Share/Bookmark

Is Firefox slowly dying?

I mean really, the performance sucks! The Firebug add-on is a 600Kb+ download and it feels as if it really, really slows it down. I have for a while now switched over to Google Chrome, because it is such a fast and responsive browser. I have found one or two issues when using Facebook with Chrome, but other than that its a pretty cool browser. The thing I dislike the most about Firefox is that its process does not terminate completely sometimes, and if you start a new instance of the browser the OS complains that Firefox is still running. Even IE is way quicker than Firefox. I used Firefox for development, and specifically the Firebug and Web Developer plugins, but I have decided to switch over to Chrome, because it does have a developer plugin. IE 8 also has a developer plugin. Thats my two cents for now.

  • Share/Bookmark

Chrome fades as users return to IE, Firefox

So Google released their own browser, Chrome about three weeks ago. As with many things in life if something is interesting and new everybody is likely to have a try, or look at it, but as soon as the shine wears off they will go back to their old habits. The same sentiment holds true for Google Chrome, which after its release had about 1% browser share, but now seems to be steadily losing ground and giving share back to Internet Explorer and Firefox. Does this mean that Google will stop the development of Chrome? Will they continue developing it? Was it a flop by Google to build a browser? Only time will tell.

You can read more about this in Computer Weekly’s article.

  • Share/Bookmark

How often do you use GMail?

How often do you use GMail? Well, if you use Gmail and you use the web-based version in your browser then you will know how boring the interface can be. With that being said I found Globex Designs and the Stylish Firefox plug in that literally transforms a mundane and boring looking GMail into a slick and awesome looking interface.

  • Share/Bookmark

From Flock to Firefox 3

So finally I ditched Flock for Firefox. One of the main reasons I chose to drop Flock was because of Firefox’s cool address bar, which is better than sliced cheese in my opinion. You simple type in a few words (title tags) and Firefox will take you to that page. Another reason for ditching Flock is the plug-ins that can be used in Firefox. Some of the plugins that I really like using are:

  • Firebug:An essential web development tool. It was interesting to note that Brad Abrams from Microsoft used Firebug at Mix Essentials 08.
  • MeasureIt:A measuring tool is incredibly useful in a web development environment
  • IE Tab:Unfortunately web developers have to code for audiences that use Internet Explorer but that doesnt mean you have to leave your Firefox abode. This add-on renders pages as Internet Explorer would. You can also change the rendering engine at any given time.
  • Share/Bookmark
Get Adobe Flash playerPlugin by wpburn.com wordpress themes